100% found this document useful (2 votes)
2K views

Topics in Matrix Analysis

Libro de Horn & Johnson
Copyright
© © All Rights Reserved
Available Formats
Download as PDF or read online on Scribd
100% found this document useful (2 votes)
2K views

Topics in Matrix Analysis

Libro de Horn & Johnson
Copyright
© © All Rights Reserved
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 612
‘To Ceres with love: Sixteen years weren't nearly enough . . . Published by the Press Syodizate of the University of Cambridge "The Pit Building, Trumpinglon Steet, Cambridge CB2 IRP 40 West 20% Steet, New York, NY 10011, USA 10 Stomford Road, Osklelgh, Mefboume 3166, Australia © Cambridge Univenity Press 1991 rst poblished 1991 Priated i the United States of America rary of Congress Cotatoging-in Publication Data aru, Roger A- in nutes analysis. I. Matices. 1. Johnson, Chats RH. Tile Wi, Tie: Matix analysis, QAIBR.HG64 1986 512.9434 86.23310, rsh Lary Cataloguing ix Publication appted for. ISHN 0-521-30587-X hardback Contents Preface poge vii Chapter 1 The field of values iz 1.0 Introduction x 11 Definitions 5 1.2 Basic properties of the field of values 8 1.3 Convexity IT 14 Axiomatization 8 15 Location ofthe field of values 30 1.6 Geometry 48 17 Products of matrices 65 1.8 Generalizations of the field of values 7 Chapter 2. Stable matrices and inertia, 89 2.0 Motivation 89 2.1 Definitions and elementary observations s1 22 — Lyapunov's theorem 95 23 The Routh-Hurwits conditions 101 24 — Generalizations of Lyapunov's theorem 102 2.5 M-matrices, P-matrices, and related topics 12 Chapter3 Singular value inequalities 134 3.0 Introduction and historical remarks ‘134 3.1 ‘The singular value decomposition a4 3.2 Weak majorization and doubly substochastic matrices 163 3.3 Basic inequalities for singular values and eigenvalues 170 3.4 Sums of singular values: the Ky Fan bnorms 195 3.5 Singular values and unitarily invariant norms 203 v wl Contents Sufficiency of Weyl’s product inequalities 27 3.7 Inclusion intervals for singular values 23 3.8 Singular value weak majorization for bilinear products 231 Chapter 4 Matrix equations and the Kronecker product 239 4.0 Motivation 239 Matrix equations a1 ‘The Kronecker product 2 Linear matrix equations and Kronecker products 254 Kronecker sums and the equation AX+ XB= C 268 Additive and multiplicative commutators and linear reservers 288 a he Madamard product 298 Introduction 298 Some basic observations 304 ‘The Schur product theorem 308 Generalizations of the Schur product theorem 312 ‘The matrices Ao (471)Tand Ao At 322 Inequalities for Hadamard products of general matrices: an overview 332 5.6 Singular values of a Hadamard product: a fundamental inequality 349 5.7 Hadamard products involving nonnegative matrices and ‘Me-matrices 356 Chapter 6 Matrices and functions 382 6.0 Introduction 382 6.1 Polynomial matrix functions and interpolation 383 6.2 Nonpolynomial matrix functions 407 6.3 Hadamard matrix function 449 6.4 Square roots, logarithms, nonlinear matrix equations 459 Matrices of functions 490 6.0 Achain rule for functions of a matrix 520 Mints for problems 561 Roferences 584 Notation 500 Indox 505 ‘This volume is a sequel to the previously published Matriz Anolysis and includes development of further topics that support applications of matrix ‘theory. We refer the reader to the preface of the prior volume for many general comments that apply here also. We adopt the notation and refer- ‘encing conventions of that volume and make specific reference to it [BJ] as needed. Matric Analysis developed the topics of broadest utility in the connec- tion of matrix theory to other subjects and for modern research in the sub- Jeet. The current volume develops a further set of slightly more specialized topics in the same spirit. ‘These are: the field of values (or classical numeri- cal range), matrix stability and inertia (including Mmatrices), singular values and associated inequalities, matrix equations and Kronecker prod- ucts, Hadamard (or entrywise) products of matrices, and several ways in which matrices and functions interact. Each of these topics is an area of active current research, and several of them do not yet enjoy a broad exposi- tion elsewhere. ‘Though this book should serve as a reference for these topics, the expo- sition is designed for use in an advanced course. Chapters include motiva- tional background, discussion, relations to other topics, and literature refer- ences. Most sections include exercises in the development as well as many problems that reinforce or extend the subject under discussion. ‘There are, of course, other matrix analysis topics not developed here that warrant atten- tion. Some of these already enjoy useful expositions; for example, totally positive matrices are discussed in [And] and [Kar]. We have included many exercises and over 650 problems because we feel they are essential to the development of an understanding of the subject ‘and its implications. ‘The exercises occur throughout the text as part of the development of each section; they arv genorully clomentary and of hmuned- ale use in understanding Lhe concepts. We racommend that the render work al least a broad selection of these. Problems are listed (in no particular order) at the end of sections; they cover a range of difficultics and types (from theoretical to computational) and they may extend the topic, develop special aspects, or suggest alternate proofs of major ideas. In order to enhance the utility of the book as a reference, many problems have hints; these are collected in a separate section following Chapter 6. ‘The results of tome problems are referred to in other problems or in the text itself. We ‘cannot overemphasize the importance of the reader's active involvement in ‘carrying out the exercises and solving problems. ‘Apin the prior volume, a broad list of related books and major surveys ln given prior to the index, and references to this list are given via mnemonic tude in square brackets. Readers may find the reference list of independent ulility. We appreciate the assistance of our colleagues and students who have offered helpful suggestions or commented on the manuscripts that preceded publication of this volume. ‘They include M. Bakonyi, W. Barrett, O. Chan, ©. Callen, M, Cusick, J. Dietrich, 8. H. Friedberg, S. Gabriel, F. Hall, C.-K. 14, M. Lundquist, R. Mathias, D. Merino, R. Merris, P. Nylen, A. Sourour, G. W. Stewart, R. C. Thompson, P. van Dooren, and E. ME. Wermuth. ‘rhe authors wish to maintain the utility of this volume to the commu- nity and welcome communication from readers of errors or omistions that they find, Such communications will be rewarded with a current copy of all known errata, RAH CRI Chapter 1 The field of values 1.0 Introduction Like the spectrum (or set of eigenvalues) o(-), the field of values F(-) is a set of complex numbers naturally associated with a given m-by-nmatrix A: F(A)= {atu 2€ 0%, 2t2= 1} ‘The spectrum of a matrix is a discrete point set; while the field of values can be a continuum, it is always a compact convex set. Like the spectrum, the field of values is a set that can be used to learn something about the matrix, and it can often give information that the spectrum alone cannot give. The cigenvalues of Hermitian and normal matrices have especially pleasant properties, and the field of values captures certain aspects of this nice struc- ture for general matrices. 1.0.1 Subadditivity and eigenvalues of sums Ieonly the eigenvalues o(A) and o( B) are known about two n-by-n matrices A and B, remarkably little can be said about o(A + B), the eigenvalues of the sum. Of course, tr(A-+ B) = tr A+ tr B, so the sum of all the eigen- values of A + Bis the eum of all the eigenvalues of A plus the sum of all the Cigenvalues of B. But beyond this, nothing can be aid about the eigenvalues of A+ B without more information about A and B. For example, even if all the eigenvalues of two n-by-n matrices A and B are known and fixed, the spectral radius of A+B (the largest absolute value of an eigenvalue of A+ B, denoted by (A + B)) can be arbitrarily large (see Problem 1). On ‘the other hand, if A and B are normal, then much can be said about the 1 a ‘The field of values clgonvalues of A+B; for example, p(A+ B)< p(A) + p(B) in this case. ‘Hum of mnatrices do arise in practice, and tworelevant properties of the field ‘of vnluon J°(-) are: (u) ‘he Meld of values is subadditive: F(A + B)c F(A) + F(B), whore the set sum has the natural definition of sums of all possible pala, one from each; and (h) tho olgenvatues of a matrix lie inside its field of values: o(A) c (A). Combluing these (wo properties yields the inclusions (A+ B)C B(A+ B)C F(A) + F(B) 40 If Uhe two fields of values *(A) and F(B) are known, something can be aald about the spectrum of the sum. 1.0.2 Amapplication from the numerical solution of partial differential equations Suppose that A= [a,j] € M,(R) satisfies (a) Ais tridiagonal (a,j= 0 for |i~] > 1), and (b) Matrices of this type arise in the numerical solution of partial differential equations and in the analysis of dynamical systems arising in mathematical biology. In both cases, knowledge about the real parts of the eigenvalues of Ais important. Tt turns out that rather good information about the eigen- values of such a matrix can be obtained easily using the field of values F(-). 1.0.2.1 Fact: For any eigenvalue A of a matrix A of the type indicated, wehave min 4;$ReA$ max oj reiee reign A proof of this fact is fairly simple using some properties of the field of values to be developed in Section (1.2). First, choose a dingonal matrix D 1.0 Introduction 3 with positive diagonal entries such that D1AD= for j# i The matrix D= diag (d,,.., d,) defined by dpe tyand dem [288A a de> f= Be will do. Since A and A are similar, their eigenvalues are the same. We then have Re o( A) = Re o(A) C Re F(A) = F(4(A + AT) = F (4586 (0430-1 Opn) Convex hall of (a, 1 Onn) in g;,, max o;;) ‘The first inclusion follows from the spectral containment property (1.2.6), ‘the next equality follows from the projection property (1.2.5), the next equality follows from the special form achieved for A, and the last equality follows from the normality property (1.2.9) and the fact that the eigenvalues of a diagonal matrix are its diagonal entries. Since the real part of each eigenvalue A € o( A) is a convex combination of the main diagonal entries a,, +» My the asserted inequalities are lear and the proof is complete. 1.03 Stability analysis In an analysis of the stability of an equilibrium in a dynamical system governed by a system of differential equations, it is important to know if the real part of every eigenvalue of a certain matrix A is negative. Such a matrix is called stable. In order to avoid juggling negative signs, we often ‘work with positive stable matrices (all eigenvalues have positive real parts). Obviously, A is positive stable if and only if -A is stable. Am important sufficient condition for a matrix to be positive stable is the following fact. 1.0.3.1 Fact: Let Ae M,, If A+ A* is positive definite, then A is positive stable. ‘This is another application of properties ofthe field of values F(-) to be developed in Section (1.2). By the spectral containment property (1.2.6), Re o(A)c Re F(A), and, by the projection property (1.2.5), Re F(A) 4 ‘The field of values F(A +44). But, since A+ A* is positive definite, 00 is 4(A + A*), and hence, by the normality property (1.2.9), °(4(4 + A*)) is contained in the positive real axis, Thus, each eigenvalue of A has a positive real part, and A is positivestable. ‘Actually, more is true. If A+ A* is positive definite, and if Pe M, is any positive definite matrix, then PA is positive stable because (Pty[PA]Pt = PHAPt, and PIAPt + (PLAPH)' where Pt is the unique (Hermitian) positive definite square root of P. Since congruence preserves positive definiteness, the eigenvalues of PA have otitive real parts for the same reason as A. Lyapunov's theorem (2.2.1) shows that all positive stable matrices arise inthis way. PAA + A*)PE 1.04 Anapproximation problem ‘Suppose we wish to approximate a given matrix Ae M, by a complex multiple of a Hermitian matrix of rank at most one, as closely as possible in the Frobenius norm |||. This is just the problem minimize || - czz*|[} for 2€€* with 2#2=1andce€ (1.0.4.1) Since the inner product [4,B] = tr AB* generates the Frobenius norm, we have A~ cxf = [Act A~ ca] = IAIB-2 Re [4,22] + Le]? which, for a given normalized s, is minimized by c= [4,z2"]. Substitution of this value into (1.0.4.1) transforms our problem into minimize (| A} |[4,22*]|2) for 2¢ €° with 2t2=1 or, equivalently, maximize |[A,22*]| for 26 €* with 2to= 1 1.1 Definitions 5 ‘A vector zp that solves the latter problem (and there will be one since we are maximizing a continuous function on a compact set) will yield the rank one solution matrix cxz* = [A,292]2926 to our original problem. However, a calculation shows that [4,22*] = tr Azz* = 2* Az, 0 that determining 29, and solving onr problem, is equivalent to finding a point in the field of values F(A) that is at maximum distance from the otigin. The absolute value of such a point is called the numerical radius of A [often denoted by r(A)] by analogy with the spectral radius, which is the absolute value of a point in the spectrum o( A) that is at maximum distance from the origin. Problems 1. Consider the real matrices lta 1 4=[a05 @)-1 al and B [ad &)-1 | Show that o(A) and o(B) are independent of the value of a€R. What are they? What is o(A + B)? Show that p(A + 8) is unbounded as a0. 2. In contrast to Problem 1, show that if A, Be M, are normal, then (A + B) SpA) + 0B). 3. Show that "<" in (1.0.2(b)) may be replaced by "<," the main diagonal entries 4;; may be complex, and Fact (1.0.2.1) til holds if; i replaced by Re ay. 4. Show that the problem of approximating a given matrix Ae M, by a positive semidefinite rank one matrix of spectral radius.one is just a matter of finding a point in F(A) that is furthest to the right in the complex plane. La Definitions In this section we define the field of values and certain related objects. L.L1 Definition. The field of values of A ¢ M, is F(A)= {ate ce C, 2t2= 1} ‘Thus, F(-) is a function from M, into subsets of the complex plane. 6 ‘The field of values F(A) is just the normalized locus of the Hermitian form associated with A. ‘The field of values is often called the numerical range, especially in the context of its analog for operators on infinite dimensional spaces. Buercise. Show that F(I) = {1} and F (al) = {a} for all a€ €. Show that [po] is the closed unit interval [0,1], and F[p 9] is the closed unit dise {ze [2] $1). ‘The field of values F(A) may also be thought of as the image of the surface of the Euclidean unit ballin €* (a compact set) under the continuous transformation z—+2*Az. As such, F(A) is a compact (and hence bounded) sct in €. Am unbounded analog of F(-)is also of interest, 1.1.2 Definition. The angular field of valuesis F(A) = {2h Az: eC", 24 0} Mxercise. Show that F’(A) is determined geometrically by F(A); every open ray from the origin that intersects F(A) in a point other than the origin is in F(A), and 0 € F(A) if and only if 0¢ F(A). Draw a typical pleture ofan F(A) and (A) assuming that 0 ¢ (A). It will become clear that F’(A) is an angular sector of the complex plane that is anchored at the origin (possibly the entire complex plane). The angular opening of this sector is of interest. 113 Definition. The field angle @ = @(A)= O(F’(A)) = O(F(A)) of AG M, is defined as follows: (a) If 0 isan interior point of F(A), then (4 (b) 1£0is on the boundary of (A) and there the boundary of (A) at 0, then @(A) = (6) If F(A)is contained in a line through the origin, (4) = 0. (a) Otherwise, consider the two different support lines of F(A) that go through the origin, and let @(A) be the angle subtended by these two lines at the origin. 1f0 /*(A), these support lines will be uniquely determined; if 0 is on the boundary of F(A), choose ‘the two support lines that give the minimum angle, a a (unique) tangent to 1.1 Definitions 7 We shall see that (A) is a compact convex set for every A€ M,, 80 this informal definition of the field angle makes sense. The field angle is just the angular opening of the smallest angular sector that includes (A), that is, ‘the angular opening of the sector F”(A). ‘Finally, the size of the bounded set (A) is of interest. We measure its size in terms of the radius of the smallest circle centered at the origin that contains F(A). 114 Definition. The numerical radius of A € M, is (A) =max {|2|:2€ F(A)} ‘The numerical radius is a vector norm on matrices that is not a matrix norm (see Section (8.7) of {H)). Problems 1, Show that among the vectors entering into the definition of F(A), only ‘vectors with real nonnegative first coordinate need be considered. 2, Show that both F(A) and F”(A) are simply connected for any A€ My. 3. Show that for each 0¢ <7, there is an A€ M with O(4)=0. Is (A) = 31/2 possible? 4. Whyis the "max" in (1.1.4) attained? 5. Show that the following alternative definition of F(A) is equivalent to the one given: (A) {2 Aa/ate: 26 Cand 24 0) ‘Thus, F (-)is a normalized version of F'(+). 6. Determine F[} 9], #[p i] and Ft]. 7. If AeM, and ae F(A), show that there is a unitary matrix Ve M, such that ais the 1,1 entry of U*AU. 8. Determine as many different postible types of sets as you can that can bean F’(A). 8 ‘The field of values 9, Show that F(A*) = F(A) and F’(A*) = P(A) for all Ae My. 10. Show that all of the main diagonal entries and eigenvalues of a given Ae M, are nits field of values F(A). 1.2 Basic properties of the field of values ‘Ana function from M, into subsets of €, the field of values (+) has many tweful functional properties, most of which are easily established. We cutalog many of these properties here for reference and later use. ‘The ‘important property of convexity is left for discussion in the next section. ‘The sum or product of two subsets of C, or of a subset of € and a scalar, how the usual algebraic meaning, For example, if 5, TCC, then S+ T= (a4 tse5,te 7}. 1.2.1 Property: Compactness. For all A€ My, P(A)i8.a compact subset of € Proof: The set (A) is the range of the continuous function s—+ 2*Az over the domain {z:2€ C*, z*z = 1}, the surface of the Buclidean unit ball, which ina compact set. Since the continuous image of a compact set is compact, it follows that (A) is compact. 0 1.22 Property: Convesity, For all Ae F(A)is a convex subset of € ‘The next section of this chapter is reserved for a proof of this fundamen- tal fact, known as the Toeplite-Hausdorff theorem. At this point, it is clear that F(A) must be a connected set since it is the continuous image of a connected set. Buercise. If Ais iagonal matrix, show that /*(A) is the convex hull of the «lagonal entries (the eigenvalues) of A. "The field of values of a matrix is changed in a simple way by adding a scalar multiple of the identity to it or by multiplying it by a scalar, 1.2. Basic properties ofthe field of values 9 123 Property: Translation. For all Ae M, and a€C, F(A+al)= F(A) +a Proof We have F(A+al)= {2*(A+al)z:2t2=1}= {2tAz+ asta: tralja(tdstateal}={tdewte=l}+a=F(A)+a Q] 124 Property: Scalar multiplication. For all Ae M, and ae C, F(aA) = aF(A) Exercise. Prove property (1.2.4) by the same method used in the proof of (128). For A€ My, H(A)= (A+ A*) denotes the Hermition part of A and 8(A)= 4(A- A*) denotes the skew-Hermitian port of A; notice that A= H(A) + S(A) and that H(A) and #S(A) are both Hermitian. Just as taking the real part of a complex number projects it onto the real axis, taking the ‘Hermitian part of a matrix projects its field of values onto the real axis. ‘This simple fact helps in locating the field of values, since, as we shall see, it is relatively easy to deal with the field of values of a Hermitian matrix. For aset SCC, we interpret Re $ as {Re s: s€ $}, the projection of Sonto the real axis. 1.2.5 Property: Projection. For all A€ My F(H(A)) =Re F(4) Proof: We calculate 2*H(A)z = 2*}(A + A*)2= 4(2*Az+ 2*A*2) = 4(2*Az + (2*Az)*) = 4(2*Az+ Az) = Re ztAz Thus, each point in F(B(A)) is of the form Re zfor some z¢ F(A) and vice versa. o We denote the open upper half-plane of € by UHP= {z€ €:Imz> 0}, the open left half-plane of € by LHP= {z¢ €: Re z<0}, the open right half- plane of € by RHP= {ze €: Re z> 0}, and the closed right halfplane of € by RHPy={z€C: Rez2 0}. ‘The projection property gives a simple indication of when F(A) ¢ RHP or RHP in terms of positive definiteness or positive semideBiniteness: 0 ‘The field of values 1.25a Property: Positive definite indicator function. Let Ae M,. ‘Then F(A) ¢ RHPif and only if A+ A* is positive definite 1.2.5b Property: Positive semidefinite indicator function. Let A€ My. ‘Then F(A) C REP if and only if A + A* is positive semidefinite Bvercise. Prove (1.2.58) and (1.2.5b) (the proofs are essentially the same) using (1.2.5) and the definition of positive definite and semidefinite (eee Chapter 7 of (H]}). ‘The point set of eigenvalues of A € M, is denoted by o(4), the spectrum of A. A very important property of the field of values is that it includes the cigenvalues of A. 1.2.6 Property: Spectral containment. For all A € Myy o(A)c F(A) Proof: Suppose that \ € o(A). ‘Then there exists some nonzero z¢ €*, which we may take to be a unit vector, for which Az= Az and hence A= Aztz= at(Az) = ate F(A). 0 Beercise. Use the spectral containment property (1.2.6) to show that the ‘igenvalues of a positive definite matrix are positive real numbers. Bzercise. Use the spectral containment property (1.2.6) to show that the cigenvalues of i a] areimaginary. ‘The following property underlies the fact that the numerical radius is a vector norm on matrices and i an important reason why the field of values is #0 useful. 1.2.7 Property: Subadditivity. For all A, Be M, F(A + B)c P(A) + F(B) Proof: F(A+ B)= {a*(A+ Bla: 260%, stz= 1) = {2tAz+ otBe: 260, 1,2 Basic properties of the field of values u sem i}c {etAn 260%, tex 1}+ {yBy yee", vfy=1}= F(A) + F(B). : 0 Brercise. Use (1.2.17) to show that the numerical radius r(-) satisfies the triangle inequality on Mf, Another important property of the field of values is its invariance under unitary similarity. 128 — Property: Unitary similarity invariance. For all A, Ue M, with Uunitary, F(U*AU) = F(A) Proof: Since a unitary transformation leaves invariant the surface of the Euclidean unit ball, the complex numbers that comprise the sete *(U*AU) and F(A) are the same, Ife €* and 2*z= 1, we have 2*(U*AU)z= y*Ay€ F(A), where y= Us, 60 y*y= 2* U"Ur= chz= 1. Thus, F(U*AU) c F(A). ‘The reverse containment is obtained similarly. 0 ‘The unitary similarity invariance property allows us to determine the field of values of a normal matrix. Recall that, for a set Scontained in a real ‘or complex vector space, Co(S) denotes the convex hull of S, which is the set of all convex combinations of finitely many points of S. Alternatively, Co(S) can be characterized as the intersection of all convex sets containing S, soit is the “emallest" closed convex set containing S. 129 Property: Normality. If A € M, is normal, then F(A)= Col a()) Proof: If A is normal, then A= U*AU, where A= diag (Ay).y Ay) 18 diag- onal and Uis unitary. By the unitary similarity invariance property (1.2.8), F(A) = F(A) and, since she So aa.= Bla, 12 ‘The field of values F(A) is just the set of all convex combinations of the diagonal entries of A (et2=1 implies B; |2;|?= 1 and |;[?2 0). Since the diagonal entries of A are the eigenvalues of 4, this means that F(A) = Co(o(A)). 0 Exercise. Show that if His Hermitian, F(#) is closed real line segment ‘whose endpoints are the largest and smallest eigenvalues of A. Exercise. Show that the field of values of a normal matrix is always a poly- gon whose vertices are eigenvalues of A. If A€M,, how many sides may F(A) have? If A is unitary, show that F(A) is a polygon inscribed in the unit circle. zercise. Show that Co(o(A)) ¢ F(A) for all A € My. Baercise. If A, BE My show that o(A+ B)c F(A) + F(B) If A and Bare normal, show that (A+B) CofA) + Co(o5)) ‘The next two properties have to do with elds of values of matrices that ace built up from or extracted from other matrices in certain ways. Recall that for AEM, and Be A, the direct sumof Aand Bis the matzix 1 AoBz (é 4 eM, 4, Hf Jc{1,2,.) n} i8 an index set and if A€M,, then A(J) denotes the principal submatrix of A contained in the rows and columns indicated by J. 1.2.10 Property: Direct sums. For all Ac M, and Be My F(A@ B) = Co(F(A)UF(B)) Prooft Note that A®BeM,,. Partition any given unit vector 26 CP as r= ti. where ze €" es yee", Then *(Ae B)e= stAz+ By. I yty=t then 2=0 and 2*(A@ B)z= y By€ F(B), 80 F(A® B)> P(B). By a similar argument when 242= 1, P(A©B)> F(A) and hence 1.2. Basic properties of the field of values 13 F(A@B)> F(A)UF(B). But since F(A® B) is convex, it follows that F(A@B) > Co(F(A)U F(B)) (see Problem 21). ‘To prove the reverse containment, let z= [3] € C"y*+", be a unit vector again, If 2t2= | = Land 2*(A @ B)z= y*By € F(B) ¢ Co(F(A)U F(B)). The argument is analogous if y*y= 0. Now suppose that both zand -yare nonzero and write #(Ae By 2st y*By= #24] + vo[ 22] vy Since 22+ y*y= 2+z= 1, this last expression is a convex combination of was wBy Fee r(A) na Tite rte) and we have F(A® B) c Co(F(A)u F(B)). o 1.2.11 Property: Submatriz inclusion. For all Ae M, and index sets Jc {yh F(AZ)) ¢ F(A) Proof: Suppose that J= {j,yu jis With 1 << ++ < jpn, and sup- pose z€C* satisfies z*r=1. We may insert zero entries into appropriate locations in xo produce a vector £¢ €* uch that £5 = 73, and j= 0 for all other indices j A calculation then shows that 2*A(J)z= 2 Az and 2*2= 1, which verifies the asserted inclusion. 0 12.12 Property: Congruence and the angular field of values. Let A€ My ‘and suppose that Ce M, is nonsingular. Then F(CtAC) = F(A) Proof: Let z€ €" be a nonzero vector, so that 2*C*ACr= y*Ay, where y= C240. Thus, F’(CtAC)c F(A). In the same way, one shows that P(A)c F*(CPAC) since A =(C7)*CrAC( C1), 0 4 ‘The field of values Problems 1. For what Ae M, does F(A) consist of a single point? Could (A) consist of k distinct points for finite k> 1? 2. State and prove results corresponding to (1.2.8) and (1.2.5a,b) about projecting the field of values onto the imaginary axis and about when F(A) isin the upper half-plane. 3. Show that F(A) + F(B)is not the same as F(A-+ B) in general. Why not? 4. ILA, Be M,,is F(AB)c F(A)F(B)? Proveor give a counterexample. 5. If A,BeMg is F’(AB)CF'(A)P'(B)? Is @(AB)< (A) +.0(B)? Proveor gives counterexample. 6. If A, Be M, are normal with o(A)={a,..., ag} and o(B) = {6}, Aq}, show that o(A + B)C Col{a;-+ Bz i J= Ann m}). HOS ag 1 and U, Ve M, ave unitary, show that p(aU+ (1-a)V)< ap(U) + (1-a)a(V) , where p(-) denotes the spectral radius. 1. If A, Be M, are Hermitian with ordered eigenvalues a, ¢ +++ Ca, and By $-+*£Bqy respectively, use a field of values argument with the subadditivity property (1.2.7) to show thet a +4, £7 and 14 $0, ++ Ayy where 7, $+++ 7, are the ordered eigenvalues of C= A+B.” What can you say if equality holds in either of the inequalities? Compare with the ‘conclusions and proof of Weyl’s theorem (4.3.1) in [BJ]. 8, Which convex subsets of € are fields of values? Show that the class of convex subsets of € that are fields of values is closed under the operation of taking convex hulls of finite unions of the sets. Show that any convex polygon and any disc are in this clase. 9. According to property (1.2.8), if two matrices are unitarily similar, ey have the same field of values. Although the converse is true when n=2 {sce Problem 18 in Section (1.3)], it is not true in general. Construct two matrices that have the same size and the same field of values, but are not unitarily similar. A complete characterization of all matrices of a given size with a given field of values is unknown. 10. According to (1.2.9), if A € M, is normal, then its field of values is the convex hull of its spectrum. Show that the converse is not true by consid- 1.2 Basic properties of the field of values Fry ering the matrix A ing [0] €My Show that F(A)= Go(o(A)), but that A is not normal. Construct a counterexample of the form A= diag(Ay, rp, q)® [pg] € Mj. Why doesn’t this kind of example work for M,? Is there some other kind of counterexample in M,? See (1.6.9). 11. Let zbe a complex number with modulus 1, and let 019 az [034] 700) ‘Show that (A) is the closed equilateral triangle whose vertices are the cube roots of z More generally, what is F(A)if A= (a,;] € Mf, bas all g,.4=1, 4, = 4, and all other aj;= 07 32, If Ae M, and F(A) isa real line segment, show that Ais Hermitian. 18. Give an example of a matrix A and an index set J for which equality occurs in (1.2.11). Can you characterize the cases of equality? 14. What is the geomettic relationship between F/(C*AC) and F°(A) if: (a) Ce M,issingular, ot (b) if Cis nby-kand rank C= 415. What is the relationship between F(C*AG) and F(A) when Ce M, is nonsingular but not unitary? Compare with Sylvester's law of inertia, ‘Theorem (4.5.8) in [BJ]. 16. Properties (1.2.8) and (1.2.11) are special cases of a more general property. Let A M,, k&n, and Pe M,,, be given. If P*P = Ie My then P is called an isometry and P*APis called an isometric projection of A. Notice that an isometry Pe M, , i8 unitary if and only if =n. If A’ isa principal submatrixof A, show how to construct an isometry Psuch that 4’ = PYAP. Prove the following statement and explain how it includes both (1.2.8) and (1.2.11): 1.2.13 Property: Isometric projection. For all Ae M, and Pe M,,, with k¢ mand P*P= I, F(PYAP)c F(A), and F(P*AP) = F(A) when k= n. 17. If is natural to inguire whether there are any nonunitary cases in which the containment in (1.2.13) is an equality. Let A € M, be given, let Pe M, , be a given isometry with 4/3> 404(B). ‘Does this contradict (a)? 26. Let A€ M, be given. Show that the following are equivalent: (a) (A) <1. (d) p(H (eA) < 1 forall 0€R. (©) Ames(H (e%A)) <1 for all #€ 8. (4) (WH(e*4) Ip <1 for all eR. 13 Convexity In this section we prove the fundamental convexity property (1.2.2) of the field of values and discuss several important consequences of that convexity. We shall make use of several basic properties exhibited in the previous section. Our proof contains several useful observations and consists of three parts: 1, Reduction of the problem to the 2-by-2 case; 2. Use of various basic properties to transform the general 2-by-2 ‘case to 2-by-2 matrices of special form; and 18 ‘The field of values 3. Demonstration of convexity of the field of values for the special 2-by-2 form. See Problems 7 and 10 for two different proofs that do not involve reduction to the 2-by-2 case. There are other proofs in the literature that are based upon more advanced concepts from other branches of mathematics Reduction to the 2-by-2.case {In order to show that a given set $c € is convex, itis suficient to show that s+ (1-a)te S whenever 0¢ a1 and s,t€ 5. Thus, for a given A€ My, F(A) is convex if aztAz+ (1-a)y*Aye F(A) whenever 0¢a¢1 and 5 yEC* satisfy atc= y*y= 1. It suffices to prove this only in the 2-by-2 case because we need to consider only convex combinations associated with pairs of vectors. For each given pair of vectors z, yc €%, there is a unitary matrix and vectors v, we €* such that 2= Uv, y= Uw, and all entries of » and wafter the fit two ate equal to zero (see Problem 1). Using this trans- formation, we have act Az+ (1-o)y*Ay= aot YAU + (1-a)ut UAW = av’ Bvt (1-a)u*Bu = ag B({1,2})6 + (1- a)" B({2,2})0 where Bz U*AU, B{{1,2}) is the upper left 2-by-2 principal submatrix of B, and ¢, 7 € € consist of the first two entries of vand w, respectively. ‘Thus, it suffices to show that the field of values of any 2-by-2 matrix is convex. This reduction is possible because of the unitary similarity invariance property (1.2.8) of the field of values. Sufficiency of a special 2-by-2 form ‘We prove next that in order to show that J*(A) is convex for every matrix A€ Mp, it suffices to demonstrate that Fh 0] is convex for any 0, 520. ‘The following observation is useful. 1.3.1 Lemma. For each A € Mp, there is a unitary U'€ Mp such that the ‘two main diagonal entries of U* AU are equal. 13° Convexity 19 Proof: We may suppose without loss of generality that tr A=0 since we may replace A by A-(str A) We need to show that A is unitarily similar toa matrix whose diagonal entries are both equal to zero. ‘To show this, it is sufficient to show that there is a nonzero vector we €? guch that w*Aw= 0. Such a vector may be normalized and used as the first column of a unitary matrix W, and a calculation reveals that the 1,1 entry of W*A Wis zero; the 2,2 entry of W*AW must also be sero since the trace is zero. Construct the vector was follows: Since A has eigenvalues +a-for some complex number a, let z be a normalized eigenvector associated with -arand let y be a normal- ized eigenvector associated with +a. If a=0, just take w=2 If a40, 2 and y are independent and the vector w= e!@z + yis nonzero for all @€R. A calculation shows that ut Aw= a(e~%:ty-el¥y*z) = 2ialm(e-"¥2*y). Now choose 690 that e~92*yis real. 0 ‘We now use Lemma (1.3.1) together with several of the properties given in the previous section to reduce the question of convexity in the 2-by-2 case to consideration of the stated special form. If A M, is given, apply the translation property (1.2.3) to conclude that F(A) is convex if and only if F(A-+al) is convex. Ifwe choose a=-}tr A, we may suppose without loss of generality that our matrix has trace 0. According to (1.3.1) and the unitary similarity invariance property (1.2.8), we may further suppose that both main diagonal entries of our matrix are 0. ‘Thus, we may assume that the given matrix has the form [§ g] for some c, dé€. Now we can use the unitary similarity invariance property (1.2.8) and a diagonal unitary matrix to show that we may consider 10 Jo qfi0]_[o ce¥ 0 #4] [ad 0} [0 ei] ~ [aert? 0 for any OER. If c= | cle! and d= | dei, and if we choose = 4(0q-%), the later matric becomes [2 |5I] with y= 4(4, + 4). ‘Thus, it suffices to consider a matrix of the form e'*[5 d with pe Rand 4,520. Finally, by the scalar multiplication property (1.2.4), we need to consider only the special form » 0620 (13.2) 20 ‘Theficld of values ‘That is, we have shown that the field of values of every 2-by-2 complex matrix is convex if the field of values of every matrix of the special form (1.8.2) is convex. Converity of the field of values of the special 2-by-2 form 1.3.8 Lemma. If Ac M, has the form (1.3.2), then F(A) is an ellipse (with its interior) centered at the origin. Its minor axis is along the imagi- nary axis and has length | a 6|. Its major axis is along the real axis and has length a+ b. Ite foci are at +VaB, which are the eigenvalues of A. Proof: Without loss of generality, we assume 2520, Since 2*Az= (ei42)* Al for any 9€R, to determine F(A), it suffices to consider z*Az for unit vectors z whose first component is real and nonnegative. ‘Thus, we consider the 2vector z= [t,e%(1-#)4]? for 0¢t¢1 and 0¢ O¢ 2m. A caleulation shows that #Az=t(1-P)H(a+ boos 6+ i(a-b)sin 4] ‘As 0 varies from 0 to 2x, the point (a-+ 8)cos 0+ i(a~ B)sin 0 traces out a possibly degenerate ellipse £ centered at the origin; the major axis extends from -(a+ 5) to (a+ 8) on the real axis and the minor axis extends from i(b~a) to #(a- 8) om the imaginary axis in the complex plane. As t varies from 0 to 1, the factor ¢(1-#)} varies from 0 to 4 and back to 0, ensuring that every point in the interior of the ellipse 4€ is attained and verifying that (A) is the asserted ellipse with its interior, which is convex. ‘The two foci of the ellipse 4€ are located on the major axis at distance [4(a-+ 6)?-4(a- 6)?]4 = +7ab from the center. ‘This completes the argument (o prove the convexity property (1.2.2). ‘There are many important consequences of convexity of the field of values, One immediate consequence is that Lemma (1.3.1) holds for matri- ces of any size, not just for 2-by-2 matrices. 1.34 ‘Theorem. For each A M, there is a unitary matrix Ue M, such hat all the diagonal entries of U*AUhhave thesame value tr(4)/n. Proof: Without loss of generality, we may suppose that tr A =0, since we may replace A by A-[tr(4)/n]/, We proceed by induction to show that A is 1.3. Convexity a ‘unitarily similar to a matrix with all zero main diagonal entries. We know from Lemma (1.3.1) that this is true for n= 2, so let n 2 3 and suppose that ‘the assertion has been proved for all matrices of all orders less than n. We have 1 Ons ARE tet HE: and this is a convex combination of the eigenvalues A; of A. Since each d; is in F(A), and since F(A) is convex, we conclude that 0¢ F(A). Ifze "is a ‘unit vector such that 2*Az=0, let W=[z wy ... u,]€M, be a unitary matrix whose first column is 2, One computes that, waw=[ a 2 Cec! AeM,y But 0=trA= tr WAW= trA=0, and 80 by the induction hypothesis ‘there is some unitary Ve M,. such that all the main diagonal entries of VA Vare zero, Define the unitary direct sum which has a zero main diagonal by construction. Oo A different proof of (1.3.4) using compactness of the set of unitary matrices is given in Problem 3 of Section (2.2) of {HJ}. Another important, and very useful, consequence of convexity of the field of values is the following rotation property of a matrix whose field of ‘values does not contain the point 0. 135 ‘Theorem. Let A € Mf, be given. There exists a real number #such ‘that the Hermitian matrix H(e'A) = }[e'@A + eA] is positive definite if and only if 0 ¢ F(A). Proof: If H(e'A) is positive definite for some @¢, then F (eA) c RHP by 22 ‘The field of values c 80 0 ¢ F(eiA) and hence 0 ¢ F(A) by (1.2.4). Conversely, suppose Og F(A). By the separating hyperplane theorem (see Appendix B of [HJ)), there is a line Z inthe plane such that each of the two nonintersecting compact convex sets {0} and /*(A) lis entirely within exactly one of the two open half-planes determined by L. The coordinate axes may now be rotated 0 that the line is carried into a vertical line in the right half-plane with F(A) strictly to the right of it, that is, for some 0 €R, (eA) (A)c RHP, 50 H(e*A) is positive definite by (1.2.54). o Some useful information can be extracted from a careful examination of the steps we have taken to transform a given matrix A ¢ M, to the special form (1.3.2). ‘The first step was a translation A—+ A-(Jtr A)I= Ap to achieve tr Ay =0. The second step was a unitary similarity Ay —+ UA,U* = A, to make both diagonal entries of A, zero, The third step was another unitary similarity 4y—+ VA, V* 2 A, to put A, into the form A= e[) J with o, 62 Oand geR ‘The last step was a unitary rotation A, ~e-#"A, = As to achieve the special form (1.3.2). Since the field of values of Ag is an ellipse (possibly degenerate, ‘hat is, a point or line segment) centered at the otigin with its major axis along the real axis and its foci at +/ab, the eigenvalues of As, the field of values of A is also an ellipse centered at the origin, but its major axis is {tilted at an an angle to the real axis. A line through the two eigenvalues of ‘Ag, +e Jab, which are the foci of the ellipse, contains the major axis of the ellipse; if ab = 0, the ellipse is a circle (possibly degenerate), so any diameter i amajor axis. Since A, and Ay are achieved from A, by successive unitary imilarities, each of which leaves the eigenvalues and field of values invariant, we have F (49) = F (Ay) = F (4g). Finally, P(A) = F (Ag+ [ite AJI) = F(Ag) + dtr A= (AQ) + te A ‘uahift that moves both eigenvalues by jtr A, 50 we conclude that the field of values of any matrix A € Mp is an ellipse (possibly degenerate) with center at the point jtr A. ‘The major axis of this ellipse lies on a line through the two Clgenvalues of A, which are the foci of the ellipse; if the two eigenvalues coincide, the ellipse is a circle or a point, ‘According to Lemma (1.3.3), the ellipse /*(A) is degenerate if and only 1.3 Convexity 23 if Ay= [} G] has a= 6. Notice that A$4y= [p° fs] and 445 = [9° ba], 00 = if and only if Ay is normal, But A can be recovered from Ay by 8 nonzero scalar multiplication, two unitary similarities, and a translation, each of which preserves both normality and nonnormality. Thus, 43 is normal if and only if A is normal, and we conclude that for A€ Mz, the ellipse F(A) is degenerate if and only if Ais normal. ‘The eigenvalues of Ay are located at the foci on the major axis of F(A) at a distance of YaB from the center, and the length of the semimajor axis is (a+ 8) by (1.3.8). Thus, }(0+ 6)-/ab= 4(Va- 5)? 20 with equality if and only if a= 6, that is, if and only if A is normal. We conclude that the eigenvalues of a nonnormal A € M; always lie in the interior of F(A). For A € M,, the parameters of the ellipse (A) (even if degenerate) can ‘be computed easily using (1.8.3) if one observes that a and bare the singular values of Ay = [ 4)» that is, the square roots of the eigenvalues of A$A5, and the singular values of A, are invariant under pre- or post-multiplication by any unitary matrix. ‘Thus, the singular values 0; 2 090 of 4p = A- (dtr A)Tare the same as those of As. The length of the major axis of F(A) is G+ b= 0; + op, the length of the minor axis is lec 04-0, and the distance ofthe foci from the center is (a+ t)?-(a- 8)2]4/4 = Yab= Yoroa= [det Ag|#= [det Ag|4. Moreover, of-+ of = tr A§Ay (the sum of the squares of the moduli of the entries of Ap), 80 01 + op = [of + 03 4 20, 09]4 [tr AfAp 4 2| det 4p|]#. We summarize these observations for convenient reference in the following theorem. 1.3.6 Theorem. Let A. M, be given, and set Ay = A-(}tr A)Z Then (a) The ficld of values F(A) is a closed ellipse (with interior, possibly degenerate). (b) The center of the ellipse F (A)is at the point jtr A. Thelength of the major axis is [tr AfAp + 2|det Ap|]3; the length of the minor axis is (tr A$Ap-2|det Ao|]}; the distance of the foci from the center is |det Ay|4. ‘The major axis lies on a line passing through the two eigenvalues of 4, which are the foci of (A); these two eigenvalues coincide if and only if the ellipseis a circle (possibly a point), (©) F(A) is a closed line segment if and only if A is normal; it is a single point if and only if A ig ascalar matrix. (a) F(A) is a nondegenerate ellipse (with interior) if and only if A is 4 ‘The field of values not normal, and in this event the eigenvalues of A are interior points of F(A). Problems 1. Let z, ye €* be two given vectors. Show how to construct vectors v, we (* and a unitary matrix Ue M, such that 2= Uv, y= Uw, and all entries of vand wafter the first two are zero. 2 Verify all the calculations in the proof of (1.3.1). 8. Sketch the field of values of a matrix of the form (1.3.2), with 020, a0, 4. Use (1.3.8) to show that the feld of values of [3] is a closed ellipse (with interior) with foci at 0 and 1, major axis of length 7, and minor axis of length 1. Verify these assertions using Theorem (1.3.6). 5. Show that if A ¢ M,(R) then F(A) is symmetric with respect to the real axis, 6. If ,..., 2, €€* are given orthonormal vectors, let P= (2, .. 2] € My,y and observe that P*P= Ie My. If A€ M,, show that (PAB) CPLA) and that 2*dz;e F(P*AP) for i=1,.., k Use this fact for k=2 to give an alternate reduction of the question of convexity of F (A) to the 2-by-2 case. 1. Let AEM, be given. Provide details for the following proof of the convexity of F(A) that does not use a reduction to the 2-by-2 case. There is nothing to prove if (A) isa single point. Pick any two distinct points in P(A), Using (1.2.3) and (1.24), theres no loss of generality to assume that these two points are 0, a€R, > 0. We must show that the line segment Joining 0 and a lies in F(A), Write Aas A= H+ which (A) and K= ~S(A)= -i4(A-A*), so that H and K are Hermitian, Assume further, without loss of generality after using a unitary similarity, that Kis siagonl. Lets, y€C¥aatiny 2 z= 0, ote 1, ytAy= and yPy= 3, and let 35= [9 6%, y= ye? 5= Ass m. Note that a? Ha= 2h Ke= yt Ky = Oand y*Hy= a. Define oi) €€7,0¢ #61, by 1.8 Convexity oy |zjleilt-3005, ostsi/s af) =] [2 -s1)1g/24 (t-DIyl)', ya3, such that (A) is not aline segment. Why can’t this happen in M,?” 13. Let Ae M, be given. If F(A) is aline segment or & point, show that A ‘must be normal. 14. Consider the upper triangular matrix A'= [}* 41] € My. Show that the Jength of the major axis of the ellipse (A) is [];—g|?+ |6|2}# and the length of the minor axis is ||. Where is the center? Where are the foci? In particular, conclude that the eigenvalues of A are interior points of F(A) if and only if 8} 0. 15. Let Ae M, be given with 0¢ F(A). Provide the details for the fol- lowing proof that F(A) lies in an open half-plane determined by some line through the origin, and explain how this result may be used as an alternative to the argument involving the separating hyperplane theorem in the proof of (1.8.8). For z¢ F(A), consider the function (2) = z/|2| = exp(iarg 2). Since «{-) is a continuous function on the compact connected set F(A), its range o{*(A)) = Ris a compact connected subset of the unit circle. Thus, R isn closed arc whose length must be strictly less than 1 since F*(A) is convex and 0¢ F(A). 16. Let A € M, be given. Ignoring for the moment that we know the field of values (A) is convex, F(A) is obviously nonempty, closed, and bounded, s0 its complement #*(A)© has an unbounded component and possibly some bounded components. ‘The ower boundary of F(A) is the intersection of F(A) with the closure of the unbounded component of F(4)#. Provide details for the following proof (due to Toeplitz) that the outer boundary of P(A) is a convex curve. For any given 0€[0,2x], let eA = H+ ik with Hermitian H, Ke M,, Let Aq(#) denote the algebraically largest eigenvalue of H, let Sp= {ze 240, H2= A,(H)a}, and suppose dim(S,)= k21. ‘Thon the intersection of F (eA) with the vertical line Re z= ,(#) is the sot A,(H) + i{2*Ke: £€ Se, ||z|lp = 1}, which is a single point if k=1 and can be a finite interval if k> 1 (this is the convexity property of the field of values for a Hermitian matrix, which follows simply from the spectral theorem). Conclude, by varying 4, that the outer boundary of F(A) is a convex curve, which may contain straight line segments. Why doesn’t this prove that F(A) is convex? 13 Convexity 7 17, (a) Let Ac M, be given. Use Schur’s unitary triangularization theo- rem (‘Theorem (2.3.1) in (HJ]) and a diagonal unitary similarity to show that Ais unitarily similar to an upper triangular matrix of the form dy ofA) 0d, where a(A) > 0. Show that tr A*A = |Aq|?+ |Ag|? + a(A)? and that a( A) is a unitary similarity invariant, that is, a(4) = o( UAU* ) for any unitary Ue My (b) IA, Be M, have the same eigenvalues, show that A is unitarily similar to Bifand only ifte A*A = tr BYB. (©) Show that A, Be M have the same eigenvalues if and only if tr A = tr B and tr A? = tr BP. (4) Conclude that two given matrices A, Be Mg are unitary similar if and only if tr A=trB, tr A? = tr BY, and tr A*A = tr B*B. 18. Use Theorem (1.3.6) and the preceding problem to show that two given 2-by-2 complex or real matrices are unitarily similar if and only if their fields of values are identical, Consider A,= diag(0,1,t) for t € [0,1] to show that nonsimilar 3-by-3 Hermitian matrices can have the same fields of values. 19, Let A= [0] € Ma be given and suppose F(A) ¢ UHP = {z€¢: Im z> 0}. Use Theorem (1.3.6) to show that ifeither (a) a;, and ap are eal, or (b) tr Ais real, then Ais Hermitian. Notes and Further Readings. The convexity of the field of values (1.2.2) was first discussed in 0. Toeplitz, Das algebraische Annlogon zu einem Satze von Fejér, Math. Zeit, 2 (1918), 187-197, and F. Hausdorff, Das Wertvorrat einer Bilinearform, Math. Zeit. 3 (1919), 314-316. Toeplitz showed that the outer boundary of F(A) is a convex curve, but left open the question of whether the interior of this curve is completely filled out with points of F(A); see Problem 16 for Toeplitz’s elegant proof. He also proved the inequality MA lll < 2r(A) between the spectral norm and the numerical radius; see Problem 21 in Section (6.7) of [HJ]. In a paper dated six months after ‘Toeplita’s (the respective dates of their papers were May 22 and November 28, 1918), Hausdorif rose to the challenge and gave a short proof, similar to the argument outlined in Problem 7, that F(A) is actually a convex set. ‘There are many other proofs, besides the elementary one given in this 28 ‘The field of values, section, such as that of W. Donoghue, On the Numerical Range of a Bounded Operator, Mich. Math. J. 4 (1957), 261-263. The result of Theo- rem (1.3.4) was first noted by W. V. Parker in Sets of Complex Numbers Associated with a Matrix, Duke Math. J. 15 (1948), 711-715. ‘The modifica- tion of Hausdorti's original convexity proof outlined in Problem 7 was given by Donald Robinson; the proof outlined in Problem 10 is due to Roy Mathias. ‘The fact that the field of values of a square complex matrix is convex has an immedjate extension to the infinite-dimensional case. If T is a bounded linear operator on a complex Bilbert space % with inner product <:y->, then its field of values (often called the numerical range) is F(T) = {: 2€7 and <52>=1}. One can show that F(7) is convex by reducing to the two-dimensional case, just as we did in the proof in this section. 14 Axiomatisation tis natural to ask (for both practical and aesthetic reasons) whether the list of properties of F(A) given in Section (1.2) is, in some sense, complete. ‘Since special cases and corollary properties may be of interest, it may be that no finite list is truly complete; but a mathematically precise version of the completeness question is whether or not, among the properties given ‘thus far, there is a subset that characterizes the field of values. If so, then further properties, and possibly some already noted, would be corollary to a set of characterizing properties, and the mathematical utility of the field of values would be captured by these properties. This does not mean that it is not useful to write down properties beyond a characterizing set. Some of the most applicable properties do follow, if tediously, from others. 1.4.1 Example. Spectral containment (1.2.6) follows from compactness (1.2.1), translation (1.2.3), scalar multiplication (1.2.4), unitary invariance (1.2.8), and submatrix inclusion (1.2.11) in the sense that any set-valued function on M, that has these five properties also satisfies (1.2.6). fA é M, and if f€ o(4), then for some unitary Ue M, the matrix U*AU is upper triangular with in the 1,1 position. ‘Then by (1.2.8) and (1.2.11) it is ‘enough to show that f« F((f]), and (because of (1.2.3)) it suffices to show that 0 € F({0}); here we think of [8] and [0] as members of M,. However, because of (1.2.4), F([0]) = @FF({0]) for any a, and there are only two non- 1.4 Axiomatization 29 empty subsets of the complex plane possessing this property: {0} and the entire plane. The latter is precluded by (1.2.1), and hence (1.2.6) follows. ‘Beercise. Show that (1.2.8) and (1.2.10) together imply (1.2.9). ‘The main result of this section is that there is a subset of the properties already mentioned that characterizes F(-) as a function from M, into subsets of €. 142 ‘Theorem. Properties (1.2.14 and 5b) characterize the field of values. That is, the usual field of values F(-) is the only complex set-valued function on M, such that (a) F(A)is compact (1.2.1) and convex (1.2.2) for all A€ Ms (0) F(A + af) = F(A) + (1.2.3) and F(a) = @F(A) (1.2.4) for allaeCandall Ae M,; and wubset of the closed right half-plane if and only if positive semidefinite (1.2.5b). © Proof: Suppose F(-) and Fy(-) are two given complex set-valued functions on M, that eatisfy the five cited functional properties. Let A € M,, be given. ‘We first show that F\(A) ¢ F(A). Suppose, to the contrary, that 6 ¢ F(A) and ff F,(A) for some complex number f. Then because of (1.2.1) and (4.2.2), there isa straight line Zin the complex plane that has the point 6 strictly on one side of it and the set F,(A) on the other side (by the separa- ting hyperplane theorem for convex sets; see Appendix B of [BJ]). ‘The plane may be rotated and translated so that the imaginary axis coincides with Z and f lies in the open left half-plane. That is, there exist complex numbers a, #0 and a, such that Re(ay6+ ay) <0 while aF,(A) + ap contained in the closed right half-plane. However, ayF,(4) + a, Fa(ayA + apt) because of (1.2.3) and (1.2.4), 60 Re A <0 while F(A’) lies in the closed right half-plane, where A’ = ayA + apf and f! = a+ a, € F(A‘). Then, by (1.2.5b), A’ + A’* is positive semidefinite. This, however, contradicts the fact that F,(A’) is not contained in the closed right halfplaze, and we conclude that F(A) ¢ Fy(A). Reversing the roles of F,(-) and F,(-) shows that F,(A)c F(A) a8 well. Thus, F,(A) = F,(A) for all Ae M,. Since the usual field of values F(+) satisfies the five stated properties (1.2.14 and 5b), we obtain the 30 ‘Thefield of values desired conclusion by taking F,(4) = F(A). 0 Problems 1. Show that no four of the five properties cited in Theorem (1.4.2) are sufficient to characterize F (-); that is, each subeet of four of the five proper- ties in the theorem is satisfied by some complex set-valued function other than F(+). 2. Show that (1.2.2-4 and 5a) also characterize F(-), and that no subset of ‘these four properties is suficient to characterize F (+). 3. Determine other characterizing sets of properties. Can you find one that does not contain (1.2.2)? 4. Show that the complex set-valued function F(-) = Co(o(-)) is charac- terized by the four properties (1.2.1-4) together with the fifth property (A) is contained in the closed right half-plane if and only if all the eigen values of A have nonnegative real parts" (A is positive semistable). 5. Give some other complex set-valued functions on M, that satisly (1.2.7). Further Reading. section is based upon C.R. Johnson, Functional ‘Characterization of the Field of Values and the Convex Hull of the Spec- trum, Proc. Amer. Math. Soc. 61 (1976), 201-204. 15 Location of the field of values ‘Thus far we have said little about where the field of values F(A) sits in the complex plane, although it is clear that knowledge of its location could be ‘useful for applications such as those mentioned in (1.0.2-4). In this section, wwe give a Gerigorin-type inclusion region for F(A) and some observations that facilitate its numerical determination. Because the eigenvalues of a matrix depend continuously upon its ‘entries and because the eigenvalues of a diagonal matrix are the diagonal entries, it is not too surprising tha there isa spectral location result such as Gorigorin's theorem (6.1.1) in [HJ]. We argue by analogy that because the scl F(A) depends continuously upon theentries of A and because the field of values of a diagonal matrix is the convex hull of the diagonal entries, there 1.5 Location of the field of values, at ought to be some sort of inclusion region for F(A) that, like the Gerégotin discs, depends in a simple way on the entries of A. We next present such an inclusion region. 151 Definition. Let A= [aj] €M,, let the deleted absolute row and column sums of A be denoted by RA)= Legh i a CA)= Dlayl, 5= a respectively, and let 944) = HRA) + C(A)}, be the average of the ith deleted absolute row and column sums of A. Define ‘the complex set-valued function Gp(+) on Mf, by (4) =00[ US feral $ 9(}] Recall Gerigorin’s theorem about the spectrum o(4) of A¢ Bf,, which says that of A)c aye Uf te leod 0}. We first show that if Gp(4)c RHP, then F(A)C RHP. If Gp(A)c RHP, then Re a,;> 9(A). Let H(A) = 4(A + 4*) =B= [by]. Since Ri(A*) = C%(A) and Ry(B)< 9{A) (by the preceding exercise), it follows that b,;= Re o,;> 9(A)> R4(B). In particular, G(B)¢ RHP. Since o(B)c G(B) by Gerigorin’s theorem, we have o(B) ¢ RHP. But since B is Hermitian, F (H(A) = F(B) = Co(o(B)) RHP and hence F(A) c REPby (1.2.5). We next show that if0# Gp(4), then 0¢ F(A). Suppose 0£ Gp(A). Since Gp(A) is convex, there is some 0€ (0,22) such that Gp (eA) = efGp(A)C RHP. As we showed in the first step, this means that F (eA) c REP, and, since F(A) = e~9F (eA), it follows that 0 ¢ F(A). Finally, if af Gp(A), then 0¢ Gp(A-af) since the set function Gp(-) satisfies the translation property. By what we have just shown, it follows that 0 ¢ F(A~ af) and hence a¢ F(A), s0 F(A) ¢ Gp(A). 0 A simple bound for the numerical radius r(A) (1.1.4) follows directly from Theorem (1.5.2). 15 Location of the field of values, 33 153 Corollary. For all Ae My PUA) ane 1 (lel + Lal) It follows immediately from (1.5.8) that (A) a[ max Floyd + mae S143] 1g ign Fh 1 sign Fe and the right-hand side of this inequality is just the average of the maximum absolute row and column sum matrix norms, lll All, and il All, (see Section (6) of (23). 154 Corollary. For all Ae My, r(A)< 4 AU +I Allk)- A norm on matrices is called spectrally dominant if, for all A ¢ M,, it is ‘an upper bound for the spectral radius p(A). It is apparent from the spectral containment property (1.2.6) that the numerical radius r(-) is spectrally dominant. 15.5 Corollary. Forall Ae My A)< (A) ¢4(M All +14ilL)- ‘Wernext discuss a procedure for determining and plotting F(A) numeri- cally. Because the set F(A) is convex and compact, it suffices to determine the boundary of F(A), which we denote by AF (4). ‘The general strategy is to calculate many well-spaced points on OF (A) and support lines of F(A) at these points. The convex hull of these boundary points is then a convex polygonal approximation to F(A) that is contained in F(A), while the intersection of the half-spaces determined by the support lines is a convex polygonal approximation to F(A) that contains F(A). ‘The area of the region between these two convex polygonal approximations may be thought of ag a measure of how well either one approximates F(A). Furthermore, if the boundary points and support lines are produced as one traverses OF (A) in one direction, it is easy to plot these two approximating polygons. A pictorial summary of this general echeme is given in Figure (1.5.5.1). The points q; are at intersections of consecutive support lines and, therefore, are the vertices of the external approximating polygon. 34 ‘The field of values Figure 1.5.5.1 ‘The purpose of the next few observations is to show how to produce boundary points and support lines around OF (A). ‘From (1.2.4) it follows that e-Mp(@i9A) = F(A) (1.5.6) for every Ae M, and all 0¢ [0,2x). Furthermore, we have the following lemme. 1.5.7 Lemma. If'ze€%, 2tc= 1, and Ae M,, the following three condi- tions are equivalent: (a) ReztAz= max {Re a: ae F(A)} (b) HC A)e= max {r-re F(B(A))} (©) B(A)2= Aypeo(B(A))2 where \yyqq(B) denotes the algebraically largest eigenvalue of the Hermitian matrix B. 15 Location of the field of values 35 Proof: The equivalence of (a) and (b) follows from the calculation Re 2* Az =i("de+ atAta) = s*H(A)e and the projection property (1.2.5). If {j,--1 Ya} 18 an orthonormal set of eigenvectors of the Hermitian matrix H(A) and if B(A)yy= A,yy then smay be written as ® ae with ¥ i mt P= since z= 1, ‘Thus, PHA Daeay mt from which the equivalence of (b) and (c) is immediately deduced. 0 Tt follows from the lemma that max {Re a: a€ F(A)} = max{r:r€ F(H(A))} = Amao(H(A)) (1.5.8) ‘This means that the furthest point to the right in F (H(A)) is the real part of ‘the farthest point to the right in F(A), which is Ajaga(H(A)). A unit vector yielding any one of these values also yields the others. Lemma (1.5.7) shows that, if we compute Ayyqq(H(A)) and an associ- ated unit eigenvector =, we obtain a boundary point 2*Azof F(A) and a support line {Xjyoo(H(A)) + ft R} of the convex set F(A) at this bound- ary point; see Figure (1.5.8.1). ‘Using (1.5.6), however, one can obtain as many such boundary points and support lines as desired by rotating F(A) and carrying out the required eigenvalue-cigenvector calculation. For an angle 0€ (0, 2), we define Naz Amne(H(€A)) (15.9) and let 296 €" be an associated unit eigenvector H(CiA)zg= Agcy ahag= 1 (1.5.10) ‘We denote 36 ‘The field of values Figure 1.5.8.1 Jg= the line {e~!%(Ay + ti): t€R} and denote the half-plane determined by the line Ly by iy the half-plane e~#9{z: Re z< Ag} Based upon (1.8.6), (1.8.7), and the preceding discussion, we then have the following: 1.5.11 Theorem. For each Ae M, and each 9€[0,2z), the complex number pp = 24.Azgis a boundary point of F(A). Theline Lgis a support line for F(A), with pg€ Lgn F(A) and F(A) c Hgfor all 6¢ (0,27). Because F(A) is convex, it is geometrically clear that each extreme point of F(A) occurs as a py and that for any af F(A) there is an yg sepa- rating F(A) and o, that is, af Hy ‘Thus, we may represent F(A) in the following way: 1.8.12 Theorem. Forall Ae My 1.5 Location of the field of values, 37 F(A)=Co({pg 06 0<25})= (Hy ogocen Since it is not possible to compute infinitely many points pp and lines Lp we must be content with a discrete analog of (1.5.12) with equalities replaced by set containments. Let @ denote a set of angular mesh points, © = {045 bayer Ophy Where 0 ¢ 0, < Oy < +++ <0, <2. 1.5.13 Definition. Let A € Mf, be given, let a finite set of angular mesh points © = {0 6, < +++ <0,.< 2x} be given, let {py} be the associated set sof boundary points of F(A) given by (1.6.11), and let {Hi} be the half-spaces associated with the support lines Ly, for F(A) at the points pp, ‘Then we define Fy (4,8) {Pays Po and Fou A0) = Hy,0 ++ gy ‘These are the constructive inner and outer approximating sets for F(A), a8 illustrated in Figure (1.8.5.1). 1.5.14 Theorem. For every A € M, and every angular mesh @, Fi(4,8)¢ F(A) C Fo,f4,8) The set Fo,(A,) is most useful as an outer estimate if the angular ‘mesh points 6; are suficiently numerous and well spaced that the set n{ #7: 1€4¢ A} is bounded (which we assume henceforth). In this case, itis also simply determined. Let gp, denote the (finite) intereection point of Lg, and Zggyyy Where i= 1,,.. band i= k-+ 118 identified with i= 1. The existence of these intersection points is equivalent to the assumption that F,4(4,@) is bounded, in which case we have the following simple alternate representa- tion of Fou 4,8): FoukA®) = 1) Beem tee op) (1.5.18) 1¢ Because of the ordering ofthe angular meth points 0 the points pg, and gp 38 ‘The field of values occur consecutively around AF (A) for j= 1,.. k, and OF (4,6) is just the union of the & line segments [rp,Palh» [Pa Poh [PopPa) while AF o4A A,O) consats ofthe kine segments [45.46Jh~» [06 ta. [opty ‘Thus, each approximating set is easily plotted, and the difference of their areas (or some other measure of their set difference), which is easily calcu- lated (see Problem 10), may be taken as a measure of the closeness of the approximation. If the approximation is not sufficiently close, a finer angular mesh may be used; because of (1.5.12), such approximations can be made arbitrarily close to (A). It is interesting to note that (A) (which com- tains o(A) for all Ae M,), may be approximated arbitrarily closely with only a series of Hermitian eigenvalue-eigenvector computations. ‘These procedures allow us to calculate the numerical radius r(A) as well. The following result is an immediate consequence of Theorem (1.5.14). 1.5.16 Gorollary. For each A ¢ M, and every angular mesh ©, max |», Prec a $r(A)¢ max |g] gig Recall from (1.2.11) that F[A(i)} ¢ F(A) for i= 1,,...m, where AC") denotes the principal submatrix of Ae M, formed by deleting row and column i. It follows from (1.2.2) that oof Flaw] cr(A) ‘A natural question to ask is: How much of the right-hand side does the left-hand side fil up? ‘The answer is “all of it in the limit," as the dimension goes to infinity. In order to describe this fact conveniently, we define ‘Area(S) for a convex subset § of the complex plane to be the conventional area unless Sis a line segment (possibly a point), in which case Area(S) is understood to be the length of the line segment (possibly zero). Further- more, in the following area ratio we take 0/0 to be 1. 15.17 ‘Theorem. For Ae M, with n22, let A(i’)€M,. denote the principal submatrix of A obtained by deleting row and column 4 from A. ‘There exists a sequence of constants cy, ¢y,..€[0,1] such that for any AcM,, 1.5 Location of the field of values 30 Area(Co U,FLAG))) ee ‘Area[F(A)] and lim c,=1asn—ra. The constants 2 fe Re Rea (1.5.18) on aaa 8 G[a(n— a cata satisfy these conditions, and so do the constants eee ee (1.5.18) Problems 1. Show by example that there is no general containment relation between any two of the sets Co(G(A)), Co(G(A7)), and Gp(A). 2. Show that neither r(A)<|jj Al]; nor r(A) 1~cq, then Ais positive definite. 17, Consider the constants c, and cf, defined in (1.5.18a,b). Show that the sequence {c,} increases monotonically to 1, ¢, > oj, for n= 2, 3,.. Lim ¢q/cy, = 18 n—ve, and the approximation c,w cf, as n—ra is quite good aince 4-61, = 9/(2n8) + O(1/n4), 42 ‘The field of values ‘The following four probleris improve Problem 8's simple estimate that the field of values of a doubly stochastic matrix is contained in the unit disc. 18. Let m2 2andllet Q=[9,;] € M,, bea cyclic permutation matris, that i, Git = L40r 1= 1,4 M=1, G1 = 1, and all other gy= 0. What does t! TEVe to do witha cyale pert ation of the integers yam}? Show that FQ) is the convex hull of the mth roots of unity e2#*¥/ yoy M, Which is contained in the angular wedge L, inthe complex plan wil vertex = Land passing through e*2"/™ fg? fem 14 rellr20, |o-a| 4} (1.5.19) Sketch L,,. Notice that the directed graph of Q contains a simple cycle of length m, that is, a directed path that begins and ends at the same node, ‘which occurs exactly twice in the path, and no other node occurs more than once in the path; see (6.2.12) in [HJ]. 19. For any Be M,, define p(B) = the length of a maximum-ength simple cycle in the directed graph of B. Notice that 0< u(B) 0. Show that p(ayB,-+--+ + ajB,)> max {u(By)y--- 4(B,)}- 20. Let Pe M, bea given permutation matrix. Since any permutation of the integers {1,..., n} is a product of cyclic permutations, explain why P is permutation similar (hence, unitarily similar) to a direct sum of cyclic permutation matrices QhronyQy QE Myy 16% 5m Mt" m= My where (P) = max {m,..., ny}. Use (1.2.10) to show that *(P) is contained in the wedge 1, defined by (1.5.19). ‘21, Let A€ M, be a given doubly stochastic matrix. Use Birkhoff’s theo- rem (8.7.1) in [HJ] to write A as a finite positive convex combination of permutation matrices, that is, A=YaP;, alla;>0, SY) Although there may be many ways to express A as such a sum, show that WA)2 max {P) always, and hence we have the uniform upper bound P(P)C Lig4y CL for all i Use (1.2.4) and (1.2.7) to show that F(A)c 1.8 Location of the field of values 43 Lyca) Eq Conclude that (A) is contained in the part of the wedge Lay (and £,) that is contained in the unit disc. Sketch this set. What does this say when n= 2? When (4) =2? When Ais tridiagonal? From what you have proved about F(A), what can you say about oA)? Do not use the fact, that a doubly stochastic tridiagonal matrix must be symmetric. ‘22. Let A € M,(R) have nonnegative entries and spectral radius 1. Let (A) denote the length of the maximum-length simple cycle in the directed graph of A. It is a theorem of Kellogg and Stephens that of A) is contained in the intersection of the unit disc and the angular wedge L,y) defined by (1.5.19). (a) Compare this statement with the result of Eeobtem 21. (b) If p(A) = 2, use the Kellogg-Stephens theorem to show that all the eigenvalues of A are real, and deduce that all the eigenvalues of a nonnegative tridiagonal matrix are real. See Problem 5 of Section (4.1) of [HJ] for a different proof of this fact. (c) Use the Kellogg-Stephens theorem to show that all the eigenvalues of a general n-by-n nonnegative matrix with spectral radius 1 are contained in the intersection of the unit disc and the wedge Z,,. Sketch this set. A precise, but rather complicated, description of the set of all possible eigenvalues of a nonnegative matrix with spectral radius 1 has ‘been given by Dmitriev, Dynkin, and Karpelevich. 23. Verify the following facts about the numerical radius function r(-): M,—R, defined by r(A)= max {|2|:2€ F(A)}. We write, o(A) for the spectral radius, || Al for the spectral norm, || A ||, for the Frobenius norm, and jl Alli, All, for the maximum column sum and maximum row sum matrix norms, respectively. The following statements hold for all A, Be M,. See Section (5.7) of [HJ] and its problems for background informati about the numerical radius and its relationship to other norms. (a) r(-)is anorm on M, but is not a matrix norm. (b) 4r(-)is submultiplicative with respect to the ordinary matrix product, that is, 4r(AB) ¢ 4r(A)4r(B), and 4 is the least positive constant with this property. Thus, 4r(+)isa matrix norm on M,. (c) 2r(-) is submultiplicative with respect to the Hadamard product, that is, 2r(AoB) < 2r(A)2r(B), and 2 is the least positive constant with this property, but if either A or Bis normal then r(AoB) < r(A)r(B); see Corollaries (1.7.24-25). (@) F(A) < [r(4)]™ forall m 125 this is the power inequality for the “4 ‘Thefield of values numerical radius. Give an example to show that it is not always true that r(A*™) ¢ r(AMr(A™), () KA) Sr(A). ( r(A)=r(4). (e) Hl All ¢ r(4) Il] Ally and both bounds are sharp. () ell Alas 7(A) 1, and ASy C Sy 1.5 Location of the field of values 45 by the Cayley-Hamilton theorem. Let U=[U, Up]eM, be a unitary matrix such that the columns of J; € M,,, form an orthonormal basis of Sy, ‘Then U*AUisa block triangular matrix of the form wrAv= { oA ee ] where UfA 1; is a Euclidean isometry on Sp (a linear transformation from Sp to Sy that preserves the Euclidean length of every vector) and hence is unitary (see (2.1.45) in [HJ]). ‘Thus, 1= p(UEAU;) < (UAV) = of 4) < WAllp ¢2. 26. Modify the proof of the preceding problem to show that if Aé M, has WA lle $1, then p(A) = 1 if and only if||| A™ ||}, = 1 for some positive integer ‘mereater than or equal to the degree of the minimal polynomial of A. 27. Let A€ M, be given. Combine the results of the preceding three prob- lems and verify that the following conditions are equivalent: (a) Aisradial. (b) of 4)=Ill All. (©) r(4)=IAlle (8) I A*I= (ll All)". (©) IA" l= (i All)” for some integer m not less than the degree of the minimal polynomial of A. (MA p= (Ul All) for all k= 1, 2,.. In addition, one more equivalent condition follows from Problem 37(f) in Section (1.6); see Problem 38 in Section (1.6). (g) Ais unitarily similar to |] All}(U® B), where Ue M, is unitary, 1¢k£n, and Be M,,,, has p(B) < 1and|l] Bll $1. 28, (a) If A € M, is normal, show that A is radial. If Ais radial and n= 2, show that A is normal. If n> 3, exhibit a radial matrix in M, that is not normal. (b) Show that if A€ M, is radial, then every positive integer power of A is radial, but it is possible for some positive integer power of a matrix to be radial without the matrix being radial. 29, Let J,(0) denote the n-by-n nilpotent Jordan block considered in Problem 9 in Section (1.3), where it was shown that (J,(0)) is a closed disc 6 ‘The field of values centered at the origin with radius r(J,(0)) = (H(J,(0)). If kis a positive integer guch that 2k< n, show that 2/F< r(J,(0)) <1. Describe. F(J,(0)) 30. A given A€M,,,, is called a contraction if its spectral norm satisfies MAllp <1; itis a strict contraction if |\| Alb <1. T. Ando has proved two ‘useful representations forthe unit ball of the numerical radius norm in My: (@) (A) 1ifand only if A=(I-z)}o(r+z)t (1.5.20) for some contractions O, Z€ M, with Z Hermitian; the indicated square roots are the unique positive semidefinite square roots of In. (b) (4) 0}. Then 0 € F(C) c UEP, Vis not in the relative interior of F(C), and 641 +++ + Hyean=0- Since each ¢,, € F(C), Tm ¢,;2 0 for all ¢ in. But wim cy, + -++ + dyin yy = Oand each 4;> 0, s0 all the main diagonal entries of Care real. For any given indices i, je {,.., n}, let Ij; denote the 2-by-2 principal submatrix of C obtained as the intersections of rows and columns é and j, and let Ay, do denote its eigenvalues. Since F(T) c F(C) ¢ UHP, wehave A,, dp € UP. But Tm(A,) +1m(Ap) = Im(A, + Ag) =Im(try)=0, so Ay and Ay are oth real and neither is an interior point of F(I4;)c UP. It follows from ‘Theorem (1.3.6(d)) that Tj; is normal; it is actually Hermitian since its eigenvalues are real. Since # and j are arbitrary indices, it follows that Cis Hermitian and hence F(C) is a real interval, which must have 0 as an endpoint since 0 is not in the relative interior of F(C). Thus, every point in F(C), in particular, every cy; and every 2,, has the same sign. But 0 + +15 + Mya = 0 and all yy>0, 60 all 6;;=0. Since Ay + +++ + Ag= c+ + + Cjny it follows that all A;=0 as well, and hence C that A= (Fand hence (a) implies (b). ‘To show that (a) implies (c), suppose ni the relative interior of F(A). Choose a unitary Ue M, such that 4 is upper triangular and has main diagonal entries A,,.. Ay ‘Then F(A) = F(A), and a strict convex combination of the main diagonal entries of 4 is not in the relative interior of its field of values. By the equivalence of (a) and (b) we conclude that A= (Tand hence A= U*AU = (I. o 50 ‘The field of values ‘We next investigate the smoothness of the boundary of the field of values AF (A) and its relationship with o(A). Intuitively, a “sharp point" of a convex set Sis an extreme point at which the boundary takes an abrupt ‘torn, a boundary point where there are nonunique tangents, or a "corner." 1.62 Definition. Let Ae My A point a¢ OF (A)is called a sharp point of F(A) ifthere are angles 0, and 6, with 0< 0, < @ < 2rfor which Reela= max {Re f: fe F(eA)} for all 0€ (0,02) 1.6.3 ‘Theorem. Let Ae M,, If ais a sharp point of F(A), then ais an eigenvalue of A. Proof: If ais a sharp point of F(A), we know from (1.5.7) that there is a ‘unit vector 2 € €* such that H(i A)2= deal H(A) forall ¢ (04,04) from which it follows that (see (1.5.9)) H(A)z= Age forall 0€ (04,6) (1.6.88) ‘The vector 2is independent of #, Differentiation of (1.6.32) with respect to @ yields H(ie!A) c= Ape which is equivalent to Seid) 2=-iNy 2 (1.6.36) Adding (1.6.3a) and (1.6.3b) gives e!#Az= (Ag-idg)z, or Az= e~M(\9-idp)z Interpreted as an evaluation at any 0 in the indicated interval, this means ‘that a= atAz= emi 4-15) 1.6 Geometry a is an eigenvalue of A. 0 Since each A€ M, has a finite number of eigenvalues, Theorem (1.6.3) implies that if a set is the field of values of a finite matrix, it can have only a finite number of sharp points. Moreover, if the only extreme points of F(A) are sharp points, then F(A) isthe convex hull of some eigenvalues of A. 1.64 Corollary. Let Ae M, be given. Then F(A) hag at most n sharp points, and F(A) sa convex polygon if and only if F(A) = Co(o(A)).. Although every sharp point on the boundary of F(A) is an eigenvalue, ‘ot every eigenvalue on the boundary of F(A) is a sharp point. Neverthe less, every such point does have a special characteristic. 1.65 Definition. A point \ ¢ o(4) is a normal eigenvalue for the matrix AcM,it (a) Every eigenvector of A corresponding to 2 is orthogonal to every eigenvector of A corresponding to each eigenvalue different from A, and (b) ‘The geometric multiplicity of the eigenvalue (the dimension of the corresponding cigenspace of A) is equal to the algebraic multiplicity of A (asa root of the characteristic polynomial of A). Beercise. Show that every eigenvalue of a normal matrix is a normal eigen- value. Beercise. If Ae M, has as many 26 n-1 normal eigenvalues, counting multiplicity, show that A is a normal matrix. Beercise. Show that \ is a normal eigenvalue of a matrix A € M, if and only ifit is a normal eigenvalue of UAU* for every unitary matrix U'¢ M,, that is, the property of being a normal eigenvalue is a unitary similarity invariant, Our main observation here is that every eigenvalue on the boundary of the field of values is « normal eigenvalue. 1.66 ‘Theorem. If Ae M, and if a¢ @F(A)No(A), then aris a normal eigenvalue of A. If mis the multiplicity of a, then A is unitarily similar to aF® B, with 1€ Myy, BE My ny and of o{B). 82 ‘The field of values Prooft If the algebraic multiplicity of a is m, then A is unitary similar to ‘an upper triangular matrix 7 (this is Schur’s theorem; sce (2.3.1) in [EJ]) whose first m main diagonal entries are equal to a and whose remaining diagonal entries (all different from a) are the other eigenvalues of A. Sup- pote there were a nonzero entry off the main diagonal in one of the first m rows of T. Then T would have a 2-by-2 principal submatrix Ty of the form Ty =[9 f]>7#0. Since F(T) iseither a circular disc about a with radius 4/4] ora nandegenerate ellipse (with interior) with foci at rand A, the point ‘must be in the interior of F(Z). But F(T) ¢ F(T) = F(A) by (1.2.8) and (1.2.11), which means that ais in the interior of F(A). This contradic- tion shows that there are no nonzero off-diagonal entries in the first m rows of 7, and hence T= ale B with Ie M,, and BEM,» ‘The remaining assertions are easily verified, a Brercise. Complete the details of the proof of Theorem (1.6.6). We have already noted that the converse of the normality property (1.2.9) does not hold, but we are now in a position to understand fully the relationship between the two conditions on A € Mf: Ais normal, and F(A) = Co(a(A)). 1.6.7 — Gorollary. If (A) c AF (A), then A is normal. Proof: If all the eigenvalues of A are normal eigenvalues, then there is an orthonormal basis of eigenvectors of A, which must therefore benormal. [] Brercise. Show that a given matrix A€ M, is normal if and only if A is unitarily similar to a direct sum A, @ A, @ ++ @ A, with o(A) c OF (A), Tk ‘Two further corollary facts complete our understanding of the extent to which there isa converse to (1.2.9). 168 Theorem. Let Ae M,. Then F(A)= Co(o(A)) if and only if either Ais normal or A is unitarily similar to a matrix of the form [43 4] where A, is normal and F (A) cF(A,)- 1.6.9 Corollary. If Ae M, and n<4, then A is normal if and only if 1.6 Geometry 53 F(A) = Co(o(A)). Exercise. Supply a proof for (1.6.8) using (1.6.6). Exercise. Supply a proof for (1.6.9) by considering the geometrical possibil- ities allowed in (1.6.8). ‘We with to discuss one additional connection between normality and the field of values. If A € Mr, is given and if B= E $) is a larger matrix ‘that contains A in pper-left corner, we say that Bis a dilationof A; by the submatrix inclusion property (1.2.11), F(A)c F(B) whenever BE My isa dilation of A€ M,. One can always find a 2n-by-2n normal dilation of A, for example, the matrix B defined by (1.6.11), in which case F(B) Co(o(B)). It is clear that F(A) is contained in the intersection of the fields of values of all the normal dilations of A, and it is a pleasant observation ‘that this intersection is ezactly F(A). 1.6.10 ‘Theorem. Let A€ M, be given. Then F(A)=() {ra 2= (J € Mis nocmal) Proof: Since we have aleady noted that (A) is a subset of the given intersection, we need only prove the reverse inclusion. Because F(A) is closed and conver, it is the intersection of all the closed halfplanes that contain it, a fact used in the preceding section to develop a numerical algo- rithm to compute F(A). ‘Thus, it is sufficient to show that for each closed half-plane that contains (A), there is some normal matrix B= [¢ {] ‘My cuch that F (B) is contained in the same half-plane. By the translation and scalar multiplication properties (1.2.8-4), there is noloss of generality to assume that the given half-plane the closed right half-plane RHP,. Thus, we shall be done if we show that whenever A+ A* is positive semidefinite, that is, F(4)C RHP, then A has a normal dilation Be Mp, such that B+ BY is positive semidefinite, that is, F(B) c RHPy. Consider Bs 4 (1.6.11) ‘Then Bis a dilation of A and 54 ‘The field of values At A+ AA At? 4 AP BB= |" 424 402 tata] tc 0 Bis normal, independent of any assumption about (A). Moreover, if A+ A* is positive semidefinite, then W[Atat Adan areal 44) (16.12) positive semidefinite, since if y, 2€ C" and z= [¥] ¢ C2", then 2*(B + BY)z = (ut a(t A*Y(yt 2) 20. a Since the field of values of a normal matrix is easily shown to be convex and the intersection of convex sets is convex, Theorem (1.6.10) suggests & clean conceptual proof of the convexity property of the field of values (1.2.2). Uniortunately, the convexity of F(A) is used in a crucial way in the proof of Theorem (1.6.10); thus, it would be very pleasant to have a differ- cent proof that does not rely on the ‘Toeplite-Hausdorff theorem. ‘The matrix defined by (1.6.11) gives a 2n-by-2n normal dilation of any given matrix A€ M,, It is sometimes useful to know that-one can find dilations with even more structure. For example, one can choose the dil tion BE Mg, to be of the form B=cV, where Ve Mp, is unitary and ¢ max {]| All, 1} (see Problem 22). Moreover, for any given integer k> 1, one can find a dilation Be M41), that ie a scalar multiple of a unitary matrix 4m and has the property that 5” 4] for all m=1, 2,.. (see Problem 25). Problems 1. IfA€ My, show that F(A) is a possibly degenerate ellipse with center at dtr and foci at the two eigenvalues of A. Characterize the case which F(A) is a circular ‘What is the radius of this disc? Characterize the case in which F(A) isa line segment. What are its endpoints? In the Temaining case of an ellipse, what is the eccentricity and what is the equa- tion of the boundary? 2. For A=[2 4] € M(B), show that (A) is an ellipse (possibly degen- exate) with center at the point 4(a+ d) on the real axis, whose vertices are the two points 4(a+ d+ [(a-d)? + (b+ ¢)?]#) on the real axis and the two 1.6 Geometry 55 points #(a+dsi]d-c|). Show that the degenerate cases correspond exactly to the cases in which A is normal, Can you find similarly explicit formulae if A € Mj(6)? 3. If Ae My has distinct eigenvalues, and if 4g =A~(jtr A)J, show that Ag is nonsingular and the major axis of F(A) lies in the direction =(det 4p)/|det Ao| in the complex plane. What is the length of the major axis? What is the length of the minor axis? 4. If A€ Mg, and if one of the eigenvalues of A appears on the main diag- onal of A, show that at least one of the off-diagonal entries of A must be zero. 5. Show that any given (possibly degenerate) ellipse (with interior) is the ficld of values of some matrix A € My, Show how to construct such a matrix Affor a given ellipse. 6. Show by example that 4 is the best possible value in (1.6.9), that is, for every n> 5 show that theres a nonnormal A € M, with F(A) = Co(o(4)). 7. Let n23 be given and let 1¢ 0, and all 0;¢ (0,22). Tf A has some diagonal entry equal to 0, show ‘that there does nod exist a value ¢euch that < 0;< ¢-+ rfor all i= 1, ‘that is, the eigenvalues do not project onto the unit circle inside any semi- circle, 1.6 Geometry ar 20, Give another proof that the matrix B defined by (1.6.11) ie normal, and that B+ B* is positive semidefinite if A+ A* is positive semidefinite, as follows: Show that the 2nby-2n matrix Us [.j 7] /yZis unitary, U*BU= [4 Sar]: and 0°(8+ BUH |) acadan)|- Thus, Band B+ BY aze unitarily similar to something obviously normal and something positive semidefinite, respectively. How is the spectrum of B determined by A? Show that [I Ble <2||| A llb- ‘21. Recall that a matrix A is a contraction if|l| Allp <1. See Problem 30 in Section (1.5) for an important link between contractions and the field of values, (2) If there exists a unitary dilation V= [£ {] of a given square complex matrix A, show that A must be contraction. (b) Conversely, if A € Mf, is a given contraction, show that A has a unitary dilation as follows: Let A= PU be a polar decomposition of A, where Ue M, is unitary and P= (AA®*)} is positive semidefinite. Let A r- Pty ae[ a4 iy "| ) Jet and show that Zis unitary. (©) IfA€ M, isa contraction, show that [ 4 (I= Aart ce wns | ean (16.14) is also a-unitary dilation of A. (a) Use the dilation Z in (1.6.14) and the argument in the proof of Theorem (1.6.10) to show that if 4 is a normalcontraction, then F(4)=() {rw u=[f fe M,, it eitacy} (€) Combine the normal dilation (1.6.11) with the result in (4) and the norm ‘bound in Problem 20 to show that if A € M, is a contraction with||| Alb < 4, then FA)= (fro: esl € Mi itary} 58 ‘The field of values (f) Now consider a third unitary dilation, one that does not require the polar decomposition, the singular value decomposition, or matrix square roots. ‘The following construction can result in a dilation of smaller size than 2n; an analogous construction gives a complex orthogonal dilation of an arbi- trary matrix (see Problem 40). Let A € M, bea given contraction and let 5 rank(I-A*A), 60 0< 6 n with 6=0 if and only if A is unitary; 6 may be thought of a5 a measure of how far A is from being unitary. Explain why rank(I-AA*) = Sand why there are nonsingular X, Ye M, with r-Adt=x[ 08 O] xr ana r-ata= ve [} a Y and Ig€ My. Define [o'] ee Ts DzI0 i] vaxey[] eM, 0 Tp] Ve My and and form Zz (6 | Mays (16.18) Use the definitions (+) and the identity A(7-A*A)=(I-AA*)A to show that san [S SJL Spears “ What do (1.6.15) and (**) become when §= n, that is, when A is a strict contraction (| Alp <1)? In this special case, verify that (1.6.15) gives a unitary matrix. Now consider the general case when 61 bea given integer. Show that there exists a unitary Ve Mfg. sye such that (cV)™= [2 $] for m=1, 2ya9 h where ¢= 60 ‘The field of values max {|All 1}. Compare with Theorem (1.6.10). 26. Let AeM, be a given contraction, that is, || All <1. If p(+) is a polynomial such that |p(2)| <1 for all |2| = 1, use Problem 25 to show that (A) is a contraction. 27. Use Theorem (1.6.6) to show that the algebraic and geometric multi- plicities of the Perron root (and any other eigenvalues of maximum modulus) of a doubly stochastic matrix must be equal, and show by example that thie is not true of a general nonnegative matrix. Show, however, that this is true of a nonnegative matrix that is either row or column stochastic, and give an example of such a matrix whose Perron root does not lie on the boundary ofits field of values. 28. Use Corollary (1.6.7) to give a shorter proof of Theorem (1.6.1). Is there any circular reasoning involved in doing so? 29, A matrix A€ M, is said to be unitarily reducible if there is some unitary Ue M, such that U*AU= A, © A, with A; € My Aye My p, and 1 @Jyy(Qg) is the Jordan canonical form of A (see (8.1.11) in [HJ]), show that there is some one nonsingular SeM, such that Co(o(4))= F(SAS-) if and only if nj=1 for every eigenvalue A; that lies on the boundary of Co( o( A)). 31. If <-,-> is a given inner product on €%, the field of values generated by is defined by Fe, .,(A)= {: 2€ (, = 1} for Ae M,. Note that the usual field of values is the field of values generated by the usual Euclidean inner product zy"z Show that Co(a(A))= NF. ,.>(A): <-> is an inner product on (*}. 32. Let Ae M, be given. In general, o(A) 2, exhibit a nonspectral matrix Be M, such that B" is spectral forall m2, 36. Let Ae M, be given with r(A) <1. Show that if (A) =1, then r(A™) 1 for all m=1,2,.... Goldberg, Tadmor, and Zwas have proved a con- verse: If r(A) <1, then p(4) = 1 if and only if r(A™) = 1 for some positive integer ma not less than the degree of the minimal polynomial of A. Use this result to establish the following: (a) Ais spectral if and only if r(A™) = r (A)™ for some por mnot less than the degree of the minimal polynomial of A. (b) Show that the 3-by-3 nilpotent Jordan block ive integer 010 oon 000 A has p(A) =0 and (A?) = r(A)?=4, but A is not spectral, Explain why the bound on the exponent in the Goldberg-Tadmor-Zwas theorem issharp. (c) Show that A is spectral ifand only ifr (A*) = r(A)". (a) When n= 2, show that A is normal if and only if r(A2) = r(A)2. Compare the Goldberg-Tadmor-Zwas theorem with Ptak’s theorem on radial matrices and its generalization in Problems 25 and 26 in Section (1.5). 37, Let Ae M, be given. Combine the results of the preceding five prob- Jems and verify that the following conditions are equivalent: (a) Ais spectral. 62 (b) () @ minimal polynomial of A. (2) (4) =r(A)Ffor all k=1, 2, () Ais unitarily similar to r(4)(Ue B), where Ve M; is unitary, 1¢k¢ mand Be M, ,has (B) <1and r(B) <1. Compare with the conditions for a matrix to be radial given in Problem 27 in Section (1.5). 38. Use Problem 37(f) to prove the assertion in Problem 27(g) in Section (3). 39. The main geometrical content of Theorem (1.6.6) is the following fact: Any eigenvector of A corresponding to an eigenvalue that is a boundary point of the fiéld of values is orthogonal to any eigenvector of A corre- sponding to any other (different) eigenvalue. Provide details for the fol- lowing alternative proof of this fact. We write for the usual inner product. (1) Let \ € #(A), and let ud be any other eigenvalue of A. Let + y be unit vectors with Az= \zand Ay= yy. Since F(A) is compact and convex, and ) is a boundary point of F(A), there is a supporting line for F(A) that passes through A. Show that this geometrical statement is ‘equivalent to the existence of a nonzero c€ € and a real number a such that Im c > o for all unit vectors », with equality for z= 2, that is, Im cd (2)Let Ay=cA-ial, y= eA-ia, and y= quia. Show that Im 20 for all unit vectors 2 A, is real, 7 #A,, and Im, #0. (8) For arbitrary é,7€C, note that Im 20 and deduce that Im [|]? + (A,-74)é%<2y>] 20. (4) Show that you may choose 9 so that |y|=1 and Im[(Ay-A))n]= |(Ay-fi)|5 deduce that Im p, + |(Ay-7,)|€20 for all €€R. (5) Conclude that =0, as desired. Remark: The proof we have outlined is of interest because it does not rely on an assumption that A is a (finite-dimensional) matrix, tis valid for any bounded linear operator A on a complex Hilbert space with inner product <-,->, and uses the fact that F(A) is a compact convex set. Thus, the geometrical statement about orthogonality of eigen- vectors given at the beginning ofthis problem is valid for any bounded linear ‘operator A on a complex Hilbert space. 40. Although a given matrix has a unitary dilation if and only if it is a contraction (Problem 21), every comples matriz has a complex orthogonal 1.6 Geometry 63 dilation. (a) Let Ae M, and let 6=rank(J-ATA), 60 0< 6 wl y (7) and J,€ M;. Define Is sex() ] €M,,5 C240 Ig] ¥e My,q,and I Dz [0 I] vant ys[ 5] €Ms and form Qe ie a € Mas (16.17) Use the definitions (7") and the identity A(7- ATA) =(I-AA™)A to show that 00 4 ape? ] * ay role 0) x4r C2 Notice that these calculations carry over exactly to (or from) (**) in Prob- lem 21; one just interchanges * and 7. What do (1.6.17) and (77) become when §= n? In this special case, verify that (1.6.17) gives a complex orthog- ‘onal matrix. The algebraic manipulations necessary to establish the general case are analogous to thote needed to establish the general casein Problem a1. (b) Assuming that (1.6.17) gives a complex orthogonal dilation of A, explain why the construction of (1.6.16) in Problem 24 gives a complex orthogonal dilation P= [4 7] € Myyys such that Pm = [4°] for m= ik (¢) Let n and k be given positive integers. Show that a given Ae Myis a 64 ‘The field of values principal submatrix of a complex orthogonal Qe M, if and only if rank(T-A7A) < min{k, n- &}, which imposes no restriction on A if k< n/2. (@) Show that every 4 € M,, , has a complex orthogonal dilation. Notes and Further Readings. The proof given for Theorem (1.6.1) is due to Onn Chan. Theorem (1.6.3) may be found in Section I of R. Kippenhahn, ‘Wher den Wertevorrat einer Matrix, Moth. Nachr. 6 (1951), 193-228, where « proof different from the three given here may be found (see Problems 9 and 10), a8 well as many additional geometric reaults about the fidd of values; however, see the comments at the end of Section (1.8) about some of the quaternion results in Section II of Kippenhahn’s paper. ‘Theorems (1.6.6) and (1.6.8) are from C. R. Johnson, Normality and the Numerical Range, Linear Algebra Appl 15 (1976), 89-04, where additional related references may be found. A version of (1.6.1) and related ideas are discussed in O. ‘Taussky, Matrices with Trace Zero, Amer. Math. Monthly 69 (1962), 40-42. ‘Theorem (1.6.10) and its proof are in P. R. Halmos, Numerical Ranges and Normal Dilations, Acta Sei. Math. (Szeged) 25 (1964), 1-5. The constructions of unitary dilations of a given contraction given in (1.6.18) and (1.6.16) are in E. Egervary, On the Contractive Linear Transformations of n-Dimensional Vector Space, Acta Sci. Math. (Szeged) 15 (1953), 178-182. ‘The constructions for (1.6.15-17) as well as sharp bounds on the sizes of unitary and complex orthogonal dilations with the power properties discussed in Problems 24 and 40 are given in R. C. Thompson and C.-C. 'T. Kuo, Doubly Stochastic, Unitary, Unimodular, and Complex Orthogonal Power. Embeddings, Acta. Sci, Math. (Szeged) 44 (1982), 345-357. Generalizations and extensions of the result in Problem 26 are in K. Fan, Applications and Sharpened Forms of an Inequality of Von Neumann, pp. 113-121 of [UhGr]; (2) need not be a polynomial, but may be any analytic function on an open set containing the closed unit disc, and there are matrix analytic versions of classical results such as the Schwars lemma, Pick’s ‘theorem, and the Koebe 4-theorem for univalent functions. The theorem referred to in Problem 36 is proved in M. Goldberg, E. Tadmor, and G. ‘Zwas, The Numerical Radius and Spectral Matrices, Linear Multilinear Algebra 2 (1975), 317-226. The argument in Problem 39 is due to Ky Fan, A Remark on Orthogonality of Eigenvectors, Linear Multilinear Algebra 23 (1988), 283-284, 17 Products of matrices 65 1.7 Products of matrices We survey here a broad range of facts relating products of matrices to the field of values. These fall into four categories: (a) Examples of the failure of submultiplicativity of F(-); (b) Results about the usual product when zero is not in the field of values of one of the factors; (c) Discussion of simultaneous diagonalization by congruence; (@) A brief survey of the field of values of a Hadamard product. ‘Examples of the failure of submultiplicativity for F(-) Unfortunately, even very weak multiplicative analogs of the subadditivity property (1.2.7) do not hold. We begin by noting several examples that illustrate what is not true, and place in proper perspective the few facts that are known about products. ‘The containment P(AB)CF(A)F(B), =A, Be M, (i741) fails to hold both "angularly" and "magnitudinally." 172 Example. Let A=[Q §] and B= [2 9]- Then F(A) = F(B) = F(A)F(B) = the unit dise, while F (AB) is the line segment joining 0 and 4, ‘Thus, F(AB) contains points much further (four times as far) from the or gin than F(A) (8), so this example shows that the numerical radius r(-) is not a matrix norm. However, 4r(-) is a matrix norm (see Problem 22 in Section (5.7) of [EJ]. 173 Example. Let A= [5] and B= [{ 5]. Then F(A) = F(B) = F(A)F(B) = the line segment joining -1 and 1, while (AB) is the line segment joining -i and i. Also, F(A) = F*(B)= F*(A)F*(B) = the real line, while (AB) = the imaginary axis. Note that r(AB) < r(A)r(B) in this case, but that (1.7.1) still fallsfor angular reasons; F”(AB) is not contained in F(A) F’(B) either. 114 Example. Let A= [5 ‘Then (A) is the line segment joining 66 ‘The field of values ‘Land i, while (A?) is the line segment joining -1 and 1. Thus, F(A?) is not even contained in F(A)?, since 0€ F(A2) and O¢ F(A). However, Co F(A)?) = Cof-4, i, 1}, 80 F(A?) c Co(F(A)*). One reason it is not surprising that (1.7.1) does not hold in general is that F(A)F(B) is not generally a convex set, whereas F(AB) is always convex. But, since F'(A)/F(B) is convex in the case of both examples (1.7.2) and (1.7.3), the inclusion F(AB)cCo(F(A)F(B)), A, BEM, (1.7.1a) also fails to hold in general. Examples (1.7.2) and (1.7.8) also show that another weaker statement AB) c F(A)F(B) (728) {ails to hold, too. Product results when zero is not in the field of values of a factor With a realization that the available results must belimited, we now turn to ‘examining some of what can be said about the field of values and products of matrices. ‘The importance of the condition that zero not be in the field of values of one of the factors should be noted at the outset. ‘This means, for example, that the field of values may be rotated into the right half-plane, and links many of the results presented. 17.5 Observation. If A ¢ M, is nonsingular, then F(A") = F(A). Proof: Employing the congruence property (1.2.12), we have F(A) = F (AAA) = F(A") = PD. a Beercise. Verify that F (A*) = F(A) and, therefore, that F’(A*) = (A). Although (1.7.1b) fails to hold, the following four facts indicate that certain limited statements can be made about the relationship between the spectrum of a product and the product of the fields of values. 1.7 Products of matrices 67 176 — Theorem. Let A,BeM, and assume that Of F(B). Then o( AB“) ¢ P(A)/F (B). Proof: Since 0¢ F(B), we know from the spectral containment property (1.26) that B is nonsingular. ‘The set ratio F(A)/F(B) (which has the usual algebraic interpretation) makes sense and is bounded. If 2 ¢ o( AB“), then there is a unit vector z€€" such that 2*AB~1= dz*. ‘Then A= AaB, at Az= at Bs, and 2*Bz¢ 0; hence d= zt Az/z*Bze F(A)/F(B). (] 1.7.7 Corollary. Let A,B¢M,. If B is positive semidefinite, then (AB) ¢ F(A)F(B). Proof: First suppose that Bis nonsingalar and set B= O-1. If) ¢ o(AB) = o(AC*), then by (1.7.6) we must have A = a/c for some a€ F(A) and some c€F(C). If Byyiq ANd Ayyge are the smallest and largest eigenvalues of B, then fai, and 1, are the largest and smallest eigenvalues of the positive definite matrix C, and F(C) is the interval (f,1,, Bn}n]- Therefore, cl € (Bin Baz] = #(B) and d= ac! € F(A)F(B). The case in which B is singular may be reduced to the nonsingular case by replacing B with B+ el, €> 0, and then letting ¢—+0. a 17.8 ‘Theorem. Let A,BeM, and assume that 0¢ F(B). Then (AB) c F’(A)F*(B). Proof: Since 0¢ F(B), B is nonsingular and we may write B= C1. By (1.7.8), for each A € o( AB) = o( AC) we have \ = o/c for some a€ F(A) C F(A). For some unit vector z¢€" we have c= 2*Oz= (0s)*B*(Cz) = yBYy € F*(BY) = F*(B) , where we have set y= Cz. Then ce F’(B), c#0, and ¢! = @/|e|?« #/(B). Thus, \= act € F(A) (B). 17.9 ‘Theorem. Let Ae M, and XC be given with 440. The fol- lowing are equivalent: (a) AEF (A); (b) A € o(HA) for some positive definite He M,; and (0) A€o(C*AC) for some nonsingular Ce M,. Proof: Since H is positive definite if and only if H= CC* for some nonsin- gular Ce M,, and since o(CC*A) = o(CtAC), it is clear that (b) and (c) 68 ‘The field of values are equivalent. ‘To show that (c) implies (a), suppose that 4 € o(C*AC) ‘with an associated unit eigenvector y. ‘Then A= y*C*ACy = (Cy)*A(Cy) = zt Az for z= Cy#0, and hence 1€ F*(A). Conversely, suppose X= 2* Az. ‘Then = #0 since A # 0, and we let C; € M, be any nonsingular matrix whose first column is 2. ‘The 1,1 entry of CAG, is 2. Let o7 be the row vector formed by the remaining n- 1 entries of the first row of CfAC, and let 27 = -oT/A, ‘Then C= [5 Al € M, is nonsingular and C3(C{AC,)G, = [2 2]. ‘Thus, d€ o(C*AC), where C= C, is nonsingular. a tis now possible to characterize the so-called H-stable matrices, that is, those A € M, such that all eigenvalues of HA have positive real part for all positive definite matrices H. 1.7.10 Corollary. Let Ae M,. ‘Then o(HA)C RHP for all positive definite He M, if and only if F(A) c RHPU {0} and A is nonsingular. Brercise. Prove (1.7.10) using (1.7.9). We have already seen the special role played by matrices for which zero is exterior to the field of values. Another example of their special nature is the following result, which amounts toa characterization, L711 Theorem. Let A € M, be nonsingular. ‘The following are equiva- lent: (a) 414+ = B-1B* for some Be M, with 0¢ F(B). (b) Ais *congruent to a normal matrix. (b') Ais *congruent to a normal matrix via a positive definite congru- (©) AAA is similar to a unitary matrix. Proof: Since A71A* is unitary if and only if (ALA*Y1 = (A-1A*)*, which holds if and only if A*A= A'A*, we see that A.A* is unitary if and only if A is normal. The equivalence of (b) and (c) then follows directly from the calculation S1A1A*S= S-~LA1(sryIgtats= (S*AS)1(S*AS)*; if S-141A*Sis unitary, then S*AS (a *congruence of A) must be normal, and conversely. To verify that (¢) implies (a), suppose that U: LATASS is unitary, 1.7 Products of matrices 69 and let Ut be one of the (several) unitary square roots of U such that all eigenvalues of Ut lie on an arc of the unit circle of length less than 7. By (1.2.9), 0¢ F(U) and by (1.2.12) and (1.7.5), 0¢ F((S"!)*U +S"), where U+t=(Uty. Now calculate ATA*= sUSt= SU4y(U+t)ts1= SUtys(s+)(u4)*s-1= B-B, with (sus and o¢ F(B) as (a) asserts, Assuming (a), we may suppose 0¢ F(A) and then we may suppose, without loss of generality, that H(A) is positive definite by (1.3.5). We now show that (a) implies (b’) by writing A= H+ 5, where = H(A) = 4(A + A*) is positive definite and S= $(4) = 4(A- A*). If #4 is the inverse of the unique positive definite square root of H, we have (#4) Ant = H4(H+ S)H4 = 14+ HASHA, which is easily verified to benormal. Thus, Ais congruent to a normal matrix via the positive definite ‘congruence H(A)+. Finally, (b’) trivially implies (b). oO We next relate angular information about the spectrum of A to the angular field of values of positive definite multiples of A. Compare the next result to (1.7.9). 1.7.12 Theorem. Let T' be an open angular sector of € anchored at the origin, with angle not greater than x, and let Ae M,, The following are equivalent: (a) o(A)er (b) ¥*(HA)cP for some positive definite He M, Proof: By (3.1.13) in [HJ] there is for each ¢ > 0 a nonsingular S, € M,, such that 5214S, is in modified Jordan canonical form: In place of every off- diagonal 1 that occurs in the Jordan canonical form of A is «. Then, for sufficiently small ¢, F($,AS;1) cT if o(A)cT. But, by (1.2.12), we have F*(S¢5,48215,)= F'(S,AS;!) CT. Letting H= SS, demonstrates that (2) implies (b). ‘The proof that (b) implies (a) is similar. If F*(HA)CT, then F(HtAH+) = F (HAHAH) cP, using (1.2.12) again, where H! is the Positive definite square root of H and H+ is its inverse. But o(A)= o(HYARA) c F(HIAH+)c F’(HtAH+) cP, which completes the proof. [] ‘Simultaneous diagonalization by congruence We now return to the notion of simultaneous diagonalization of matrices by 0 ‘The field of values congruence (*congruence), which was discussed initially in Section (4.5) of (a). 1.7.13 Definition. We say that A1, Ag, Am € My are simultaneously diagonalizable by congruence if there is a nonsingular matrix Ce Af, such that each of C*A, 0, C* gC... O*Ay, Cis diagonal. A notion that arises in the study of simultaneous diagonalization by congru- ence and links it to the field of values, especially those topics studied in this, section, is that of zeroes of the Hermitian form 2* Az. 1.7.14 Definition. A nonzero vector 260" is said to be an ésotropic vector for a given matrix Ae M, if 2+Az = 0, and zis further said to be a common isotropic vector for Atyry Ap, € Myif 2* Ape = 0,9 Iyny me. ‘Two simple observations will be of use. 1.718 Replacement Lemma. Let ayy.) dg €C and Ajyny Am My be given. ‘The matrices A,,.., A,, ate simultaneously diagonalizable by congru- ence if and only if the matrices Any 0 #0 a Ay Ans A il are simultaneously diagonalizable by congruence. Btercise. Perform the calculation necessary to verify (1.7.15). Wf m=2 and if A;,4,€M, are Hermitian, we often study A= Ay + iAg in order to determine whether A and A, are simultaneously diagonalizable by congruence. Note that A, = B(A) and iA, = (4). 1.7.16 Observation. Let A,, Az €M, be Hermitian. Then A, and A, are simultaneously diagonalizable by congruence if and only if A= A; + iAy is congruent to a normal matrix. Proof: If A, and A, are simultaneously diagonalizable by congruence via CE M,, then C*AC= C*A,C+ iCtA,C= Dy + iDp, where D, and Dy are LT Products of matrices n real diagonal matrices. ‘The matrix C*AC is then diagonal and, thus, is normal. Conversely, suppose that a given nonsingular matrix Be M, is such that BYAB is normal, ‘Then there is a unitary matrix Ve M, such that U*B*ABU= D is diagonal. Now set C= BU and notice that H(D) = C*H(A)C= C*A, Cis diagonal and that -i8(D) = -8(C*AC) CA, Cis diagonal. Oo ‘We may now apply (1.7.11) to obtain a classical sufficient condition for two Hermitian matrices to be simultaneously diagonalizable by congruence. Beercise. It Ay, Ay €M, are Hermitian and if A= A, + idg, show that 0 F(A)ifand only if A; and Ay have no common isotropic vector. L717 Theorem. If Ay, 4, €M, are Hermitian and have no common isotropic vector, then Ay and Az are simultancously diagonaliaable by congruence. Proof: If Ay and Ap have no common isotropic vector, then 0 ¢ F(A), where Ay + id. Choosing (17.112), we conclude from the equiva- lence of (1.7.11a) and (1.7.11) that 4 is congruent to a normal matrix. ‘According to (1.7.16), then, A; and Ap are simultaneously diagonalizable by congruence. Other classical sufficient conditions for pairs of Hermitian matrices to be simultaneously diagonalizable by congruence follow directly from (17.17). Brercise. Let Ay, Ay € M, be given. Show that the set of common isotropic vectors of A, and Ap is the same as the set of common isotropic vectors of yA + yA, and Ay if a, #0. Since a positive definite matrix has no isotro- pic vectors, conclude that A, and A, have no common isotropic vectors if a linear combination of A, and A, is positive definite. 1.7.18 Corollary. If A, € M, is positive definite and if A, € M, is Hermi- lian, then A; and A, are simultaneously diagonalizable by congruence. 1.7.19 Corollary. If Ay, Ay €M, are Hermitian and if there is a linear combination of A; and Ap that is positive definite, then A, and Ay are n ‘The field of values simultaneously diagonalizable by congruence. ‘Brercise. Deduce (1.7.18) and (1.7.19) from (1.7.17) using (1.7.15) in the case of (1.7.19). Beercise. Prove (1.7.18) directly, using Sylvester’s law of inertia. Beercise. Uf Ay, Ay €M, are Hermitian, show that there is a real linear combination of Ay and Ap that is positive definite if and only if A, and Ap have no common isotropic vector. Hints Choote 0 < 0-< 2 80 that F(A} CREP, A= 11 + ig, and calculate H(e##A) = (cos 0), - (sin 0) Ap. 1.7.20 Example. The simple example A, = 3 4] shows that the direct converses of (1.7.17), (1.7.18), and (1.7.19) do not hold. Using (1.7.11) again, however, we can provide a converse to (1.7.17) ‘that emphasizes the role of the condition "0 ¢ F(A)" in the study of simulta- ‘neous diagonalization by congruence. We conclude this discussion of simul- ‘taneouis diagonalization by congruence for two Hermitian matrices by listing this converse (1.7.21e) along with other equivalent conditions. 17.21 Theorem. Let A,, Ay € M, be Hermitian and assume that A, and ‘As Ay + iA, are nonsingular. The following are equivalent: (a) A, and A, are simultaneously diagonalizable by congruence; (b) Ais diagonatizable by congruence; (©) Ais congruent toa normal matrix; (@) | 471A*is similar to a unitary matrix; (e) The some A€ M, with 0¢ F(A), such that A1A4* = 444%; and (0 Aj1Apis similar to a real diagonal matrix. Brercise. Prove (1.7.21). ‘These equivalences are merely a compilation of previous developments. Observation (1.7.16) shows that (a) and (c) are equivalent, independent of the nonsingularity assumptions. Show that (b) and (c) are equivalent, independent of the nonsingularity assumptions, using ‘the theory of normal matrices. Items (c), (4), and (e) are equivalent because of (1.7.11), and the equivalence of (a) and (f) was proven in Theorem (4.5.15) in [HJ]. Notice that the assumption of nonsingularity of A, is necessary only because of (f) and the assumption of nonsingularity of A is 1.7 Products of matrices 3 necessary only because of (a) and (e). Exercise. Many conditions for two Hermitian matrices to be simultaneously diagonalizable by congruence can be thought of as generalizations of Corol- lary (1.7.18). Consider (1.7.21f) and show that if 4; is positive definite, then 4j1A, has real eigenvalues and is diagonalizable by similarity. ‘Extensions of the Hermitian pair case to non-Hermitian matrices and to amore than two matrices are developed in Probleme 18 and 19. Hadamard products Recall that the Hadamard product AoB is just the entrywise product of two ‘matrices of the same size (see Chapter 5). For comparison, we state, with- out proof here, some results relating F(-) and the Hadamard product. Tools for the first of these (1.7.22) are developed at the end of Section (4.2). The remaining four results (1.7.23-26) may be deduced from (1.7.22). 11.22 Theorem, If A, NeM, and if Nis normal, then F(NoA)C Col (WF (A). Proof: See Corollary (4.2.17). 1.7.23 Corollary. If A, He M, and if H is positive semidefinite, then F(HoA) CF (HF (4). Brercise. Deduce (1.7.28) from (1.7.22) using the facts that H'is normal and F(H)F (A) is convex. ‘17.24 Corollary. If A, Be M, and if either A or B is normal, then r(AeB) 0, bea given eigenvalue of BA. Show that Bi STS By and, if all e*4; are contained in an arc of the unit circle of length Jess than Or equal to x, show that @is also contained in this arc. 9. Under the assumptions of (1.7.8), explain why one can prove the appar- cently better result o( AB) c F (A)F’(B). Why isn’t this better? 10. Theorem (1.7.8) shows that some angular information about o(AB) ‘may be gained in special circumstances. Note that this might be mated with the value of a matrix norm of A and B to give magnitudinal information also. Give an example. 11. IEA=[o,,] € M, has positive diagonal entries, define A= [a;;] by ('s + lol (2ay) itt # ° ifizj ay Let Py= {ret r>0,-0y< 0< Op}, where Jy =aresin(t), 0¢ ty < 7/2 and 0 f. 18. Let Ais Am€ My- Show that the following statements are equiva- Tent: (2) yy. Ap, ate simultaneously diagonalizable by congruence. (e) Ap, axe simultaneously congruent to commuting normal matrices (©) The 2m Hermitian matrices H(A;), 18(4;), H(Az), (4p), H(4,,), (4,,) are simultaneously diagonalizable by congruence. 19. If Ay... Agg My are Hermitian, show that Ay,.., 4p, are simulta- neously diagonalizable by congruence if and only if A, + idg, Ag+ iAgy..y Angi iApg are simultaneously diagonalizable by congruence. 20, Let a nonsingular A € Af, be given. Explain why both factors in a polar decomposition A= PU are uniquely determined, where Ue M, is unitary. Prove that F/(U)¢ F’(A). 21. Let A€ M, be given with 0 f F(A) and let A= PU be the polar factor- ination of A with Ue M, unitary. Prove that Uis a cramped unitary matris, that is, the eigenvalues of Ulie on an arc of the unit circle of length less than . Also show that ©(U) < @(A), where @(-) is the field angle (1.1.3). Show that a unitary matrix Vis cramped if and only if 0 ¢ F(V). 22. Let GCC be a nonempty open convex set, and let A€ Af, be given. Show that o(A) c Gif and only if there is a nonsingular matrix S€ M, such that F(S71AS)c G. 23, If A, Be M, ate both normal with o(4)={a,,.., a,} amd o(B) = {Byy--»Ba}s show that F(4oB) is contained in the convex polygon dete: mined by the points a; 8s for i, j= 1,..,m. 24. Let A€ M, be given. Show that the following are equivalent: (a) At =S-1AS for some Se M, with 0¢ F(S). (b) A*=PAAP for some positive definite Pe My, (Q) Aissimilar toa Hermitian matrix. (@) A= PK for P, Ke M, with P positive definite and K Hermit ‘Compare with (4.1.7) in [HJ]. 25. Let Ae M, be a given normal matrix. Show that there is an Sc M, with 0 ¢ F(S) and A* = S-LASif and only if A is Hermitian. 1.8 Generalizations of the field of values 7 26. Let A€M, be given. Show that there is a unitary Ue M, with Og ‘F(U) and A* = U*AUif and only if Ais Hermitian. Consider A tt | and U= [5 2] to show that the identity A* = U+AU for a general unitary U does not imply normality of A. 21. (a) HA€ M,and0€ F(A), show that || ¢| A~21[}p forall ze €. (b) Use (a) to show that | AB-BA~ Illy >1 for all A, BE M,; in pastic- ular, the identity matrix is not a commutator. See Problem 6 in Section (4.5) for a generalization of this inequality to any matrix norm. (c) If Ae M, and there is some z€ € such that | A- 2k < |], show that og F(A). 28. Let nonsingular A, Be M, be given and let BABA. If Aand C are normal, AC= CA, and 0¢ F(B), show that AB= BA, that is, C= I. In particular, use Problem 21 to show that if A and B are unitary, AC= CA, and Bis a cramped unitary matrix, then AB = BA. For an additive commu- tator analog of this result, see Problem 12 in Section (2.4) of [HJ]. Notes and Further Readings. The results (1.7.6-8) are based upon H. Wielandt, On the Eigenvalues of A+B and AB, J. Research N.B.S. 778 (1973), 61-63, which was redrafted from Wiclandt’s National Bureau of Standards Report #£1367, December 27, 1961, by C. R. Johnson. Theorem (1.7.9) is from C. R. Johnson, The Field of Values and Spectra of Positive Definite Multiples, J, Research N.B.S. 78B (1974) 197-198, and (1.7.10) was first proved by D. Carlson in A New Criterion for Stability of Complex Matrices, Linear Algebra Appl. 1 (1968), 59-64. ‘Theorem (1.7.11) is from C. R. DePrima and C. R. Johnson, The Range of A+1A* in GL{n,¢), Linear Algebra Appl. 9 (1974), 209-222, and (1.7.12) is from C. R. Johnson, A lyapunoy Theorem for Angular Cones, J. Research National Bureau Stand- ards 78B (1974), 7-10. ‘The treatment of simultaneous diagonalization by congruence is a new exposition centered around the field of values, which ides some classical results such as (1.7.17). a Generalizations of the field of values "Thora i a rich variety of generalizations of the field of values, some of which luvo been studied in detail. ‘These generalizations emphasize various alge- Innule, nnalytic, or axiomatic aspects of the field of values, making it one of 8 ‘The ficld of values the most generalized concepts in mathematics. With no attempt to be complete, we mention, with occasional comments, several prominent or natural generalizations; there are many others. A natural question to ask about any generalized field of values is whether or not it is always convex: For some generalisations it is, and for some it is not. This gives further insight into the convexity (1.2.2) of the usual field of values, certainly one of its more subtle properti ‘The first generalization involves a natural alteration of the inner prod- uct used to calculate each point of the field. 1.8.1 Generalized inner product: Let He M, be a given positive definite matrix. For any A€ My define F(A) = {2 HA: 2€ 0%, oh He = 1} 1.8.la Observation. Since H is positive definite, it can be written a8 H= S*S with Se M, nonsingular. Fq(A) = F(SAS*), 80 Fg(A) is just the usual field of values of a similarity of A, by a fixed similarity matrix. Proof: Fx{A) = {a*S*SAS-ASx: 26 €", sPS*Sz= 1} = {y°SAS“ly: yer, sy=1} = F (SAS), a EBvercise. There are many different matrices Se M, such that S*S= H. Why does it not matter which is chosen? Hint: If T*7'= S*S, show that ST-1is unitary and apply (1.2.8). Bzereise. Consider the fixed similarity generalization F5(A) = F(SAS™) for 4 fixed nonsingular Se M,. Show that P+) = Feeg(+) = Fy{-) for H= ‘S*S, 60 that the fixed similarity and generalized inner product generaliza- tions are the same, Beercise. Why is F(A) always convex? Exercise. Show that o(A) ¢ Fg(A) for any positive definite matrix H. Show further that 1{Fg(4): He M, is positive definite} = Co(o(A)). Hint: Use (2.4.7) im [HJ]. For which matrices A does there exist a positive definite H with Fy(A) = Co(o(A))? ‘Another generalization of the field of values is motivated by the fact that ot Aze M, when z€ €%, so the usual field is juat the set of determinants 1.8 Generalizations of the field of values 7 of all such matrices for normalized =. 1.8.2 Determinants of unitary projections: For any m 2, the function r+) i a norm on M, if and only if c; + +++ + ¢#0 and the scalars c; are not all equal; this condition is clearly met for =[1,0,..., 0], in which case r-(-) is the usual numerical radius r(-). It is known that if 4, Be M, are Hermitian, then r,(A) = 1(B) for all c€ R" if and only if A is unitarily similar to +B, ‘A generalization of (1.8.4) is the C-field of values. 185 The C-field of values: Let Ce M, be given. For any A€ M,, define POA) ={tr(CU*AU): Ue M, is unitary} Beercise. Show that FVICV V5AV,) = F(A) if V1, Vp € M, are unitary. Brercise. If C¢ M, is normal, show that F(A) = F*(A), where the vector ¢ is the vector of eigenvalues of the matrix C. Thus, the C-leld is a generaliza- tion of the efeld. Brercise. Show that F4(C) = F(A) for all A, C€ My. Deduce that F(A) is convex if either C'or A is Hermitian, or, more generally, if either A or Cis normal and has eigenvalues that are collinear as points in the complex plane. Known properties of the ofield cover the issue of convexity of C-Helds of values for normal C, but otherwise it is not known which pairs of matrices C,A€ M, produce a convex F°(A) and for which C the set F(A) is convex for all Ae M,. Associated with the Cicld of values are natural generalizations of the ‘numerical radius, spectral norm, and spectral radius: WAllle= max {|tr(CUAY)|: U, Ve M, are unitary} ro(A) =max {|t(CUAD*)|: Ue M,is unitary} Po(A)=max{ SAA) Aggg(C) |i permutation of}, 4 a where the eigenvalues of A and Care {1,(A)} and {A{C)}, respectively. For any 4, Ce M,, these three quantities satity the inequalities 82 ‘The field of values olA)Sro(A) 2. Brercise. Show that it is unnecessary to consider F,(-) for a g¢ € that is not nonnegative, for if ge € with | q] < 1 and F,(-) i analogously (y*z = @, then F(A) = oF; (A), where ¢= e'#/4). Brercise. Let q= 1 and show that F,(A) = F(A) for all A. Thus, F,(A) is 8 ‘The field of values always convex. Brercise. Let ¢=0 and show that F(A) is always a disc centered at the gn, and hence Fy(A) is always convex. Hint: If z€ Fy(A), show that ez ¢ Fy(A) for all 0¢ (0,22). If x, z are two given points in F(A), show that they are connected by a continuous path lying in Fy(A) (any pair of orthonormal vectors can be rotated continuously into any other pair of orthonormal vectors). It follows that Fy(4) is an annulus centered at the origin. Finally, show that 0 € F(A) by considering y*Az for an eigenvector zzof A and some y that is orthogonal to 2. Brercise. Show that F(A) is the set of 1,2 entries of matrices that are unitarily similar to A. ‘The ordinary field of values may be thought of as the set of 1,1 entries of matrices that are unitarily similar to A. N. K. Tsing has shown that F(A) is convex for all g€ 0,1] and all AEMynd2. ‘Thus far, the classical field of values and all the generalizations of it we have mentioned have been objects that lie in two real dimensions, with the exception of the real field of values (1.8.7), which lies in one real dimension. Another generalization, sometimes called the shel, lies in three real dimen- sions in an attempt to capture more information about the matrix. 18.9 Definition. The Davis-Wielandt shell: For any A € My, define DW(A)= {[Re 2* Az, Im 2* Az, 2*(A* A): 2€€", t= 1} Exercise. Let Ae M, have eigenvalues {2 Aq} If A is normal, show ‘that DW(A) = Co({[Re Ay Im Aj, |A,[7]7: + 7). ‘One motivation for the definition of the Davis-Wielandt shell is that the converse of the assertion in the preceding exercise is also true. ‘This is in contrast to the situation for the classical field of values, for which the simple converse to the analogous property (1.2.8) is not valid, as discussed in Section (1.6). ‘There are several useful multi-dimensional generalizations of the field of values that involve more than one matrix. The first is suggested naturally by thinking of the usual field of values as an object that lies in two real dimensions. For any Ae M,, write A= A, +idg, where A, =H(4)= (A+ A¥)/2 and 4, =~i8(A)=-i(A-A*)/2 are both Hermitian. ‘Then 1.8 Generalizations of the field of values 85 F(A)= {ahs 260%, ho=1} = {2hA,c+ teh Age: 20%, oz = 1}, which (since 2*Ayzand 2* Az are both real) describes the same set in the plane as {(s*Ayxetdya): 2eC%, eb2=1}. Thus, the Toeplite- Hausdorff theorem (1.2.2) says that the latter set in 0? is convex for any two Hermitian matri- ces Ay, Ap € M,, and we are led to the following generalizations of F(A) and FRA). 1.8.10 Definition. The k-dimensional field of k matrices: Let k2 1 be a given integer. For any ,,..., A,€M, define PAT cee", f2=1} ce 4€ M(B), define Ay)? {[2?Ays,... 27A]™ se BY, 27 = 1} CRE Notice that when k= 1, FO,(A) = F(A) and FR,(4) = FR(A). For & FOA(A+ A*Y/2, i[A-A*l/2) describes a set in a real two-dimensional subspace of C that isthe same as F(A). If the matrices Ay Ay are al Hermitian, then FO,(A,,..., A,)C BEC C+. In considering FC,, the case in which all of the & matrices are Hermitian is quite different from the general case in which the matrices are arbitrary, but in studying Rj, it is conve- nient to know that we always have FRY(Ayyq. Ay) = FRYCA, + AN/2 + 14, + 4E}/2). Exercise. For any Hermitian matrices A,, A, € M,, note that FC,(A,) cB! and FO,(A,,4,) CO are convex sets. For any real eymmetric matrix Ay € ‘M,(B), note that FR,(4,) Cis convex. Brercise. For n= 2, consider the two real symmetric (Hermitian) matrices Show that FR,(A,,4,) is not convex but that FC,(4,,4,) is convex. Com- ae the Tatter set with F(4,+i4,), Hint: Consider the points in PRq(A,,4q) generated by z= [1,0]? and 2=(0,1]7. Is the origin in FR Ay Ay)? ‘The preceding exercise illustrates two special cases of what is known 86 ‘The field of values generally about convexity of the b-dimensional fields. Converity of FRy: For n=1, FR,(Aj,.... Ay) is a single point and is therefore convex for all Ajyny Ay€ M(B) and all k21. For n=2, FR,(A) is convex for every A€ Mq(B), but for every 22 there are matrices Ay, ‘Mg{B) such that FR,(Ay,... Aj) is not convex. For n> 3, FR,(. FR4(Ay,Aq) are convex for all Ay, Ap € My(B8), but for every #2 3 there are matrices Ay... Ay € M,(R) such that FR,(4,,..., A,)is not convex. Convexity of FC, for Hermitian matrices: 1) FO, Ayy.-+y Ay) is a single point and is therefore convex for ‘Aye My and all k2 1. For n=2, FC,(A,) and FC,(A)Ap) are convex for all Hermitian A,, Ap € Mp, but for every k> 3 there are Hermitian matrices Ayy.., Ay€ My such that FO,Ay,.., Ay) 8 not convex. For n23, FO;(Ay), FCp(AyyAp), amd FO,(Ay,AnAs) ate convex for all Hermitian Ay, Ap, 4g € M,, but for every k2 4 there are Hermitian matrices Ay, Ay€ My such that FRy(Ayy.A,) i8 not convex. ‘The definition (1.1.1) for the field of values carries over without change to matrices and vectors with quaternion entries, but there are some surprising new developments in this case. Just as a complex number can be written a8 z= a, + ig) with a, a,€R and #=-1, a quaternion can be written a8 (= 0; + ia + jag + kay with 0, 03, 0g, a €R, 2 j=jick waka 4, and Kaka 5 ‘The conjugate of the quaternion C= ay + 5g + jag + bag is C= 0 ~i0g~ jog hag; its absolute value is |¢| = (af + 0B + 08 + af)3; its real part 18 Re (=o. The set of quaternions, denoted by @, is an algebraically closed division ring (noncommutative field) in which the inverse of a nonzero quaternion (is given by (1 = ¢/|¢|?. The quaternions may be thought of as lying in four real dimensions, and the real and complex fields may be thought of as subfields of @ in a natural way. ‘We denote by M,(Q) the set of n-by-n matrices with quaternion entries and write Q for the set of n-vectors with quaternion entries; for 2 Q", 2* denotes the transpose of the entrywise conjugate of z. 18.11 Definition. The quaternion field of values: For any A€ M,(Q), 1.8 Generalizations of the field of values 87 define FQ(A) = {2"Az: 2€Q" and 2*2= 1). Although the quaternion field of values F(A) shares many properties with the complex field of values, it need not be convex even when A is a normal complex matrix. If we set foo oto 001 ioo [: 1 4 and A, oo1 then FQ(A,) is not convex but FQ(4p) és convex; in the classical case, F (Ay) and F(4p).are identical. It is known that, for a given A€ M,(@), FQ(A) is convex if and only if {Re ¢: ¢¢FQ(A)}={¢: Ce FQ(A) and ¢=Re ¢}, that is, if and only if the projection of FQ (A) onto the real axis is the sameas the intersection of F@Q(A) with the rea! axis. Notes and Further Readings. ‘The generalizations of the field of values that we have mentioned in this section are a selection of only a few of the many possibilities. Several other generalizations of the field of values are men- tioned with references in [Hal 67]. Generalization (1.8.1) was studied by W. Givens in Fields of Values of a Matrix, Proc. Amer. Math. Soc. 3 (1952), 206-209. Some of the generalizations of the field of values discussed in this section are the objects of current research by workers in the field such as M. Marcus and his students. The convexity results in (1.8.4) are in R. West- wick, A Theorem on Numerical Range, Linear Mullilinear Algebra 2 (1975), 311-815, and in Y. Poon, Another Proof of a Result of Westwick, Linear ‘Maltitinear Algebra 9 (1980), 35-87. ‘The converse is in Y.-H. Au-Yeung and N. K. Teing, A Conjecture of Marcus on the Generalized Numerical Range, Linear Multitinear Algebra 14 (1983), 235-239. ‘The fact that the e-ield of values is star-shaped is in N.-K. Ting, On the Shape of the Generalized Numerical Ranges, Linear Multilinear Algebro 10 (1981), 173- 182. For a survey of results about the cield of values and the numerical radins, with an extensive bibliography, see CK. Li and N.-K. Tsing, Linear Operators that Preserve the oNumerical Range or Radius of Matrices, Linear Mult linear Algebra 23 (1988), 27-46, For a discussion of the generalized spectral norm, generalized numerical radius, and generalized spectral radius intro- duced in Section (1.8.5), see C-K. Li, TY. Tam, and N.-K. Tsing, The Generalized Spectral Radius, Numerical Radius and Spectral Norm, Linear Mubllinear Algebra 16 (1984), 218-237. ‘The convexity of F(A) mentioned 88 The ficld of values in (1.8.8) is shown in N. K. Tsing, The Constrained Bilinear Form and the CNumerical Range, Linear Algebra Appl. 56 (1984), 195-206, The shell generalization (1.8.9) was considered independently and in alternate forms in C. Davis, The Shell of a Hilbert Space Operator, Acta Sci. Math. (Szeged) 29 (1968), 69-86 and in H. Wielandt, Inclusion Theorems for Eigenvalues, U.S. Department of Commerce, National Bureau of Standards, Applied Mathematics Series 20 (1953), 75-78. For more information about FRy(Ajyuoy Ay) se¢ L. Brickman, On the Field of Values of a Matrix, Proc. Amer. Math. Soc. 12 (1961), 61-86. For a proof that FCy(4y,Ap,A3) i8 convex when Ay, Ap, As € M, are Hermitian and n>, and for many related results, see P. Binding, Hermitian Forms and the Fibration of Spheres, Proc. Amer. Math. Soc. 94 (1985), 581-584. Also see Y.-H. Au-Yeung and N. K, Teing, An Extension of the Hausdorfi-Toeplits Theorem on the Numerical Range, Proc. Amer. Math. Soc. 89 (1983), 215-218. For referen- to the literature and proofs of the assertions made about the quaternion field of values FQ(A), see Y.-H. Aw-Yeung, On the Convexity of Numerical Range in Quaternionic Hilbert Spaces, Linear Multilinear Algebra 16 (1984), 93-100. There is also a discussion of the quaternion field of values in Section ‘of the 1951 paper by R. Kippenhahn cited at the end of Section (1.6), but the reader is warned that Kippenhahn’s basic Theorem 36 is false: the qua- ternion field of valuesis not always convex. ‘There are strong links between the field of values and Lyapunov's theorem (2.2). Some further selected readings for Chapter 1 are: C.S. Ballantine, Numerical Range of a Matrix: Some Effective Criteria, Linear Algebra Appl. 19 (1978), 117-188; C.R. Johnson, Computation of the Field of Values of a 2-by-2 Matrix, J. Research National Bureau Standards 8B (1974), 105-107; C. R. Johnson, Numerical Location of the Field of Values, Linear Multilinear Algebra’ (1975), 9-14; C.R. Johnson, Numerical Ranges of Principal Submatrices, Linear Algebra Appl. 37 (1981), 23-34; F. Murna- sghan, On the Field of Values of a Square Matrix, Proc. National Acad. Sci. U.S.A. 18 (1982), 246-248; B. Saunders and H. Schneider, A Symmetric Numerical Range for Matrices, Numer. Math. 26 (1976), 99-105; 0. Taus- sky, A Remark Concerning the Similarity of a Finite Matrix A and A*, ‘Math, Z. 117 (1970), 189-190; C. Zenger, Minimal Subadditive Inclusion Domains for the Eigenvalues of Matrices, Linear Algebra Appl. 17 (1977), 233-268. Chapter 2 Stable matrices and inertia 20 Motivation ‘The primary motivation for the study of stable matrices (matrices whose eigenvalues have negative real parts) stems from a desire to understand stability properties of equilibria of systems of differential equations. Ques- tions about such equilibria arise in a variety of forms in virtually every icipline to which mathematics is applied—physical sciences, engineering, economics, biological sciences, etc, In each of these fields it is necessary to study the dynamics of systems whose state changes, according to some rule, over time and, in particular, to ask questions about the long-term behavior of the system. ‘Thus, matrix stability, an initial tool in the study of such questions, is an important topic that has been a major area in the interplay ‘between the applications and evolution of matrix theory. 2.0.1 Stability of equilibria for systems of linear, constant coefficient, ordinary differential equations Consider the first-order linear constant coefficient system of n ordinary differential equations: = Ale()-2) (2.01.1) where A€ M,(R) and x(f), Z€R®. It is clear that if 2(#) = Zat some time ¢, 2(t) will cease changing at {= %. ‘Thus, 2 is called an equilibrium for this system. If A is nonsingular, z(f) will cease changing only when it has reached the equilibrium %. Central questions about such a system are: 90 Stable matrices and inertia (a) Will (9) converge to 2s ta, given an initial point 2 (0)? (b) More importantly, will 2(f) converge to # for all choices of the initial ata 2(0)? If 2(£) converges to's as t—+= for every choice of the initial data =(0), the equilibrium 2is said to be globally stable. It is not difficult to see that the equilibrium is globally stable if and only if each eigenvalue of A has negative real part. A matriz A satisfying this condition is called a stable matrix. If we define et by the power series te SL AA ets Yh ae =) ‘then e4* ig well defined (that is, the series converges) for all ¢ and all A (see Chapter 5 of [BJ}) and de4*= AeAt, The unique solution 2(¢) to our differential equation may then be written as 2(i) = eAt[2(0)-a] +2 (2.0.1.2) Brercise. Verify that this function 2(t) satisfies the differential equation (2.0.1.1). Bvercise. Just as e**—0 as ive if and only if the complex scalar a satisfies Re a<0, show that e4!+0 as te if and only if each eigenvalue A of A satisfies Re X <0. Conclude that (2.0.1.2) satisfies 2(t)— a8 t—1= for all choices of the initial data z (0) if and only if A is stable. Hint: How are the eigenvalues of e4 related to those of A? If e440 as tra, teR, then (A)-+0 a5 ka, E= 1, 2,... Now use the criterion in Theorem (5.6.12) in ] for a matrix to be convergent, Conversely, |l|e4*l| = || eAl+40 p< Hey eACO I cfeACNyjell ll for any matzix norm Ill, where [¢] denotes the greatest integer in t and (¢) = t-[t] is the fractional part of If Ais stable, (e4)l'10 as t+. 2.0.2 Local stability of equilibria for a nonlinear system of ordinary differential equations Stable matrices are also relevant to the study of more general systems of n not necessarily linear ordinary differential equations 2.1 Definitions and elementary observations a fle(-2) (20.23) where each component of f:R*—1R® is assumed to have continuous partial derivatives and f(z) =0 if and only if t=0. Again, x(t) stops changing if and only ifit reaches the equilibrium % and questions about the stability of this equilibrium arise naturally. Now, however, there are many possible notions of stability, not all of which may be addressed with matrix theory alone. One that may be 60 addressed is a notion of local stability. Suppose, instead of global stability, we ask whether 2(t) converges to # for every choice of the initial value (0) that is suficiently close to 4, that is, will the system converge to equilibrium after small perturbations from equilibrium? ‘The notions of "small" and “sufficiently close" can be made precise, and, by appeal to Taylor's theorem, the system (2.0.2.1) may be teplaced by the 1inear system Ha [0(9-2) in some neighborhood of %, without altering the qualitative dynamical properties of the original system. Here Of, yy EA | is the mby-n Jacobian of fevaluated at the origin. ‘Thus, 2(¢) + for all initial values 2(0) inthis neighborhood (Zs sai to be a locally stable equi- librium) if and only if all eigenvalues of the Jacobian J, have negative real parts, that is, J; isa stable matrix. 2h Definitions and elementary observations ‘We begin the study of stable matrices by considering the general notion of inertia for matrices in M,. 21.1 Definition. IfA¢ My, define: 4,(A) = thenumber of eigenvalues of 4, counting multiplicities, with Positive real part; 92 Stable matrices and inertia. {[A) = the number of eigenvalues of A, counting multiplicities, with negative real part; and (4) = the numberof eigenvalues of 4, counting multiplicities, with zero real part. ‘Then, é,(4) + (A) + ig(A) = nand the row vector i(A) = [5,(4), U4), f(A) is called the inertio of A. This generalizes the notion of the inertia of a Hermitian matrix defined in Section (4.5) of [HJ]. o10 Beercise. Show that i(-A)?= [! Q ‘| (A)? t For Hermitian matrices A, Be M,, Sylvester's law of inertia (Theorem (4.5.8) in [HJ] says that Bis *congruent to A if and only if ¢(A) = i(B). Brercise. For a Hermitian A ¢ M,, define positive and negative definite and semidefinite in terms of i(). Although the motivation of this chapter is the study of stable matrices, for reasons purely of mathematical convenience, we shall speak in terms of matrices all of whose eigenvalues have positive real parte and use the term positive stable to avoid confusion, ‘There is no substantive difference if the theory is developed this way. 21.2 Definition. A matrix A € M, is said to be positive stable if i(A) = (7,0, 0}, that, if,(4) =m. Brercise. Show that inertia, and therefore stability, is similarity invariant. Exercise. Show that A ¢ M, is positive stableif and only if-A is stable in the sense arising in the study of differential equations mentioned in Section (2.0); thus any theorem about positive stable matrices can be translated to ‘one about stable matrices by appropriate insertion of minus signs. Exercise. If A € M, is positive stable, show that A is nonsingular. If Ais positive stable, then so are many matrices simply related to A. 2.1 Definitions and elementary observations 93 2.1.3 . Observation. If A« M, is positive stable, then each of the fol- lowing is positive stable: (a) aA+0], a20,620,c+6>0, (b) a4, () Atyand (a) at ‘Bzercise. Prove the four assertions in Observation (2.1.3). 2.14 Observation. If A ¢ M,(R) is positive stable, then det A > 0. Proof: Since A is real, any complex eigenvalues occur in conjugate pairs ‘whose product contributes positively to the determinant, and, since A is positive stable, any real eigenvalues are positive and hence their product contributes positively to the determinant. o Brercise. Let A¢M,(B) be positive stable. Show that 4* has positive determinant forall integers k. Bzercise. (a) Show by example that the converse of (2.1.4) is false. (b) If A is positive stable, show by example that A? need not be. Exercise. If A € M, is positive stable, show that Retr A> 0; if Ae M,(R), show that tr A > 0. ‘There is, of course, considerable interplay between the root location problem of determining conditions under which zeroes of a polynomial lie in some given region, such as the right half of the complex plane, and the problem of determining whether a matrix is positive stable. We will not explore this interplay fully in this chapter, but we note two obvious facts hhere. All zeroes of the polynomial p(-) lie in the right half-plane if and only ifthe companion matrix associated with p(-) is positive stable, and a matrix is positive stable if and only if all zeroes of its characteristic polynomial liein the right half-plane, ‘Thus, conditions for tt 'a polynomial to lie in the right half-plane and conditions for positive stability of a matrix may be ‘used interchangeably, where convenient, Also, since the coetficients of the characteristic polynomial of a matrix are just « the sums of principal minors of a given size, polynomial conditions may often be interpreted in terms of Principal minor sums; see (1.2.9-12) in [HJ]. 94 Stable matrices and inertia Problems 1. If Ae M, is nonsingular and has all real eigenvalues, show that 4? is positive stable. 2. Show that A ¢ M(R) is positive stable if and only if tr A > 0 and det A >o. 3. Let A€ M,(R) be a given real positive stable matrix with characteristic polynomial p4(#) = ¢*+ cyt" + cnt"? + «++ + cy_it+ cy. Show that the coefficients c, are all real, nonzero, and strictly alternate in sign: ¢, <0, @>0, 4 <0,.... Conclude that £,(A), the eum of all the k-by-f principal minors of the real positive stable matrix A, is positive, k=1,..,m See (1.2.9-12) in [HJ], where it is shown that F(A) is equal to the sum of all the {) possible products of keigenvalues of A. 4. Suppose A € M, is positive stable. Show that A¥ is positive stable for all positive integers & (in fact, all integers) if and only if the eigenvalues of A are real. 5. Recall that a Hermitian matrix has positive leading principal minors if and only if it is positive definite. Show by example that positive leading principal minors are neither necessary nor sufficient for positive stability, even for real matrices. 6. If Ae M, is partitioned as _[4u Aw cel Age in which Aj€ My, i=1,2 and m-+ny=n, show that ¢(A)= i(Ayp). 7. Let A €M,(R) be given with n2 2. Show by a diagonal counterexample hat the converse to the result of Problem 3 i false; that is it is possible for A to have all its principal minor sums £,() positive but not be positive ‘table. What is the smallest value of n for which this can happen? Why? If BA) > 0 for k= 1,...n, there is something useful that can be said about tite location of o(4): Show that all the eigenvalues of A lie in the open angular wedge W,={2= rel eC: r>0, |0| Oif and only ifRe a> 0. 2.2.1 Theorem (Lyapunov). Let Ae M, be given. ‘Then A is positive stableif and only if there exists a positive definite Ge M, such that GA+ AN =H (2.2.2) is positive definite. Furthermore, suppose there are Hermitian matrices G, He M, that satisfy (2.2.2), and suppose H is positive definite; then A is positive stable if and only if Gis positive definite. Proof: For any nonsingular S¢ M, we may perform a *congruence of the identity (2.2.2) and obtain S"GSS-1AS4 StAtstls*GS= S*HS, which may berewritten as GAs dOn ir (223) in which G = S*GS, A = SAS, and Ht = S*HS. That is, *congruence applied to (2.2.2) replaces G and H by *congruent Hermitian matrices @ and Hand replaces A by a matrix A that is similar to A. Recall that *con- sgtuence preserves inertia (and therefore positive definiteness) of Hermitian matrices, and, of course, similarity always preserves eigenvalues. Thus, any ‘one of the three matrices G, H, or A in (2.2.2) may be assumed to bein any special form achievable by simultaneous *congruence (for G and H) and similarity (for A). ‘Now assume that A is positive stable and that A is in the modified Jordan canonical form in which any nonzero super-diagonal entries are all equal to €> 0, where ¢ is chosen to be less than the real part of every eigen- value of A; see Corollary (3.1.13) in {HJ]. Then set G= Ie M, and observe that the resulting matrix GA + A*G= A+ At =2H(A)= ive M, in (2.2.2) is Hermitian (even tridiagonal, real, symmetric), has positive diagonal 8 (the real parts of the eigenvalues of A), and is strictly diagonally dominant (see Theorem (6.1.10b) in [H3]). Thus, G and H are positive definite and the forward implication of the first assertion is proved. Now suppose there are positive definite matrices G and He M, that satisfy (2.2.2). Then write GA = B, so that B+ BY = His positive definite. 2.2 Lyapunov’s theorem 97 Let Gt be the unique positive definite square root of G and let G+ be its inverts (ce Theorem (726) ip ies). ‘Multiplication on both left and right G4BG4, for which G4BG+ + (G4BG4)* = Positive definite. Since GHAG+ therefore has positive definite Hermitian part (see the preceding exercise), both it and A (to which it is similar) are positive stable. ‘This verifies the backward implication of both assertions. ‘To verify the last necessary implication, let A, G, € Mf, be such that A {is positive stable, G, is Hermitian, and G,A + A*G, = Hj; is positive defi- nite. Suppose that G, is not positive definite. By the forward implication already proved, there is a positive definite G, ¢ M, such that GA + A*G, Hy is also positive definite. Set G,A=B, and G,A By, so that By 4(Hy + S,) and By = }(Hy + 5,), with S,, 5, €M, skew-Hermitian. Con- sider the Hermitian matrix function G(a) = aG, + (1-a)Gz for a€ (0,1). Al the eigenvalues of G(0) = G, are positive, while at least one cigenvalue of G(1) = G, is not positive. Since the eigenvalues of G(a) are always real and depend continuously on , there is some ay € (0,1) such that at least one eigenvalue of G(a9) is zero, that is, Gis singular. Notice that Glay)A = a9B, + (1-09) By = 40gHy + 4(1- a9) Hy + 4495, + (1 ~ a)S, has positive definite Hermitian part. This is a contradiction: A matrix with positive definite Hermitian part is nonsingular (its field of values lies in the open right half-plane), while G(a)A is singular because (a) is singular. ‘Thus, the original Hermitian matrix G, could not fail to be positive definite and the proof is complete. 0 ‘The equation (2.2.2) is often referred to as the Lyapunov equation, and a matrix G for which GA + A*G is positive definite is called a Lyapunov ‘solution of the Lyapunov equation. ‘The basic Lyapunov theorem may be refined to indicate the attainable right-hand sides in (2.2.2) and give a more definitive procedure to test for the stability of A € M, ‘Note that (2.2.2) is a system of linear equations for the entries of G. In ‘Chapter 4 the linear matrix equation (2.2.2) will be studied and we shall eee (Theorem (4.4.6) and Corollary (4.4.7) that (2.2.2) has a unique solution @ for any given right-hand side Hif and only if o( A*) 0 o(-A) = 4, a condition 8 ‘Stable matrices and inertia that is certainly fulfilled if A is positive stable. Thus, as a consequence of general facts about matrix equations, we have the first assertion of the following theorem: 22.3 Theorem. Let A € M, be potitive stable. For each given H€ Myy there is a unique solution G to (2.2.2). Ifa given right-hand side His Hermi- tian, then the corresponding solution G is necessarily Hermitian; if His positive definite, then Gis also positive definite. Proof: The assertion "His Hermitian implies that G is Hermitian" follows from the first (uniqueness) assertion, for if Gis the solution to GA + A*G= F, then H= Ht = (GA+ A*G)* = G*A + A*G*, and hence G*=G. The assertion "H is positive definite implies that G is positive definite" follows from the last part of (2.2.1). a Brercise. Let A€ M, be positive stable. According to (2.2.1) and (2.2.3), the solution G of (2.2.2) is positive definite for each given positive definite H, Show by example, however, that a positive definite G need not produce a positive definite H. Hint: Choose a positive stable A such that H(A) = 4(A + At) is not positive definite, showing that G= I qualifies. ‘Thus, each positive stable A induces, via Lyapunov’s theorem, a func- tion G(-): M,— My given by G4(H)= the unique solution Gto GA + A*G= ‘This function is evidently one-to-one, for if G4(Hy)= G4(H,), then Hy = GA(H)A + A*G4(H,) = G4(H_)A + A*G,(H) = Hy. Tn particular, the function G,(-) establishes a one-to-one correspondence between the set of positive definite matrices and a (typically proper) subset of itself. When A is dingonalizable, there is a simple explicit formula for the function G,(-); sce Problem 3. Because of (2.2.1) and (2.2.3), a common form in which Lyapunov's theorem is often seen is the following: 2.24 — Corollary. Let A€ M, be given. Then A is positive stable if and only if theré is a positive definite matrix Geatisfying the equation 2.2 Lyapmov's theorem 99 GA+A‘G=I (225) If A is positive stable, there is precisely one solution G to this equation, and Gis positive definite. Conversely, if for a given A Mf, there exists a posi- tive definite solution G, then A is positive stable and G is the unique solu- tion to the equation. Brercise. If A M,is positive stable, show that o( A*) n o(-A) = ¢. Beerciae. Prove Corollary (2.2.4). Bzercise. Show that (G 3] is positive stable by solving equation (2.2.5) ‘and checking the solution for positive definiteness. A definitive test for the positive stability of a given matrix Ae M, may now be summarized as follows: Choose your favorite positive definite matrix H (perhaps H'= 1), attempt to solve the linear system (2.2.2) for G, and if there is a solution, check G for positive definiteness. The matrix A is stable if and only if a solution G can be found and G passes the test of positive definiteness. This method has the pleasant advantage that the question of positive stability for an arbitrary A € Mf, can be transferred to the question of positive definiteness for a Hermitian matrix. Unfortunately, the cost of this transferal is the solution of an n2-by-n? linear system, although this can bbe reduced somewhat. Fortunately, a Lyapunov solution sometimes pre- sents itself in a less costly way, but this general procedure can be useful as a tool. Problems 1. Show that if G, and G, are two Lyapunov solutions for A, then oG, + 4G, is a Lyapunov solution for A as long as a, > 0 and a-+ b> 0. A set with this property is called a (convex) cone. Show that the positive definite matrices themselves form a cone and that A is positive stable if and only if nonempty and is contained in the cone of positive definite matrices. 2. (a) Suppose 2( [2{#] €€", where each complex-valued function (0) is absolutely integrable over an interval [0,8] CR, Show that the matrix Az f's(d2t(Odte M, is postive semidefnite. 100 ‘Stable matrices and inertia (b) Let Ayynny Ay €€ be given with Re; > 0 for i=1,..., m Show that the matrix L= [(X;+)"1] € M, is positive semidefinite and has positive main diagonal entries. 3. Let Ae M, bea given diagonalizable positive stable matrix with A= SAS+, 5, A.€ My S nonsingular, and A = diag(A,...,) with Re A;> 0 for i=1,...,n. Show that GA + A*G= Hif and only if G= (S4)*[D(A)o(S*HS)]S* where £(4)= [(X;+AJ"!] and o denotes the Hadamard product. Deduce that the cone of Lyapunov solutions for Ais given by {(S)*[L(A)oH]S*: He M, is positive definite} 4. Use Problem 3 to determine the cone of Lyapunov solutions for A= 32 [4 3]. 5. Use the ideas in Problem 3 to give a difierent proof of the last implica- tion demonstrated in the proof of Theorem (2.2.1). 6. If A and Bare both positive stable, show that A + Bis positive stable if A and B have a common Lyapunov solution. Is it possible for A + B to be positive stable without A and B having a common Lyapunov solution? 7. Give an example of a matrix A € Mp, with ig(A)=0, such that (2.2.2) does not have a solution for every Hermitian He Mp. 8. If His Hermitian and there is a unique solution G to (2.2.2), show that G must be Hermitian, 0, Let A€ M, be positive stable and let He M, be positive definite. For t> 0, let P(t) =e-A"tHe~At and show that Ge fro dt a cxists, is positive definite, and satisfies GA + A*G = H. 23 ‘The Routh-Hurwits conditions 101 23 TheRouth-Hurwits conditions ‘There is another important stability criterion that focuses on the principal ‘minor sums or characteristic polynomial of A € M,(R). 23.1 Definition. Let Ae M,. For k=1,....n, B,(A) is the sum of the (8) principal minors of A of order k, that is, 4 E,(A) is the coefficient of a" im the characteristic polynomial of A. For example, tr A= £,(A) and det A= E,(A); see Theorem (1.2.12) in [HJ]. 23.2 Definition. The Routh-Hurwite matriz 9(A)€ M,(R) associated with A € M,(R) (A) BA) B(A) + 0 1 BMA) BA) 0 A) = |° oo a : 0 20 BCA)! ‘The diagonal entries of (A) = [w;] are w= H(A). In the column above tug a6 1, 44= Biyi(A)s Ue2,a= Eyya(A)y-~ up to the first row wy, or to £,(A), whichever comes firat; all entries above £,(A) are zero. In the alum below uz are wy s= Bp5(A)s 49,2 By-2(A), ny By(A)s 1,0, 0, .. down to the last row W,j- . ‘The Routh-Hurwitz criterion for A € M,(R) to be positive stable is that the leading principal minors of (A) all be positive. Two items are worth noting, however. (a) The Routh-Hurwits stability criterion is a special case of more general and complicated conditions for determining the inertia of Ae M,(R)in certain circumstances. (b) ‘The Routh-Hurwits criterion may equivalently be stated as a test for "positive stability" of a given polynomial, that is, a test for whether all its zeroes aren the right half-plane. 233 ‘Theorem (Routh-Hurwitz stability criterion). A matrix Aé ‘M,{R) is potitive stable if and only if the leading principal minors of the Routh-Hurwits matrix (A) are positive. 102 Stable matrices and inertia We omit a proof of the Routh-Hurwitz criterion. One modern tech- nique of proof may be described as follows: Write the characteristic poly- nomial of A in terms of the (4) and construct a matrix of simple form (ouch as a companion matrix) that has this polynomial as its characteristic polynomial. Then apply Lyapunov’s theorem (and perform many algebraic calculations) to obtain the Routh-Hurwitz conditions. Problems 1, Consider the polynomial p(t) = + a,_y2"-} + +--+ ajt-+ og with real coefficients. Use the Routh-Hurwits criterion to give necessary and sufficient conditions on the coefficients 60 that all roots of p(¢) have positive real part, 2, Prove (2.3.3) forn=2and n=3. 3. Consider the matrix rab A= [5 y ‘| € M,(R) BY z, Show that DA is postive stable for every positive diagonal matrix D € Mf,(R) ‘f(a all principal minors of A are positive, and (b) ayz> (acf)+ ai)/2. 4. Show that the sufficient condition (b) in Problem 3 is not necessary by considering the matrix 6541 A=|125 531 Show that H(A) = 3(A + A*) is positive definite, so DA is positive stable for all positive diagonal D¢ M(R); see Problem 10 in Section (2.1). Further Reading: For a proof of Theorem (2.3.3), see Chapter 3 of [Bar 83]. 24 Generalizations of Lyapunov’s theorem Lyapunov’s fundamental Theorem (2.2.1) may be generalized in a variety of ways. We mention a selection of the many generalizations in this section. 2.4 Generalizations of Lyapunov’s theorem 103 ‘These inclu: (a) Circumstances under which positive stability may be concluded when a solution G of (2.2.2) is positive definite, but the right-hand side His only positive semidefinite; (b) ‘The general inertia result when the right-hand side Hof (2.2.2) is positive definite, but the solution Gis Hermitian of general iner- tia; and (©) Positive stability conditions involving possibly varying positive definite multiples of A. For additional generalizations that involve the field of values, see Section (an. ‘Consider the Lyapunoy equation GA+A‘G=H, A, G, He M, (24.1) ‘in which we assume that G, and therefore H, is Hermitian. Define S=GA-A*G (2.4.2) so that Sis skew-Hermitian and 2GA=H+S (243) case of a positive definite G for which the matrix in (2.4.1) is only positive semidefinite. Recall that a positive semidefinite matrix is necessarily Hermitian. 244 Definition, A matrix Ae M, is said to be positive semistable if 4_(A) =0, that fs, ifevery eigenvalue of A has nonnegative real part. 24.5 Lemma, Let Ae M,, If GA + A*G= H for some positive definite GeM, and some positive semidefinite He M,, then A is positive semi- stable. Proof: Suppose that \¢ o(A) and Az= Az, 240. Then 2GAz=2Gz and from (2.4.3) we have (H+ S)z= 2\Gz Multiplication on the left by 2* gives 2*( H+ S)z= 2Az* Gz, and extraction of real parts yields 104 Stable matrices and inertia, 2(Re A)a*Ga= otis (2.4.6) Since 2*Gz> 0 and 2* Hz) 0, we conclude that Re A 2 0. Oo If the circumstances of the lemma guarantee positive semistability, then what more is needed for positive stability? This question is answered in the following theorem. 24.7 Theorem. Let A€ Myy suppose that GA+ A*G=H for some positive definite Ge M, and some positive semidefinite He M,, and let Sz GA-A*G, Then Ais positive stable if and only if no eigenvector of G*1Slies in the nullspace of H. Proof: Suppose that 24 0 is an eigenvector of G15 that also lies in the null- space of H. Then, for some A€€, A\Gz= Sz=(H+S)z=2GAz and we conclude that Ar=}Az. Since 42 is then an eigenvalue of A and by (2.4.6) Re \ = 2*Hs/2* Ga= 0 (because zis in the nullspace of H), the matrix A is not positive stable. This means that positive stability of A implies the "no eigenvector" condition. Because of (2.4.8), all eigenvalues of 4 have nonnegative real part. If A is not positive stable, then there is a. € o(A) with Re A=0. From (2.4.6), we conclude that 2*Hz= 0 if #0 is an eigenvector of A associated with 1. Since His positive semidefinite, this can happen only if Hr=0, that is, zis in the mullspace of H. Combining (H+ S)z=2AGz (a consequence of = being an eigenvector associated with 1 € o()) and H2= 0, we conclude that ‘Sc=2AGz, that is, «is an eigenvector of G18 that lies in the nullspace of H. ‘This means that the "no eigenvector" condition implies positive stability for A. a Beereise. Show that Theorem (2.4.7) implies the backward implication of the first statement in Theorem (2.2.1). ‘Beereise. Show that Lemma (2.4.5) is just a generalization to nonstrict ine- qualities (semidefinite, semistable) of the backward implication in the first statement of Theorem (2.2.1). Prove the lemma by a continuity argument. Bzercise. Unlike the positive stable case, the converse of (2.4.5) does not hold. Give an example of a positive semidefinite A ¢ M, for which no posi- live definite Ge M, and positive semidefinite He M, exist that satisfy (2.4.1). Hint: ‘Try something with nontrivial Jordan block structure associ- 2.4 Generalizations of Lyapunov’s theorem 105 ated with a purely imaginary multiple eigenvalue. 248 Definition. Let A€ M,, BE Man ‘The pair of matrices (4, B) is said to be conirollableif rank [B AB AB... A*-1B) =n. ‘The concept of controllability arises in the theory of linear control differential equation systems. Let A€ M,(R) and Be Mf, ,,(R) be given. For any given continuous vector function u(t) € R™, the initial-value problem Ha Aa(9 + Bud, 2(0)=% has a unique solution z(¢) for each given 2 €W; the vector function u(-) is ‘the control ‘The question of controllability of this system is the following: For each &, z € W*, is there some ? <.e and some choice of the control vector +) such that the solution satisfies x(2) =? ‘The answer: ‘There is if and ‘nly if the pair (4, B) is controllablein the sense (2.4.8). 24.9 Remark. The condition on GS in (2.4.7) is known to be equiv- lent statement that the pair (A*, H) is controllable. Thus, if a positive definite Ge M, and a positive semidefinite He M, satisfy (2.4.1), then Ae M, is positive stable if and only if the pair (A*, 7) is controllable. ‘We next.tum to information about the inertia of A that follows if one has a Hermitian (but not necessarily positive definite or positive semidefi- nite) solution G of the Lyapunov equation (2.4.1) with H positive definite. This is known as the general inertia theorem, 24.10 Theorem. Let A€ Mf, be given. There exists a Hermitian Ge M, and a positive definite He M, such that GA+A*G= H if and only if {g(A) = 0. In this event, i() Proof: If ig(A) = 0, recall the form (2.2.3), equivalent to (2.4.1), in which we again take A= (4, to be in modified Jordan canonical form with any nonzero super-diagonal entries equal to ¢> 0, and choose ¢ < min {|Re A|: Xe of A)}. Now choose G= £= diag(e,, ...€,),in which feet ifRea;<0' 106 ‘Stable matrices and inertia ‘The resulting Hermitian matrix = £A + A* has positive diagonal entries and is strictly diagonally dominant, and therefore it is positive definite. On the other hand, suppose there is a Hermitian Gsuch that H= GA + A*Gis positive definite, and suppose that ig(A) #0. Again appeal to the equivalent form (2.2.2) and take A to bein (usual) Jordan canonical form with a purely imaginary 1,1 entry. Since the 1,1 entry of @ is real, and all entries below ‘the 1,1 entry in the first column of A are zero, a calculation reveals that the 4,1 entry of the matrix Hf resulting in (2.2.2) is zero. Such an H cannot be positive definite, contradicting our assumption that (A) #0. We conclude that ig(A) = 0, which completes the proof of the first assertion in the theorem. If a Hermitian G (necessarily nonsingular) and a positive definite H satisly (2.4.1), we verify that #(4) = i(G) by induction on the dimension 1. For n=1, the assertion is clear: If Re ga>0 and g real, then Rea and have the same sign. Now suppose the assertion is verified for values of the dimension up to and including n-1 and again appeal to the equivalent form (2.2.2), We claim that a nonsingular matrix S¢ Mf, cam be chosen so that @=sos=[% 2 | ana A=stas= [42 2] 0 See. 0 ang in which G,, €M,_ is Hermitian and Ay; €M,_y. This may be seen by choosing 5; so that S;!AS, is in Jordan canonical form and then choosing S of the form 5 = [5 {] in which 2=-Gy] yp and Oy A= [ ote a2 + Gy € My gig toa If Gy, is singular, then the original G may be perturbed slightly, G+ G+ el, with ¢ > 0 sufficiently small so that neither the positive definiteness of H nor the inertia of G is alteced~note that @ itself must be nonsingular—and so that the resulting G,, is nonsingular. Nothing essential about the problem 1s thus changed and so there is no loss of generality to assume that G,, is nonsingular. ‘Then let $= $, 5) to produce A and G in (2.2.3) that have the claimed form. The induction then proceeds as follows: Since GA + A*G= Ht is positive definite and since eine submatrices of positive definite matrices are positive definite, G,,4,, + At,G,, = Hy, is positive definite, 24 Generalizations of Lyapunov’s theorem a7 Application of the induction hypotheses, together with the observation that the inertia of a block triangular matrix is the sum of the inertias of the. blocks, then yields the desired induction step. Note that Re Jypiq= the last diagonal entry of Hf (and His positive definite), so the real number jpn has the same sign a5 Re Gp. This completes the proof ofthe second assertion and the theorem. Oo ‘Beercise. Show that Lyapunov’s theorem (2.2.1) follows from (2.4.10). Lyapunov’s theorem asks for a single positive definite matrix G such that the Hermitian quadratic form 2*GAz has positive real part for all nonzero 2¢ €*. If there is such a G, Lyapunov’s theorem guarantees that A is positive stable. This global requirement (one G for all 2) can be relaxed considerably to a very local requirement (one G for each 2). 24.11 Theorem. Let A M, be given. Then A is positive stable if and only if for each nonzero 2. €* there exists a positive definite G (which may depend on 2) such that Re z*GAz> 0, Proof: Let 2¢ €* be an eigenvector of A associated with a given A€ o(A) and let Ge M, be any given positive definite matrix. ‘Then Rez*GAz= Re 2#G(\2) = 'Rez*Gz\= (z*Gz)Red has the same sign as Re) since z*Gz>0. We conclude that if for each nonzero 26" there is a positive definite Ge M,, with Re z*GAz> 0, then A must be positive stable. On the other hand, if is positive stable, apply Lyapunov's Theorem (2.2.1) to produce a G satisfying (2.2.2) with H positive definite. Then, for any non- zero £¢ €*, Rect GAz= j(z? GAz-+ 2*A*Ga) = ate > 0. 0 2412 Definition. Let Ae M, and a nonzero vector 2€C* be given. Define £(4,2) = {Ge M,: Gis positive definite and Re 2*GAz> 0}. ‘Theorem (2.4.11) says that A is positive stable if and only if Z(A,2) # ¢ for all nonzero te o ‘Beercise. Prove the somewhat remarkable fact that, for a given A M, 1(4,2) # 6 for all nonzero 2 ¢ ("if and only if n{L(4,2): 04 2 C7} 4 Brercise. Prove the forward implication of (2.4.11) by direct argument without using Lyapunov’s theorem. 108 Stable matrices and inertia If H, Ke M, are Hermitian, it is a basic fact that if H'is positive def- nite, then i(HK) = i(K). If His negative definite, the inertia of His also completely determined by that of K. Tn general, there are multiple possi tes for i (HK) given i(H) and 4(K). Itisknown that A M, is similar toa matrix in M,(0) if and only if A may be factored as A= HK'with Hand K Hermitian. ‘These facts are discussed in (4.1.7) in [HJ]. For more complete- ness in our discussion of general inertia theory, we include without proof a description of the possible values of the inertia of the product HK, given i(H) and i(K), for H, Ke M, Hermitian and nonsingular, that is, ig(E) = i(K)= 0. In this event, of course, ¢_(H)= n=i,(H) and i_(K) 4,(K), but ig(HA) may be a nonzero even integer. For simplicity in description, let Kran) = {|p+9-n], [p+ a-n| +2, [P+ 9-n] +4, m-[p- al} for nonnegative integers p, 0. Cam you deter- rine i(B)in general? 10. Let A= [ay] € M(B) be tridiagonal. Show that if all principal minors 112 Stable matrices and inertia of A are positive then A is positive stable. Show further that "tridiagonal" may be generalized to "no simple circuits of more than two edges in the directed graph of A." Note that this isa circumstance in which the converse of the statement in Problem 3 of Section (2.1) is valid. Compare to the case a=2. U1. Use (1.7.8) to give an independent proof of Theorem (2.4.15). 12. Use the general inertia theorem (2.4.10) to give a short proof that if Ae M, and A+ A* is positive definite, then DA is positive stable for every positive diagonal De M,, and the same is true for any positive definite HeMy Notes and Further Readings. ‘The treatment up to and including (2.4.7) and (2.4.8) is based on D. Carlson, B. Datta, and C. R. Johnson, A Semi-definite Lyapunov Theorem and the Characterization of Tridiagonal D-Stable Matrices, SIAM J. Alg. Discr. Meth, 3 (1982), 293-304. Various forms of (2.4.10) were published independently by H. Wielandt, On the Eigenvalues of A+B and AB, J. Research NBS, 77B (1973), 61-63 (adapted from Wielandt’s 1951 NBS Report #1367 by C. R. Johnson), O. Taussky, A Generalization of a Theorem of Lyapunov, SIAM J. Appl. Math. 9 (1961), 640-643, and A. Ostrowski and H. Schneider, Some Theorems on the Inertia of General Matrices, J. Math. Anal. Appl. 4 (1962), 72-84. Wielandt, though earliest, gave only the equality of inertias portion. ‘Theorem (2.4.11) is from ©. R. Johnson, A Local Lyapunov Theorem and the Stability of Sums, Linear Algebra Appl. 13 (1976), 37-43. Theorem (2.4.13) may be found in C. R. Johnson, The Inertia of a Product of Two Hermitian Matrices, J. Math. ‘Anal. Appl. 57 (1977), 85-90. 2.5 - Mematrices, P-matrices, and related topics section we study & very important special class of real positive stable ‘matrices that arise in many areas of application: the Mf-matrices, They also link the nonnegative matrices (Chapter 8 of [HJ]) with the positive stable matrices. If X, Y€ My, (8), we write X2 Y (respectively, X> Y)if all the entries of X~ ¥ are nonnegative (respectively, positive); in particular, X > 0 ‘means that X has nonnegative entries. For X¢ M,, we denote the spectral radius of X by o(X). 2.5 M-matrices, Pmatrices, and related topics 3 25.1 Definition. Theset Z, ¢ M,(R) is defined by (A = [a5] € M(B): 056 O1F §F 3, f= Lynn Mh ‘The simple sign pattern of the matrices in Z, has many striking conse- quences, Exercise. Show that A= [a,j] € Z, if and only if A€ Af,(R) and A= al-P ee ee any a) max {aj: 1= 1... n}. ‘This representation A= al P for a matrix in Z, is often convenient because it suggests connections with the Perron-Frobenius theory of nonneg- ative matrices. 252 Definition. A matrix A is called an Mf-mairixif Ae Z, and Ais positivestable. ‘The following simple facts are very useful in the study of M-matrices. 252.1 Lemma. Let A €Z,, and suppose A = al- Pwith ae and P20. Then a~p(P) is an eigenvalue of A, every eigenvalue of A lies in the disc {z€€: |z-al ¢p(P)}, and hence every eigenvalue A of A satisfies ReA>a-p(A). In particular, Ais an M-matrix if and only if a> p(P). If A is an M-matrix, one may always write A= 7I-P with y=max {0,3 48}, P= I-A. 0; necessarily, 7> p(P). Proof: If A= al-P with P> 0 and ae R, notice that every eigenvalue of A is of the form a-(P), where A(P) is an eigenvalue of P. Thus, every igenvalue of A lies in a disc in the complex plane with radius p(P) and centered at z= a. Since p(P) is an eigenvalue of P (see Theorem (8.3.1) in [HJ]), a-p(P) is a real eigenvalue of A. If A is an M-matrix, it is positive stable and hence a-p(P) >0. Conversely, if a> p(P) then the disc {ze ¢: |2-al < p(P)} lies in the right half-plane, so A is positive stable. 0 A very important property of the class of M-matrices is that principal submatrices and direct sums of M-matrices are again M-matrices. Notice that the same is true of the class of positive definite matrices; there are a great many analogies between positive definite matrices and M-matrices. 114 Stable matrices and inertia Brercise. If Ais an M-matrix, show that any principal submatrix of 4 is an ‘M-matrix. In particular, conclude that an M-matrix has positive diagonal entries. Show by example that a matrix A € Z, with positive diagonal need not be an Mématrix. Hint: Use the fact that the spectral radius of a princi- pal submatrix of a nonnegative matrix Pis no larger than the spectral radius of P (see Corollary (8.1.20) in (HI). Brercise. Show that a direct sum of M-matiices is an Mematrix. Hint: If (ayl-Pg) with all a;>0 and all P;>0, consider A. It is mathematically intriguing, and important for applications, that there are a great many different (not obviously equivalent) conditions that are necessary and sufficient for a given matrix in Z, to be an M-matrix. In order to recognize M-matrices in the immense variety of ways in which they arise, it is useful to list several of these. Tt should be noted that there are many more such conditions and the ones we list are a selection from the more uéeful ones. Theorem. If A € Z, the following statements are equivalent: Ais positive stable, that is, A isan M-matrix. A=al-P, P20, a>9(P). Every real eigenvalue of A is positive. A+ tis nonsingular forall £2 0. A + Dis nonsingular for every nonnegative diagonal matrix D. All prineipal minors of A are positive. ‘The sum of all -by-k principal minors of A is positive for 2 2.5.3.8 Theleading principal minors of A are positive. = LU, where Lis lower triangular and Vis upper triangular and all the diagonal entries of each are positiv For each nonzero 2€R®, there is an index 1¢i¢n such that 3{A2);>0. 2.6.8.1 For each nonzero z€ B®, there is a positive diagonal matrix D such that 27ADr>0. 0” 2.6.3.12 Thereis a positive vector s€R" with Az> 0. 2.5.3.13 The diagonal entries of A are positive and AD is strictly row diagonally dominant for some positive diagonal matrix D. 2.6.3.14 The diagonal entries of A are positive and 14D is strictly row 2.5 Matrices, P-matrices, and related topics 15 diagonally dominant for some positive diagonal matrix D. 283.15 The diagonal entries of A are positive and there exist positive diagonal matrices D, Esuch that DABis both strictly row diago- nally dominant and strictly column diagonally dominant, 253.16 ‘There is a positive diagonal matrix D such that DA + A™D is positive definite; that is, there is a positive diagonal Lyapunov solution. 253.17 Ais nonsingular and A720. 253.18 Az2 implies 220. ‘We do not give a complete proof of the equivalence of all 18 conditions in (2.5.3), but do present a selected sample of implications. Other implica tions, as well as an ordering of all 18 that gives an efficient proof of equiva- lence, should be considered as exercises. Many of the other implications are immediate. A longer list of conditions, together with references to proofs of various implications and of lists of equivalent conditions may be found in [Py Selected implications in (2.5.3). Numbers correspond to final digits of listing in (2.5.3.2). 1¢=92: See Lemma 2.5.2.1. 3-92 Since Ae Z, may be written A= al-P, P20, and a-p(P)e (A), (3) means that a-(P) > 0or a> p(P). 2=94: Given (2), A+ t= (a+ f)I-P and a+ t> p(P) for #20. Thus, (2) holds ‘with 4 replaced by A+ #1, and since (2) implies (1), A+ tle M,(R) is positive stable. Since positive stable real mat ces have positive determinant, A + ¢Tis n gular if t> 0. 2=46: Since (2) is inherited by principal submatrices of A, and because (2) impfies (1), it follows that every principal submatrix of A is positive stable. But, since A¢ M,(R), each of these principal submatrices has positive determinant. T=¥4: Expansion of det (A+ tI) a8 a polynomial in ¢ gives a monic polynomial of degree n whose coefficients are the E,(), which are all guaranteed to be positive by (7). Therefore, this polynomial takes on positive values for nonnegative £ Since A+ tI has ‘positive determinant for ¢ > 0, it is nonsingular for ¢ > 0. 8=99: This follows from the development of the LU factorization in : Section (3.5) of [HJ]. See a following exercise to note that both 6 Stable matrices and inertia factors Land Uare M-matrices if A is an M-matrix. 2=)17: Since division by a positive number a changes nothing important 17 => 18:, Suppose that Az= 20. Then in (17), we assume a= 1. The power series I+ P+ P+ +++ then converges since p(P) <1. Because P> 0, the limit (I~ Py = Al of this series is nonnegative. = Ay 0 because A120. 18=92: Write A= al-Pfor some P20 and a> 0, and let v be a Perron a9 12=913: Let D: vector for P, so 0¢020 and Pu=p(P)u If a p(P). 12: If P20 is irreducible, let 2>0 be a Perron-Frobenius right eigenvector of P; if P is reducible, perturb it by placing sufi ently small values ¢ > 0 in the positions in which P has zeroes, and let be the Perron-Frobenius right eigenvector of the result. ‘Then Az= ax- Ps, and Pris either p(P)z, or as close to it as we like. In either event, Az is sufficiently close to [a-p(P)]z> 0 50 that Az> 0. iag(2) and note that the row sums of AD are just the entries of ADe= Az>0, where e=[1, 1,.. 1]7€R". Since all off-diagonal entries of A ¢ Z, are nonpositive, the diagonal entries of AD, and therefore those of A, must then be positive and AD must be strictly row diagonally dominant, 15=916: If (15) holds, then DAB+ (DAB)? has positive diagonal entries, is strictly diagonally dominant, and is, therefore, positive definite. A congruence by E! shows that E"[DAE+ (DAB)™|E1= (B1D)A + AT(E"D) is positive definite also, This means that Disa positive diagonal Lyapunov solution. 16=96: Note that condition (16) is inherited by principal submatrices of A and that (16) implies positive stability by Lyapunov’s theorem. ‘Thus, every principal submatrix of A is a real positive stable ‘matrix and therefore has positive determinant; hence, (6) holds. {1e91: ‘This follows immediately from the local version of Lyapunov's theorem (2.4.11). It should be noted that, for the most part, the conditions of (2.5.3) are not equivalent outside of Z,. ‘The structure imposed by Z,, mostly because of its relationship with the nonnegative matrices, is remarkable. Mrercise. Show that no two of conditions (2.5.3.1,6,7,and8) are equivalent, in general, in M(B). 2.5" Mematrices, P-matrices, and related topics ur Beercise. Show that (2.5.2.8 aid 9) are equivalent as conditions on & matrix AeM,(®). Brercise. Show that (2.5.3.9) can be strengthened and, in fact, an M-matrix may be LU-factored with L and Uboth Mmatrices, If Land Uare Mmatri- ces, is the product LU'an M-matrix? Hint: Reduce the M-matrix A to upper triangular form using only lower triangular type 3 elementary operations and keep careful track. ‘There are remarkable parallels between the M-matrices and the positive definite matrices, for example, the equivalence of (2.5.3.6, 7, and 8). These parallels extend to some of the classical matrix and determinantal inequali- ties for positive definite matrices discussed in Sections (7.7-8) of [HJ]. 254 — Theorem. Let A, Be Z, be given and assume that A= [ay] is an Momatrix and BY A. Then (a) Bisan Mmatrix, (%) 4428420, and (0) det B2det A>. Moreover, A satisfies the determinantal inequalities of (@) Hadamard: det A¢ 041-++0yq5 (€) Fischer: det A¥ det A(a) det A(a’) for any a¢ {1,...,n}; and (Sass: AH Ah S k=1,2, In the Szasz inequality, P,is the product of all bby-k principal minors of A. Proof: Let A and B= [bj] satisfy the stated conditions. .‘Then (a) follows from (2.5.3.12), for if z>0 and Az> 0, then Bz? Az>0. The inequality (b) now follows from (2.5.3.17) since A“! - Br! = A-1(B- A)B- is the prod- ‘uct of three nonnegative matrices and henceis nonnegative. ‘To prove (c), proceed by induction on the dimension n. The asserted inequality is trivial for n=1, 60 suppose it holds for all dimensions k= Aye M1. Partition A and Bas 118 Stable matrices and inertia. Ay A, ‘By 2B; a= [ft 42) wa 2 [22 2] Aa tan) OLB be with Aj), By, €Z,,- The principal submatrices A), and B,, are M-matri- cas and By, > yy, 80 det By, > det A,, > 0 by the induction hypothesis. We also have A> B1>0, det A>0, and det B>0. The nn entry of the nonnegative matrix A") B is (det A,,)/det A- (det By,)/det B20, 60 det B> (det B,,/det A,,) det A > det A> 0. ‘The Hadamard inequality (4) follows from the Fischer inequality (e), which follows easily from the determinant inequality (c) as follows: Assume without loss of generality that a= {1B}, 1 A. (det 4y3)(det Ap), which is (e). Szasz’s inequality may be deduced with the same technique used in Corollary (7.8.2) in [HJ]. : o Tt follows from (c) that det A< det A= There are many other Af-matrix inequalities that are analogs of classi cal results for positive definite matrices. See Problem 12(c) for an AFmatrix analog of Oppenheim’s inequality and see Problem 15 for a result involving Schur complements. Mzercise. Deduce Hadamard’s inequality for M-matrices from Fischer's Inoquality for M-matrices. Weercise. Prove Szasz’s inequality for M-matrices. Hint: It is possible to argue directly using Fischer’s inequality and (0.8.4) in [HJ], or to use the fact that inverse M-matrices satisfy Hadamard’ inequality (Problem 9)—in ‘ither event mimicking the proof of Szass’s inequality for positive definite matrices; see (7.8.4) in [EJ]. Within the set Z,, the M-matrices are natural analogs of the positive 2.5 Mematrices, P-matrices, and related topics ng definite matrices. ‘The corresponding natural analogs of the positive semt definite matrices are the positive semistable matrices in Z,. According to Lemma (2.5.2.1), any A€ Z, can be written as A= al-P with a€R and P20, a-p(P) is an elgenvalue of A, and every eigenvalue of A lies in a disc with radius p(P) and centered at z= a. Thus, if a positive semistable Ae Z, is represented in this way as A= al-P, we must have a> p(P) and the only way euch a matrix could fail to be positive stable is to have a=p(P), that is, A is singular, Thus, a positive semistable matrix A€ Z, is an M-matrix if and only if it is nonsingular. ‘This observation justifies the terminology singular Mf-matris for an A ¢ Z, that is positive semistable but not positive stable. Just as for M-matrices, there are a host of equivalent conditions for a matrix A Z, to be a singular M-matrix, and many follow easily from the observation that if Ais a singular M-matrixthen A+ efis an ‘M-matrix for all'e > 0. For example, the following equivalent conditions for amatrix A ¢ Z, are analogs of conditions (2.5.3.2): Ais positive semistable. A=al-Pwith P20and 2 oP). Every real eigenvalue of A is nonnegative. All principal minors of A are nonnegative. The sum of all k-by-k principal minors of A is nonnegative for k= Uy Byony 2.55 Definition. A nonsingular matrix Be M, is said to be an inverse Memotriz if Bis an Mematrix, It is not surprising that inverse M-matrices inherit considerable struc- ture from the M-matrices. Exercise. Show that an inverse M-matrix B is a nonnegative matrix that (a) is positive stable; (b) has a positive diagonal Lyapunov solution; and (c) has the property that DB is an inverse Mmatrix for any positive diagonal matrix D. Beereise. Show that: (a) The principal minors of an inverse M-matrix are positive; (b) Every principal submatrix of an inverse M-matrix is an inverse Mematrix; and (¢) If A is an inverse M-matrix and Dis a positive diagonal 120 Stable matrices and inertia matrix, then A+ D is an inverse M-matrix. Verification of these facts requires some effort. The Mmatrices naturally suggest a number of notions (that of a P- matrix, D-stability, ote.) that are also of interest in other contexts, 25.6 Definition. A matrix A€ M,({t) is called a P-matris (respectively, a Py-matris, or a P§-matria) if all bby-k principal minors of A are positive (respectively, are nonnegative, or are nonnegative with at least one positive) there are many different equivalent conditions for a matrix to be a P-matrix. All of the following are casily shown to be equivalent for a given A € M,(R): 2.5.6.1 Allthe principal minors of A are positive, that is, Aisa P-matrix. 2.5.8.2 For each nonzero z¢R", some entry of the Hadamard product zo(Az) is positive, that is, for each nonzero z= [z;] € R° there is some Ke {1,... n} such that 24(42), >0. 25.3 For each nonzero z€R, there is some positive diagonal matrix D= 1X2) € M,(R) such that 27(D(2)4)2> 0. 2.5.6.4 For cach nonzero zeR", there is some nonnegative diagonal matrix = B(2) € M,(R) such that 27( 22) )z > 0. 2.5.6.5 Every real eigenvalue of every principal submatrix of A is posi- ti ‘Sketch of the implications in (2.5.0). Numbers correspond to final digits of the listing in (2.5.6.2). laa: suppose zo (Az) < 0, let J = {j,,. ices for which 2,40, 1¢k< , 16} <++ < HSM, Tet £69) = jy» 4]€ Rh, and consider the principal submatrix AQ) E My Notice that [A(J)2(N];= (Aa)j. i=1yyk Then 2(J)o[A(J)z(F)] <0 and there exists a nonnegative diagonal matrix De M, such that A(J)2(J) =-Dz(J); the diagonal entries of Dare just the Hadamard quotient of A(J}z(J) by 2(J)- ‘Then [A(Z) + Djz(7) = 0, which is impossible since det [A(J) + D]2 det A(Z) > 0 (see Problem 18) and hence A(J) + Dis nonsingular, 2=93: Let 0 ¢ 2€ R” be given, and let ke {1,..., n} be an index for which 2.5 Mematrices, P-matrices, and related topics 121 (Az),>0. Then 54(Aa)y+ Dy 2( 2), ith for some ¢ > 0 and we may take D(s) © diag( dy... d,) with dy =1, dys eifith 3a4 Trivial. dab: Let J 4a} C Cyn} be given with 16 j,<-++ < jon and 14m, and suppose A(J)E=A€ with AER and 0¢ GE RE, Let 2= [a] €R" be defined by 3, = 5 i Kand 2,=0ifi¢ J, 10 €=2(J). Let B= B(s) € M, bea nonnegative diagonal matrix such that 2%(BA)z>0. Then 0< 2%(BA)z= (Bs)7(Az) (BNx(N)"AD2(N) = (BNOA) =AETHNE), which implies that both A and ¢7(J) are positive. 5=)1: Since Ais real, its complex nonreal eigenvalues occur in conjugate airs, whose product is positive. But det A is the product of the complex nonreal eigenvalues of A and the real ones, which are all Positive by (8), sodet A> 0. The same argument works for every principal submatrix of A. 0 ‘Bzercise. Show that any real eigenvalues of a P¢-matrix are positive. Hint: Use the fact that det(A+ tf) >0if t> 0 and A is a Pf-matrix; see Problem 18, 2.5.7 Definition. A matrix A€ M, is called D-stable if DA is positive stable for all positive diagonal matrices De M,. Brercise. Note that the clase of Pmatrices is closed under positive diagonal multiplication, Beercise. Show that every M-matrix is a P-matrix andis D-stable. Exercise. If A€ M, is positive definite or, more generally, if A+ A* is positive definite, show that Ais D-stable. Bzercise, Show that a D-stable matrix is nonsingular and that the inverse of a Dastable matrixis D-stable, Beercise. A matrix A€ M, is called D-semistable if DA is positive semis ble for all positive diagonal matrices De M,. Show that a principal subma- 122 ‘Stable matrices and inertia trix of a D-table matrix is D-semistable, but is not necessarily D-stable. Hint: Suppose A is Dstable and aC {I,...,n} is an index set. Let E be a diagonal matrix with ones in the diagonal positions indicated by a and a value € > 0 in the other diagonal positions. Now consider DEA, where Dis positive diagonal, and show that o(DEA) (approximately) includes (D(a) A(a)). We next give basic necessary condition and a basic sufficient condi- tion for D-stability. Unfortunately no effective characterization of D-stabil- ity is yet known, 25.8 Theorem. Let 4 € M,(B) begiven. (a) IfAis Dstable, then Ais a Pj-matrix; and (b) If A has a positive diagonal Lyapunov solution, then Ais D-stable. Proof: Suppose that A ¢ M,(R) is D-stable but is not in Pj. Consider the Principal submatrices A(a)€ My. If one of them had negative determinant, this would contradict the fact that each must be D-semistable; thus, all b-by-k principal minors are nonnegative. Since E,(4) is the kth elementary symmetric function of the eigenvalues of A, not all the bby-& principal ‘minors can be zero because A is real and positive stable. Thus, A € Pj and (a) is proved. Now suppose there is a positive diagonal matrix He M, such that A+ ATE= Bis positive definite. Let De M, be positive diagonal, con- sider DA, and notice that (BD)(D4) + (DA) zp) = BA+ ATE= B, 80 the positive definite matrix BD is a Lyapunov solution for DA. ‘Thu DA is positive stable for any positive diagonal matrix D, and Ais Dtable. [] Beercise. Give examples to show that neither the converse of (2.5.8a) nor the converse of (2.5.8b) is true, Bzercise, Show that A € Mg(R) is D-stableif and only if A € P§. Show that i] € Ma(R) is D-atable if (a) A € Pf and (b) 041022095 > (aypde50q) + int: This may be shown by ditect argument or by using the Routh-Hurwitz criterion. ‘Exercise. Compare (2.5.6.3) and (2.4.11). Why may we not conclude that a Pamatrixis positive stable? ‘Brercise. Show that if A € M,(R) is a P-matrix, then A is (positive) stable. 2.5 M-matrices, P-matrices, and related topics 123 Give an example of a Pmatrix in M,(R) that is not positive stable. Except for n£2, there is no subset-superset relationship between the Pematrices and the positive stable matrices; not all positive stable matrices are Pamatrices and not all P-matrices are positive stable. However, being an ‘Mmatrix or a P-matrix does have subset-superset implications for eigen- value location. ‘There is an open angular wedge anchored at the origin (extending into the left half-plane if n > 2), symmetric about the real axis that is the union of all the eigenvalues of all P-matrices in M,(R). There smaller angular wedge anchored at the origin (open if n > 2 and equal to the positive real axis if n=2), eymmetric about the real axis and strictly con- tained within the right half-plane, that is the union of all the eigenvalues of all M-matrices in M(B). 2.5.9 ‘Theorem. Let A € Mf,(R) be given with n2 2. (a) Assume that £(A) > O for all k= 1,..., n, where #,(A) is the sum of all the principal minors of A of order k In particular, this condition is satisfied if A is a Pmatrix or a P§-matrix. Then every eigenvalue of A lies in the open angular wedge W/,= rei || 0}. Moreover, every point in W, is an eigenvalue of some n-by-n P-matrix. (b) Assume that 4 is am M-matrix. Then every eigenvalue of A lies in ‘the open angular wedge W, = {z= re’: r>0, [@| 2, and in W, =(0,2) ifn=2. Moreover, every point in Wis an eigenvalue of some n-by-n M-matrix. Proofs of the assertions in (a) and (b) are outlined in Problem 7 of Section (2.1) and Problem 20 at the end of this section, respectively. Con- structions to show that every point in the respective angular regions is an eigenvalue of a matrix of the given type are given in Problems 23 and 21-22. *2.5.10 Definition. ‘The comparison matriz M(A)=[m,] of a given matrix A= [a;5] € My is defined by 124 Stable matrices and inertia A given matrix A ¢ M, is called an H-matrizif ite comparison matrix (A) is an M-matrix. Notice that we always have M(A) € Z,) that M(A) = A if and only if AG Zyy and that M(A) = |To Al -(|A| -|To Al) is the difference between ‘a nonnegative diagonal matrix and a nonnegative matrix with zero diagonal. Here, |X| [|x] denotes the entrywise bole value of X= [sy} Furthermore, a given matrix A ¢ M, is an M-matrix if and only if and A is an H-matrix. ‘The representation A = |To Al -(|] - real (To A)~ (Zo A)~ A] for an M-matrix is an alternative to the usual represen- tation A= al-P that can be very useful; see Problem 6(c) and Corollary (744). Beercise. Show that a given matrix Ae M, is an H-matrix if and only if there is a diagonal matrix Dé M, such that AD is strictly row diagonally dominant, Bvercise. Show that an H-matrix with real entries is positive stable if and only if it has positive diagonal entries. Beereise. Show that an H-matrix with positive diagonal entries has a posi- tive diagonal Lyapunov solution and is, therefore, Dstable. Hint: Use (253.15). Bvercise. Show that a strictly diagonally dominant matrix with positive agonal Lyapunov solution and is, therefore, Brercise. Show that a nonsingular triangular matrix T¢ M, is an H-matrix. ‘eercise. Show that a triangular matrix T'¢ M, is D-stable if and only if the diagonal entries have positive real part. For A= [aij] € M,, wesay that A is strictly row diagonally dominantif Loi] > Sy lay] for i=1,..., 2 att and we say that A is strictly column diagonally dominant if AT is strictly row diagonally dominant. A weaker, but also important, concept is the fol- lowing: 2.5 Mematrices, P-matrices, and related topics 125 25.11 Definition. A matrix A=[a;;] € M, is said to be strictly diagon- ally dominant of its row entries (respectively, of its colwnn entries) if Lois] > Lois] (respectively, | 5:1) for each i= 1,..., nand all j¢ i. Ecercise. Note that if A€ M, is strictly row diagonally dominant, then it must bestrictly diagonally dominant ofits row entries, but not conversely. Beercise. Let A= [aij] € M,(R) have positive diagonal entries. Show that strictly row diagonally dominant if and only if ay+ Ytay>0 in forall possible 2°! choices of + signs for every i= 1... m Brercise. If A= [0,5] € M,(R) is strictly row or column diagonally domi- nant, show that det A has the same sign as the product 1029"-~ dy, ofits ‘main diagonal entries. Hint: Either reduce to the case in which all a,;> 0, orlet Dz diag(a;1,-0yq) and consider D+ «(A~ D) as «decreases from 1 to ©. In either case, Gerdgorin’s theorem (Theorem (6.1.1) in (BJ]) is the key. 2.5.12 Theorem. If A € M,(R) is strictly row diagonally dominant, then 471 ig otrictly diagonally dominant of its column entries. Proof: The Levy-Desplanques theorem ensures that A is invertible because O cannot be an eigenvalue of a strictly (row or column) diagonally dominant jee (5.6.17) in [HJ]. Let A=[a,;] and denote A“! =[a,;]. Since y= (1) idet Aj/det A, where A, dena the eubmatix of A with row j and column i delated, it suflices t0 show that |det Ai] > [det A4| forall nand all jf 4. hut loss of generality, we take i= 1 and j= 2 for convenience and, via multiplication of each row of A by #1, we may also assume that a;;>0 for i=1,....m By the preceding exercise, det Aj, >0. ‘To complete the proof, it suffices to show that det Ay, 4 det Ay > 0. But det Ay, + edet Ayo 22°] Mayon, Og = det | aseteey S95 Diy the two preceding exercises, the latter matrix is strictly row diagonally dominant for ¢= #1, and its determinant is positive because all its diagonal ntries are positive, Oo Heercise. Let A= [a,j] € M, be of the form A= PMD, in which Pe M,(B) is a permutation matt, Dé M, is diagonal and nonsingular, and Me M, is strictly row diagonally dominant. Show that any matrix B= [b,j] € M, such that [bis] = Las], 4, wy ®, is nonsingular. One might ask if there is a converse to the observation of the preceding exercise, that is, a kind a converse to the “diagonal dominance implies onsingularity" version of Ger8gorin’s theorem. An affirmative answer is provided by a theorem of Camion and Hoffman. JEM, is said to be equimodular with a given matrix A = [a,j] © My if |bj[ = [45 | forall j= 1. m. 2.5.14 Theorem. Let Ae M, be given. Every Be M, that is equimod- ular with 4 is nonsingular if and only if A may be written as A= PMD, where P€ M,(&) is a permutation matrix, De M,(R) is a positive diagonal matrix, and Me M, is strictly row diagonally dominant, ‘zercise. Suppose that A € M,(R) is a matrix of the type characterized in ‘Vheorem (2.5.14). Show that there is a generalized diagonal such that the sign of det B is determined by the product of the entries along this diagonal lor every Be Mf,(R) that is equimodular with A. Heercise. Show that a matrix of the type characterized in Theorem (2.5.14) is just a permutation multiple of an H-matrix. Problems 1. If ef, is an Mmatrix and Pe M,(R) is a permutation matrix, show

You might also like