0% found this document useful (0 votes)
112 views

Eigenvalues, Eigenvectors (CDT-28) : April 2020

This document provides an introductory overview of eigenvalues and eigenvectors. It defines key concepts such as matrices, diagonalization, and quadratic forms. It also classifies real symmetric matrices based on the sign of their quadratic forms, distinguishing between positive definite, negative definite, positive semidefinite, and negative semidefinite matrices. Examples are provided to illustrate these concepts and properties. The document aims to introduce eigenvalues and eigenvectors in a concise yet intuitive manner by focusing on their properties and representative applications.

Uploaded by

Jorge Leandro
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
112 views

Eigenvalues, Eigenvectors (CDT-28) : April 2020

This document provides an introductory overview of eigenvalues and eigenvectors. It defines key concepts such as matrices, diagonalization, and quadratic forms. It also classifies real symmetric matrices based on the sign of their quadratic forms, distinguishing between positive definite, negative definite, positive semidefinite, and negative semidefinite matrices. Examples are provided to illustrate these concepts and properties. The document aims to introduce eigenvalues and eigenvectors in a concise yet intuitive manner by focusing on their properties and representative applications.

Uploaded by

Jorge Leandro
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/340628834

Eigenvalues, Eigenvectors (CDT-28)

Preprint · April 2020

CITATIONS READS
0 991

1 author:

Luciano da F. Costa
University of São Paulo
679 PUBLICATIONS   11,411 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Sound and Music View project

Transistor Characterization and Modeling View project

All content following this page was uploaded by Luciano da F. Costa on 14 April 2020.

The user has requested enhancement of the downloaded file.


Eigenvalues, Eigenvectors
(CDT-28)
Luciano da Fontoura Costa
[email protected]

São Carlos Institute of Physics – DFCM/USP

14th Apr 2020

Abstract
The concept and properties of eigenvalues and eigenvectors are presented in a concise and introductory manner. The
importance of eigenvalues and eigenvectors in several areas is also briefly illustrated with respect to characterization of
scalar field extrema, dynamical systems, Markov chains, and multivariate statistics.

ties, and including a few applications examples. We aimed


“Know thyself.”
at addressing applications from the eigenvector and eigen-
Delphic proberb. value perspective, in the sense that the respective concept
and some important properties are presented as a prepa-
ration for discussing some of the many representative ap-
plications of eigenvalues and eigenvectors.
1 Introduction
Eigenvalues, eigenvectors. These forever companion
terms, appear recurrently in many areas of science and
technology. Among their many applications, we have the
inference of the type of extrema (e.g. maximum, mini-
mum, saddle-point, etc.) of a multivariate function, char-
acterization of the stability and convergence in dynamical
systems, study of physical properties such as axes of in-
ertia, characterization of human faces, study of neuronal
network dynamics, development of important statistical
methods (such as principal component analysis), detec-
tion of communities in network research, study of Markov
processes... The list, which seems endless, continues to Figure 1: An eigenvector ~
u of a linear transformation T will not
grow, helped by the fact that eigenvalues and eigenvec- be modified by it other than by a scaling by λ, its respectively
tors can relate not only to matrices, but also to functions, associated eigenvalue.
dynamical states, graphs, etc.
Though eigenvalues and eigenvectors (e.g. [1, 2, 3]) are
frequently used in several scientific and technological ar-
eas, they are not always well-understood conceptually. Before proceeding, it is interesting to consider the ori-
Yet, they are ultimately intuitive and relatively simple, gin of these two terms. Since ‘values’ and ‘vectors’ are
provided they are approached from the perspective of well-known mathematical entities, we focus on the term
their properties, such as in preserving the inclination of ‘eigen’. Probably it derives from German, where ‘eigen’
vectors in linear transformation (see Figure 1). means peculiar, proper, own, characteristic,self, inherent.
The present work aims at providing a concise introduc- Actually, the use of this term goes back at least to H.
tion to the concept of eigenvalues and eigenvectors, also von Helmholtz’s ‘eigentöne’, signifying the tones and reso-
covering some of their most important and useful proper- nances produced by vibrating bodies or systems (e.g. [4]).

1
2 Some Basic Concepts matrix P can be found such that:

An N × N matrix A has the general form: A = P −1 BP (3)

A diagonalizable real matrix A is similar to a diagonal


 
a1,1 a1,2 ... a1,j ... a1,M
 a2,1 a2,2 ... a2,j ... a2,M  matrix D, i.e.:
A = P −1 DP,
 
 .. .. ..  (4)
. . ... ... ... .
 
A= , (1)
 
 ai,1 ai,2 ... ai,j . . . ai,M  in case the matrix P exists, P −1 DP corresponds to the

 .. .. ..

 diagonalization of A.
. . ... ... ... .
Let A be a symmetric matrix. Matrix A determines a
 
aN,1 aN,2 ... ai,M . . . aN,M
respective quadratic form:
A square matrix A is invertible if and only if its deter- qA (~x) = ~xT A~x (5)
minant is different from zero.
The trace of a diagonal matrix A corresponds to the The sign of the quadratic form of a real symmetric ma-
sum of the elements of its main diagonal. trix A can be taken into account while classifying it into
A real, square matrix A is orthogonal if and only if several types, as described in the following:
AAT = AT A = I, where I is the identity matrix, a diag-
onal matrix with all elements equal to 1. Positive definite: qA (~x) > 0 for any vector ~x 6= ~0.
The conjugate transpose, or Hermitian transpose of a Negative definite: qA (~x) < 0 for any vector ~x 6= ~0.
complex matrix A is (A∗ )T = (AT )∗ = AH . A complex, Positive
qA (~x) ≥ 0 for any vector ~x 6= ~0.
square Hermitian matrix A is such that AH A = AAH = I. semidefinite:
The spectral radius Rλ of a square matrix A is the max- Negative
qA (~x) ≤ 0 for any vector ~x 6= ~0.
imum magnitude of its eigenvalues. semidefinite:
A square, nonnegative (all entries ≥ 0) matrix A is said det(A) 6= 0 and A is neither pos-
Indefinite:
to be primitive if for some positive integer k > 0 we have itive definite or negative definite.
that all elements of Ak are positive (> 0) values. If det(A) = 0, A can be indefinite
Null-determinant
A graph can be associated to a real matrix A by making or positive semidefinite or nega-
case:
ai,j = 1 whenever we have a directed connection from tive semidefinite.
node j to node i, and ai,j = 0 otherwise. Such a matrix
Observe that a matrix being positive is not the same
is called the adjacency matrix of the graph.
as being positive definite (and similarly for negative and
Let A be a positive square matrix, and make each of
negative definite).
its non-zero entries equal to 1, and 0 otherwise. This 
1 0

matrix can be understood as an adjacency matrix defining Let’s consider as an example that A = . Its
0 1
a respective graph. quadratic form is:
A matrix A is irreducible if the resulting directed graph     
is strongly connected, i.e. given any two nodes i and j, 1 0 x x
qA (~x) = [x y] = [x y] =
there is a directed path from i to j. This can also be 0 1 y y
interpreted as: given any element ai,j of A there is an x2 + y 2 . (6)
integer k > 0 such that the corresponding element pi,j of
P = Ak is positive (> 0). Observe that the quadratic form of A corresponds to a
So, we have that every primitive matrix (nonnegative) second order polynomial on the variables x and y.
is irreducible, with all its elements becoming nonnegative This polynomial can only take positive values, from
for the same power of k, but not vice-versa, implying that which we conclude that A is positive definite, but is not
being primitive is a more restrictive conditions than being positive as it includes the element zero (however, it can
irreducible. Also, every positive square matrix is neces- be said to be nonnegative).
sarily irreducible.
The linear transformation of a vector ~x into a vector ~y
can be expressed in matrix form as:
3 Eigenvalues and Eigenvectors of
a Matrix
~y = A~x (2)
We now proceed to defining and presenting some of the
where ~x = (x1 , x2 , . . . , xN in <N ) is a column vector many important properties of the eigenvalues and respec-
Two matrices A and B are said to be similar if a third tive eigenvectors of a matrix A.

2
Given a matrix A, which we will considered as being An eigenvalue λi can also have a respective geometric
associated to its respectively implemented linear trans- multiplicity, γA (λi ), which is given as:
formation, we can define one of its eigenvalues λ and re-
spective eigenvector ~v as obeying: γA (λi ) = N − rank(A − λi I) (11)

λ~v = A~v (7) It can be shown that:

Thus, an eigenvector ~v of the A is such that, when 1 ≤ γA (λi ) ≤ µA (λi ) ≤ N (12)


transformed by A, yields a new vector λ~v that preserves
If rank(A − λi I) > 1, implying that there are two or
the inclination of the original vector, though being scaled
more linear independent eigenvectors with the same eigen-
by the respective eigenvalue λ.
value, the respective λi is said to be degenerate. In case
The eigenvalues of a matrix A can be determined by
all eigenvalues have multiplicity 1, the spectrum is said to
making the respective characteristic polynomial
be simple.
p(λ) = det(A − λI) (8) The Perron-Frobenius theorem (e.g. [5]) provides im-
portant information regarding the eigenvalue and eigen-
equal to zero, and solving the resulting secular (or char- vector properties of a square, nonnegative irreducible ma-
acteristic) equation. Then, one possible way to obtain the trix A. Among other things, it implies that the respective
respective eigenvectors is by substituting the eigenvalues, Rλ exists and is a simple root of the respective character-
one-by-one, into Equation 10 and solving the respectively istic polynomial. We also have that there is an eigenvector
obtained systems of linear equations. associated to Rλ that has strictly positive elements.
It is also interesting to consider the equation A list of several additional eigenvalue and eigenvector
properties are given in Table 1, while Table 2 includes
λ~r = ~rT A (9)
some eigenvalue and eigenvector properties related to the
defining the right-eigenvectors of A with respect to λ positive or negative definiteness of a symmetric real ma-
(the previous ones are called left-eigenvectors). trix A.
It can be shown that the eigenvalues λi are the same for Given an N × N matrix A with respective non-
both Equations 10 and 9, but the respective eigenvectors degenerated eigenvalues λi , i = 1, 2, . . . , N , with asso-
~v and ~r are generally not identical, unless A is symmet- ciated linearly-independent eigenvectors v~i , we can define
ric. This can be verified by transposing Equation 9 and the following diagonal matrix:
comparing it with Equation 10.
λ1 0 . . . 0
 
Let’s consider the determination
 of the eigevalues and  0 λ2 . . . 0 
1 1 Λ= , (13)
respective eigenvectors of A = . We have:  0 0 ... 0 
0 −1
0 0 ... λN
 
1−λ 1−λ
p(λ) = det(A − λI) = det = as well as the matrix of corresponding eigenvector:
0 1+λ
↑ ↑ ... ↑
 
1 − λ2 = 0 =⇒ λ1 = 1 and λ2 = −1
V = v~1
 v~2 . . . v~N  , (14)
For λ1 = 1, we go back to Equation 10: ↓ ↓ ... ↓

λ~v = ~v = A~v =⇒ vx = 1; vy = 0. The previous Equation 10 can now be expressed in ‘par-


(10) allel ’ manner (e.g. [1]) as:

The reader is encouraged to find the eigenvector corre- AV = V Λ. (15)


sponding to λ2 = −1.
Therefore, this equation expresses the interrelation-
Given a matrix A, its eigenstructure can be understood
ships between all eigenvalues and eigenvectors of the orig-
as defined by its respective set of eigenvalues and eigen-
inal matrix A.
vectors, while the set of eigenvalues is called the spectrum
If we left-multiply both sides of Equation 10 by V −1 ,
of A (typically, the diagonal of A is not considered when
we get:
calculating its spectrum).
V −1 AV = Λ, (16)
Given an N ×N matrix A, one of its eigenvalues λi may
corresponds to a multiple root of the characteristic equa- which is know as the eigendecomposition of A. Ob-
tion, implying that eigenvalue to have respective algebraic server that this decomposition corresponds to the diago-
multiplicity µA (λi ). nalization of A.

3
The above result allows us to obtain all the N eigenval- The following R algorithm can be used to visualize the
ues (in case they exist) given matrix A and its respective Gershgorin discs of the N × N input matrix A. Observe
eigenvectors. that N th is the angular resolution for plotting the discs.
Observe that a matrix A may not have its inverse. In
this case, its single-value decomposition can be used for Algorithm 1 Gershgorin(A, N )
respective diagonalization (e.g. [1]), being applicable even
to non-square matrices. 1. N th ← 300
Now, if we right-multiply both sides of Equation 10 by 2. dth ← 2 ∗ pi/(N th − 1)
V −1 , and assuming a set of eigenvalues given as Λ, we
have: 3. th ← seq(0, 2 ∗ pi, dth)
A = V ΛV −1 . (17)
4. a ← 7
which provides a method for designing a matrix A with
5. f or(i in seq(1, N )){
pre-specified eigenvalues and eigenvectors.
Given a row vector ~v with at least 2 components, it is (a) R ← sum(abs(Ad[i, ]))
easy to obtain a matrix A of which it is an eigenvector. (b) Gx ← R ∗ cos(2 ∗ pi ∗ th) + A[i, i]
This can be done as:
(c) Gy ← R ∗ sin(2 ∗ pi ∗ th)
T
A = ~v~v (18) (d) plot(Gx, Gy, xlim = c(−a, a), . . .
ylim = c(−a, a), type = ”l”)
If p~ is another vector orthogonal to ~v with the same
(e) par(new = TRUE)}
dimension, a matrix A having ~v and p~ as eigenvectors,
but a priori unspecified eigenvalues, can now be obtained
as:
Figure 2 illustrates the Gershgorin discs (in salmon),
A = ~v~v T + p~p~T (19)
as well as the actual three eigenvalues (green triangles,
Up to N vectors N × 1, each one orthogonal to all the calculated by using an eigenvalue/eigenvector library), of
others, can be combined in this manner so as to obtain the following matrix:
an N × N matrix A having them as eigenvectors.
−1
 
1 0
A =  −2 2 −1  , (21)
4 Gershgorin Discs 1 2 0

Proposed by the Russian applied mathematician


S. A. Gershgorin [6], the discs, as well as the respective
theorem that carry his name, provide an interest-
ing resource for bounding the eigenvalues of complex
matrices.
Given an N × N complex matrix A, let the radius of
the discs be calculated as:
N
X
Ri = |ai,j | (20)
j=1,
j6=i

Observe that Ri is a real value corresponding to the


absolute values of all elements along the i-th row of A.
The Gershgorin disc for row i, G(i) is a closed circle de- Figure 2: The Gershgorin discs for the matrix A in Eq. 21 shown
in salmon, as well as the three respective actual eigenvalues, shown
fined in the complex plan with center at ai,i and radius as green triangles.
Ri . Therefore, all such circles are centered at points along
the real-axis.
Gershgorin’s theorem states that all eigenvalues of A The above results allow some interesting interpretations
will necessarily be found inside the union of all the N and predictions about the eigenvalues of a matrix. For
respective Gershgorin discs, which therefore bound the instance, we infer that the the higher the dispersion of
eigenvalues of a matrix. the values of the elements along the diagonal of A, while

4
preserving all the other entries, higher also will be the ~˜
We have, from property [P 4] in Table 1, that this H(X)
dispersion of the discs along the real axis. has eigenvalues λ1 = 2 and λ2 = −2, implying the found
In addition, the larger the absolute values along respec- critical point to be a saddle point.
tive rows of A, the higher the uncertainty, as inferred by
this approach, in bounding the eigenvalues. However, we
should not take higher bounding regions to necessarily 6 Linear Dynamical Systems
imply larger eigenvalues.
An interesting situation arises when we have a row i in A homogeneous (no constant terms in the equations) lin-
which only the respective element corresponding to the ear dynamical system involving N variables (<N ) with
diagonal, ai,i , is non-zero. In this case, one of the eigen- constant coefficients can be expressed as:
values will necessarily be equal to ai,i . 
 ẋ1 (t) = a1,1 x1 (t) + a1,2 x2 (t) + . . . + a1,N xN (t)

ẋ2 (t) = a2,1 x1 (t) + a2,2 x2 (t) + . . . + a12,N xN (t)

S:
5 Extrema of Multivariate Func- 


...
ẋN (t) = aN,1 x1 (t) + aN,2 x2 (t) + . . . + aN,N xN (t)
tions
In our first application example, we will briefly ad- where ai , j, i, j = 1, 2, . . . , N are real values.
dress the classification of the extrema of a scalar field The system S can be placed in the equivalent matrix
ψ(x1 , x2 , . . . , xN ) defined on <N . form:
First, we identify the points yielding null respective gra- 
ẋ1 (t)
 
a1,1 a1,2 ... a1,N

x1 (t)

dient, i.e. ||∇ψ(x ~ 1 , x2 , . . . , xN )|| = 0. These so-called  ẋ2 (t)   a2,1 a2,2 ... a2,N  x2 (t) 
critical points are candidates for being extrema of ψ. =
    
 .. .. .. .. ..  .. 
However, additional testing is required considering the
 .   . . . .  . 
eigenvalues of the Hessian matrix of A, namely: ẋN (t) aN,1 aN,2 ... aN,N xN (t)
 ∂2ψ (24)
∂2ψ ∂2ψ

∂2x ∂x∂y ∂x∂z or, more synthetically:
∂2ψ ∂2ψ ∂2ψ
Hψ(~p) = (22)
 
∂y∂x ∂2y ∂y∂z 
∂2ψ ∂2ψ ∂2ψ ~x˙ (t) = A~x(t). (25)
∂z∂x ∂z∂y ∂2z

The following criteria can then be used while trying to where A is an N × N matrix, and both ~x˙ and ~x are
classify the types of extrema of a scalar field: column vectors.
It can be shown (e.g. [7]) that the general solution of
1. Positive-definite Hessian: All eigenvalues of
a linear system of ordinary differential equations at con-
H(x̃, ỹ, z̃) are positive (i.e. A is positive definite) =⇒
stant coefficients can be expressed as:
(x̃, ỹ, z̃) is a local minimum point;
2. Negative-definite Hessian: All eigenvalues of ~x(t) = c1 e−λ1 t~v1 + c2 e−λ2 t~v2 + . . . + cN e−λN t~vN (26)
H(x̃, ỹ, z̃) are negative (i.e. A is negative definite)
=⇒ (x̃, ỹ, z̃) is a local maximum point; where c1 , c2 , . . . , cN are constants, λ1 , λ2 , . . . , λN are
the eigenvalues of A with respective eigenvectors
3. Indefinite Hessian: The eigenvalues of H(x̃, ỹ, z̃)
~v1 , ~v2 , . . . , ~vN , provided they can be found.
are a mixture of positive and negative values =⇒
(x̃, ỹ, z̃) is a saddle point; In case an initial condition ~x0 is provided, the constants
can be determined as:
4. Otherwise: One or more null eigenvalues =⇒ addi-

c1
 
x0,1

tional analysis is needed. −1
↑ ↑ ... ↑

 c2   x0,2 
Observe the importance of the eigenvalues of H(ψ) in ~ =
C  ..  = ~v1
 
~v2 . . . ~vN 

 ..


identifying the types of extrema of a scalar field.  . 
↓ ↓ ... ↓
 . 
Let’s illustrate the identification of the extrema of the cN x0,N
~
scalar field ψ = x2 − y 2 . Its gradient is ∇(ψ) = 2xı̂ − 2y̂, (27)
which has null magnitude only when x = y = 0, so that As an example, let’s consider the solution of the linear
we have X̃ ~ = (0, 0) as the sole critical point of ψ. The ODE system with:
Hessian of ψ is given as:  
1 0 3
 
~˜ = 2 0
 2 0 1  (28)
H(X) . (23)
0 −2 0 1 3

5
We obtain λ1 = 3.81912..., λ1 = 0.09043...+1.14062...i, Let the following ordinary differential equation:
and λ3 = 0.09043... − 1.14062...i, with respective eigen-
vectors: ~x˙ (t) = f (~x(t)). (32)
 
0.63557... Its discretization in time yields:
~v1 =  0.48922...  ;
0.59725... ~x(t + ∆t) − ~x(t)
~x˙ (t) = lim∆t→0 ≈
  ∆t
0.15670... + 0.50738...i ~x(t + ∆t) − ~x(t)
~v2 =  0.80703... ; ≈ ≈ f (~x, t), (33)
∆t
−0.24042... − 0.09425i
which implies:
0.15670... − 0.50738...i
 

~v3 =  0.80703... . (29) ~x(t + ∆t) ≈ f (~x, t)∆t + ~x(t), (34)


−0.24042... + 0.09425...i
which corresponds to a first-order finite difference ap-
Considering the initial condition:
proximation of the original equation, which can be used to
estimate the solution of that equation by using numerical
 
0
~x0 =  −1  (30) approaches.
1 If we make ∆t = 1, and replace the approximation by
an equality we get the new system:
we get the constants:
  ~x(t + 1) = f (~x(t)) + ~x(t), (35)
0.87520...
~ =  −0.88482... + 0.27487...i 
C (31)
Observe that this is not longer a good approximation of
−0.88482... − 0.27487...i
the original equation, corresponding to a distinct dynamic
Figure 3 shows the three obtained solutions system.
x1 (t), x2 (t), x3 (t) for a period of time starting at In f () corresponds to a linear transformation, we ob-
t = 0. tain:
~x(t + 1) = B~x(t) + ~x(t) = A~x(t), (36)

where A = B + I and in which the next state ~xt+1


depends only on a linear combination of the current states
corresponding to the entries of ~x(t).
An initial condition can be expressed as ~x(t = 0).
Let A be an N ×N real matrix. It is said to be stochastic
or Markov if and only if it only contains nonnegative real
entries and the sums of each of its columns is equal to 1.
As a simple example, we have:
 
0.32 0.49
A=
0.68 0.51
1 1

Observe that each of the rows of this type of matrix A


can be understood as a normalized set of probabilities, in
the sense that they add to 1.
Figure 3: The solutions of the ODE in the considered example un- It is also interesting to consider a row -stochastic ma-
folding along time. trix, which has all rows adding up to one. In this case, we
shall call the previous type of matrix as being column-
stochastic. A matrix that has these two properties is
called a doubly stochastic matrix.
7 Markov Chains Stochastic matrices can be sn to have all eigenvalues
with magnitudes smaller or equal to 1. This follows from
Markov chains constitute and interesting statistical ap- Gershgorin theorem: we have that each of the rows of a
proach (e.g. [8]) that is largely applicable. row-stochastic matrix adds up to 1, so we know by that

6
theorem that they are bound in magnitude to 1. How-
ever, as we also know that square matrices have the same
left- and right-eigenvalues, we conclude that the right-
eigenvalues of a stochastic matrix are also bound by mag-
nitude 1.
In addition, we have that every column-stochastic ma-
trix has at least one left-eigenvector identical to ~1 (row
vector with all entries identical to 1), because ~1A = ~1
effectively implements the sum of each of the columns of
A. This vector is associated to the eigenvalue 1, so ev-
ery row- and column-stochastic matrix have at least one Figure 4: The graph associated to the stochastic matrix in the
considered example. It is also interesting to imagine a uniformly
eigenvalue equal to 1.
random walk performed by a hypothetical agent along this graph,
If A is irreducible, we have from the Perron-Frobenius taking the outgoing links according to the respective transition prob-
that the eigenvector associated to λ = 1 can be placed in abilities.
a form with strictly positive elements. This eigenvector
will be associated to the stationary state of the respective
Markov chain, also implying that every respective state with corresponding eigenvectors:
will have a non-null probability.
It should be kept in mind that a stochastic matrix A −0.122... 0.435... − 0.334...i
   
can have: (i) more than one eigenvalue equal to 1; (ii)  −0.967...   −0.745... 
 −0.193...  ; ~v2 =  0.036... + 0.241...i  ;
~v1 =    
eigenvalues equal to zero; (iii) negative eigenvalues; (iv)
complex eigenvalues. −0.110... −0.273... + 0.092...i
If A is a stochastic matrix, Equation 36 defines a re- 
0.435... + 0.334...i
 
−0.734...

spective Markov chain on the states in ~x(t). −0.745...  0.600...
   
Let’s consider that A is irreducible and regular. This  0.036... − 0.241...i  ; ~v4 =  −0.146...
~v3 =   

effectively means that the state xi (t) associated to any 0.273... − 0.092...i 0.280...
node i of the graph representing A will, along time, influ-
ences any of the other nodes with a non-null contribution. Observe the coexistence of real and complex eigenvalues
As already observed, the left-eigenvector associated to and eigenvectors.
the eigenvalue 1 of A corresponds to the equilibrium As expected, A has one eigenvalue identical to 1, with
or stationary distribution of probabilities of the Markov an associated real eigenvector corresponding to the sta-
chain states, i.e.: tionary state. This can be transformed into probabilities
by normalizing p~1 = ~v1 /sum(~v1 ), which yields:

A~
p = p~ (37) 
0.0878...

 0.6940... 
p~1 =  
 0.1388... 
Interestingly, this eigenvector does not depend on the 0.0793...
initial state ~x(t = 0), and therefore has no ‘memory’ of
the past dynamics or initial condition. In case we understand the transition probabilities in
Let’s consider the simple example of Markov chain pre- A as corresponding to a uniformly random walk on the
sented in Figure 4. respective system, the obtained distribution p~1 indicates
The respective transition matrix A can be obtained as: that node 2 will be much more frequently visited than the
others, followed by the third, first and forth nodes.
Figure 5 illustrates the unfolding of the states values
0.3 0 0.1 0.6
 
associated to the nodes along the discrete time steps t =
 0.7 0.9 0 0.1  0, 1, 2, . . . , 10.
 . (38)
 0 0.1 0.5 0 
0 0 0.4 0.3
8 Multivariate Statistics
This matrix can be verified to be irreducible. The multivariate normal distribution (e.g. [9]) is particu-
The respective eigenvalues are λ1 = 1, λ2 = 0.45461...+ larly important for modeling and trying to make predic-
0.30132...i, λ3 = 0.4546... − 0.3013...i, and λ4 = 0.0907..., tions on a whole set of random variables (measurements).

7
Indeed, the covariance can be shown to be positive
semidefinite, i.e. its eigenvalues are all larger or equal to
0, therefore its determinant is also nonnegative.
In addition, by being symmetric, its eigenvalues are
all real, and the eigenvectors ~vi corresponding to distinct
eigenvalues are orthogonal.
Therefore, if we define the matrix:

← v~1 →
 
 ← v~2 → 
P = , (42)
 
.. .. ..
 . . . 
← v~N →
Figure 5: The values of the states probabilities (frequency of visits
by a hypothetical agent) along the discrete time t from the initial it will orthogonal. We can apply this matrix on the
condition ~
x0 = [0, 0, 0, 1]. Observe how, as a consequence of the
original random vector, yielding a new random variable
interconnections and respective probability transitions, the value of
vector Y~ whose elements are linear combinations of the
the state of node 4, initially equal to 1, decreases quickly, as node
2 progressively concentrates the density of visiting agents. original random vectors, i.e.:

~ = PX
Y ~ (43)

Considering an N × N domain, the multivariate normal which corresponds to a linear statistical transforma-
probability density function with average vector µ
~ and tion known as discrete Karhunen-Loève transform, which
covariance matrix K can be expressed as: provides the basis for the Principal Component Analysis
(PCA) methodology [10, 11]. PCA implements a rotation
~ =
gµ~ ,K (X)
 of the original coordinates axes so as to align the first axes
1 1  T  
= |K|−1/2 exp − ~ −µ
X ~ K −1 X ~ −µ~ (39) with the directions of largest variation, as quantified by
(2π)M/2 2 the respective variances. The obtained random variables
result completely uncorrelated, and their variance is equal
Observe the quadratic form of K −1 in the argument of
to eigenvalues associated to the respective axes.
the exponential above.
Given N random variables Xi , i = 1, 2, . . . , N , rep-
resented as the random vector X, ~ and their respective
9 Concluding Remarks
joint probability density function, the corresponding co-
variances can be defined as: The present work presented, briefly and in an introduc-
ˆ ∞ tory manner, the concept, properties and applications of
cov(Xi , Xj ) = (Xi − µXi )(Xj − µXj )p(X)d~ X ~ (40) eigenvalues and eigenvectors from ‘eigen-centered’ posi-
−∞
tion. By starting with a review of some of their important
The respective unbiased estimator, from M samples of properties, it was possible to discuss the subsequent ap-
the random variables is given as: plications in a more integrated and systematic way. The
interesting Gershgorin approach was also briefly outlined
M
1 X and illustrated.
cov(Xi , Xj ) ≈ (Xi − µXi )(Xj − µXj ) (41)
M −1 The addressed eigenvalues and eigenvectors applica-
k=1
tions included scalar field extrema characterization, so-
The covariance matrix K can now be defined so that lution of linear dynamic systems at constant coefficients,
each of its elements ki,j = cov(Xi , Xj ). This matrix has Markov chains, as well as some aspects of multivariate
important properties, some of which are presented as fol- statistics.
lows. The already large potential of theoretical and practical
First, we have that, as a consequence of its own defini- applications of eigenvalues and eigenvectors is being con-
tion, that it is necessarily real and symmetric. Then, as stantly further enhanced thanks to continuing advances
the diagonal elements correspond to the respective vari- in computer science, allowing respective calculations on
ance (i.e. var(Xi ) = cov(Xi, Xi), an average of squared matrices of ever increasing sizes. This opens up new
values), all its elements are necessarily non-zero, and so prospects in theoretical and applied research. It is hoped
is its trace, implying that the respective eigenvalues add that the covered presentation may motivate the reader to
to a nonnegative value. probe further in this interesting area.

8
Acknowledgments. 340114268_Features_Transformation_and_
Luciano da F. Costa thanks CNPq (grant Normalization_A_Visual_Approach_CDT-24.
no. 307085/2018-0) for sponsorship. This work has [Online; accessed 10-Apr-2020.].
benefited from FAPESP grant 15/22308-2.
Costa’s Didactic Texts – CDTs
CDTs intend to be a halfway point between a
formal scientific article and a dissemination text
in the sense that they: (i) explain and illustrate
References concepts in a more informal, graphical and acces-
[1] G. H. Golub and C. F. van Loan. Matrix Computa- sible way than the typical scientific article; and
tions. The Johns Hopkins University Press, 1996. (ii) provide more in-depth mathematical develop-
ments than a more traditional dissemination work.
[2] Wikipedia. Eigenvalues and eigenvectors.
Wikipedia, the free encyclopedia, 2020. It is hoped that CDTs can also incorporate new
https://en.wikipedia.org/wiki/Eigenvalues_ insights and analogies concerning the reported
and_eigenvectors. [Online; accessed 10-Apr-2020.]. concepts and methods. We hope these character-
istics will contribute to making CDTs interesting
[3] H. Sagan. Boundary and eigenvalue problems in
both to beginners as well as to more senior
mathematical Physics. Dover, 1989.
researchers.
[4] H. von Helmholtz. Die Lehre von den Tonempfindun-
gen als Physiologische Grundlage für die Rhoerie der Each CDT focuses on a limited set of interrelated
Musik. Braunschweig Druk und Verlag von Friedrich concepts. Though attempting to be relatively
Vieweg und Sohn, 1896. self-contained, CDTs also aim at being relatively
short. Links to related material are provided in
[5] R. A. Horn and C. R. Johnson. Matrix Analysis. order to complement the covered subjects.
Cambridge University Press, 2012.
Observe that CDTs, which come with absolutely
[6] S. Gershgorin. Über die abgrenzung der eigenwerte
no warranty, are non distributable and for non-
einer matrix. Izv. Akad. Nauk. URSS Otd. Fiz.-Mat.
commercial use only.
Nauk, pages 749–754, 1931.

[7] R. K. Nagle, E. B. Saff, and A. D. Snider. Funda- The complete set of CDTs can be found
mentals of Differential Equations. Pearson, 2017. at: https://www.researchgate.net/project/
Costas-Didactic-Texts-CDTs.
[8] G. Kemeny and J. L. Snell. Finite Markov Chains.
van Nostrand Princeton, 1960.

[9] L. da F. Costa. Multivariate statistical modeling.


https://www.researchgate.net/publication/
340442989_Multivariate_Statistical_
Modeling_CDT-26, 2019. [Online; accessed 10-
Apr-2020.].

[10] F. Gewers, G. R. Ferreira, H. F. Arruda, F. N.


Silva, C. H. Comin, D. R. Amancio, and L. da F.
Costa. Principal component analysis: A natural
approach to data exploration. Researchgate, 2019.
https://www.researchgate.net/publication/
324454887_Principal_Component_Analysis_
A_Natural_Approach_to_Data_Exploration.
accessed 25-Dec-2019.

[11] L. da F. Costa. Features transformation


and normalization. Researchgate, 2019.
https://www.researchgate.net/publication/

9
Table 1: Some of the some properties of eigenvalues and eigenvectors
of a matrix A.

[P1]: Let α ∈ <, α 6= 0. The matrix


αA has the same eigenvectors as A,
while the eigenvalues are multiplied by α.
[P2]: Let α ∈ <, α 6= 0. If ~v is an eigenvector
of A, so is α~v . Therefore, the magnitude of an
eigenvector cannot be uniquely specified.
[P3]: The characteristic polynomial of an
N × N matrix has degree N , and so
has N roots, which may appear with multiplicity.
[P4]: The eigenvalues of a diagonal matrix
are the elements of its diagonal. Table 2: Some positive and negative definiteness-related properties
of a symmetric real matrix A.
[P5]: The trace of a matrix is equal to the sum
of its eigenvalues.
[P6]: The product of the eigenvalues of a matrix If and only if all its eigenvalues
Positive definite:
is equal to its determinant. are positive.
[P7]: If λ is an eigenvalue of A, If and only if all its eigenvalues
Negative definite:
it is also an eigenvalue of AT . are negative
[P8]: The eigenvalues of Ar , for a Positive If and only if all its eigenvalues
positive integer r, are (λi )r . semidefinite: are nonnegative (≥ 0).
[P9]: If λ is an eigenvalue of A, Negative If and only if all its eigenvalues
then λ−1 is an eigenvalue of A−1 . semidefinite: are non-positive (≤ 0).
[P10]: Eigenvectors associated to distinct If and only it has both positive
Indefinite:
eigenvalues are linearly independent. and negative eigenvalues.
[P11]: The eigenvalues of a real matrix
are real or come in conjugate pairs.
[P12]: All the eigenvalues of a
complex /real matrix A which is
Hermitian/symmetric matrix are real.
[P13]: If a complex /real matrix A is
Hermitian/symmetric, all its eigenvectors
are linearly independent and mutually orthogonal.
[P14]: If a complex /real matrix A is
unitary/orthogonal, then all its
eigenvalues have absolute value equal to 1.
[P15]: A is invertibe if and only if
all its eigenvalues are non-null.

10
View publication stats

You might also like