0% found this document useful (0 votes)
7 views

05.1 PP 3 14 Deterministic Matrices

1. Matrices appear throughout science, from physics to biology. Their properties, like eigenvalues and eigenvectors, are important to study. 2. Three examples that involve matrices are dynamical systems, master equations describing probability evolution, and covariance matrices. Dynamical systems and covariance matrices involve square matrices, while master equations can involve stochastic matrices. 3. For a square matrix A, an eigenvalue-eigenvector pair (λ, v) satisfies the equation Av = λv. The eigenvalues are roots of the characteristic polynomial. Symmetric matrices have real eigenvalues and orthogonal eigenvectors.

Uploaded by

qwsx098
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

05.1 PP 3 14 Deterministic Matrices

1. Matrices appear throughout science, from physics to biology. Their properties, like eigenvalues and eigenvectors, are important to study. 2. Three examples that involve matrices are dynamical systems, master equations describing probability evolution, and covariance matrices. Dynamical systems and covariance matrices involve square matrices, while master equations can involve stochastic matrices. 3. For a square matrix A, an eigenvalue-eigenvector pair (λ, v) satisfies the equation Av = λv. The eigenvalues are roots of the characteristic polynomial. Symmetric matrices have real eigenvalues and orthogonal eigenvectors.

Uploaded by

qwsx098
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

1

Deterministic Matrices

Matrices appear in all corners of science, from mathematics to physics, computer science,
biology, economics and quantitative finance. In fact, before Schrodinger’s equation, quan-
tum mechanics was formulated by Heisenberg in terms of what he called “Matrix Mechan-
ics”. In many cases, the matrices that appear are deterministic, and their properties are
encapsulated in their eigenvalues and eigenvectors. This first chapter gives several elemen-
tary results in linear algebra, in particular concerning eigenvalues. These results will be
extremely useful in the rest of the book where we will deal with random matrices, and in
particular the statistical properties of their eigenvalues and eigenvectors.

1.1 Matrices, Eigenvalues and Singular Values


1.1.1 Some Problems Where Matrices Appear
Let us give three examples motivating the study of matrices, and the different forms that
those can take.

Dynamical System
Consider a generic dynamical system describing the time evolution of a certain
N-dimensional vector x(t), for example the three-dimensional position of a point in
space. Let us write the equation of motion as
dx
= F(x), (1.1)
dt
where F(x) is an arbitrary vector field. Equilibrium points x∗ are such that F(x∗ ) = 0.
Consider now small deviations from equilibrium, i.e. x = x∗ + y where 1. To first
order in , the dynamics becomes linear, and given by
dy
= Ay, (1.2)
dt
where A is a matrix whose elements are given by Aij = ∂j Fi (x∗ ), where i,j are indices
that run from 1 to N . When F can itself be written as the gradient of some potential V , i.e.
Fi = −∂i V (x), the matrix A becomes symmetric, i.e. Aij = Aj i = −∂ij V . But this is not

3
4 Deterministic Matrices

always the case; in general the linearized dynamics is described by a matrix A without any
particular property – except that it is a square N × N array of real numbers.

Master Equation
Another standard setting is the so-called Master equation for the evolution of probabilities.
Call i = 1, . . . ,N the different possible states of a system and Pi (t) the probability to
find the system in state i at time t. When memory effects can be neglected, the dynamics
is called Markovian and the evolution of Pi (t) is described by the following discrete time
equation:


N
Pi (t + 1) = Aij Pj (t), (1.3)
j =1

meaning that the system has a probability Aij to jump from state j to state i between t and
t +1. Note that all elements of A are positive; furthermore, since all jump possibilities must
 
be exhausted, one must have, for each j , i Aij = 1. This ensures that i Pi (t) = 1 at
all times, since


N 
N 
N 
N 
N 
N
Pi (t + 1) = Aij Pj (t) = Aij Pj (t) = Pj (t) = 1. (1.4)
i=1 i=1 j =1 j =1 i=1 j =1

Matrices such that all elements are positive and such that the sum over all rows is equal
to unity for each column are called stochastic matrices. In matrix form, Eq. (1.3) reads
P(t + 1) = AP(t), leading to P(t) = At P(0), i.e. A raised to the t-th power applied to the
initial distribution.

Covariance Matrices
As a third important example, let us consider random, N -dimensional real vectors X, with
some given multivariate distribution P (X). The covariance matrix C of the X’s is defined as

Cij = E[Xi Xj ] − E[Xi ]E[Xj ], (1.5)

where E means that we are averaging over the distribution P (X). Clearly, the matrix C is
real and symmetric. It is also positive semi-definite, in the sense that for any vector x,

xT Cx ≥ 0. (1.6)

If it were not the case, it would be possible to find a linear combination of the vectors X
with a negative variance, which is obviously impossible.
The three examples above are all such that the corresponding matrices are N × N square
matrices. Examples where matrices are rectangular also abound. For example, one could
consider two sets of random real vectors: X of dimension N1 and Y of dimension N2 . The
cross-covariance matrix defined as
1.1 Matrices, Eigenvalues and Singular Values 5

Cia = E[Xi Ya ] − E[Xi ]E[Ya ]; i = 1, . . . ,N1 ; a = 1, . . . ,N2, (1.7)


is an N1 × N2 matrix that describes the correlations between the two sets of vectors.

1.1.2 Eigenvalues and Eigenvectors


One learns a great deal about matrices by studying their eigenvalues and eigenvectors. For
a square matrix A a pair of scalar and non-zero vector (λ,v) satisfying
Av = λv (1.8)
is called an eigenvalue–eigenvector pair.
Trivially if v is an eigenvector αv is also an eigenvector when α is a non-zero real
number. Sometimes multiple non-collinear eigenvectors share the same eigenvalue; we say
that this eigenvalue is degenerate and has multiplicity equal to the dimension of the vector
space spanned by its eigenvectors.
If Eq. (1.8) is true, it implies that the equation (A − λ1)v = 0 has non-trivial solutions,
which requires that det(λ1 − A) = 0. The eigenvalues λ are thus the roots of the so-called
characteristic polynomial of the matrix A, obtained by expanding det(λ1 − A). Clearly, this
polynomial1 is of order N and therefore has at most N different roots, which correspond
to the (possibly complex) eigenvalues of A. Note that the characteristic polynomial of AT
coincides with the characteristic polynomial of A, so the eigenvalues of A and AT are
identical.
Now, let λ1,λ2, . . . ,λN be the N eigenvalues of A with v1,v2, . . . ,vN the corresponding
eigenvectors. We define  as the N × N diagonal matrix with λi on the diagonal, and V
as the N × N matrix whose j th column is vj , i.e. Vij = (vj )i is the ith component of vj .
Then, by definition,
AV = V, (1.9)
since once expanded, this reads

Aik Vkj = Vij λj , (1.10)
k

or Avj = λj vj . If the eigenvectors are linearly independent (which is not true for all
matrices), the matrix inverse V−1 exists and one can therefore write A as
A = VV−1, (1.11)
which is called the eigenvalue decomposition of the matrix A.
Symmetric matrices (such that A = AT ) have very nice properties regarding their
eigenvalues and eigenvectors.

1 The characteristic polynomial Q (λ) = det(λ1 − A) always has a coefficient 1 in front of its highest power (Q (λ) =
N N
λN + O(λN−1 )), such polynomials are called monic.
6 Deterministic Matrices

• They have exactly N eigenvalues when counted with their multiplicity.


• All their eigenvalues and eigenvectors are real.
• Their eigenvectors are orthogonal and can be chosen to be orthonormal (i.e. vTi vj =
δij ). Here we assume that for degenerate eigenvalues we pick an orthogonal set of
corresponding eigenvectors.

If we choose orthonormal eigenvectors, the matrix V has the property VT V = 1 (⇒


V = V−1 ). Hence it is an orthogonal matrix V = O and Eq. (1.11) reads
T

A = OOT , (1.12)

where  is a diagonal matrix containing the eigenvalues associated with the eigenvectors
in the columns of O. A symmetric matrix can be diagonalized by an orthogonal matrix.
Remark that an N × N orthogonal matrix is fully parameterized by N (N − 1)/2 “angles”,
whereas  contains N diagonal elements. So the total number of parameters of the diagonal
decomposition is N (N − 1)/2 + N, which is identical, as it should be, to the number of
different elements of a symmetric N × N matrix.
Let us come back to our dynamical system example, Eq. (1.2). One basic question is to
know whether the perturbation y will grow with time, or decay with time. The answer to
this question is readily given by the eigenvalues of A. For simplicity, we assume F to be
a gradient such that A is symmetric. Since the eigenvectors of A are orthonormal, one can
decompose y in term of the v’s as


N
y(t) = ci (t)vi . (1.13)
i=1

Taking the dot product of Eq. (1.2) with vi then shows that the dynamics of the coefficients
ci (t) are decoupled and given by

dci
= λi ci , (1.14)
dt
where λi is the eigenvalue associated with vi . Therefore, any component of the initial
perturbation y(t = 0) that is along an eigenvector with positive eigenvalue will grow expo-
nentially with time, until the linearized approximation leading to Eq. (1.2) breaks down.
Conversely, components along directions with negative eigenvalues decrease exponentially
with time. An equilibrium x∗ is called stable provided all eigenvalues are negative, and
marginally stable if some eigenvalues are zero while all others are negative.
The important message carried by the example above is that diagonalizing a matrix
amounts to finding a way to decouple the different degrees of freedom, and convert a matrix
equation into a set of N scalar equations, as Eqs. (1.14). We will see later that the same
idea holds for covariance matrices as well: their diagonalization allows one to find a set of
uncorrelated vectors. This is usually called Principal Component Analysis (pca).
1.1 Matrices, Eigenvalues and Singular Values 7

Exercise 1.1.1 Instability of eigenvalues of non-symmetric matrices


Consider the N × N square band diagonal matrix M0 defined by
[M0 ]ij = 2δi,j −1 :
⎛ ⎞
0 2 0 ··· 0
⎜0 0 2 · · · 0⎟
⎜ ⎟
⎜ ⎟
M0 = ⎜0 0 0 . . . 0⎟ . (1.15)
⎜ ⎟
⎝0 0 0 · · · 2⎠
0 0 0 ··· 0

(a) Show that MN 0 = 0 and so all the eigenvalues of M0 must be zero.


Use a numerical eigenvalue solver for non-symmetric matrices and confirm
numerically that this is the case for N = 100.
(b) If O is an orthogonal matrix (OOT = 1), OM0 OT has the same eigenvalues
as M0 . Following Exercise 1.2.4, generate a random orthogonal matrix O.
Numerically find the eigenvalues of OM0 OT . Do you get the same answer as
in (a)?
(c) Consider M1 whose elements are all equal to those of M0 except for one
element in the lower left corner [M1 ]N,1 = (1/2)N−1 . Show that MN1 = 1;
more precisely, show that the characteristic polynomial of M1 is given by
det(M1 − λ1) = λN − 1, therefore M1 has N distinct eigenvalues equal to the
N complex roots of unity λk = e2πik/N .
(d) For N greater than about 60, OM0 OT and OM1 OT are indistinguishable to
machine precision. Compare numerically the eigenvalues of these two rotated
matrices.

1.1.3 Singular Values


A non-symmetric, square matrix cannot in general be decomposed as A = OOT , where
 is a diagonal matrix and O an orthogonal matrix. One can however find a very useful
alternative decomposition as

A = VSUT , (1.16)

where S is a non-negative diagonal matrix, whose elements are called the singular values
of A, and U,V are two real, orthogonal matrices. Whenever A is symmetric positive semi-
definite, one has S =  and U = V.
Equation (1.16) also holds for rectangular N ×T matrices, where V is N ×N orthogonal,
U is T × T orthogonal and S is N × T diagonal as defined below. To construct the
singular value decomposition (svd) of A, we first introduce two matrices B and B, defined
as B := AAT and B = AT A. It is plain to see that these matrices are symmetric, since
8 Deterministic Matrices

BT = (AAT )T = AT T AT = B (and similarly for B). They are also positive semi-definite as
for any vector x we have xT Bx = ||AT x||2 ≥ 0.
We can show that B and B have the same non-zero eigenvalues. In fact, let λ > 0 be an
eigenvalue of B and v  0 is the corresponding eigenvector. Then we have, by definition,
AAT v = λv. (1.17)
Let u = AT v, then we can get from the above equation that
AT AAT v = λAT v ⇒ Bu = λu. (1.18)
Moreover,
u2 = vT AAT v = vT Bv  0 ⇒ u  0. (1.19)
Hence λ is also an eigenvalue of B. Note that for degenerate eigenvalues λ of B, an
orthogonal set of corresponding eigenvectors {v } gives rise to an orthogonal set {AT v }
of eigenvectors of B. Hence the multiplicity of λ in B is at least that of B. Similarly, we can
show that any non-zero eigenvalue of B is also an eigenvalue of B. This finishes the proof
of the claim.
Note that B has at most N non-zero eigenvalues and B has at most T non-zero eigen-
values. Thus by the above claim, if T > N , B has at least T − N zero eigenvalues, and if
T < N, B has at least N − T zero eigenvalues. We denote the other min{N,T } eigenvalues
of B and B by {λk }1≤k≤min{N,T } . Then the svd of A is expressed as Eq. (1.16), where V is
the N × N orthogonal matrix consisting of the N normalized eigenvectors of B, U is the
T × T orthogonal matrix consisting of the T normalized eigenvectors of B, and S is an

N × T rectangular diagonal matrix with Skk = λk ≥ 0, 1 ≤ k ≤ min{N,T } and all other
entries equal to zero.
For instance, if N < T , we have
⎛√ ⎞
λ1 0 0 0 ··· 0

⎜ 0 λ2 0 0 · · · 0⎟
⎜ ⎟
S=⎜ .. ⎟. (1.20)
⎝ 0 0 . 0 · · · 0⎠

0 0 0 λN · · · 0
Although (non-degenerate) normalized eigenvectors are unique up to a sign, the choice of

the positive sign for the square-root λk imposes a condition on the combined sign for the
left and right singular vectors vk and uk . In other words, simultaneously changing both vk
and uk to −vk and −uk leaves the matrix A invariant, but for non-zero singular values one
cannot individually change the sign of either vk or uk .
The recipe to find the svd, Eq. (1.16), is thus to diagonalize both AAT (to obtain V
and S2 ) and AT A (to obtain U and again S2 ). It is insightful to again count the number of
parameters involved in this decomposition. Consider a general N × T matrix with T ≥ N
(the case N ≥ T follows similarly). The N eigenvectors of AAT are generically unique
up to a sign, while for T − N > 0 the matrix AT A will have a degenerate eigenspace
associated with the eigenvalue 0 of size T − N , hence its eigenvectors are only unique up
1.2 Some Useful Theorems and Identities 9

to an arbitrary rotation in T − N dimension. So generically the svd decomposition amounts


to writing the NT elements of A as
1 1 1
NT ≡ N (N − 1) + N + T (T − 1) − (T − N )(T − N − 1). (1.21)
2 2 2
The interpretation of Eq. (1.16) for N × N matrices is that one can always find an
orthonormal basis of vectors {u} such that the application of a matrix A amounts to a
rotation (or an improper rotation) of {u} into another orthonormal set {v}, followed by a

dilation of each vk by a positive factor λk .
Normal matrices are such that U = V. In other words, A is normal whenever A com-
mutes with its transpose: AAT = AT A. Symmetric, skew-symmetric and orthogonal matri-
ces are normal, but other cases are possible. For example a 3 × 3 matrix such that each row
and each column has exactly two elements equal to 1 and one element equal to 0 is normal.

1.2 Some Useful Theorems and Identities


In this section, we state without proof very useful theorems on eigenvalues and matrices.

1.2.1 Gershgorin Circle Theorem



Let A be a real matrix, with elements Aij . Define Ri as Ri = j i |Aij |, and Di a disk in
the complex plane centered on Aii and of radius Ri . Then every eigenvalue of A lies within
at least one disk Di . For example, for the matrix
⎛ ⎞
1 −0.2 0.2
A = ⎝−0.3 2 −0.2⎠ , (1.22)
0 1.1 3
the three circles are located on the real axis at x = 1,2 and 3 with radii 0.4, 0.5 and 1.1
respectively (see Fig. 1.1).
In particular, eigenvalues corresponding to eigenvectors with a maximum amplitude on
i lie within the disk Di .

1.2.2 The Perron–Frobenius Theorem


Let A be a real matrix, with all its elements positive Aij > 0. Then the top eigenvalue λmax
is unique and real (all other eigenvalues have a smaller real part). The corresponding top
eigenvector v∗ has all its elements positive:
Av∗ = λmax v∗ ; v∗k > 0, ∀k. (1.23)
The top eigenvalue satisfies the following inequalities:
 
min Aij ≤ λmax ≤ max Aij . (1.24)
i i
j j
10 Deterministic Matrices

1.0

0.5
Im ( )
0.0

− 0.5

− 1.0

0 1 2 3 4
Re ( )

Figure 1.1 The three complex eigenvalues of the matrix (1.22) (crosses) and its three Gershgorin
circles. The first eigenvalue λ1 ≈ 0.92 falls in the first circle while the other two λ2,3 ≈ 2.54 ± 0.18i
fall in the third one.

Application: Suppose A is a stochastic matrix, such that all its elements are positive

and satisfy i Aij = 1, ∀j . Then clearly the vector 1 is an eigenvector of AT , with
eigenvalue λ = 1. But since the Perron–Frobenius can be applied to AT , the inequalities
(1.24) ensure that λ is the top eigenvalue of AT , and thus also of A. All the elements of the
corresponding eigenvector v∗ are positive, and describe the stationary state of the associated
Master equation, i.e.
 v∗
Pi∗ = Aij Pj∗ −→ Pi∗ =  i ∗ . (1.25)
j k vk

Exercise 1.2.1 Gershgorin and Perron–Frobenius


Show that the upper bound in Eq. (1.24) is a simple consequence of the
Gershgorin theorem.

1.2.3 The Eigenvalue Interlacing Theorem


Let A be an N ×N symmetric matrix (or more generally Hermitian matrix) with eigenvalues
λ1 ≥ λ2 · · · ≥ λN . Consider the N − 1 × N − 1 submatrix A\i obtained by removing the ith
row and ith columns of A. Its eigenvalues are μ(i) (i) (i)
1 ≥ μ2 · · · ≥ μN−1 . Then the following
interlacing inequalities hold:
(i) (i)
λ1 ≥ μ1 ≥ λ2 · · · ≥ μN−1 ≥ λN . (1.26)
1.2 Some Useful Theorems and Identities 11

Very recently, a formula relating eigenvectors to eigenvalues was (re-)discovered. Calling


vi the eigenvector of A associated with λi , one has2
N−1 (j )
2 λi − μk
(vi )j = Nk=1 . (1.27)
=1,i λi − λ

1.2.4 Sherman–Morrison Formula


The Sherman–Morrison formula gives the inverse of a matrix A perturbed by a rank-1
perturbation:
A−1 uvT A−1
(A + uvT )−1 = A−1 − , (1.28)
1 + vT A−1 u
valid for any invertible matrix A and vectors u and v such that the denominator does not
vanish. This is a special case of the Woodbury identity, which reads
 −1  −1
A + UCVT = A−1 − A−1 U C−1 + VT A−1 U VT A−1, (1.29)

where U,V are N × K matrices and C is a K × K matrix. Equation (1.28) corresponds to


the case K = 1.
The associated Sherman–Morrison determinant lemma reads
 
det(A + vuT ) = det A · 1 + uT A−1 v (1.30)

for invertible A.

Exercise 1.2.2 Sherman–Morrison


Show that Eq. (1.28) is correct by multiplying both sides by (A + uvT ).

1.2.5 Schur Complement Formula


The Schur complement, also called inversion by partitioning, relates the blocks of the
inverse of a matrix to the inverse of blocks of the original matrix. Let M be an invertible
matrix which we divide in four blocks as
   
M11 M12 −1 Q11 Q12
M= and M = Q = , (1.31)
M21 M22 Q21 Q22
where [M11 ] = n × n, [M12 ] = n × (N − n), [M21 ] = (N − n) × n, [M22 ] = (N − n) ×
(N − n), and M22 is invertible. The integer n can take any values from 1 to N − 1.

2 See: P. Denton, S. Parke, T. Tao, X. Zhang, Eigenvalues from Eigenvectors: a survey of a basic identity in linear algebra,
arXiv:1908.03795.
12 Deterministic Matrices

Then the upper left n × n block of Q is given by

Q−1 −1
11 = M11 − M12 (M22 ) M21, (1.32)

where the right hand side is called the Schur complement of the block M22 of the matrix M.

Exercise 1.2.3 Combining Schur and Sherman–Morrison


In the notation of Eq. (1.31) for n = 1 and any N > 1, combine the Schur
complement of the lower right block with the Sherman–Morrison formula to
show that
(M22 )−1 M21 M12 (M22 )−1
Q22 = (M22 )−1 + . (1.33)
M11 − M12 (M22 )−1 M21

1.2.6 Function of a Matrix and Matrix Derivative


In our study of random matrices, we will need to extend real or complex scalar functions
to take a symmetric matrix M as its argument. The simplest way to extend such a function
is to apply it to each eigenvalue of the matrix M = OOT :

F (M) = OF ()OT , (1.34)

where F () is the diagonal matrix where we have applied the function F to each (diag-
onal) entry of . The function F (M) is now a matrix valued function of a matrix. Scalar
polynomial functions can obviously be extended directly as


K 
K
F (x) = ak x k ⇒ F (M) = ak Mk , (1.35)
k=0 k=0

but this is equivalent to applying the polynomial to the eigenvalues of M. By extension,


when the Taylor series of the function F (x) converges for every eigenvalue of M the matrix
Taylor series coincides with our definition.
Taking the trace of F (M) will yield a matrix function that returns a scalar. This con-
struction is rotationally invariant in the following sense:

Tr F (UMUT ) = Tr F (M) for any UUT = 1. (1.36)

We can take the derivative of a scalar-valued function Tr F (M) with respect to each
element of the matrix M:
d d
Tr(F (M)) = [F (M)]ij ⇒ Tr(F (M)) = F (M). (1.37)
d[M]ij dM

Equation (1.37) is easy to derive when F (x) is a monomial ak x k and by linearity for
polynomial or Taylor series F (x).
1.2 Some Useful Theorems and Identities 13

1.2.7 Jacobian of Simple Matrix Transformations


Suppose one transforms an N × N matrix A into another N × N matrix B through some
function of the matrix elements. The Jacobian of the transformation is defined as the
determinant of the partial derivatives:
∂Bk
Gij,k = . (1.38)
∂Aij
The simplest case is just multiplication by a scalar: B = αA, leading to Gij,k = αδik δj  .
G is therefore the tensor product of α1 with 1, and its determinant is thus equal to α N .
Not much more difficult is the case of an orthogonal transformation B = OAOT , for which
Gij,k = Oik Oj  . G is now the tensor product G = O ⊗ O and therefore its determinant is
unity.
Slightly more complicated is the case where B = A−1 . Using simple algebra, one readily
obtains, for symmetric matrices,
1 −1 1
Gij,k = [A ]ik [A−1 ]j  + [A−1 ]i [A−1 ]j k . (1.39)
2 2
Let us now assume that A has eigenvalues λα and eigenvectors vα . One can easily diago-
nalize Gij,k within the symmetric sector, since
   1  
Gij,k vα,k vβ, + vα, vβ,k = vα,i vβ,j + vα,j vβ,i . (1.40)
λα λβ
k

So the determinant of G is simply α,β≥α (λα λβ )−1 . Taking the logarithm of this product
helps avoiding counting mistakes, and finally leads to the result

det G = (det A)−N −1 . (1.41)

Exercise 1.2.4 Random Matrices


We conclude this chapter on deterministic matrices with a numerical exercise
on random matrices. Most of the results of this exercise will be explored
theoretically in the following chapters.
• Let M be a random real symmetric orthogonal matrix, that is an N × N matrix
satisfying M = MT = M−1 . Show that all the eigenvalues of M are ±1.
• Let X be a Wigner matrix, i.e. an N ×N real symmetric matrix whose diagonal
and upper triangular entries are iid Gaussian random √
numbers with zero mean
and variance σ 2 /N. You can use X = σ (H + HT )/ 2N where H is a non-
symmetric N × N matrix with iid standard Gaussians.
• The matrix P+ is defined as P+ = 12 (M + 1N ). Convince yourself that P+ is
the projector onto the eigenspace of M with eigenvalue +1. Explain the effect
of the matrix P+ on eigenvectors of M.
14 Deterministic Matrices

• An easy way to generate a random matrix M is to generate a Wigner matrix


(independent of X), diagonalize it, replace every eigenvalue by its sign and
reconstruct the matrix. The procedure does not depend on the σ used for the
Wigner.
• We consider a matrix E of the form E = M+X. To wit, E is a noisy version of
M. The goal of the following is to understand numerically how the matrix E is
corrupted by the Wigner noise. Using the computer language of your choice,
for a large value of N (as large as possible while keeping computing times
below one minute), for three interesting values of σ of your choice, do the
following numerical analysis.
(a) Plot a histogram of the eigenvalues of E, for a single sample first, and then for
many samples (say 100).
(b) From your numerical analysis, in the large N limit, for what values of σ do
you expect a non-zero density of eigenvalues near zero.
(c) For every normalized eigenvector vi of E, compute the norm of the vector
P+ vi . For a single sample, do a scatter plot of |P+ vi |2 vs λi (its eigenvalue).
Turn your scatter plot into an approximate conditional expectation value
(using a histogram) including data from many samples.
(d) Build an estimator (E) of M using only data from E. We want to minimize
the error E = N1 ||((E) − M)||2F where ||A||2F = TrAAT . Consider first
1 (E) = E and then 0 (E) = 0. What is the error E of these two estimators?
Try to build an ad-hoc estimator (E) that has a lower error E than these two.
(e) Show numerically that the eigenvalues of E are not iid. For each sample E
rank its eigenvalues λ1 < λ2 < · · · < λN . Consider the eigenvalue spacing
sk = λk − λk−1 for eigenvalues in the bulk (.2N < k < .3N and .7N < k <
.8N). Make a histogram of {sk } including data from 100 samples. Make 100
pseudo-iid samples: mix eigenvalues for 100 different samples and randomly
choose N from the 100N possibilities, do not choose the same eigenvalue
twice for a given pseudo-iid sample. For each pseudo-iid sample, compute
sk in the bulk and make a histogram of the values using data from all 100
pseudo-iid samples. (Bonus) Try to fit an exponential distribution to these two
histograms. The iid case should be well fitted by the exponential but not the
original data (not iid).

You might also like