0% found this document useful (0 votes)

77 views

2019 GHOJOGN Generalized Eigenvalue Tutorial

This document provides a tutorial on eigenvalue and generalized eigenvalue problems. It introduces the eigenvalue problem, which involves finding eigenvectors and eigenvalues that satisfy a matrix equation involving a single symmetric matrix. The generalized eigenvalue problem similarly finds eigenvectors and eigenvalues but involves two symmetric matrices. Examples of problems in machine learning that reduce to eigenvalue problems are also given, such as principal component analysis, kernel supervised principal component analysis, and Fisher discriminant analysis.

Uploaded by

carlos carrasco

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

77 views

2019 GHOJOGN Generalized Eigenvalue Tutorial

Uploaded by

carlos carrasco

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

Eigenvalue and Generalized Eigenvalue Problems: Tutorial

Benyamin Ghojogh BGHOJOGH @ UWATERLOO . CA

Department of Electrical and Computer Engineering,
Machine Learning Laboratory, University of Waterloo, Waterloo, ON, Canada
arXiv:1903.11240v1 [stat.ML] 25 Mar 2019

Fakhri Karray KARRAY @ UWATERLOO . CA

Department of Electrical and Computer Engineering,
Centre for Pattern Analysis and Machine Intelligence, University of Waterloo, Waterloo, ON, Canada
Mark Crowley MCROWLEY @ UWATERLOO . CA
Department of Electrical and Computer Engineering,
Machine Learning Laboratory, University of Waterloo, Waterloo, ON, Canada

Abstract yield to the eigenvalue and generalized eigenvalue prob-

This paper is a tutorial for eigenvalue and gen- lems. Some examples of these optimization problems in
eralized eigenvalue problems. We first intro- machine learning are also introduced for better illustration.
duce eigenvalue problem, eigen-decomposition The examples include principal component analysis, kernel
(spectral decomposition), and generalized eigen- supervised principal component analysis, and Fisher dis-
value problem. Then, we mention the optimiza- criminant analysis.
tion problems which yield to the eigenvalue and
generalized eigenvalue problems. We also pro- 2. Introducing Eigenvalue and Generalized
vide examples from machine learning, includ- Eigenvalue Problems
ing principal component analysis, kernel super- In this section, we introduce the eigenvalue problem and
vised principal component analysis, and Fisher generalized eigenvalue problem.
discriminant analysis, which result in eigenvalue
and generalized eigenvalue problems. Finally, 2.1. Eigenvalue Problem
we introduce the solutions to both eigenvalue and The eigenvalue problem (Wilkinson, 1965;
generalized eigenvalue problems. Golub & Van Loan, 2012) of a symmetric matrix
A ∈ Rd×d is defined as:
1. Introduction Aφi = λi φi , ∀i ∈ {1, . . . , d}, (1)
Eigenvalue and generalized eigenvalue problems play im- and in matrix form, it is:
portant roles in different fields of science, especially in ma-
chine learning. In eigenvalue problem, the eigenvectors AΦ = ΦΛ, (2)
represent the directions of the spread or variance of data
where the columns of Rd×d ∋ Φ := [φ1 , . . . , φd ] are
and the corresponding eigenvalues are the magnitude of the
the eigenvectors and diagonal elements of Rd×d ∋ Λ :=
spread in these directions (Jolliffe, 2011). In generalized
diag([λ1 , . . . , λd ]⊤ ) are the eigenvalues. Note that φi ∈
eigenvalue problem, these directions are impacted by an-
Rd and λi ∈ R.
other matrix. If the other matrix is the identity matrix, this
impact is canceled and we will have the eigenvalue problem Note that for eigenvalue problem, the matrix A can be non-
capturing the directions of the maximum spread. symmetric. If the matrix is symmetric, its eigenvectors
are orthogonal/orthonormal and if it is non-symmetric, its
In this paper, we introduce the eigenvalue problem and gen-
eigenvectors are not orthogonal/orthonormal.
eralized eigenvalue problem and we introduce their solu-
tions. We also introduce the optimization problems which The Eq. (2) can be restated as:
⊤ ⊤
| {z } = ΦΛΦ
AΦ = ΦΛ =⇒ A ΦΦ
I

=⇒ A = ΦΛΦ⊤ = ΦΛΦ−1 , (3)

Eigenvalue and Generalized Eigenvalue Problems: Tutorial 2

where Φ⊤ = Φ−1 because Φ is an orthogonal matrix. which is an eigenvalue problem for A according to Eq. (1).
Moreover, note that we always have Φ⊤ Φ = I for orthog- The φ is the eigenvector of A and the λ is the eigenvalue.
onal Φ but we only have ΦΦ⊤ = I if “all” the columns of As the Eq. (6) is a maximization problem, the eigenvector
the orthogonal Φ exist (it is not truncated, i.e., it is a square is the one having the largest eigenvalue. If the Eq. (6) is
matrix). The Eq. (3) is referred to as “eigenvalue decom- a minimization problem, the eigenvector is the one having
position”, “eigen-decomposition”, or “spectral decomposi- the smallest eigenvalue.
tion”.
3.2. Optimization Form 2
2.2. Generalized Eigenvalue Problem Consider the following optimization problem with the vari-
The generalized eigenvalue problem (Parlett, 1998; able Φ ∈ Rd×d :
Golub & Van Loan, 2012) of two symmetric matrices A ∈
Rd×d and B ∈ Rd×d is defined as:
maximize tr(Φ⊤ A Φ),
Φ (7)
Aφi = λi Bφi , ∀i ∈ {1, . . . , d}, (4) subject to Φ⊤ Φ = I,
and in matrix form, it is:
where A ∈ Rd×d , the tr(.) denotes the trace of matrix, and
AΦ = BΦΛ, (5) I is the identity matrix. Note that according to the prop-
erties of trace, the objective function can be any of these:
where the columns of Rd×d ∋ Φ := [φ1 , . . . , φd ] are tr(Φ⊤ A Φ) = tr(ΦΦ⊤ A) = tr(AΦΦ⊤ ).
the eigenvectors and diagonal elements of Rd×d ∋ Λ := The Lagrangian (Boyd & Vandenberghe, 2004) for Eq. (7)
diag([λ1 , . . . , λd ]⊤ ) are the eigenvalues. Note that φi ∈ is:
Rd and λi ∈ R.
The generalized eigenvalue problem of Eq. (4) or (5) is
L = tr(Φ⊤ A Φ) − tr Λ⊤ (Φ⊤ Φ − I) ,
denoted by (A, B). The (A, B) is called “pair” or “pencil”
(Parlett, 1998). The order in the pair matters. The Φ and Λ
are called the generalized eigenvectors and eigenvalues of where Λ ∈ Rd×d is a diagonal matrix whose entries are the
(A, B). The (Φ, Λ) or (φi , λi ) is called the “eigenpair” of Lagrange multipliers.
the pair (A, B) in the literature (Parlett, 1998). Equating derivative of L to zero gives us:
Comparing Eqs. (1) and (4) or Eqs. (2) and (5) shows that
the eigenvalue problem is a special case of the generalized ∂L set
eigenvalue problem where B = I. Rd×d ∋ = 2 AΦ − 2 ΦΛ = 0
∂Φ
=⇒ AΦ = ΦΛ,
3. Eigenvalue Optimization
In this section, we introduce the optimization problems which is an eigenvalue problem for A according to Eq. (2).
which yield to the eigenvalue problem. The columns of Φ are the eigenvectors of A and the diag-
3.1. Optimization Form 1 onal elements of Λ are the eigenvalues.
As the Eq. (7) is a maximization problem, the eigenvalues
Consider the following optimization problem with the vari-
and eigenvectors in Λ and Φ are sorted from the largest
able φ ∈ Rd :
to smallest eigenvalues. If the Eq. (7) is a minimization
maximize φ⊤ A φ, problem, the eigenvalues and eigenvectors in Λ and Φ are
φ sorted from the smallest to largest eigenvalues.
(6)
subject to φ⊤ φ = 1,
3.3. Optimization Form 3
where A ∈ Rd×d . The Lagrangian Consider the following optimization problem with the vari-
(Boyd & Vandenberghe, 2004) for Eq. (6) is: able φ ∈ Rd :

L = φ⊤ A φ − λ (φ⊤ φ − 1),
minimize ||X − φ φ⊤ X||2F ,
φ
(8)
where λ ∈ R is the Lagrange multiplier. Equating the
subject to φ⊤ φ = 1,
derivative of Lagrangian to zero gives us:

∂L set where X ∈ Rd×n and ||.||F denotes the Frobenius norm of

Rd ∋ = 2Aφ − 2λφ = 0 =⇒ Aφ = λφ,
∂φ matrix.
Eigenvalue and Generalized Eigenvalue Problems: Tutorial 3

The objective function in Eq. (8) is simplified as: 3.5. Optimization Form 5
||X− φφ ⊤
X||2F Consider the following optimization problem with the vari-
able φ ∈ Rd :
= tr (X − φφ⊤ X)⊤ (X − φφ⊤ X)
= tr (X ⊤ − X ⊤ φφ⊤ )(X − φφ⊤ X)
φ⊤ A φ
maximize . (10)
⊤ ⊤ ⊤ ⊤ ⊤ ⊤
φ φ⊤ φ
= tr X X − X φφ X + X φ φ φ φ X
| {z }
1 According to Rayleigh-Ritz quotient method (Croot, 2005),
⊤
= tr(X X − X φφ X) ⊤ ⊤ this optimization problem can be restated as:

= tr(X ⊤ X) − tr(X ⊤ φφ⊤ X) maximize φ⊤ A φ,

φ
(11)
= tr(X ⊤ X) − tr(XX ⊤ φφ⊤ )
subject to φ⊤ φ = 1,
= tr(X ⊤ X − XX ⊤ φφ⊤ )
The Lagrangian (Boyd & Vandenberghe, 2004) is:
The Lagrangian (Boyd & Vandenberghe, 2004) is:
L = φ⊤ A φ − λ(φ⊤ φ − 1),
L = tr(X ⊤ X) − tr(XX ⊤ φφ⊤ ) − λ(φ⊤ φ − 1),
where λ is the Lagrange multiplier. Equating the derivative
where λ is the Lagrange multiplier. Equating the derivative
of L to zero gives:
of L to zero gives:
∂L ∂L set
Rd ∋
set
= 2 XX ⊤ φ − 2 λ φ = 0 = 2Aφ − 2λφ = 0
∂φ ∂w
(a)
=⇒ 2 A φ = 2 λ φ =⇒ A φ = λ φ,
=⇒ XX ⊤ φ = λ φ =⇒ A φ = λ φ,
which is an eigenvalue problem for A according to Eq. (1).
where (a) is because we take Rd×d ∋ A = XX ⊤ . The The φ is the eigenvector of A and the λ is the eigenvalue.
A φ = λ φ is an eigenvalue problem for A according to As the Eq. (10) is a maximization problem, the eigenvector
Eq. (1). The φ is the eigenvector of A and the λ is the is the one having the largest eigenvalue. If the Eq. (10) is
eigenvalue. a minimization problem, the eigenvector is the one having
the smallest eigenvalue.
3.4. Optimization Form 4
Consider the following optimization problem with the vari- 4. Generalized Eigenvalue Optimization
able Φ ∈ Rd×d :
In this section, we introduce the optimization problems
minimize ||X − Φ Φ⊤ X||2F , which yield to the generalized eigenvalue problem.
Φ (9)
subject to Φ⊤ Φ = I, 4.1. Optimization Form 1
d×n Consider the following optimization problem with the vari-
where X ∈ R .
Similar to what we had for Eq. (8), the objective function able φ ∈ Rd :
in Eq. (9) is simplified as:
maximize φ⊤ A φ,
φ
||X− ΦΦ ⊤
X||2F ⊤
= tr(X X − XX ΦΦ ) ⊤ ⊤ (12)
subject to φ⊤ B φ = 1,
The Lagrangian (Boyd & Vandenberghe, 2004) is:
where A ∈ Rd×d and B ∈ Rd×d . The Lagrangian
L = tr(X ⊤ X) − tr(XX ⊤ ΦΦ⊤ )
(Boyd & Vandenberghe, 2004) for Eq. (12) is:
− tr Λ⊤ (Φ⊤ Φ − I) ,
L = φ⊤ A φ − λ (φ⊤ B φ − 1),
d×d
where Λ ∈ R is a diagonal matrix including Lagrange
multipliers. Equating the derivative of L to zero gives: where λ ∈ R is the Lagrange multiplier. Equating the
derivative of Lagrangian to zero gives us:
∂L set
Rd×d ∋ = 2 XX ⊤ Φ − 2 ΦΛ = 0
∂Φ ∂L set
Rd ∋ = 2Aφ − 2λBφ = 0 =⇒ Aφ = λBφ,
=⇒ XX ⊤ Φ = ΦΛ =⇒ AΦ = ΦΛ, ∂φ
which is an eigenvalue problem for A according to Eq. (2). which is a generalized eigenvalue problem (A, B) accord-
The columns of Φ are the eigenvectors of A and the diag- ing to Eq. (4). The φ is the eigenvector and the λ is the
onal elements of Λ are the eigenvalues. eigenvalue for this problem.
Eigenvalue and Generalized Eigenvalue Problems: Tutorial 4

As the Eq. (12) is a maximization problem, the eigenvector where λ is the Lagrange multiplier. Equating the derivative
is the one having the largest eigenvalue. If the Eq. (12) is of L to zero gives:
a minimization problem, the eigenvector is the one having
∂L set
the smallest eigenvalue. Rd ∋ = 2 XX ⊤ φ − 2 λ B φ = 0
∂φ
Comparing Eqs. (6) and (12) shows that eigenvalue prob-
lem is a special case of generalized eigenvalue problem =⇒ XX ⊤ φ = λ B φ =⇒ A φ = λ B φ,
where B = I.
which is a generalized eigenvalue problem (A, B) accord-
4.2. Optimization Form 2 ing to Eq. (4). The φ is the eigenvector and the λ is the
eigenvalue.
Consider the following optimization problem with the vari-
able Φ ∈ Rd×d : 4.4. Optimization Form 4
⊤
maximize tr(Φ A Φ), Consider the following optimization problem with the vari-
Φ (13) able Φ ∈ Rd×d :
subject to Φ⊤ B Φ = I,
minimize ||X − Φ Φ⊤ X||2F ,
d×d d×d Φ
where A ∈ R and B ∈ R . Note that according to (15)
the properties of trace, the objective function can be any of subject to Φ⊤ B Φ = I,
these: tr(Φ⊤ A Φ) = tr(ΦΦ⊤ A) = tr(AΦΦ⊤ ).
where X ∈ Rd×n .
The Lagrangian (Boyd & Vandenberghe, 2004) for Eq.
(13) is: Similar to what we had for Eq. (9), the objective function
in Eq. (15) is simplified as:
L = tr(Φ⊤ A Φ) − tr Λ⊤ (Φ⊤ B Φ − I) ,
||X− ΦΦ⊤ X||2F = tr(X ⊤ X − XX ⊤ ΦΦ⊤ )
d×d
where Λ ∈ R is a diagonal matrix whose entries are the
Lagrange multipliers. The Lagrangian (Boyd & Vandenberghe, 2004) is:
Equating derivative of L to zero gives us: L = tr(X ⊤ X) − tr(XX ⊤ ΦΦ⊤ )

∂L set − tr Λ⊤ (Φ⊤ B Φ − I) ,
Rd×d ∋ = 2 AΦ − 2 BΦΛ = 0
∂Φ
=⇒ AΦ = BΦΛ, where Λ ∈ Rd×d is a diagonal matrix including Lagrange
multipliers. Equating the derivative of L to zero gives:
which is an eigenvalue problem (A, B) according to Eq.
(5). The columns of Φ are the eigenvectors of A and the ∂L set
Rd×d ∋ = 2 XX ⊤ Φ − 2 B ΦΛ = 0
diagonal elements of Λ are the eigenvalues. ∂Φ
As the Eq. (13) is a maximization problem, the eigenvalues =⇒ XX ⊤ Φ = BΦΛ =⇒ AΦ = BΦΛ,
and eigenvectors in Λ and Φ are sorted from the largest
which is an eigenvalue problem (A, B) according to Eq.
to smallest eigenvalues. If the Eq. (13) is a minimization
(5). The columns of Φ are the eigenvectors of A and the
problem, the eigenvalues and eigenvectors in Λ and Φ are
diagonal elements of Λ are the eigenvalues.
sorted from the smallest to largest eigenvalues.
4.5. Optimization Form 5
4.3. Optimization Form 3
Consider the following optimization problem (Parlett,
Consider the following optimization problem with the vari-
1998) with the variable φ ∈ Rd :
able φ ∈ Rd :
minimize ||X − φ φ⊤ X||2F , φ⊤ A φ
maximize . (16)
φ
(14) φ φ⊤ B φ
subject to φ⊤ B φ = 1,
According to Rayleigh-Ritz quotient method (Croot, 2005),
where X ∈ Rd×n . this optimization problem can be restated as:
Similar to what we had for Eq. (8), The objective function
maximize φ⊤ A φ,
in Eq. (14) is simplified as: φ
(17)
||X− φφ ⊤
X||2F ⊤
= tr(X X − XX φφ ) ⊤ ⊤ subject to φ⊤ B φ = 1,

The Lagrangian (Boyd & Vandenberghe, 2004) is: The Lagrangian (Boyd & Vandenberghe, 2004) is:

L = tr(X ⊤ X) − tr(XX ⊤ φφ⊤ ) − λ(φ⊤ B φ − 1), L = φ⊤ A φ − λ(φ⊤ B φ − 1),

Eigenvalue and Generalized Eigenvalue Problems: Tutorial 5

where λ is the Lagrange multiplier. Equating the derivative If we consider several PCA directions, i.e., the columns of
of L to zero gives: U , the minimization of the reconstruction error is:
∂L set minimize ||X − U U ⊤ X||2F ,
= 2Aφ − 2λB φ = 0 U (21)
∂w
=⇒ 2 A φ = 2 λ B φ =⇒ A φ = λ B φ, subject to U ⊤ U = I.
Thus, the columns of U are the eigenvectors of the covari-
which is a generalized eigenvalue problem (A, B) accord-
ance matrix S = XX ⊤ (the X is already centered by
ing to Eq. (4). The φ is the eigenvector and the λ is the
removing its mean).
eigenvalue.
As the Eq. (16) is a maximization problem, the eigenvector 5.2. Examples for Generalized Eigenvalue Problem
is the one having the largest eigenvalue. If the Eq. (16) is 5.2.1. K ERNEL S UPERVISED P RINCIPAL C OMPONENT
a minimization problem, the eigenvector is the one having A NALYSIS
the smallest eigenvalue.
Kernel Supervised PCA (SPCA) (Barshan et al., 2011)
uses the following optimization problem:
5. Examples for the Optimization Problems
In this section, we introduce some examples in machine maximize tr(Θ⊤ K x HK y HK x Θ),
Θ (22)
learning which use the introduced optimization problems.
subject to Θ⊤ K x Θ = I,
5.1. Examples for Eigenvalue Problem where K x and K y are the kernel matrices over the train-
5.1.1. VARIANCE IN P RINCIPAL C OMPONENT ing data and the labels of the training data, respectively,
A NALYSIS the H := I − (1/n)11⊤ is the centering matrix, and the
In Principal Component Analysis (PCA) (Pearson, 1901; columns of Θ span the kernel SPCA subspace.
Friedman et al., 2009), if we want to project onto one vec- According to Eq. (13), the solution to Eq. (22) is:
tor (one-dimensional PCA subspace), the problem is:
K x HK y HK x Θ = K x ΘΛ, (23)
maximize u⊤ S u, which is the generalized eigenvalue problem
u
(18)
subject to ⊤
u u = 1, (K x HK y HK x , K x ) according to Eq. (5) where
the Θ and Λ are the eigenvector and eigenvalue matrices,
where u is the projection direction and S is the covariance respectively.
matrix. Therefore, u is the eigenvector of S with the largest
eigenvalue. 5.2.2. F ISHER D ISCRIMINANT A NALYSIS
If we want to project onto a PCA subspace spanned by sev- Another example is Fisher Discriminant Analysis (FDA)
eral directions, we have: (Fisher, 1936; Friedman et al., 2009) in which the Fisher
criterion (Xu & Lu, 2006) is maximized:
maximize tr(U ⊤ S U ),
U (19) w⊤ S B w
⊤ maximize , (24)
subject to U U = I, w w⊤ S W w
where the columns of U span the PCA subspace. where w is the projection direction and S B and S W are
between- and within-class scatters:
5.1.2. R ECONSTRUCTION IN P RINCIPAL C OMPONENT c
X
A NALYSIS SB = (µi − µt )(µi − µt )⊤ , (25)
We can look at PCA with another perspective: PCA is the j=1
best linear projection which has the smallest reconstruc- nj
c X
X
tion error. If we have one PCA direction, the projection is SW = (xj,i − µi )(xj,i − µi )⊤ , (26)
u⊤ X and the reconstruction is uu⊤ X. We want the error j=1 i=1
between the reconstructed data and the original data to be c is the number of classes, nj is the sample size of the j-th
minimized: class, xj,i is the i-th data point in the j-th class, µi is the
minimize ||X − u u⊤ X||2F , mean of the i-th class, and µt is the total mean.
u
(20) According to Rayleigh-Ritz quotient method (Croot, 2005),
subject to u⊤ u = 1.
the optimization problem in Eq. (24) can be restated as:
Therefore, u is the eigenvector of the covariance matrix
maximize w ⊤ S B w,
S = XX ⊤ (the X is already centered by removing its w
(27)
mean). subject to w ⊤ S W w = 1.
Eigenvalue and Generalized Eigenvalue Problems: Tutorial 6

The Lagrangian (Boyd & Vandenberghe, 2004) is: The ρ is stationary at φ 6= 0 if and only if:

L = w⊤ S B w − λ(w ⊤ S W w − 1), (A − λ B) φ = 0, (31)

where λ is the Lagrange multiplier. Equating the derivative for some scalar λ (Parlett, 1998). The Eq. (31) is a linear
of L to zero gives: system of equations. This system of equations can also be
obtained from the Eq. (4):
∂L set
= 2 SB w − 2 λ SW w = 0
∂w Aφi = λi Bφi =⇒ (A − λi B) φi = 0. (32)
=⇒ 2 S B w = 2 λ S W w =⇒ S B w = λ S W w,
As we mentioned earlier, eigenvalue problem is a special
which is a generalized eigenvalue problem (S B , S W ) ac- case of generalized eigenvalue problem (where B = I)
cording to Eq. (4). The w is the eigenvector with the largest which is obvious by comparing Eqs. (28) and (32).
eigenvalue and the λ is the corresponding eigenvalue. According to Cramer’s rule, a linear system of equations
has non-trivial solutions if and only if the determinant van-
6. Solution to Eigenvalue Problem ishes. Therefore:
In this section, we introduce the solution to the eigenvalue
problem. Consider the Eq. (1): det(A − λi B) = 0. (33)

Aφi = λi φi =⇒ (A − λi I) φi = 0, (28) Similar to the explanations for Eq. (29), we can solve for
the roots of Eq. (33). However, note that the Eq. (33) is
which is a linear system of equations. According to obtained from Eq. (4) or (16) where only one eigenvector
Cramer’s rule, a linear system of equations has non-trivial φ is considered.
solutions if and only if the determinant vanishes. There- For solving Eq. (5) in general case, there exist two solu-
fore: tions for the generalized eigenvalue problem one of which
is a quick and dirty solution and the other is a rigorous
det(A − λi I) = 0, (29) method. Both of the methods are explained in the follow-
ing.
where det(.) denotes the determinant of matrix. The Eq.
(29) gives us a d-degree polynomial equation which has d 7.1. The Quick & Dirty Solution
roots (answers). Note that if the A is not full rank (if it is a
Consider the Eq. (5) again:
singular matrix), some of the roots will be zero. Moreover,
if A is positive semi-definite, i.e., A 0, all the roots are AΦ = BΦΛ.
non-negative.
The roots (answers) from Eq. (29) are the eigenvalues of If B is not singular (is invertible ), we can left-multiply the
A. After finding the roots, we put every answer in Eq. (28) expressions by B −1 :
and find its corresponding eigenvector, φi ∈ Rd . Note that
(a)
putting the root in Eq. (28) gives us a vector which can B −1 AΦ = ΦΛ =⇒ CΦ = ΦΛ, (34)
be normalized because the direction of the eigenvector is
important and not its magnitude. The information of mag- where (a) is because we take C = B −1 A. The Eq. (34) is
nitude exists in its corresponding eigenvalue. the eigenvalue problem for C according to Eq. (2) and can
be solved using the approach of Eq. (29).
7. Solution to Generalized Eigenvalue Note that even if B is singular, we can use a numeric hack
Problem (which is a little dirty) and slightly strengthen its main di-
In this section, we introduce the solution to the generalized agonal in order to make it full rank:
eigenvalue problem. Recall the Eq. (16) again:
(B + εI)−1 AΦ = ΦΛ =⇒ CΦ = ΦΛ, (35)
⊤
φ Aφ
maximize . where ε is a very small positive number, e.g., ε = 10−5 ,
φ φ⊤ B φ large enough to make B full rank.
Let ρ be this fraction named Rayleigh quotient (Croot,
7.2. The Rigorous Solution
2005):
Consider the Eq. (5) again:
u⊤ A u
ρ(u; A, B) := , ∀u 6= 0. (30) AΦ = BΦΛ.
u⊤ B u
Eigenvalue and Generalized Eigenvalue Problems: Tutorial 7

There exist a rigorous method to solve the generalized 1 ΦB , ΛB ← BΦB = ΦB ΛB

eigenvalue problem (Wang, 2015) which is explained in the −1/2 1/2
following. 2 Φ̆B ← Φ̆B = ΦB ΛB ≈ ΦB (ΛB + εI)−1
⊤
Consider the eigenvalue problem for B: 3 Ă ← Ă = Φ̆B AΦ̆B
4 ΦA , ΛA ← ĂΦA = ΦA ΛA
BΦB = ΦB ΛB , (36)
5 Λ ← Λ = ΛA
where ΦB and ΛB are the eigenvector and eigenvalue ma- 6 Φ ← Φ = Φ̆B ΦA
trices of B, respectively. Then, we have: 7 return Φ and Λ
BΦB = ΦB ΛB =⇒ Φ−1 −1
B BΦB = ΦB ΦB ΛB = ΛB Algorithm 1: Solution to the generalized eigen-
| {z }
I value problem AΦ = BΦΛ.
(a)
=⇒ Φ⊤
B BΦB = ΛB , (37)
where:
where (a) is because ΦB is an orthogonal matrix (its −1/2
columns are orthonormal) and thus Φ−1 ⊤ Φ := Φ̆B ΦA = ΦB ΛB ΦA . (43)
B = ΦB .
−1/2
We multiply ΛB to equation (37) from left and right The Φ also diagonalizes B because (I is a diagonal ma-
hand sides: trix):
−1/2 −1/2 −1/2 −1/2 (43)
ΛB Φ⊤
B BΦB ΛB = ΛB ΛB ΛB = I, −1/2 −1/2
Φ⊤ BΦ = (ΦB ΛB ΦA )⊤ B(ΦB ΛB ΦA )
⊤ −1/2 −1/2
=⇒ Φ̆B B Φ̆B = I, = Φ⊤
A ΛB (Φ⊤
B BΦB )ΛB ΦA
(37) −1/2 −1/2
where: = Φ⊤ ΛB ΛB ΦA = Φ⊤
A ΛB A ΦA
−1/2
| {z }
Φ̆B := ΦB ΛB . (38) I
(a)
We define Ă as: = Φ−1
A ΦA = I, (44)
⊤ where (a) is because ΦA is an orthogonal matrix. From
Ă := Φ̆B AΦ̆B . (39)
equation (44), we have:
The Ă is symmetric because: Φ⊤ BΦ = I =⇒ Φ⊤ BΦΛA = ΛA
⊤ ⊤ (a) ⊤ (42)
Ă = (Φ̆B AΦ̆B )⊤ = Φ̆B AΦ̆B = Ă. =⇒ Φ⊤ BΦΛA = Φ⊤ AΦ
(a)
where (a) notices that A is symmetric. =⇒ BΦΛA = AΦ, (45)
The eigenvalue problem for Ă is: where (a) is because Φ 6= 0.
ĂΦA = ΦA ΛA , (40) Comparing equations (5) and (45) shows us:

where ΦA and ΛA are the eigenvector and eigenvalue ma- ΛA = Λ. (46)

trices of Ă. Left-multiplying Φ−1
A to equation (40) gives To summarize, for finding Φ and Λ in Eq. (5), we do the
us: following steps (note that A and B are given):
(a)
Φ−1 −1 ⊤
A ĂΦA = ΦA ΦA ΛA =⇒ ΦA ĂΦA = ΛA , (41) 1. From Eq. (36), we find ΦB and ΛB .
| {z }
I 1/2
2. From Eq. (38), we find Φ̆B . In case ΛB is singu-
where (a) is because ΦA is an orthogonal matrix (its lar in Eq. (38), we can use the numeric hack Φ̆B ≈
columns are orthonormal), so Φ−1 ⊤
A = ΦA . Note that ΦA is
1/2
ΦB (ΛB + εI)−1 where ε is a very small positive
an orthogonal matrix because Ă is symmetric (if the matrix 1/2
number, e.g., ε = 10−5 , large enough to make ΛB
is symmetric, its eigenvectors are orthogonal/orthonormal). full rank.
The equation (41) is diagonalizing the matrix Ă.
3. From Eq. (39), we find Ă.
Plugging equation (39) in equation (41) gives us:
4. From Eq. (40), we find ΦA and ΛA . From Eq. (46),
⊤ Λ is found.
Φ⊤
A Φ̆B AΦ̆B ΦA = ΛA
(38) −1/2 −1/2
5. From Eq. (43), we find Φ.
=⇒ Φ⊤
A ΛB Φ⊤
B AΦB ΛB ΦA = Λ A
The above instructions are given as an algorithm in Algo-
=⇒ Φ⊤ AΦ = ΛA , (42) rithm 1.
Eigenvalue and Generalized Eigenvalue Problems: Tutorial 8

8. Conclusion
This paper was a tutorial paper introducing the eigenvalue
and generalized eigenvalue problems. The problems were
introduced, their optimization problems were mentioned,
and some examples from machine learning were provided
for them. Moreover, the solution to the eigenvalue and gen-
eralized eigenvalue problems were introduced.

References
Barshan, Elnaz, Ghodsi, Ali, Azimifar, Zohreh, and
Jahromi, Mansoor Zolghadri. Supervised principal com-
ponent analysis: Visualization, classification and regres-
sion on subspaces and submanifolds. Pattern Recogni-
tion, 44(7):1357–1371, 2011.
Boyd, Stephen and Vandenberghe, Lieven. Convex opti-
mization. Cambridge university press, 2004.
Croot, Ernie. The Rayleigh principle for finding
eigenvalues. Technical report, Georgia Institute of
Technology, School of Mathematics, 2005. Online:
http://people.math.gatech.edu/∼ecroot/notes_linear.pdf,
Accessed: March 2019.
Fisher, Ronald A. The use of multiple measurements in
taxonomic problems. Annals of eugenics, 7(2):179–188,
1936.
Friedman, Jerome, Hastie, Trevor, and Tibshirani, Robert.
The elements of statistical learning, volume 2. Springer
series in statistics New York, NY, USA:, 2009.
Golub, Gene H. and Van Loan, Charles F. Matrix compu-
tations, volume 3. The Johns Hopkins University Press,
2012.
Jolliffe, Ian. Principal component analysis. Springer, 2011.
Parlett, Beresford N. The symmetric eigenvalue problem.
Classics in Applied Mathematics, 20, 1998.
Pearson, Karl. LIII. on lines and planes of closest fit to
systems of points in space. The London, Edinburgh, and
Dublin Philosophical Magazine and Journal of Science,
2(11):559–572, 1901.
Wang, Ruye. Generalized eigenvalue problem.
http://fourier.eng.hmc.edu/e161/lectures/algebra/node7.html,
2015. Accessed: January 2019.
Wilkinson, James Hardy. The algebraic eigenvalue prob-
lem, volume 662. Oxford Clarendon, 1965.
Xu, Yong and Lu, Guangming. Analysis on Fisher discrim-
inant criterion and linear separability of feature space. In
2006 International Conference on Computational Intel-
ligence and Security, volume 2, pp. 1671–1676. IEEE,
2006.

Solution Manual For Discrete Time Signal Processing 3 E 3rd Edition Alan V Oppenheim Ronald W Schafer
0% (1)
Solution Manual For Discrete Time Signal Processing 3 E 3rd Edition Alan V Oppenheim Ronald W Schafer
4 pages
On Derivatives of Eigenvalues and Eigenvectors of The
No ratings yet
On Derivatives of Eigenvalues and Eigenvectors of The
4 pages
Cse3521 Hw1 Solutions
No ratings yet
Cse3521 Hw1 Solutions
5 pages
Stress Manual Eads PDF
100% (1)
Stress Manual Eads PDF
649 pages
MLF Notes - Rishab Dec 24
No ratings yet
MLF Notes - Rishab Dec 24
6 pages
Nonlinear Programming PDF
No ratings yet
Nonlinear Programming PDF
224 pages
Nonlinear Programming Concepts PDF
No ratings yet
Nonlinear Programming Concepts PDF
224 pages
ch8
No ratings yet
ch8
15 pages
AAB Week 21
No ratings yet
AAB Week 21
7 pages
Linear Algebra & Analysis Review As Covered in Class UW EE/AA/ME 578 Convex Optimization
No ratings yet
Linear Algebra & Analysis Review As Covered in Class UW EE/AA/ME 578 Convex Optimization
16 pages
Linear Algebra Part2
No ratings yet
Linear Algebra Part2
44 pages
A X AX X: SMA3013 Linear Algebra
No ratings yet
A X AX X: SMA3013 Linear Algebra
11 pages
Optimization For Machine Learning: Lecture 6: Tractable Nonconvex Problems 6.881: MIT
No ratings yet
Optimization For Machine Learning: Lecture 6: Tractable Nonconvex Problems 6.881: MIT
67 pages
Numerical Linar Algebra and Applications (Biswa Nath Datta)
No ratings yet
Numerical Linar Algebra and Applications (Biswa Nath Datta)
16 pages
CS 532 Lecture Notes
No ratings yet
CS 532 Lecture Notes
25 pages
Problem Sheet 4 With Solutions GRA 6035 Mathematics: BI Norwegian Business School
No ratings yet
Problem Sheet 4 With Solutions GRA 6035 Mathematics: BI Norwegian Business School
12 pages
chapter_3 Performance Surface and Search Method
No ratings yet
chapter_3 Performance Surface and Search Method
24 pages
Chap2 Eigenvalues and Eigenvectors
No ratings yet
Chap2 Eigenvalues and Eigenvectors
12 pages
Lecture 07
No ratings yet
Lecture 07
5 pages
Lecture-05__Least Squares and Optimization
No ratings yet
Lecture-05__Least Squares and Optimization
34 pages
chapter_3 Performance Surface and Search Method
No ratings yet
chapter_3 Performance Surface and Search Method
49 pages
Nptel CN Maths
No ratings yet
Nptel CN Maths
32 pages
MATH2071: LAB 8: The Eigenvalue Problem
No ratings yet
MATH2071: LAB 8: The Eigenvalue Problem
16 pages
Matrices
No ratings yet
Matrices
15 pages
Eigenvalues and Eigenvectors-2
No ratings yet
Eigenvalues and Eigenvectors-2
4 pages
A Journey From Linear Algebra To Machine Learning
No ratings yet
A Journey From Linear Algebra To Machine Learning
50 pages
Lecture 8
No ratings yet
Lecture 8
3 pages
Huang MVC General
No ratings yet
Huang MVC General
27 pages
OptimumEngineeringDesign Day2b
No ratings yet
OptimumEngineeringDesign Day2b
24 pages
Lec 4 RQ
No ratings yet
Lec 4 RQ
32 pages
Ecd 01
No ratings yet
Ecd 01
16 pages
Maths 20031 lecture slits
No ratings yet
Maths 20031 lecture slits
5 pages
Slides Ch8 Bài 2. Bài toán trị riêng - Các ý tưởng chính
No ratings yet
Slides Ch8 Bài 2. Bài toán trị riêng - Các ý tưởng chính
31 pages
MTK3002 - Notes For Student
No ratings yet
MTK3002 - Notes For Student
15 pages
Convex Optimization Prerequisite_topics
No ratings yet
Convex Optimization Prerequisite_topics
6 pages
Some Modified Matrix Eigenvalue: E 2 (X), (A I) A (X) - (9) ) - A (A) O'i (A O'2 (A A, (A) 21 (A) 22 (A) 2, (A)
No ratings yet
Some Modified Matrix Eigenvalue: E 2 (X), (A I) A (X) - (9) ) - A (A) O'i (A O'2 (A A, (A) 21 (A) 22 (A) 2, (A)
17 pages
Aljabar Linear Lanjut PDF
No ratings yet
Aljabar Linear Lanjut PDF
57 pages
Solutions of First Order Linear Systems
No ratings yet
Solutions of First Order Linear Systems
6 pages
Math Primer
No ratings yet
Math Primer
13 pages
Blatt1 Solution
No ratings yet
Blatt1 Solution
8 pages
Linear Models: Stability and Redundancy: 2.1 Singular Value Decomposition
No ratings yet
Linear Models: Stability and Redundancy: 2.1 Singular Value Decomposition
24 pages
Chapter 10 - Eigenvalues and Eigenvectors
100% (6)
Chapter 10 - Eigenvalues and Eigenvectors
16 pages
1 Singular Value Decomposition (Recap) : Lecture 7: October 19, 2021
No ratings yet
1 Singular Value Decomposition (Recap) : Lecture 7: October 19, 2021
5 pages
Chapter 04 Eigen Values
No ratings yet
Chapter 04 Eigen Values
10 pages
Singular Value Decomposition (SVD)
No ratings yet
Singular Value Decomposition (SVD)
94 pages
Eigenvalues, Eigenvectors (CDT-28) : April 2020
No ratings yet
Eigenvalues, Eigenvectors (CDT-28) : April 2020
11 pages
Linear Algebra Notes: Eigenvectors & Diagonalization
No ratings yet
Linear Algebra Notes: Eigenvectors & Diagonalization
67 pages
Background/Random Processes
No ratings yet
Background/Random Processes
33 pages
An Introduction To Eigenvalues and Eigenvectors: Bachelor of Science
No ratings yet
An Introduction To Eigenvalues and Eigenvectors: Bachelor of Science
18 pages
MATH 685/ CSI 700/ OR 682 Lecture Notes: Eigenvalue Problems
No ratings yet
MATH 685/ CSI 700/ OR 682 Lecture Notes: Eigenvalue Problems
78 pages
Lecture Week 13a (Supplementary) : - Constrained Optimisation
No ratings yet
Lecture Week 13a (Supplementary) : - Constrained Optimisation
27 pages
Homework 4 MATH2050
No ratings yet
Homework 4 MATH2050
7 pages
Shifting Method
No ratings yet
Shifting Method
9 pages
bookwithindex
No ratings yet
bookwithindex
96 pages
Chap1 Ho
No ratings yet
Chap1 Ho
4 pages
Module 2 - DS I
No ratings yet
Module 2 - DS I
94 pages
Midterm Solutions: 1: Schur, Backsubstitution, Complexity (20 Points)
No ratings yet
Midterm Solutions: 1: Schur, Backsubstitution, Complexity (20 Points)
4 pages
Math Zc234 l8
No ratings yet
Math Zc234 l8
22 pages
mws_gen_sle_ppt_eigenvalues
No ratings yet
mws_gen_sle_ppt_eigenvalues
41 pages
August 5, 2023
No ratings yet
August 5, 2023
19 pages
Introduction to Equations and Disequations
From Everand
Introduction to Equations and Disequations
Simone Malacrida
No ratings yet
Ordinary Differential Equations and Stability Theory: An Introduction
From Everand
Ordinary Differential Equations and Stability Theory: An Introduction
David A. Sanchez
No ratings yet
Chapter 5 Lecture PowerPoint
No ratings yet
Chapter 5 Lecture PowerPoint
34 pages
Bhs Ing Joshua Christian Revisi
No ratings yet
Bhs Ing Joshua Christian Revisi
6 pages
Maintenance Management of Tractors and Agricultural Machinery PDF
100% (1)
Maintenance Management of Tractors and Agricultural Machinery PDF
13 pages
Morefeatures
No ratings yet
Morefeatures
6 pages
Mastering The FDDI UG Exam Key Insights and Preparation Strategies
No ratings yet
Mastering The FDDI UG Exam Key Insights and Preparation Strategies
3 pages
Cramer's Rule
No ratings yet
Cramer's Rule
33 pages
ECSS Q ST 30 09 (31july2008)
No ratings yet
ECSS Q ST 30 09 (31july2008)
31 pages
Correlational Research and Survey Research
No ratings yet
Correlational Research and Survey Research
31 pages
PRACTICE - D07 Aug 2022
No ratings yet
PRACTICE - D07 Aug 2022
2 pages
Paper - Gestion Del Cambio
No ratings yet
Paper - Gestion Del Cambio
12 pages
DLL - Mathematics 4 - Q1 - W7
No ratings yet
DLL - Mathematics 4 - Q1 - W7
4 pages
Math Portfolio (SHORT Bondpaper)
No ratings yet
Math Portfolio (SHORT Bondpaper)
8 pages
2024-2025-Class VIII-Mathematics-Chapter 9-AW
No ratings yet
2024-2025-Class VIII-Mathematics-Chapter 9-AW
11 pages
All Chapters
No ratings yet
All Chapters
126 pages
1.sets MCQs
No ratings yet
1.sets MCQs
3 pages
Maths_Concept_King_Book_by_GAGAN_PRATAP_SIR_Champion_Publication_7
No ratings yet
Maths_Concept_King_Book_by_GAGAN_PRATAP_SIR_Champion_Publication_7
5 pages
Mte201 Tut1 2024
No ratings yet
Mte201 Tut1 2024
2 pages
Transportation Model
No ratings yet
Transportation Model
20 pages
Ultimate Tensile Strength Breaking Strength Elongation
No ratings yet
Ultimate Tensile Strength Breaking Strength Elongation
2 pages
Behavioral Finance Psychology Decision Making and Markets 1st Edition Ackert Solutions Manual - Available For Instant Download And Reading
100% (3)
Behavioral Finance Psychology Decision Making and Markets 1st Edition Ackert Solutions Manual - Available For Instant Download And Reading
30 pages
Lecture-18-19-20 Introduction To Digital Control Systems Dr. Imtiaz Hussain
No ratings yet
Lecture-18-19-20 Introduction To Digital Control Systems Dr. Imtiaz Hussain
60 pages
Helical Compressing Spring Calculation PDF
No ratings yet
Helical Compressing Spring Calculation PDF
4 pages
Flow Control Report Final Version
No ratings yet
Flow Control Report Final Version
15 pages
Rohini 98229548802
No ratings yet
Rohini 98229548802
5 pages
Calling A Fortran DLL From Excel
No ratings yet
Calling A Fortran DLL From Excel
7 pages
Eagle Point Mannual 2005 (Road Calc)
100% (7)
Eagle Point Mannual 2005 (Road Calc)
22 pages
G Heron's Formula
No ratings yet
G Heron's Formula
3 pages
Unit 1 First and Higher Order Equations: Structure Page No
No ratings yet
Unit 1 First and Higher Order Equations: Structure Page No
35 pages

2019 GHOJOGN Generalized Eigenvalue Tutorial

Uploaded by

2019 GHOJOGN Generalized Eigenvalue Tutorial

Uploaded by

Eigenvalue and Generalized Eigenvalue Problems: Tutorial

Benyamin Ghojogh BGHOJOGH @ UWATERLOO . CA

Fakhri Karray KARRAY @ UWATERLOO . CA

Abstract yield to the eigenvalue and generalized eigenvalue prob-

=⇒ A = ΦΛΦ⊤ = ΦΛΦ−1 , (3)

∂L set where X ∈ Rd×n and ||.||F denotes the Frobenius norm of

= tr(X ⊤ X) − tr(X ⊤ φφ⊤ X) maximize φ⊤ A φ,

L = tr(X ⊤ X) − tr(XX ⊤ φφ⊤ ) − λ(φ⊤ B φ − 1), L = φ⊤ A φ − λ(φ⊤ B φ − 1),

L = w⊤ S B w − λ(w ⊤ S W w − 1), (A − λ B) φ = 0, (31)

There exist a rigorous method to solve the generalized 1 ΦB , ΛB ← BΦB = ΦB ΛB

where ΦA and ΛA are the eigenvector and eigenvalue ma- ΛA = Λ. (46)

You might also like