Matrix Exponent Explained
Matrix Exponent Explained
x′ = A(t)x (1)
From Theorem 3 proved earlier, this set of solutions is linearly independent for each t ∈ I and
hence forms a linearly independent set in the solution space S.
We now prove that an arbitrary solution x(t) of (1) is a linear combination of these solutions
xk (t). Given such an arbitrary solution x(t), define
x0 = x(t0 ) . (4)
x0 = c1 e1 + c2 e2 + · · · cn en . (5)
142
Now use these constants to define a new function, y : I → Rn ,
From Eq. (4), the solutions x(t) and y(t) satisfy the same initial condition. Therefore, from
the uniqueness part of Theorem 1 (Existence-Uniqueness for linear homogeneous DEs),
From (6), it follows that our arbitrary solution x(t) may be written as
i.e., a linear combination of the solutions xk (t). Therefore the set of solutions,
is a basis for the solution space S, implying that the dimension of S is n. The proof is complete.
143
Important remarks: This result finally proves a statement made earlier in this course that
the dimension of the solution space for linear second order homogeneous DEs of the form,
is two. (Note that t is used as the independent variable here, i.e., y = y(t).) From our earlier
discussion, the above DE may be written as a linear homogeneous DE in R2 , where
Given a solution y(t) to (11), and the fact that x1 (t) = y, along with the fact that x′2 (t) = x1 (t),
one might wonder why the dimension of the solution y(t) isn’t 1, instead of 2. After all, if we
know x1 (t), we know y(t). The solution x2 (t), which is, in fact, x′1 (t) = y ′ (t) seems to be
redundant.
The fact, however, is that for general linear systems in x1 (t) and x2 (t), one cannot, in
general, solve for x1 (t) independently without solving for x2 (t) as well. Furthermore, in order
to isolate a particular solution of the second order DE in (11), one must prescribe the initial
conditions, y(t0 ) and y ′(t0 ) which become the initial conditions x1 (t0 ) and x2 (t0 ). The require-
ment to know two initial pieces of information is a consequence of the two-dimensionality of
the solution space S of solutions to (11).
Also recall that an nth order homogeneous linear differential equation in the function y(t)
having the form,
an (t)y (n) + an−1 (t)y (n−1) + · · · + a1 (t)y ′ + a0 (t) = 0 , (13)
can be transformed into a linear system of n first order DEs in the functions,
144
Special case: Constant matrix A
Recall from AMATH 250/251 that the solutions to linear systems with constant coefficients,
i.e.,
x′ = Ax , (15)
Av = λv . (18)
145
The Fundamental Matrix associated with x′ = Ax
Let’s now return to the following idea used in the proof of Theorem 4 (dimensionality of the
solution space S for a linear system in Rn ). Let ek , k = 1, 2, · · · n, denote the “standard basis”
in Rn :
1 0 0
0 1 0
e1 = ..
, e2 = ..
, · · · en = .
.. (23)
. . .
0 0 1
Suppose that we have n solutions X1 (t), ..., Xn (t) to the linear system,
x′ = A(t)x , (24)
Recall that these solutions are represented by 1 × n column vectors. Now put these solu-
tions/column vectors together to form the following n × n matrix,
Then this matrix, known as the fundamental matrix of the linear system in (24) at t0 ,
satisfies the following properties:
1. Identity property:
= [e1 e2 · · · en ]
= I,
146
2. Initial value property: Let the matrix Φ(t, t0 ) operate on a constant vector a ∈ Rn as
follows:
which implies that x(t) is also a solution to (24). But note that
= Ia
= a.
In other words, the solution x(t) = Φ(t, t0 )a is the unique solution of the initial value
problem x(t0 ) = a.
x′ = A(t)x . (31)
147
3. Derivative property: The n×n matrix Φ(t, t0 ) is a solution of the matrix differential
equation
d
Φ(t, t0 ) = A(t)Φ(t, t0 ). (32)
dt
(It is understood that the term on LHS is the n × n matrix of the time derivatives of the
elements of Φ(t, t0 ).)
This result is a consequence of matrix multiplication. Recall that the kth column of
Φ(t, t0 ) is the solution vector Xk (t). The time derivative of this solution vector is the
vector A(t)Xk (t). Placing each vector in the kth column on each side preserves these
relations.
Now substitute the left and right hand sides of (33) into the respective sides of (34):
Rearranging, we have
[Φ(t, t0 )′ − A(t)Φ(t, t0 )]a = 0. (36)
Since this equation is true for all a ∈ Rn , it follows that the matrix expression in the
square brackets is zero, which proves (32).
Example: Let us return to the earlier example, namely, the linear system x′ = Ax, with
matrix A given by
1 1
A= . (37)
4 1
148
Recall that the general solution of this system was found to be
3t −t
c1 e + c2 e
x(t) = c1 x1 (t) + c2 x2 (t) = . (38)
3t −t
2c1 e − 2c2 e
We now determine the particular solutions X1 (t) and X2 (t) from which the fundamental matrix
Φ(t, 0) will be constructed:
1
(i) Find X1 (t) such that X1 (0) = . This leads to the following linear system of equations
0
in c1 and c2 :
c1 + c2 = 1
2c1 − 2c2 = 0
0
(ii) Find X2 (t) such that X2 (0) = . This leads to the following linear system of equations
1
in c1 and c2 :
c1 + c2 = 0
2c1 − 2c2 = 1
Therefore the fundamental matrix associated with this linear system is given by
1 3t 1 −t 1 3t 1 −t
e + 2e e − 4e
Φ(t, 0) = 2 4 . (39)
1 3t 1 −t
e3t − e−t 2
e + 2
e
149
You will note that Φ(0, 0) = I.
The particular solution of this linear system satisfying the initial condition,
1
x0 = a = , (40)
1
is
We used this result in the previous lecture to illustrate the vector Picard iteration method.
and define
y(t) = Φ(t, t1 )b. (45)
Therefore,
Φ(t, t1 )b = Φ(t, t0 )a, (47)
150
or
Φ(t, t1 )Φ(t1 , t0 )a = Φ(t, t0 )a. (48)
Since this result follows for any a ∈ Rn , the property in (42) follows.
151
Special case: “Linear time-invariant systems”
In the special case that the matrix A in the linear system is constant, the solutions of the DE
along with its fundamental matrix possess some additional properties. For example:
To illustrate, let us return to the linear system studied in the previous lecture and defined by
the constant matrix
1 1
A= . (52)
4 1
The general solution of this system was found to be
1 1
x(t) = c1 e3t + c2 e−t . (53)
2 −2
Now examine the function y(t) = x(t − a):
1 1
x(t − a) = c1 e3(t−a) + c2 e−(t−a) (54)
2 −2
1 1
= c1 e−3a e3t + ea e−t
2 −2
1 1
= d1 e3t + d2 e−t ,
2 −2
which is also a solution. We can prove this result more formally as follows.
Let y(t) = x(t − a). Then
d d
y(t) = x(t − a) (55)
dt dt
d ds
= x(s) , where s = t − a,
ds dt
′
= x (s)
= Ax(s)
= Ay(t),
152
Lecture 16
At the end of the previous lecture, we proved the following important property of linear systems
defined by constant matrices A:
The above property extends to the fundamental matrix associated with the linear system
x′ = Ax, where A is a constant matrix, as we now show.
From the previous result, y(t) is also a solution. By the definition of the fundamental matrix:
and
x(t) = Φ(t, 0)x(0). (59)
153
Comparing (60) and (61), we have the result
In other words:
Therefore, for a linear time-invariant system, without loss of generality, we may define
where t denotes the time elapsed from t0 = 0. This fundamental matrix satisfies some important
properties:
Let us return to the linear system studied earlier, with fundamental matrix given by
1 3t 1 −t 1 3t 1 −t
e + e e − e
Φ(t, 0) = 2 2 4 4 . (64)
3t −t 1 3t 1 −t
e −e 2
e + 2e
1. Φ(−t)Φ(t) = I and
2. the matrix Φ(t) satisfies the matrix differential equation, Φ′ (t) = AΦ(t).
154
The exponential matrix
We now examine more closely the fundamental matrix Φ(t, t0 ) associated with a linear time-
invariant system, i.e.,
x′ = Ax, x(0) = a, (65)
with solution
x(t) = x0 eat . (67)
If we replace the constant “a” with the matrix A, then one might conjecture that the solution
to the linear system in (65) is given by
1 2 1
ex = 1 + x + x + · · · xn + · · · . (69)
2! n!
1 2 1
eA = I + A + A + · · · + An + · · · . (70)
2! n!
It still remains to determine if the above infinte series makes sense. For the moment, we’ll
simply continue, with the assumption that everything above is fine, i.e., that the above infinite
series converges. The technical points will be addressed at the end of the lecture.
155
Some properties of matrix exponentials
Some of the properties listed below follow immediately from the series definition. The others
require a little more work, which we omit here.
1. e0 = I
Some examples:
156
where
0 1
N= . (75)
0 0
The matrix N is nilpotent since N2 = N3 = · · · = 0. This implies that the series for the
exponential of N terminates quite rapidly, i.e.,
eN = I + N . (76)
Then
eA = eaI+bN (77)
= ea I [I + bN]
1 b
= ea .
0 1
where
0 b
B= , (80)
−b 0
Since I and B commute,
eA = eaI+B = eaI eB = ea eB . (81)
157
and
2 3
0 b −b 0 0 −b
B3 = BB2 = = . (83)
2 3
−b 0 0 −b b 0
Combining all of the above results:
2 3
1−b +··· b − b /3! + · · · cos b sin b
eB = = . (84)
3 2
−b + b /3! + · · · 1−b +··· − sin b cos b
The reader may recall that the three examples discussed above represent the three “Jordan
canonical forms” for 2 × 2 matrices. Example 1 corresponds to a diagonalizable matrix (which
also includes the case a = b), Example 2 corresponds to a nondiagonalizable matrix with equal
eigenvalues a and Example 3 corresponds to the complex conjugate pair of eigenvalues a ± bi.
We shall return to these cases as they are encountered in linear systems of ODEs.
We now come to the main result of this section, which is to verify the conjecture in Eq. (68),
i.e. that the solution to the initial value problem
t2 2 tn n
x(t) = I + tA + A + · · · + A + · · · a. (88)
2! n!
tn−1
′ 2 n
x (t) = A + tA + · · · + A + · · · a. (89)
(n − 1)!
158
Now factor out an A from the above series,
tn−1
′ n−1
x (t) = A I + tA + · · · + A +··· a (90)
(n − 1)!
= AetA a
= Ax(t).
(These comments were not presented in this lecture. They are presented here for the interested
reader.)
k A k = max k Ax k . (92)
kxk=1
Note that we are looking for the maximum value of k Ax k for x in the “unit ball” in Rn . It
follows that
k Ax k ≤ k A k, for any x such that k x k= 1. (93)
159
which is clearly a unit vector. Now insert this vector into Eq. (93) to give
1
k Ay k ≤ k A k, (96)
kyk
1 2 1
Sn = I + A + A + · · · + An , n = 0, 1, 2, · · · . (98)
2! n!
The matrices An are all n × n matricies, implying that Sn is an n × n matrix. Let us now
examine the norm of the matrices Sn . From the property of norms,
1 2 1
k Sn k = k I + A + A + · · · + An
2! n!
1 1
≤ k I k + k A k + k A2 k + · · · + k An k (99)
2! n!
k A n k ≤ k A kn . (100)
To see this, let’s go back to Eq. (94) and let y = Ax, for any x such that k x k= 1. Then
k AAx k ≤ k A kk Ax k
≤ k A k2 k x k, k x k= 1. (101)
1 1
k Sn k ≤ k I k + k A k + k A k2 + · · · + k A kn
2! n!
1 2 1 n
= 1 + a + a + · · · a , where a =k A k . (102)
2! n!
Note that
kSn k ≤ ea for all n ≥ 0 . (103)
160
Note that for any n ≥ 0, the quantity kSn k is a real number – it is the norm of the matrix Sn .
From (103),
lim kSn k ≤ ea = ekAk , (104)
n→∞
We’re almost there – a tiny amount of extra work is required. Let us return to the partial
sum in (98),
1 2 1
Sn = I + A + A + · · · + An , n = 0, 1, 2, · · · . (105)
2! n!
In order to conclude that the sequence of partial sums {Sn } is convergent, we must show that it
is a Cauchy series, i.e., that for any ǫ > 0, there exists an N > 0, such that for any n, m > N,
k Sm − Sn k < ǫ. (106)
This can be done in the same way as is done for convergent series of real numbers. Without
loss of generality, let us assume that m > n. Then
1 m 1 1
k Sm − Sn k = k A + Am+1 + · · · + An k
m! (m + 1)! n!
1 m 1 1
≤ k A k+k Am+1 k + · · · + k An k
m! (m + 1)! n!
m m+1 n
a a a
≤ + +···+ , (107)
m! (m + 1)! n!
where a =k A k. We know that the series
∞
X an
(108)
n=0
n!
converges – it converges to the value ea . This implies that the sequence of partial sums,
1 2 1
Sn = 1 + a + a · · · + an , (109)
2! n!
converges to the limit ea . An elementary theorem from analysis states that if a sequence is
convergent, i.e., if it converges to a limit, then it is a Cauchy sequence. From Eq. (107), this
implies that for any ǫ > 0, there exists an N > 0 such that
161
This establishes that the sequence of partial sums Sn in (98) is a convergent sequence and that
the limit of this sequence S is an n × n matrix. We call this limit the exponential of matrix A
and denote it as follows,
lim Sn = S = eA . (111)
n→∞
162
Lecture 17
We now apply the results obtained earlier to examine the matrix exponential solutions of three
classes of linear systems in R2 :
a 0
Example 1: x′ = Ax where A= .
0 b
In the previous lecture, we found eA
to be the diagonal matrix with entries ea and eb . To
compute etA , we may simply replace a and b with at and bt so that
at
e 0
etA = . (112)
0 ebt
This means that the solution to the initial value problem x(0) = a is
at at
e 0 a a e
etA a = 1 = 1 . (113)
bt bt
0 e a2 a2 e
Of course, the two DEs in x1 (t) and x2 (t) could easily have been solved separately. Since A is
diagonal, the two DEs are uncoupled. Nevertheless, it is still worthwhile to see the exponential
matrix applied to this problem.
a 1
Example 2: x′ = Ax where A= .
0 a
As in the previous lecture, we shall write A as
A = aI + N, (114)
where
0 1
N= . (115)
0 0
163
Recall that the matrix N is nilpotent since N2 = N3 = · · · = 0. Then
= eatI eN since IN = NI
= eat I [I + tN]
1 t
= eat
0 1
at at
e te
= .
at
0 e
The solution to the initial value problem x(0) = a is
at at at
e te a (a + a2 t)e
etA a = 1 = 1 . (117)
at at
0 e a2 a2 e
Here we see the appearance of solutions with polynomials multiplied by exponentials. This
is, of course, reminiscent of the case for second-order linear DEs with constant coefficients for
which the eigenvalues of the characteristic polynomial are equal. The matrix A above is the
Jordan canonical form for such a situation.
a b
Example 3: x′ = Ax where A= .
−b a
As in the previous lecture, we shall write A as
A = aI + B, (118)
where
0 b
B= , (119)
−b 0
Since I and B commute,
eA = eatI+B = eatI etB = eat etB . (120)
The matrix eB was computed in the previous lecture. We obtain etB by replacing b with bt.
The final result is
cos bt sin bt
etA = eat . (121)
− sin bt cos bt
164
The solution to the initial value problem x(0) = a is
at at at
e cos bt e sin bt a e (a cos bt + a sin bt)
etA a = 1 = 1 2
. (122)
at at at
−e sin bt e cos bt a2 e (−a1 sin bt + a2 cos bt)
The eigenvalues of A are complex, i.e., λ = a±bi. As we know for linear second order DEs with
constant coefficients, the solutions involve exponentials multiplied by trigonometric functions.
The above three examples actually take care of all situations that would be encountered in
R2 . Why is this? Because they are the three fundamental Jordan canonical (or normal)
forms of 2 × 2 matrices into which all 2 × 2 matrices can be converted by means of a similarity
transformation.
This leads to the important question, “How do we compute matrix exponentials etA in
general?” The answer will almost always be, “Certainly not by using the series definition!”
Fortunately, we can rely upon similarity transformations to simplify the problem. Recall, from
linear algebra, that for a general n × n matrix A, we can construct a matrix C such that the
matrix B defined as
B = C−1 AC (123)
is in Jordan form. In the simplest possible case, where A has n real and distinct eigenvalues
λ1 , λ2 , · · · , λn , the matrix B will be a diagonal matrix with entries λi , i.e.,
λ 0 ··· 0
1
0 λ2 0 0
B = diag[λ1 , λ2 , · · · , λn ] = . (124)
··· ··· ··· 0
0 0 0 λn
In this case, the matrix C that accomplishes this job is the matrix formed by concatenating
the n linearly independent λk -eigenvectors vk , k = 1, 2, · · · , n, to form an n × n matrix:
C = [v2 v2 · · · vn ] (125)
165
have the following 2 × 2 block in the upper left corner:
λ 1
. (126)
0 λ
And if two eigenvalues λk and λk+1 are complex conjugates of each other, e.g., λk = a + bi,
with an appropriate selection of basis vectors (we omit a detailed discussion here), there will
be a corresponding 2 × 2 block containing the kth and k + 1st diagonal elements of B; that
block will have the form
a b
. (127)
−b a
The exponentiation of B is then relatively straightforward: In the simplest cases, etB will
generally have blocks that correspond to the 2 × 2 blocks produced in the first three examples.
(I use the word “generally” because there can be additional complications, e.g., triply degen-
erate eigenvalues, doubly degenerate complex eigenvalues, etc.. We avoid discussion of these
complications here.)
Question: So once you have produced the matrix etB , how do you get back to etA ?
By definition,
etA = etCBC
−1
(130)
1 2
= I + tCBC−1 + t (CBC−1)(CBC−1 ) + · · ·
2!
1
= CC−1 + tCBC−1 + t2 CB2 C−1 + · · ·
2!
1 2 2
= C I + tB + t B + · · · C−1
2!
tB −1
= Ce C
166
This result implies that we can take the general matrix A, transform it to matrix B, exponen-
tiate tB, then inverse transform back to yield etA .
1 1
Example 4: x′ = Ax where A= .
4 1
167
which agrees with the earlier result.
We now have outlined the basic procedure to determine all solutions to linear homogeneous
systems having the form,
x′ = Ax , (137)
x(0) = a , (139)
is
x(t) = Φ(t)a = etA a . (140)
For example, the solution to this initial value problem for Example 4 above is,
x(t) = etA a
1 3t −t 1 3t −t
(e + e ) (e − e ) a
= 2 4 1
1 3t
e3t − e−t 2
(e + e−t ) a2
( 21 a1 + 14 a2 )e3t + ( 12 a1 − 41 a2 )e−t
= . (141)
1 3t 1 −t
(a1 + 2 a2 )e + (−a1 + 2 a2 )e
168
Linear homogeneous DEs as “dynamical systems”
(Relevant section from AMATH 351 Course Notes by J. Wainwright: Section 2.2.5,
p. 85)
Dynamical systems theory is the study of evolution of systems with respect to a parameter,
usually time. Here, we view the solutions of the linear homogeneous DE,
x′ = Ax , (142)
in from a dynamical systems viewpoint, which is essentially examining solutions, or the entire
family of solutions, from a geometrical point of view. Recall that these solutions may be
expressed in the form,
x(t) = etA a (143)
where
x(0) = a . (144)
We’ll be viewing these solutions as curves in the state space Rn . These curves are called
orbits of the DE. As the time variable t varies, the point x(t) traces out a curve in Rn . And
the motion of x(t) is determined by the flow operator
We don’t simply have to think about the curve x(t) for positive values of t, i.e., forward
motion – we can also consider its motion for t decreasing from zero. The net result is the
orbit of the flow etA through a is defined as
There is one special point for this system, namely, the point x = 0 since
x(t) = 0 (147)
169
is a solution of the DE. Since
A0 = 0 , (148)
it is easy to see, from the series expansion definition of the matrix exponential, that
The point x = 0 is called an equilibrium point of the DE. It is also called a fixed point of
the flow operator Φ(t). Note that the orbit of 0 is simply the point 0 itself,
170