MIT System Theory Solutions
MIT System Theory Solutions
Exercise 1.1 a) Given square matrices A1 and A4 , we know that A is square as well:
A1 A2
A=
0 A4
=
Note that
det
I 0
0 A4
I 0
0 A4
A 1 A2
.
0
I
= det(I)det(A4 ) = det(A4 ),
which can be veried by recursively computing the principal minors. Also, by the elementary
operations of rows, we have
A1 0
A1 A2
det =
= det
= det(A1 ).
0
I
0 I
Finally note that when A and B are square, we have that det(AB) = det(A)det(B). Thus we have
det(A) = det(A1 )det(A4 ).
1
b) Assume A1
1 and A4 exist. Then
A1 A2
B1 B2
I 0
1
AA =
=
,
B3 B4
0 A4
0 I
1
A1
A1
A2 A41
1
1
.
A =
0
A1
4
Exercise 1.2 a)
0 I
I 0
A 1 A2
A 3 A4
A3 A4
A1 A2
b) Let us nd
B =
B1 B2
B3 B4
such that
BA =
A2
A1
1
0 A4 A3 A
1 A2
I
0
B =
A3 A1
I
1
c) Using linear operations
= 1. Then, det(A) = det(B)det(A) =
A3 A1
.
Note
that
A
det (BA) = det (A1 ) det A4 A3 A1
2
4
1
1 A2 does not have to be invert
ible for the proof.
Exercise 1.3 We have to prove that det(I AB) = det(I BA).
Proof: Since I and I BA are square,
I
0
det(I BA) = det
B I BA
I A
I A
= det
B I
0 I
I A
I A
= det
det
,
B I
0 I
yet, from Exercise 1.1, we have
det
I A
0 I
= det(I)det(I) = 1.
Thus,
I A
B I
Now,
det
I A
B I
= det
I AB 0
B
I
= det(I AB).
Therefore
det(I BA) = det(I AB).
Note that (I BA) is a q q matrix while (I AB) is a p p matrix. Thus, when one wants to
compute the determinant of (I AB) or (I BA), s/he can compare p and q to pick the product
(AB or BA) with the smaller size.
b) We have to show that (I AB)1 A = A(I BA)1 .
Proof: Assume that (I BA)1 and (I AB)1 exist. Then,
A = A I = A(I BA)(I BA)1
= (A ABA)(I BA)1
= (I AB)A(I BA)1
(I AB)1 A = A(I BA)1 .
This completes the proof.
Exercise 1.6
of limits, i.e.
d
A(t + t)B(t + t) A(t)B(t)
(A(t)B(t)) = lim
t0
dt
t
We substitute rst order Taylor series expansions
A(t + t) = A(t) + t
d
A(t) + o(t)
dt
B(t + t) = B(t) + t
d
B(t) + o(t)
dt
to obtain
d
1
d
d
(A(t)B(t)) =
A(t)B(t) + t A(t)B(t) + tA(t) B(t) + h.o.t. A(t)B(t) .
dt
t
dt
dt
Here h.o.t. stands for the terms
d
d
h.o.t. = A(t) + t A(t) o(t) + o(t) B(t) + t B(t) + o(t2 ),
dt
dt
a matrix quantity, where limt0 h.o.t./t = 0 (verify). Reducing the expression and taking the
limit, we obtain
d
d
d
[A(t)B(t)] = A(t)B(t) + A(t) B(t).
dt
dt
dt
b) For this part we write the identity A1 (t)A(t) = I. Taking the derivative on both sides, we have
d 1
d
d
A (t)A(t) = A1 (t)A(t) + A1 (t) A(t) = 0
dt
dt
dt
3
p(x) =
i xi
i=0
b) T : X X and T (g(x)) =
d
dx g(x).
Proof:
d
(ag1 (x) + bg2 (x))
dx
d
d
= a g 1 + b g2
dx
dx
= aT (g1 ) + bT (g2 ).
Thus, T is linear.
2. g(x) = 0 + 1 x + 2 x2 + + M xM , so
T (g(x)) = 1 + 22 x + + M M xM 1 .
Thus it can be written as follows:
0 1 0 0
0 0 2 0
0 0 0 3
.. .. .. .
. . . ..
0 0 0 0
0 0 0 0
.
.
0
0
0
1
2
3
..
.
0
M
4
1
22
33
.
M M
0
The big matrix, M , is a matrix representation of T with respect to basis B. The column
vector in the left is a representation of g(x) with respect to B. The column vector in the
right is T (g) with respect to basis B.
3. Since the matrix M is upper triangular with zeros along diagonal (in fact M is Hessenberg),
the eigenvalues are all 0;
i = 0 i = 1, , M + 1.
0
V1 = .
..
0
is one eigenvector. Since i s are not distinct, the eigenvectors are not necessarily inde
pendent. Thus in order to computer the M others, ones uses the generalized eigenvector
formula.
MIT OpenCourseWare
http://ocw.mit.edu
For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
Proof: From i) we know that N (A) = R (A ) by switching A with A . That implies that
N (A) = {R (A )} = R(A ).
b) Show that rank(A) + rank(B) n rank(AB) min{rank(A), rank(B)}.
Each column of AB is a combination of the columns of A, which implies that R(AB) R(A).
Each row of AB is a combination of the rows of B rowspace (AB) rowspace (B), but the
Now, let {v1 , , vrB } be a basis set of R(B), and add n rB linearly independent vectors
v1 | v2
vrB | w1 |
wnrB
Using fact 1, we see that the number of linearly independent columns of A is less than or equal to
the number of linearly independent columns of AV + the number of linearly independent columns
of AW , which means that
rank(A) rank(AV ) + rank(AW ).
Using fact 2, we see that
rank(AV ) = rank(AB) rank(A) rank(AB) + rank(AW ),
yet, there re only n rB columns in AW . Thus,
rank(AW ) n rB
rank(A) rank(AB) rank(AW ) n rB
rA (n rB ) rAB .
2
1 t1 t21
y1
e
1
a
. .
..
0
.
.
.
. .
.
a1 +
.
=
. .
.
1 t16 t16
a
2
e16
y16
The coecients a0 , a1 , and a2 are determined
by the least
0.5296
aLS =
0.2061
0.375
For the 15th order polynomial, by a similar reasoning we can express the relation between data
points yi and the polynomial as follows:
1 t1 t21 t115
a0
e1
y1
.
.. ..
..
.
.
.
..
.
.
.
.
..
.
=
. .
2
15
y16
a15
e16
1 t16 t16 t16
This can be rewritten as y = Aa + e.
Observe that matrix A is invertible for distinct ti s. So
the coecients ai of the polynomial are aexact = A1 y, where aexact = a0 a1 a15 . The
resulting error in tting the data is e = 0, thus we have a perfect t at these particular time
instants.
0.49999998876521
0.39999826604650
0.16013119161635
0.04457531385982
0.00699544100513
0.00976690595462
0.02110628552919
0.02986537283027
aexact =
0.03799813521505
0.00337725219202
0.00252507772183
0.00072658523695
0.00021752221402
0.00009045014791
0.00015170733465
0.00001343734075
1.5
0.5
0.2
0.4
0.6
0.8
1.2
1.4
1.6
1.8
Figure 2.2a
The function f (t) as well as the approximating polynomials p15 (t) and p2 (t) are plotted in
Figure 2.2b. Note that while both polynomials are a good t, the fteenth order polynomial is a
better approximation, as expected.
b) Now we have measurements aected by some noise. The corrupted data is
yi = f (ti ) + e(ti )
i = 1, . . . , 16
ti T
Following the reasoning in part (a), we can express the relation between the noisy data points yi
y = Aa + e
The solution procedure is the same as in part (a), with y replaced by y.
Numerically, the values of the coecients are:
0.00001497214861
0.00089442543781
0.01844588716755
0.14764397515270
0.63231582484352
1.62190727992829
2.61484909708492
2.67459894145774
aexact =
1.67594757924772
0.56666848864500
0.06211921500456
0.00219622725954
0.01911248745682
0.01085690854235
0.00207893294346
0.00010788458590
4
105
0.2
0.4
0.6
0.8
1.2
1.4
1.6
1.8
Figure 2.2b
and
aLS
1.2239
= 0.1089
0.3219
The function f (t) as well as the approximating polynomials p15 (t) and p2 (t) are plotted in
Figure 2.2b. The second order polynomial does much better in this case as the fteenth order
polynomial ends up tting the noise. Overtting is a common problem encountered when trying
to t a nite data set corrupted by noise using a class of models that is too rich.
Additional Comments A stochastic derivation shows that the minimum variance unbiased
estimator for a is a
= argminy Aa2W where W = Rn1 , and Rn is the covariance matrix of the
random variable e. So,
a
= (A W A)1 A W y.
Roughly speaking, this is saying that measurements with more noise are given less weight in the
estimate of a. In our problem, Rn = I because the ei s are independent, zero mean and have unit
variance. That is, each of the measurments is equally noisy or treated as equally reliable.
c) p2 (t) can be written as
p2 (t) = a0 + a1 t + a2 t2 .
In order to minimize the approximation error in least square sense, the optimal p2 (t) must be such
that the error, f p2 , is orthogonal to the span of {1, t, t2 }:
< f p2 , 1 >= 0 < f, 1 >=< p2 , 1 >
< f p2 , t >= 0 < f, t >=< p2 , t >
< f p2 , t2 >= 0 < f, t2 >=< p2 , t2 > .
1.5
0.5
0.2
0.4
0.6
0.8
1.2
1.4
1.6
1.8
Figure 2.2c
We have that f = 12 e0.8t for t [0, 2], So,
< f, 1 >=
0
2
< f, t >=
0
< f, t2 >=
1 0.8t
5 8 5
e dt = e 5
2
8
8
t 0.8t
15 8 25
e dt = e 5 +
2
32
32
2 2
t
e0.8t dt =
85 8 125
e5
.
64
64
And,
8
< p2 , 1 >= 2a0 + 2a1 + a2
3
8
< p2 , t >= 2a0 + a1 + 4a2
3
8
32
< p2 , t2 >= a0 + 4a1 + a2
3
5
Therefore the problem reduces to solving another set of linear equations:
2 2 38
a0
< f, 1 >
2 8 4 a1 = < f, t > .
3
8
32
4
a2
< f, t2 >
3
5
Numerically, the values of the coecients are:
0.5353
a = 0.2032
0.3727
The function f (t) and the approximating polynomial p2 (t) are plotted in Figure 2.2c. Here
we use a dierent notion for the closeness of the approximating polynomial, p2 (t), to the original
function, f . Roughly speaking, in parts (a) and (b), the optimal polynomial will be the one for
6
which there is smallest discrepancy between f (ti ) and p2 (ti ) for all ti , i.e., the polynomial that will
come closest to passing through all the sample points, f (ti ). All that matters is the 16 sample
points, f (ti ). In this part however, all the points of f matter.
y1
y2
C1
e1
S1 0
,A=
,e=
and S =
.
C2
e2
0 S2
Note that A has full column rank because C1 has full column rank. Also note that S is symmetric
positive denite since both S1 and S2 are symmetric positive denite. Therefore, we know that
= (A SA)1 A Sy.
x
= argmin e Se exists and is unique and is given by x
Thus by direct substitution of terms, we have:
x
= (C1 S1 C1 + C2 S2 C2 )1 (C1 S1 y1 + C2 S2 y2 )
Recall that x
1 = (C1 S1 C1 )1 C1 S1 y1 and that x
2 = (C2 S2 C2 )1 C2 S2 y2 . Hence x
can be re-written
as:
x
= (Q1 + Q2 )1 (Q1 x
1 + Q2 x
2 )
Exercise 2.8 We can think of the two data sets as sequentially available data sets. x
is the
least squares solution to y Ax corresponding
to minimizing the euclidean norm of e1 = y Ax.
y
A
x
is the least squares solution to
In the trivial case where D is a square (hence non-singular) matrix, the set of values of x over
in this
which we seek to minimize the cost function consists of a single element, D1 z. Thus, x
case is simply x
= D1 z. It is easy to verify that the expression we obtained does in fact reduce
to this when D is invertible.
Exercise 3.1 The rst and the third facts given in the problem are the keys to solve this problem,
in addition to the fact that:
R
UA =
.
0
Here note that R is a nonsingular, upper-triangular matrix so that it can be inverted. Now the
problem reduces to show that
x
= arg min y Ax22 = arg min(y Ax) (y Ax)
x
is indeed equal to
x
= R1 y1 .
Lets transform the problem into the familiar form. We introduce an error e such that
y = Ax + e,
and we would like to minimize e2 which is equivalent to minimizing yAx2 . Using the property
of an orthogonal matrix, we have that
e2 = U e2 .
Thus with e = y Ax, we have
e
22 =
U e22 = e U U e = (U (y Ax)) (U (y Ax)) = U y U Ax22
2
y1
2
Since y2 22 = y2 y2 is just a constant, it does not play any role in this minimization.
Thus we would lik to have
y1 Rx
=0
and because R is an invertible matrix, x
= R1 y1 .
MIT OpenCourseWare
http://ocw.mit.edu
For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
Exercise 3.2 i) We would like to minimize the 2-norm of u, i.e., u22 . Since yn is given as
yn =
hi un1
i=1
yn =
h1 h2
un1
un2
hn .
..
u0
We want to nd the u with the smallest 2-norm such that
y = Au.
where we assume that A has a full rank (i.e. hi =
0 for some i, 1 i n). Then, the solution
reduces to the familiar form:
u
= A (AA )1 y.
By noting that AA =
ni=1 h2i , we can obtain u
j as follows;
hj y
u
j =
n
2,
i=1 hi
for j = 0, 1, , n 1.
ii) a) Lets introduce e as an error such that yn = y e. It can also be written as y yn = e. Then
now the quantity we would like to minimize can be written as
r(y yn )2 + u20 + + u2n1
where r is a positive weighting parameter. The problem becomes to solve the following minimization
problem :
n
u
= arg min
u2i + re2 = arg min(u22 + re22 ),
u
i=1
from which we see that r is a weight that characterizes the tradeo between the size of the nal
error, y yn , and energy of the input signal, u.
In order to reduce the problem into the familiar form, i.e, y Ax, lets augment
is
bottom of u so that a new augmented vector, u
u
=
re
1
re at the
This choice of u
follows from the observation that this is the u
that would have u
22 = u22 + re2 ,
the quantity we aim to minimize .
Now we can write y as follows
y=
.
A ..
1
r
u
= Au
= Au + e = yn + e.
re
Now, u
can be obtained using the augmented A, A, as
A
1
1
u
= A (AA ) y = 1
AA +
y.
r
r
By noting that
1 2 1
AA + =
hi + ,
r
r
i=1
we can obtain u
j as follows
hj y
2
i=1 hi +
u
j = n
1
r
for j = 0, , n 1.
ii) b) When r = 0, it can be interpreted that the error can be anything, but we would like to
minimize the input energy. Thus we expect that the solution will have all the ui s to be zero. In
fact, the expression obtained in ii) a) will be zero as r 0. On the other hand, the other situation
is an interesting case. We put a weight of to the nal state error, then the expression from ii)
a) gives the same expression as in i) as r .
y
< T t, x(t) >
=
= T t 1 , x(t)
0
< 1, x(t) >
where ., . denotes the Grammian, as dened in chapter 2. Now, in chapter 3, it was shown
= A A, A 1 y. So, for our problem,
that the minimum length solution to y = A, x , is x
1 y
x
= T t 1 T t 1 , T t 1
.
0
Where, using the denition of the Grammian, we have that:
No
use the denition for inner
the individual entries, < T t, T t >=
T w, we can
T product to nd
2
3
2
0 (T t) dt = T /3, < T t, 1 >= 0 (T t)dt = T /2, and < 1, 1 >= T . Plugging these in, one
can simplify the expression for x
and obtain x
(t) = 12y
[ 1 Tt ] for t [0, T ].
T2 2
Alternatively, we have that x(t) = p(t).
Integrating both
t sides and taking into account that
t t1
p(0) = 0 and p(0) = 0, we have p(t) = 0 0 x( )d dt1 = 0 f (t1 )dt1 . Now, we use the integration
t
t
t
by parts formula, 0 u dv = uv|t0 0 v du, with u = f (t1 ) = 0 1 x( )d, and dv = dt1 ; hence du =
tt
df (t1 ) = x(t1 )dt1 and v = t1 . Plugging in and simplifying we get that p(t) = 0 0 1 x( )d dt1 =
t
T
0 (t )x()d. Thus, y = p(T ) = 0 (T )x( )d =< T t, x(t) > . In addition, we have that
T
0 = p(T ) = 0 x( )d =< 1, x(t) > . That is, we seek to nd the minimum length x(t) such that
y = < T t, x(t) >
0 = < 1, x(t) > .
Recall that the minimum length solution x
(t) must be a linear combination of T t and 1, i.e.,
x
(t) = a1 (T t) + a2 . So,
y = < T t, a1 (T t) + a2 > = a1
0 =
< 1, a1 (T t) + a2 >
=
T
0
T
3
2
(T t)2 dt + a2 0 (T t)dt = a1 T3 + a2 T2
T
2
= a1 T2 + a2 T.
0 (a1 (T t) + a2 )dt
This is a system of two equations and two unknowns, which we can rewrite in matrix form:
T3 T2
y
a1
3
2
,
= T 2
a2
0
T
2
So,
a1
a2
T3
3
T2
2
T2
2
y
0
v v2
mv .
(1)
Ax
2
mAx for x =
m x2 .
0, Ax
x2
Ax2
x2
1
x
1
x2 .
Thus,
mAx
mAx
mA ,
x2
x
0, therefore
Equation (2) must hold for all x =
3
(2)
maxx=0
Ax
1 A
n
Ax2
x2
= A2
mA .
Ax
Ax2
Ax2 for x = 0,
A2
x2
x2
n
x2
Ax
x
for all x =
0 including x that makes
x1 . So,
nAx
nA2
x2
Ax
x
maxx=0
nAx
nAx2
nA2 .
x2
x2
(3)
maximum, so,
Ax
x
= A
nA2 ,
or equivalently,
1
A A2 .
n
0
A=U
V ,
0 0
where U and V are unitary matrices. The Moore-Penrose inverse, or pseudo-inverse of A,
denoted by A+ , is then dened as the n m matrix
1
0
A+ = V
U .
0
0
a) Now we have to show that A+ A and AA+ are symmetric, and that AA+ A = A and A+ AA+ =
A+ . Suppose that is a diagonal invertible matrix with the dimension of r r. Using the given
denitions as well as the fact that for a unitary matrix U , U U = U U = I, we have
AA
1 0
= U
V V
U
0
0
1
0
0
= U
I
U
0 0
0
0
Irr 0
= U
U ,
0
0
0
0 0
1 0
0
0
1
0
Irr
0
A A = V
= V
= V
0
V
UU
0 0
0
0
V
I
0
0 0
0
V
0
The facts derived above can be used to show the other two.
AA A =
=
=
=
Irr
(AA )A = U
0
Irr 0
UU
U
0
0
0
U
V
0 0
A.
+
0
0
U A
0
V
0 0
Also,
A AA
Irr 0
= (A A)A = V
V A+
0
0
1
Irr 0
= V
U
V V
0
0
0
0
1
0
= V
U
0
0
+
= A+ .
b) We have to show that when A has full column rank then A+ = (A A)1 A , and that when A
has full row rank then A+ = A (AA )1 . If A has full column rank, then we know that m n,
rank(A) = n, and
nn
A=U
V .
0
Also, as shown in chapter 2, when A has full column rank, (A A)1 exists. Hence
1
(A A) A =
V 0 UU
V
V 0 U
0
1
V 0 U
= V V
= V ( )1 V V 0 U
= V ( )1 0 U
= V ( 1 0 )U
= A+ .
5
A = U mm 0 V .
It can be proved that when A has full row rank, (A A)1 exists. Hence,
A (AA )
= V
U U 0 V V
U
0
= V
U U U
0
= V
U U (1 )U
0
1
= V
U
0
= A+ .
c) Show that, of all x that minimize y Ax2 , the one with the smallest length x2 is given by
x
= A+ y. If A has full row rank, we have shown in chapter 3 that the solution with the smallest
length is given by
x
= A (AA )1 y,
and from part (b), A (AA )1 = A+ . Therefore
x
= A+ y.
Similary, it can be shown that the pseudo inverse is the solution for the case when a matrix A
has a full column rank (compare the results in chapter 2 with the expression you found in part (b)
for A+ when A has full column rank).
Now, lets consider the case when a matrix A is rank decient, i.e., rank(A) = r < min(m, n)
where A C mn and is thus neither full row or column rank. Suppose we have a singular value
decomposition of A as
A = U V ,
where U and V are unitary matrices. Then the norm we are minimizing is
Ax y = U V x y = U (V x U y) = z U y,
where z = V x, since is unaltered by the orthogonal transformation, U . Thus, x minimizes
Ax y if and only if z minimizes z c, where c = U y. Since the rank of A is r, the matrix
has the nonzero singular values 1 , 2 , , r in its diagonal entries. Then we can rewrite z c2
as follows:
z c2 =
r
n
(i zi ci )2 +
c2i .
i=1
i=r+1
It is clear that the minimum of the norm can be achived when zi = cii for i = 1, 2, , r and
the rest of the zi s can be chosen arbitrarily. Thus, there are innitely many solutions z and the
solution with the minimum norm can be achieved when zi = 0 for i = r + 1, r + 2, , n. Thus, we
can write this z as
z = 1 c,
where
1 =
1
0
r
0
0
and r is a square matrix with nonzero singular values in its diagonal in decreasing order. This
value of z also yields the value of x of minimal 2 norm since V is a unitary matrix.
Thus the solution to this problem is
x
= V z = V 1 c = V 1 U y = A+ y.
It can be easily shown that this choice of A+ satises all the conditions, or denitions, of pseudo
inverse in a).
n
A=U
V
0
where n is a nn diagonal matrix with singular values on the diagonal. Let Q = U and R = n V
and we get the QR factorization. Since Q is an orthogonal matrix, we can represent any Y Cpm
as
Y1
Y =Q
Y2
Next
Y1
R
Y1 RX
Y AXF2
= Q
2F
X 2F
= Q
Y2
0
Y2
Denote
D=
Y1 RX
Y2
and note that multiplication by an orthogonal matrix does not change Frobenius norm of the
matrix:
QD2F
= tr(D Q QD) = tr(D D) = D2F
Since Frobenius norm squared is equal to sum of squares of all elements, square of the Frobenius
norm of a block matrix is equal to sum of the squares of Frobenius norms of the blocks:
Y1 RX
2F
= Y1 RX F2 + Y2 2F
Y2
Since Y2 block can not be aected by choice of X matrix, the problem reduces to minimization of
Y1 RX2F . Recalling that R is invertible (because A has full column rank) the solution is
X = R1 Y1
7
b) Evaluate the expression with the pseudoinverse using the representations of A and Y from part
a):
AA
AY =
R
0
Y1
Y2
=R
Y1
Y2
= R1 Y1
From 4.5 b) we know that if a matrix has a full column rank, A+ = (A A)1 A , therefore both
expressions give the same solutions.
c)
Y
A
2
2
Y AXF + Z BXF =
X
F2
Z
B
A
Since A has full column rank,
also has full column rank, therefore we can apply results
B
from parts a) and b) to conclude that
X=
A
B
A
B
A
B
Y
Z
1
= A A + B B
A Y + BZ
MIT OpenCourseWare
http://ocw.mit.edu
For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
Exercise 4.7 Given a complex square matrix A, the denition of the structured singular value
function is as follows.
(A) =
1
min {max ()| det(I A) = 0}
1
|.
max (A)
max (A)
x2
2
1
max ().
max (A)
Then, we show that the lower bound can be achieved. Since = { Cnn }, we can choose
such that
1
max (A)
= V
..
U .
.
0
where U and V are from the SVD of A, A = U V . Note that this choice results in
1
0
V
=
V
I A = I V
0
1
which is singular, as required. Also from the construction of , max () =
(A) = max (A).
1
max (A) .
Therefore,
c) If = {diag(1 , , n )|i C} with D {diag(d1 , , dn )|di > 0}, we rst note that D1
exists. Thus:
det(I D1 AD) = det(I D1 AD)
= det((D1 D1 A)D)
= det(D1 D1 A)det(D)
= det(D1 (I A))det(D)
= det(D1 )det(I A)det(D)
= det(I A).
Where the rst equality follows because and D1 are diagonal and the last equality holds because
det(D1 ) = 1/det(D). Thus, (A) = (D1 AD).
Now lets show the left side inequality rst. Since 1 2 , 1 = {I| C} and 2 =
{diag(1 , , n )}, we have that
min {max ()| det(I A) = 0} min {max ()|det(I A) = 0},
Hence,
2 (A) = 2 (D1 AD) 3 (D1 AD) = max (D1 AD).
Exercise 4.8 We are given a complex square matrix A with rank(A) = 1. According to the SVD
of A we can write A = uv where u, v are complex vectors of dimension n. To simplify computations
we are asked to minimize the Frobenius norm of in the denition of (A). So
(A) =
1
min { F | det(I A) = 0}
is the set of diagonal matrices with complex entries, = {diag(1 , , n )|i C}. Introduce
the column vector = (1 , , n )T and the row vector B = (u1 v1 , , un vn ), then the original
problem can be reformulated after some algebraic manipulations as
(A) =
1
minCn { 2 | B = 1}
2
To see this, we use the fact that A = uv , and (from excercise 1.3(a))
det(I A) = det(I uv )
= det(1 v u)
= 1 v u
u1
.
.
.
v1
vn
un
n
v1 u1 vn un
..
=
.
n
= B
We are dealing with a underdetermined system of equations and we are seeking a minimum norm
solution. Using the projection theorem, the optimal is given from o = B (BB )1 . Substituting
in the expression of the structured singular value function we obtain:
(A) =
|u
i vi |
2
i=1
In the second part of this exercise we dene to be the set of diagonal matrices with real entries,
= {diag(1 , , n )|i R}. The idea remains the same, we just have toalter the
constraint
Re(B)
equation, namely B = 1 + 0j. Equivalently one can write D = d where D =
and d =
Im(B)
1
. Again the optimal is obtained by use of the projection theorem and o = D (DDT )1 d.
0
Substituting in the expression of the structured singular value function we obtain:
1
(A) =
T
d (DDT )1 d
A = A + E E A = A + E E A + E + E A A + E E.
Also,
(A + E) = A + E A + E A + E A + E A E.
Thus, putting the two inequalities above together, we get that
|A + E A| E.
Note that the norm can be any matrix norm, thus the above inequality holds for the 2-induced
norms which gives us,
|max (A + E) max (A)| max (E).
A matrix E that achieves the upper bound is
E = U
1
0
..
.
0
0
..
.
0
0
.
..
V = A,
r
... 0
is achieved.
2. Suppose that A has less than full column rank, i.e., the rank(A) < n, but A + E has full
column rank. Show that
min (A + E) max (E).
0 such that
(A + E)x2
Ex2
=
E 2 = max (E).
x2
x2
But,
min (A + E)
4
(A + E)x2
,
x2
as shown in chapter 4 (please refer to the proof in the lecture notes!). Thus
min (A + E) max (E).
Finally, a matrix E that results in A + E having full column rank and that achieves the upper
bound is
0
0
0
.
0 ..
0
0
.
..
. 0 r+1
..
E = U
0
r+1
0 0
V ,
for
A = U
0
...
0
..
.
0
..
.
V .
A + E = U
0
0
0
0
0
..
.
0
0
0
0
0
0
0
0
r
0
0 r+1
0
... r+1
0
V .
It is easy to see that min (A + E) = r+1 , and that max (E) = r+1 .
The result in part 2, and some extensions to it, give rise to the following procedure (which
is widely used in practice) for estimating the rank of an unknown matrix A from a known
matrix A + E, where E2 is known as well. Essentially, the SVD of A + E is computed, and
the rank of A is then estimated to be the number of singular values of A + E that are larger
than E2 .
..
A = U
V ,
where U and V are unitary matrices and k r + 1. Following the given procedure, lets select the
rst r+1 columns of V : {v1 , v2 , , vr+1 }. Since V is unitary, those vi s are orthonormal and hence
independent. Note that {v1 , v2 , , vr+1 , vn } span Rn , and if rank(E) = r, then exactly r of
the vectors, {v1 , v2 , , vr+1 , vn }, span R(E ) = N (E). The remaining vectors span N (E).
So, given any r + 1 linearly independent vectors in Rn , at least one must be in the nullspace of E.
That is there exists coecients ci for i = 1, , r + 1, not all zero, such that
E(c1 v1 + c2 v2 + cr+1 vr+1 ) = 0.
These coecients can be normalized to obtain a nonzero vector z, z2 = 1, given by
1
r+1
.
z=
i vi = v1 vr+1
.
.
i=1
r+1
and such that Ez = 0. Thus,
(A E)z = Az = U
v1
.
..
r+1
i vi = U
r+1 r+1
i=1
0
..
.
0
vr+1
1 1
2 2
..
(1)
(A E)z2
1 1
2 2
.
1 1
2 2
.
..
..
0
0
r+1
12
r+1
12
=
|i i |2
r+1
|i |2
.
i=1
i=1
z22 = 1 ( v1
vr+1
1
1
r+1
.
2
..
2
|
i |
2 = 1.
=
1
=
)
2
2
.
i=1
r+1
r+1
6
(2)
..
E = U
V .
E has rank r,
A E = U
0
..
V .
.
0
r+1
..
.
k
0
and A E2 = r+1 .
Exercise 6.1 The model is linear one needs to note that the integration operator is a linear
operator. Formally one writes
(ts)
=
e
u1 (s) +
e(ts) u2 (s)
S(u1 + u2 )(t) =
It is non-causal since future inputs are needed in order to determine the current value of y. Formally
one writes
e(ts) u(s)ds
It is not memoryless since the current output depends on the integration of past inputs. It is also
time varying since
e(tT s) u(s)ds
one can argue that if the only valid input signals are those where u(t) = 0 if t < 0 then the system
is time invariant.
MIT OpenCourseWare
http://ocw.mit.edu
For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
for t 0
One may assume that u(t) = 0 for t < 0 this will just alter the lower limit of integration in the
convolution formula, but will not aect the state space description, note also that the system is
causal.
t
y(t) =
2(e(t ) e2(t ) )u( )d t 0.
0
y(t) = 2
x 1 (t)
x1 (t)
0
1
0
+
u(t).
=
2 3
2
x 2 (t)
x2 (t)
Since x1 (t) and x2 (t) can be written as x = Ax + Bu, there variables satisfy the continuous time
state property and are this valid state variables.
b) The transfer function of the system is
2(s + 2) c(s + 1)
, Re(s) > 1.
s2 + 3s + 2
When c = 2 there are no s terms in the numerator, which implies that the output y(t) only de
pends on u(t) but not on u (t). Our selection of state variables is valid only for c = 2. If c = 2, the
reachability canonical form may guide us to the selection of state variables
H(s) =
y=
a0 (t)y(t) + b0 (t)u(t) +
1
b1 (t)u (t).
Notice that
in the TI case, the coecients a0 , b0 , and b1 were constants, so we were able to integrate
the term b1 u (t). In this case, we can still get rid of the u (t), by integration by parts. We have
y = b1 (t)u(t) +
Now, let
x = a0 (t)y(t) + (b0 (t) b1 (t))u(t),
we have that
y = x + b1 (t)u(t),
and substituting y in the equation for x we get:
x = a0 (t)x(t) + (b0 (t) b1 (t) a0 (t)b1 (t))u(t)
y = x + b1 (t)u(t)
Exercise 10.1 a) A =
J1
..
b) Let J
=
.
0 0
.
1 0
Jq
Note that
=0
= 0, 1 i q.
k
Also note that A = M J k M 1 and hence Ak = 0 J k = 0.
Thus, it suces to show that Jik = 0, for all i {1, . . . , q} for some nite positive power k i all
the eigenvalues of A are 0.
First, we prove suciency: If all the eigenvalues of A are 0, then the corresponding Jordan blocks
have zero diagonal elements and are such that Jini = 0 for every i, where ni is the size of Ji . Let
Next, we proof necessity. Suppose there exists at least one eigenvalue of A, say io , that is non-zero.
Note that the diagonal elements of the k th power of the corresponding Jordan block(s) are kio , for
any positive power k. Hence, there exists at least one i such that Jik = 0, for any positive power k.
If A has size n, then the size of each of the Jordan blocks in its Jordan form decomposition is at
Jk
Jik
J1
..
d) Let J =
be the Jordan form decomposition of A. We have that:
.
Jq
R(Ak+1 ) = R(Ak ) R(J k+1 ) = R(J k )
Thus it suces to look for the smallest value of k for which R(J k+1 ) = R(J k ). Note that a Jordan
block associated with a non-zero eigenvalue has full column rank, and retains full column rank
2
when raised to any positive power k. On the other hand, a nilpotent Jordan block of size ni has
column rank ni 1, and is such that rankJik = max{0, ni k}. Let N = {i|Ji is nilpotent}. Dene
kmin = maxiN ni . kmin is the smallest value of k for which R(J k+1 ) = R(J k ).
Exercise 11.1.
Since the characteristic polynomial of A is a determinant of a matrix zI A,
det(zI A) = det((zI A)T ) = det(zI AT ),
rst we show that
det(zI A1 ) = det(zI A2 ) = q(z)
for given A1 and A2 . For
qn1 1 0
qn2 0 1
.
. ..
A1 =
..
.. .
q1 0 0
q0 0 0
..
.
0
0
.
..
.
0
0
z
1 0
0
z
1
.
..
..
and zI A2 =
..
.
.
1
0 0
0
z
q 0 q1 q2
..
.
0
0
.
1
z + qn1
0
0
...
1
0
...
0
1
.
..
and A2 =
0
0
0
0
q0 q1 q2 qn1
we have
z + qn1 1 0
qn2
z 1
.
..
.
...
zI A1 =
.
.
0
0
q1
0
0
q0
..
.
0
0
.
..
a11 a12
a21 a22
A =
.
..
..
.
for
..
.
a1n
a2n
..
.
an1 an2
ann
det(zI A1 ) = (z + qn1 ) .
.
.
0
1 0
z
1
0
0
+ qn3 .
...
..
0
0
0
0
qn2 ..
0
z 1
0
0
z
0 0
1
0 0
0 0
.
.
q0 ..
..
.
.
.
.
.
.
0
z 1
0
0
z
1 0
z
1
0
z
..
.
..
..
.
0
0
0
0
0
0...
z
...
0
0
0
0
0
.
..
0
0
0
0 0
z
1
0
z
..
..
..
.
.
.
0 0
0 0
0
0
1 0
z
1
.
.
..
..
0
0
0
0
z 1
0
z
0
0
0
...
0
0
0
1 0
z
1
..
.
0
0
0
..
.
0
0
0
.
where the last depends on whether n is an even or odd number. Similarly if we take the
determinant of zI A2 using cofactors on the last row of zI A2 it is clear that we have
det(zI A1 ) = det(zI A2 ) = q(z).
Also it is true that
det(zI A) = det((zI A)T ) = det(zI AT ).
Hence
det(zI A1 ) = det(zI AT1 ) = det(zI A2 ) = det(zI AT2 ) = q(z).
Then we have
q(z) = (z + qn1 )z n1 + qn2 z n2 + + q1 z + q0
q(z ) = z n + qn1 z n1 + qn2 z n2 + + q1 z + q0 .
b) For A2 , we have
i 1 0
0 i 1
..
.
i I A2 = ...
.
..
0 0
0
q0 q1 q3
..
.
0
0
..
.
0
0
..
.
1
i
qn2 i + qn1
0
0
0
i 1 0
0
i 1
0
0
0
0 0
i
0
0
v i = .
..
.
.
.
.
..
..
...
.
..
..
.
0 0
0
i
1
q0 q1 q3 qn2 i + qn1
0
(1)
vi =
1
i
2i
..
.
n1
i
i
0
q0
i
2
i
+
.
+
+
i q 1
0
..
n1
i
(i + qn1 )in1
0
..
q0 + q1 i +
+ ni 1 qn1 + ni
= 0
0 1 0
A =
0 0 1
.
6 5 2
Its eigenvalues are 1 = 1, 2 = 3, and 3 = 2. Note that this A has the form of A2 thus the
corresponding eigenvectors can be written as follows:
1
1
1
v 1 =
1
,
v 2 =
2
,
v 3 =
3
.
12
22
23
Using those three eigenvectors we can obtain the
diagonal:
|
M =
v 1
|
|
|
v 2 v 3
.
|
|
Thus with
1 0 0
=
0 2 0
,
0 0 3
we have
A = M M 1 ,
which implies that
k1 0 0
= M k M 1 = M 0 k2 0 M 1
0 0 k
1
1 1
1
1 0 0
= 1 3 2 0 3 0 15
1
1
9 4
0
0 2
5
Ak
16
1
10
16
1
10
1
15
4
v15
and
e 1 t
0
0
0 M 1
= M 0 e2 t
0
0 e 3 t
1
1 1
e
0
0
1
= 1 3 2 0 e3t 0 15
1
e2t
1
9 4
0
0
5
eAt
61
1
10
4
15
16
1
10
1
15
Exercise 11.3 This equality can be shown in a number of ways. Here we will see two. One way
is by diagonalization of the matrix
.
Its characteristic equation is (A) = ( )2 + 2 , yielding eigenvalues = j. Using the
associated eigenvectors, we can show that it has diagonalization
1
1 1
j
j
2
2
=
.
1
1
j j
+ j
2 j 2
Now
jt
1 1
e e
exp t
=
e ejt
j j
jt
jt
jt
jt
e e +2e
e e e
j2
=
jt +ejt
jt
jt
e e j2
e e +e
2
e cos(t) e sin(t)
.
=
e sin(t) e cos(t)
1
2
1
2
j 12
j 12
An arguably simpler
way to
s
(sI A) =
s
and so
1
(sI A)
1
=
(s )2 + 2
6
Taking the inverse Laplace transform element-wise gives us the previous result.
Exercise 11.4 This equality is shown through the denition of the matrix exponential. The
derivation is as follows.
1 k A
A
exp t
=
t
B
B
k!
k=0
1
tk Ak
=
tk B k
k!
k=0
1 k k
tA
t A
e
k=0
k!
=
=
1 k k
etB
k=0 k! t B
Exercise 11.5 By direct substitution for the proposed solution, we have:
etA BetA x(t) = etA BetA etA e(tto )(A+B) eto A x(to )
= etA Be(tto )(A+B) eto A x(to )
(2)
(3)
Since (3) and (2) are equal, the proposed solution satises the system of ODEs and hence is a
solution. Moreover, it can be shown that the solution is unique (though this is not the subject of
this class).
MIT OpenCourseWare
http://ocw.mit.edu
For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
Additionally, when > 0 there are two more equilibrium points: 0, 2 (In particular, this is
the case for in the interval 0 < 1).
(b) Linearizing the system around (0, 0) we get the Jacobian:
0 1
A =
2 0
The characteristic polynomial of the system is det(A I) = 2 2. If > 0 there is an unstable
root, hence the linearized model is unstable around (0, 0). If < 0 both roots are on imaginary
axis, and the linearized system is marginally stable (neither asymtotically stable nor unstable). To
analyze stability of the original non-linear system in this case we would have to look at the higher
order terms.
For the two other equilibrium points (which exist for > 0) we get the Jacobian:
0
1
4 0
The characteristic polynomial for the system is det(A I) = 2 + 4. The complex conjugate
roots lie on the j axis and the linearized system is marginally stable.
Exercise 13.2 a) Notice that the input-output dierential equation can be written as
y = (u a1 y) + (u a2 y cy 2 )
and we can use observability-like realization employed for a discrete-time system of exercise 7.1
(c). The dierential equations for the states are
x1 = a1 x1 + x2 + u
x2 = a2 x1 cx21 + u
and the output equation is y = x1 . You can check that it is indeed a correct realization by
dierentiating the rst state equation and plugging in an expression for x2 from the second equation.
1
A =
3 1
2 0
P =
1
2
21
1
12
for various constants C. Let us nd such C that if the point x1 x2 is within the boundary then
V (x) < 0. Then the trajectory started in this set will stay there, and will asymptotically decay to
zero. Note that
1
1
V (x) = x21 + (x1 2x2 )2 = C 2
4
4
therefore
|2x2 x1 | < 2C
Therefore if C < 1/4 the derivative is strictly less than zero. Hence we found a region of attraction
as an ellipse, given by
1
x21 + (x1 2x2 )2 <
4
Any ball located completely within this ellipse will also be a region of attraction. Note also that
this set is not exhaustive, there are other points in space that converge to zero.
Exercise 14.2 (a) The system is asymptotically stable if all the roots of the characteristic poly
nomial lie in the left half of the complex plane. Note that characteristic polynomial for matrix A
in a control canonical form is given by
det(A I) = N + a0 N 1 + . . . + aN 1
b) Use continuity argument to prove that destabilizing perturbation with the smallest Frobenius
norm will place an eigenvalue of A + on the imaginary axis. Suppose that the minimum pertur
Assume that there is an eigenvalue in the right half plane. Consider a perturbation of
bation is .
where 0 c 1. As c changes from 0 to 1 at least one eigenvalue had to cross j axis,
the form c,
This proves contradiction
and the resulting perturbation has a smaller Frobenius norm than .
with the original assumption that A + has an eigenvalue in the right half plane.
c) The characteristic polynomial for the perturbed matrix is
det(A I) = N + (a0 + 0 ) N 1 + . . . + (aN 1 + N 1 )
We know that there exists a root = j, where is real. If we plug this solution in, and assemble
the real and imaginary parts and set them equal to zero, well get two linear equations in with
coecients dependent on ak and powers of . For example, for a 4th order polynomial:
(j )4 + (a0 + 0 )(j )3 + (a1 + 1 )(j)2 + (a2 + 2 )j + a3 + 3 = 0
results in the following two equations:
4 (a1 + 1 ) 2 + a3 + 3 = 0
(a0 + 0 ) 3 + (a2 + 2 ) = 0
This equation can be written in matrix form
0
2 0 1
3
0
0
as follows:
4 + a1 2 a3
1
=
2
a 0 3 a2
3
or
A() = B()
Therefore the problem can be formulated as nding a minimal norm solution to an underdetermined
system of equations:
min
A()=B()
By inspection we can see that matrix A has full row rank for any value of unequal to zero. If
= 0 the solution is 3 = a3 , and the rest of the k equal to zeros. For all other values of the
solution can be expressed as a function of :
() = A () A()A () 1 B()
Note that the matrix AA is diagonal, and can be easily inverted. By minimizing the norm of this
expression over we can nd
that corresponds to the minimizing perturbation. Then plug this
minimum as the solution. This way we converted the problem to minimization of a function of a
single variable, which can be easily solved.
d) In case N = 2 the characteristic polynomial of the perturbed matrix is
2 + (a0 + 0 ) + (a1 + 1 ) = 0
where = j. For = 0 the minimizing solution is 1 = a1 , 0 = 0. If = 0, plug in = j,
and the resulting system of equations is
1 = 2 a1
0 = a0
This is a proper system (number of equations is equal to the number of unknowns), and its solution
is given directly by the equations. To minimize the norm of the solution we set = a1 . Note
that stability of original matrix A requires that a1 > 0, a0 > 0 (in fact positivity of all coecients
is always a necessary condition, but not sucient for N > 2 - use Routh criterion for a test in that
case!). Next, we have to compare |a1 | and |a0 |, and choose the smallest of them to null with 1
or 0 . In our problem a0 = a1 = a, therefore there are 2 solutions: (0, a) and (a, 0) for the set of s.
around. Then,
x = f (x, u) =
f
f
x
+
u + (x, u),
u (x,
x (x,
u
)=(0,0)
u
)=(0,0)
x1
x2
x2
x31 x22 u x24
Thus the unique equilibrium point x is found to be x = (0, 0), which is independent on u .
2) Choose u = 0. Then the linearized system is
0
1
=
x
3x12 2x2 u 4x23
0 1
0
=
x +
u
0
0 0
x = Ax + Bu.
(x ,u )=(0,0)
x +
0
2
x2
(x ,u )=(0,0)
Since the eigenvalues of the matrix A are at 0, and the u term does not enter to the linearized
system, the linearized system can not conclude the local stability of the nonlinear system S1 .
c) Let u = c x32 where c is a function of x1 and x2 . Then
x 1 = x2
Now, choose a Lyapunov function candidate in the V (x) = x41 + x22 . Then,
V (x) =
4x31
2x2
x2
x31 cx22
x31 cx22 = x2
c =
This choice of c makes V (x) be
x21
u = 2x2 x2 +
x32
x2
stabilizes the system.
5
x 1
x 2
1
2x2
x2 (x1 + 1)
x1
x2
1 0
A=
0 1
A has repeated eigenvalues at 1. Thus, the nonlinear system is locally asymptotically stable about
the origin.
(b) The linearized system is given by:
x 1
x 2
3x21 1
1 1
x1
x2
0 1
A=
1 1
The characteristic polynomial is: (1 + ) 1 = 0. The eigenvalues are thus 1,2 = 12 25 (one
of the eigenvalues is in the right half plane), and the nonlinear system is unstable about the origin.
(c) The linearized system is given by:
x 1
x 2
1 1
2x1 1
x1
x2
1 1
A=
0 1
A has repeated eigenvalues at 1. Thus, the nonlinear system is locally asymptotically stable about
the origin.
(d) The linearized system is given by:
x1 (k + 1)
x2 (k + 1)
2 x2 (k)
1
1
x1 (k)
x2 (k)
2 0
A=
1 1
A has eigenvalues 1 = 2 and 2 = 1. Since one of the eigenvaues is outside the unit disk, the
nonlinear system is unstable about the origin.
x1 (k + 1)
x2 (k + 1)
x1 (k)
x2 (k)
0 0
A=
1 2
A has eigenvalues 1 = 0 and 2 = 2. Since one of the eigenvalues is outside the unit disk, the
nonlinear system is unstable about the origin.
MIT OpenCourseWare
http://ocw.mit.edu
For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
Therefore for a system represented by rst-order transfer function to be causal the ROC has to be
to the right of the pole (in fact this is true for a multiple pole as well). Since a rational function
can be represented by a partial fraction expansion, and region of convergence is dened by the
intersection of individual regions of convergence, the ROC of the system has to lie to the right
of the rightmost pole for the system to be causal. In case of a rational transfer function this is
also a sucient condition. Note that if an LTI system has a rational transfer function, its impulse
response consists of diverging or decaying exponents (maybe multiplied by powers of t), therefore all
concepts of p-stability are equivalent. For BIBO stability the impulse response has to be absolutely
integrable, which is equivalent to existence of Fourier transform. The Fourier transform is Laplace
transform evaluated at the j axis. Therefore for stability the ROC has to include the j axis.
Using these two rules we can see that the system
G(s) =
s+2
(s 2)(s + 1)
(i) is neither causal nor stable for ROC given by Re(s) < 1
(ii) is non-causal and stable for ROC 1 < Re(s) < 2.
(iii) is causal and unstable for ROC Re(s) > 2
Another way to solve the problem is to nd (look up in the tables) inverse Laplace tranforms
corresponding to the transfer function and ROC pairs. Compute partial fraction expansion
s+2
1
4
1
G(s) =
=
(s 2)(s + 1)
3 s2 s+1
The impulse response functions in three dierent cases are
(i) h(t) =
(ii) h(t) =
(iii) h(t) =
1
4e2t u[t] + et u[t] , Re(s) < 1 anticausal, unstable
3
1
4e2t u[t] + et u[t] , 1 < Re(s) < 2 non-causal, stable
3
1 2t
4e u[t] + et u[t] , Re(s) > 2 causal, unstable
3
(b) Note that if there is a diverging exponent in the impulse response, an input which is non-zero
on some interval will result in an exponentially diverging output. For example, in case (iii) choose
1
f (t) = 1 for 0 < t < 1 and 0 otherwise. The output for any positive t will be a linear combination
of e2t and et . For example for t > 1:
1
1 + 2(t )
2
y(t) =
4e
e(t ) u[t ]f ( )d = e2t 1 e2 et (e 1)
3
3
3
Clearly this function grows unbounded and has an innite p-norm. However the input f (t) has
p-norm equal to 1 for any p including . In case (i) we can use f (t) = 1 for 1 < t < 0, for
example. Innitely long bounded inputs that do not cancel an unstable pole will also result in
unbounded output. For example, choose e2t , t 0 in case (iii).
Exercise 15.2 a) When g(x) = cos(x), the system is unstable for p 1. Proof: Suppose the
system is p-stable. Then, there there exists a constant C such that gp Cup . Now, with
zp 0, there exists a T such that cos(z(t)) 1 for all t T . So, we have a contradiction
where up 0, then yp 1, which implies that there are no such a constantC to satisfy the
condition. Therefore the system is p-stable not all p 1.
b) When g(x) = sin(x), the system is p-stable for p 1. Proof: Consider the Taylor series
expansion of y(t) = sin(z(t)) about the origin. Then, we have
1
y(t) = sin(z(t)) = z(t) z 3 (t) + H.O.T.
3
This implies that
yp zp + O(zp ).
(1)
(2)
for some constant C. Thus combining Eqn (1) and (2), we have
yp Cup + O(Cup ).
So, for all > 0 there exists such that O(Cup ) up , which implies that yp (C + )up .
That concludes the p-stability, with up < .
c) When g(x) is a saturation function with a scale of 1, the system is p-stable for p 1. Proof:
Again since the system from u to z is p-stable, there exists a constant C such that z p C up .
So, for all u with up , if we take C to be 1 , then we have:
zP Cup 1.
Since,
|g(z)| =
z
1
|z | 1
,
|z| 1
u=
ui eji t
where ui Rn , i R.
i=1
With
N
u (t) =
eji t =
i=1
N
u (t)u(t) = (
uTi eji t
i=1
eji t uTi )(
i=1
N
N
uj ej k t )
k=1
ej(k i )t uTi uk ,
i=1 k=1
Pu
L
1
= lim
u (t)u(t)dt
L 2L L
L
1
= lim
uTi
uj ej(j i )t dt
L 2L L
i
j
L
T
= lim
ej(j i )t dt.
u
i uj
L 2L
L
i
L
0 : i = j
lim
ej
(j i )tdt =
L L
1 : i=j
Thus
N
i=1
i=1
1
T
ui
ui (2L) =
ui 22 .
L 2L
Pu = lim
b) The output of the system can be expressed as y(t) = H(t) u(t) in time domain or Y (s) =
H(s)U (s) in frequency domain. For a CT LTI system, we have y = H(ji )ui eji t if u = ui eji t .
Thus
N
y(t) =
H(ji )ui eji t .
i=1
y (t)y(t) =
i=1
N
N
k=1
i=1 k=1
T
= lim
ui H (ji )H(jk )uk
L 2L
i
k
1
= lim
H(ji )ui 2 (2L).
L 2L
ej(k i )t dt
Py =
H(ji )ui 2 .
i=1
i=1
N
H(ji )ui 2
2
max
(H(ji ))ui 2
i=1
=
Py
Py
sup Py =
Pu =1
2
max max
(H(ji ))
i
ui 2
i=1
2
max max (H(ji ))Pu
i
2
max max
(H(ji ))Pu
i
2
sup max
(H(j))Pu .
H2 .
1
|
|
.
.
H(j0 ) = U V =
u1
un
|
|
n
Consider
a SVD of H(j0 ):
v1
vn
Exercise 16.3 We can restrict our attention to the SISO system since one can prove the MIMO
case with similar arguments and use of the SVD.
i.) Input l Output l
this case was treated in chapter 16.2 of the notes
ii.) Input l2 Output l2
this case was treated in chapter 16.3 of the notes
iii.) Input Power Output Power
this case was treated in the Exercise 16.1. Please note that Py = H2 Pu , the given entry in the
table corresponds to the rms values.
iv.) Input Power Output l2
a nite power input normally produces a nite power output (unless the gain of the system is zero
at all frequencies) and in that case the 2-norm of the output is innite.
v.) Input l2 Output Power
This is now the reveresed situation, but with the same reasoning. A nite energy input produces
nite energy output, which has zero power.
vi.) Input Power Output l
Here the idea is that a nite power input can produce a nite power output
whose -norm is
unbounded. Thinking along the lines of example 15.2 consider the signal u =
m=1 vm (t) where
3
vm (t) = m if m < t < m + m and otherwise 0. This signal has nite power and becomes un
bounded over time. Take that signal as the input to an LTI system that is just a constant gain.
vii.) Input l2 Output l
|y(t)
| =
h(t s)u(s) ds
MIT OpenCourseWare
http://ocw.mit.edu
For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
y
(I + P K)1 P (I + P K)1 P K
w1
=
.
w2
u
(I + KP )1
(I + KP )1 K
So, if K is given as
K = Q(I P Q)1 = (I QP )1 Q,
then
(I + P K)1 P = (I + P Q(I P Q)1 )1 P
= (((I P Q) + P Q)(I P Q)1 )1 P
= (I P Q)P
(I + P K)
(I + KP )
= (I + (I QP )1 QP )1
= ((I QP + QP )(I QP )1 )1
= I QP
(I + KP )
K = (I QP )(I QP )1 Q
= Q.
Thus, the closed loop transfer function can be now written as follows:
y
(I P Q)P P Q
w1
=
.
u
(I QP )
Q
w2
In order for the closed loop system to be stable, then all the transfer functions in the large matrix
above must be stable as well.
(I P Q)P I P QP (I + P Q)P
(I + P Q)P P + P 2 Q
P Q P Q
I QP I + QP I + QP .
1
Since P and Q are stable from the assumptions, we know that all the transfer functions are stable.
Therefore the closed loop system is stable if K = Q(I P Q)1 = (I QP )1 Q.
2) From 1), we can express Q in terms of P and K in the following manner.
K = Q(I P Q)1
K(I P Q) = Q
K KP Q = Q
K = (I + KP )Q
Q = (I + KP )1 K = K(I + P K)1 ,
by push through rule.
For some stable Q, the closed loop is stable for a stable P . by the stabilizing controller K =
Q(I P Q)1 . Yet, not all stable Q can be used for this formulation because of the well-posedness
of the closed loop. In the state space descriptions of P and Q, in order for the interconnected
system, in this case K(s) to be well-posed, we have to have the condition ( 17.4 ) in the lecture
note, i.e., (I DP Q()) is invertible.
3) Suppose P is SISO, w1 is a step, and w2 = 0. Then, we have the following closed loop transfer
function:
1
Y (s)
(I P Q)P
=
,
U (s)
I QP
s
1
s
we have
1
U (s) = (1 Q(s)P (s)) .
s
Then using the nal value theorem, in order to have the steady state value of u() to be zero, we
need:
1
u() = lim s(1 Q(s)P (s)) = 0
s0
s
1 Q(0)P (0) = 0
Q(0) = 1/P (0).
Therefore, Q(0) must be nonzero and is equal to 1/P (0). Note that this condition implies that P
cannot have a zero at s = 0 because then Q would have a pole at s = 0, which contradicts that Q
is stable.
Exercise 17.5 a) Let l(s) be the signal at the output of Q(s), then we have
l = Q(r (P P0 )l)
(I + Q(P P0 ))l = Qr
l = (I + Q(P P0 ))1 Qr.
2
s1 ,
P0 (s) =
1
s1 ,
Y (s) = P (s)L(s)
Y (s)
R(s)
2
1
2
=
1+2
2R(s)
s1
s1
s1
4
s + 1 1
=
R(s)
s1 s1
4 s1
=
R(s)
s1 s+1
4
=
.
s+1
b) There is an unstable pole/zero cancellation so that the system is not internally stable.
c) Suppose P (s) = P0 (s) = H(s) for some H(s). Then using a part of the equation in a), we have
Y (s) = H(s)(I + Q(s)(H(s) H(s)))1 Q(s)R(s)
= H(S)I 1 Q(s)R(s)
= H(s)Q(s)R(s)
Y (s)
R(s)
= H(s)Q(s).
Therefore in order for the system to be internally stable for any Q(s), H(s) has to be stable.
Exercise 19.2 The characteristic polynom for the closed loop system is given by
s(s + 2)(s + a) + 1 = 0
Computing the locus of the closed poles as a function of a can be done numerically. The closed
loop system is stable if a 0.225. The above bound can also be derived by means of root locus
techniques or by evaluating the Routh Hurwitz criterion. Another way of deriving bounds for
the value of a is by casting this parametric uncertainty problem as an additive or multiplicative
perturbation problem, see also 19.5. One can expect that the derived bounds in such a case would
be rather conservative.
Exercise 19.4 We can represent an uncertainty in feedback conguration, as shown below.
Note that the plant is SISO, and we consider blocks and W to be SISO systems as well, so
we can commute them. The transfer function seen by the block can be derived as follows:
z = P0 (W w Kz)
= (I + P0 K)1 P0 W w
M
= (I + P0 K)1 P0 W.
Apply the small gain theorem, and obtain the condition for stability robustness of the closed loop
system as follows:
Figure 19.4
W (j)P0 (j)
< 1
sup
1 + P0 (j)K(j)
1 + PK = 1 +
10
s s + 10
=
a < 10.
sa
sa
1
1
P0
s
=
=
,
1
1 + W P0
s a
1 a s
so when = 1, we have
1
P0
=
= P,
1 + W P0
sa
which says that P is clearly in .
c) The transfer function seen by the block was derived is (from the previous problem):
= (I + P0 K)1 P0 W.
sup |
sup |
P0 W
|<1
1 + P0 K
1
j
10
+ j
| < 1
..
.
|a| <
2 + 100
P0
1
s+100
=
=
,
1
1 + W P0
s
+
100
+ W
1 + W s+100
with = 1, the denominator becomes s + 100 + W , which we want to equate to s a. Thus we
have a new W to be
W = a 100.
Then in order to derive the condition on the closed loop system to be stable n the set , we use
the small gain theorem again.
sup |
sup |
|a + 100|
<
..
.
P0 W
|<1
1 + P0 K
1
a100
j+100
|
10
+ j+100
<1
2 + 1102
MIT OpenCourseWare
http://ocw.mit.edu
For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
W21
0
0 1
P11 0
K1 0
W =
,=
, P0 =
,K=
0
W12
2 0
0 P22
0 K2
Calculating transfer function from the output of block w to its input z we get
W21 K1 1
0
M = W12 K2 2 1+K11 P11
0
1+K22 P22
By assumption in the problem statement the decoupled system is stable, therefore the perturbed
system will be stable if I M does not have zeros in the closed RHP for any such that
1 1 and 2 1. By continuity argument this will be true if I M does not have
zeros on j axis, or equivalently |det (I M (j)(j))| > 0. Let us calculate the determinant in
question:
W21 K1 1
1
W12 W21 K1 K2 1 2
det (I M ) = det W12 K2 2 1+K11 P11 = 1
1
(1 + K11 P11 ) (1 + K22 P22 )
1+K22 P22
To have a stable perturbed system for arbitrary 1 1 and 2 1 it is necessary and
sucient to impose
W
W
K
K
12
21
1
2
Since uncertainty blocks enter as a product, the answer will not change if only one block is per
turbed.
Figure 21.1
Exercise 21.2
According to the lecture notes, we can state the problem equivalently as:
(M ) =
infRn {maxi | i | : i i ci = 1}
inf {max | i | : =
}
0
i
Rn
where:
R(c)
I(c)
1
Notice that the equality =
denes a line, say L0 R3 , in R3 . This makes our problem
0
equivalent to nding the smallest such that the cube B = { Rn : maxi | i | } touches the
line L0 . That can be done by looking at the following projections:
For every j such that I(cj ) = 0, do the following: Look for the smallest j such that maxi | i | =
j and
i (R(ci ) j I(ci )) = 1
i
R(cj )
I(cj ) . For each such j, the example done in the lecture notes
1
where j =
j is
By doing that, we found
the smallest j such that the projection of B j in j = 0 touches the projection of L0 . Among all
of the above candidate solutions, the only admissible is j such that j =
| j | j .
The nal solution is:
1
(M ) =
=
|R(ci ) j I(ci )|
j
I(ci ) i
i=j
cj
satises
Exercise 21.3 The perturbed system can be represented by the diagram in Figure 21.3. The closed
loop transfer function from reference input to output is
H(s) =
N (s)
D(s) + K(s)N (s)
The system is stable if the denominator does not have zeros in the closed RHP.
D + KN
Figure 21.3
we can see that the minimum norm at which stability is lost puts at least one root on j axis.
Therefore we can rewrite the problem in the following way:
inf
min
D +KN
(j)=1
D0 +KN0
We can expand the constraint expression, taking into account that the real part has to be equal to
1 and imaginary part ot zero.
2
inf min
A =b
where
A =
D +KN
(j)
D
0 +KN0
D +KN
Im D0 +KN0 (j)
Re
, b =
1
0
The above represents underdetermined least squares problem. For all such that rank A() is 2,
the solution is
() = A (A A)1 b
The 2-norm of this expression can be minimized over , and compared to the solutions (if any)
with rank(A()) < 2.
Exercise 22.3 a) The modal test is the most convenient in this case. The system is reachable if
and only if rank [I A|B] = 5 (need to check for equal to the eigenvalues of A). Observe
that when = 2, [I A|B] is
0 1
b1
0
b2
0
b3
1 1
b4
1
b5
which has rank 5 if and only if b2 and b3 are linearly independent. Similarly, = 3, [I A|B] is
1 1
b1
1
b2
1
b
3
0 1
b4
0
b5
which has rank 5 if and only if b5 =
0.
(b) Suppose that A Rnn has k Jordan blocks of dimensions (number or rows) r1 , r2 , . . . , rk .
0. Furthermore, if blocks ri and rj have
Then we must have that br1 , br1 +r2 , . . . br1 +r2 +r3 ++rk =
the same eigenvalue, br1 +r2 +r3 ++ri and br1 +r2 +r3 ++rj must be linearly independent. These con
ditions imply that the input can excite the beginning of each Jordan chain, and hence has an impact
on each of the states.
(c) If the bi s are scalars, then they are linearly dependent (multiples of each other), so if two of
the Jordan blocks have the same eigenvalues the rank of [I A|B] is less than n.
Alternatively,
a) The system is reachable if none of the left eigenvectors of matrix A are orthogonal to B.
Notice that to control the states corresponding to a Jordan block, it is sucient to excite only
the state corresponding the beginning of the Jordan chain, or the last element in the Jordan block
(convince yourself of this considering a DT system for example). Thus it is not necessary that
generalized eigenvectors are not orthogonal to the B matrix! Besides, notice that if two or more
Jordan blocks have the same eigenvalue than any linear combination of eigenvectors corresponding
to those Jordan blocks is a left eigenvector again. In case (a) we can identify left eigenvectors of
matrix A:
0 1 0 0 0
w2 =
0 0 2 0 0
w3 =
0 0 0 0 1
w5 =
Any linear combination of w2 and w3 is also a left eigenvector. We can see that wk B = bk - k th row
of matrix B. Therefore for reachability of matrix A we need to have at least one non-zero element
in 5th row and linear independence of 2rd and 3rd rows of matrix B.
b) Generalizing to an arbitrary matrix in Jordan form we can see that all rows of matrix B corre
sponding to a Jordan block with unique eigenvalue should have at least one non-zero element, and
rows corresponding to Jordan blocks with repeated eigenvalues should be linearly independent.
c) If there are two or more Jordan blocks then we can nd a linear combination of the eigenvectors
which is orthogonal to the vector b, since two real numbers are obviously linearly dependent.
Exercise 22.4 The open loop system is reachable and has a closed-loop expression as follows:
xk+1 = Axk + B(wk + f (xk )),
where f () is an arbitrary but known function. Since the open loop system is reachable, there
exists the control input u such that
u =
u (0)
u (n 1)
that can drive the system to a target state in Rn , xf = x (n). Thus lets dene a trajectory
x (k) such that it starts from the origin and gets to x (n) by the control input u . Then, since
u(k) = w(k) + f (x(k)), let w(k) = u (k) f (x (K)). Then this w(k) can always take the system
state from the origin to any specied target state in no more than n steps.
MIT OpenCourseWare
http://ocw.mit.edu
For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
0 1
0
x = Ax + bu , A =
, b=
0 0
1
The solution is expressed by:
t
At
eA(t ) bu( )d
x(t) = e x(0) +
0
Calculate exponent of matrix A by summing up the series and taking into account that An =
0 , n > 1.
1 t
At
e = I + At =
0 1
thus
e
At
b=
t
1
b Ab
0 1
1 0
The reachability matrix has rank 2, therefore the system is reachable. Now, we compute the
reachability Grammian over an interval of length 1:
1 1
1
A(T ) A(T )
G=
e
bb e
d = 31 2
0
2 1
The system is reachable thus the Grammian is invertible, so given any nal state xf we can always
nd such that xf = G. In particular
1
18
=
2 10
c) According to 23.5 dene F T (t) = eA(1t) b. Then u(t) = F (t) is a control input that produces
a trajectory that satises the terminal constraint xf . The control eort is given as:
u2 d = G
Infact this input corresponds to the minimum energy input required to reach xf in 1 second. This
can be veried by solving the corresponding underconstrained least squares problem by means of
the tools we learned at chapter 3.
Find
s.t.
min 2
wiT = wiT b.
This is exactly in the form of the least square problem. Since both and b are real, even when
wi Cn , let w
i = wiR wiI , where wiR and wiI are real and imaginary parts of wi respectively.
Then the formulation still remains as a least square problem as follows:
Find
s.t.
2
w
iT = w
iT b.
T
min 2 =
The last expression has to be minimized over all possible left eigenvectors of A. Note that the ex
pression does not depend on the norm of the eigenvectors, thus we can minimize over eigenvectors
with unity norm. If all Jordan blocks of matrix A have dierent eigenvalues, this is a minimization
over a nite set. In the other case we can represent eigenvectors corresponding to Jordan blocks
with the same eigenvalues as a linear combination of eigenvectors corresponding to particular Jor
dan blocks, and then minimize over the coecients in the linear combination.
b) NO. The explanation is as follows. With the control suggested, the closed loop dynamics is now
x = Ax + (b + )u
u = fT x + v
x = (A + (b + )f T )x + (b + )v.
Suppose that wi was the minimizing eigenvector of unity norm in part a). Then it is also an
eigenvector of matrix A + (b + )f T since wi is orthogonal to b + . Therefore feedback does not
improve reachability.
Exercise 24.5 a) The given system in general for all t 0 with u(k) = 0 k 0 has the following
expression for the output:
tk1
Bu(k)
y(t) =
k=1 CA
k1
= CAt
Bu(k)
k=1 A
since matrix A is stable. Note that because of stability of matrix A all of its eigenvalues are strictly
within unit circle, and from Jordan decomposition we can see that
lim Ak 2 = 0
therefore x() does not inuence x(0). Thus the above equation can be used in order to nd
x(0) as follows:
x(0) =
Ak1 Bu(k).
k=1
b) Since the system is reachable, any Rn can be achieved by some choice of an input of the
above form. Also, since the system is reachable, the reachability matrix R has full row rank. As
a consequence (RRT )1 exists. Thus, in order to minimize the input energy, we have to solve the
following familiar least square problem:
Find
s.t.
min u2
=
Ak1 Bu(k).
k=1
Then the solution can be written in terms of the reachability matrix as follows:
umin = RT (RRT )1 ,
so that its square can be expressed as
2
= uTmin umin
umin
where the last equality comes from the fact that inverse of a symmetric positive denite matrix is
still symmetric positive denite. Also, the Controlability Gramian of DT systems P is
P=
Ak BB T (AT )k = RRT ,
k=0
and is symmetric positive denite. Thus the square of the minimum energy, denoted as 1 (), can
be expressed as
1 () = T P 1 = M 22
where M is a Hermitian square root matrix of P 1 which is still symmetric positive denite.
c) Suppose some input umin results in x(0) = , then the output for t 0 can be expressed as
y(t) = Cx(t) = CAt .
Thus the square of the energy of the output for t 0 can be written as
y22 = (y T y)
C
C
CA
C A
=
..
..
.
.
= T
(AT )k C T CAk
T
t=0
T
= O O
(AT )k C T CAk = OT O,
k=0
the square of the energy of the output for t 0 , which we now denote 2 (), can be expressed as
a function of as follows:
2 () T Q.
Also, because of the symmetric positive deniteness of Q, 2 () can be written as
2 () = N 22 ,
where N is a Hermitian square root matrix of Q.
= max{
y (t)2 |
u(t)2 1 , u(k) = 0k 0}
u
t=0
t=
u(t)2 1 , u(k) = 0 , k 0}
t=
= max{2 ()
|
umin 22
1}
= max{2 () | 1 () 1}.
e) Now, using the fact shown in d) and noting the fact that P 1 = M T M where M is Hermitian
square root matrix which is invertible, we can compute :
= max{2 () | 1 () 1}
= max{N 22 | M 22 1} set = M 1 l
= max (OM 1 )
= max ((M 1 )T OT OM 1 )
= max ((M 1 )T QM 1 )
= max (QM 1 (M 1 )T )
= max (Q P)
s+f
s + f
1
= 3
, H2 (s) =
.
3
2
s + 12s + 48s + 64
s2
(s + 4)
Thus the state-space realizations in controller canonical form for H1 (s) and H2 (s) are :
12 48 64
1
1
0
0
0 , C1 = 0 1 f
A1 =
, B1 =
, D1 = 0,
0
1
0
0
and
A2 = 2 , B2 = 1 , C2 = 1 , D2 = 0.
Since f is not included in the controllability matrix for H1 (s) with this realization, the controllabil
ity, which is equivalent to reachability for CT cases, the controllability is independent of the value
of f .Thus, check the rank of the controllability matrix:
1 12 96
1
12
rank(C) = rank 0
0
0
1
= 3.
Thus, the system with this realization is controllable. On the other hand, the observability matrix
O for H1 (s) contains f in it as follows:
0
1
f
1
f
0 .
O=
12 + f 48 60
Thus, when f = 4, O decreases its rank from 3 to 2.
Now, lets consider the state-space realization in observer canonical form for H1 (s). It can be
expressed as follows:
0 0 64
f
A1 = 1 0 48 , B1 = 1 , C1 = 0 0 1 , D1 = 0.
0 1 12
0
Since C1 does not contain f , the observability in independent of the value f . Thus check the rank
of the observability matrix:
0
0
1
1
12
rank(O) = rank 0
1 12 96
= 3.
Thus thus the system with this realization is observable.
f 0 64
48 .
C= 1 f
0 1 f 12
Thus, again when f = 4, C decreases its rank from 3 to 2.
b) Let H(s) be the cascaded system, H2 (s)H1 (s). Then, the augmented system H(s) has the
following state-space representation:
A1
0
x1
B1
+
u
B2 C1 A2
x
0
x
1
0 C2
x2
12 48 64 0
0
0
0
0
u
x
+
0
0
1
0
0
0
1
f
2
0
0 0 0 1 x
x
2
=
y
x =
y =
x =
y =
Ax + Bu
Cx.
Here, we use A1 , B1 , and C1 from the controller canonical form obtained in a). Since matrix A has
zero block in its upper triangle, the eigenvalues of the cascaded system are ones of A1 and A2 , i.e.,
4, 4, 4, and 2. Thus the cascaded system is not asymptotically stable. Since C1 is not included
in the eigenvalue computation for A, the stability does not depend on the value of f .
The controllability matrix C for H(s) is
B AB A2 B A3 B
1 12 122 48 123 + 48 12 2 64
0
1
12
122 48
=
0
0
1
12
0
0
1
12 + f + 2
C =
which decreases its rank from 4 to 3 when f = 2. On the other hand, the observability matrix O
for H(s) is
CA
O =
CA2
C A3
0
0
0
0
1
f
=
1
f +2
2f
12 + f + 2 48 + 2f + 4 64 + 4f
2
,
thus the choice of f = 4, O drops its rank from full rank to 3. Thus the cascaded system is
unoberservable at f = 4.
It can be seen immediately that f = 2 case corresponds to unstable pole-zero cancellation. Thus,
for f = 2, the cascaded system is BIBO stable, but is not asymptotically stable due to the unstable
pole-zero cancellation.
MIT OpenCourseWare
http://ocw.mit.edu
For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.