Applied Linear Algebra and Optimization Using MATLAB
Applied Linear Algebra and Optimization Using MATLAB
and Optimization
using MATLAB
R
License, Disclaimer of Liability, and Limited Warranty
By purchasing or using this book (the “Work”), you agree that this license grants
permission to use the contents contained herein, but does not give you the right
of ownership to any of the textual content in the book or ownership to any of
the information or products contained in it. This license does not permit up-
loading of the Work onto the Internet or on a network (of any kind) without the
written consent of the Publisher. Duplication or dissemination of any text, code,
simulations, images, etc. contained herein is limited to and subject to licensing
terms for the respective products, and permission must be obtained from the
Publisher or the owner of the content, etc., in order to reproduce or network
any portion of the textual material (in any media) that is contained in the Work.
The author, developers, and the publisher of any accompanying content, and
anyone involved in the composition, production, and manufacturing of this work
will not be liable for damages of any kind arising out of the use of (or the inabil-
ity to use) the algorithms, source code, computer programs, or textual material
contained in this publication. This includes, but is not limited to, loss of revenue
or profit, or other incidental, physical, or consequential damages arising out of
the use of this Work.
The sole remedy in the event of a claim of any kind is expressly limited to
replacement of the book, and only at the discretion of the Publisher. The use of
“implied warranty” and certain “exclusions” vary from state to state, and might
not apply to the purchaser of this product.
Applied Linear Algebra
and Optimization
using MATLAB
R
This publication, portions of it, or any accompanying software may not be reproduced
in any way, stored in a retrieval system of any type, or transmitted by any means,
media, electronic display or mechanical display, including, but not limited to,
photocopy, recording, Internet postings, or scanning, without prior permission in
writing from the publisher.
ISBN: 978-1-9364200-4-9
The publisher recognizes and respects all marks used by companies, manufacturers,
and developers as a means to distinguish their products. All brand names and
product names mentioned in this book are trademarks or service marks of their
respective companies. Any omission or misuse (of any kind) of service marks or
trademarks, etc. is not an attempt to infringe on the property of others.
1112133 2 1
Our titles are available for adoption, license, or bulk purchase by institutions,
corporations, etc. For additional information, please contact the
Customer Service Dept. at 1-800-758-3756 (toll free).
Preface xv
Acknowledgments xix
Appendices 917
Bibliography 1129
Index 1145
Preface
The main approach used in this book is quite different from currently
available books, which are either too theoretical or too computational. The
approach adopted in this book lies between the above two extremities. The
book fully exploits MATLAB’s symbolic, numerical, and graphical capabil-
ities to develop a thorough understanding of linear algebra and optimiza-
tion algorithms.
The book covers two distinct topics: linear algebra and optimization
theory. Linear algebra plays an important role in both applied and
theoretical mathematics, as well as in all of science and engineering,
xvi Preface
ing with the intent of this book, this chapter presents the mathematical
formulations of basic linear programming problems. In Chapter 7, we de-
scribe nonlinear programming formulations. We discuss many numerical
methods for solving unconstrained and constrained problems. In the be-
ginning of the chapter some of the basic mathematical concepts useful in
developing optimization theory are presented. For unconstrained optimiza-
tion problems we discuss the golden-section search method and quadratic
interpolation method, which depend on the initial guesses that bracket the
single optimum, and Newton’s method, which is based on the idea from
calculus that the minimum or maximum can be found by solving f 0 (x) = 0.
For the functions of several variables, we use the steepest descent method
and Newton’s method. For handling nonlinear optimization problems with
constraints, we discuss the generalized reduced-gradient method, Lagrange
multipliers, and KT conditions. At the end of the chapter, we also discuss
quadratic programming problems and the separable programming prob-
lems.
My sincere thanks are also due to the Deanship of the Scientific Re-
search Center, College of Science, King Saud University, Riyadh, KSA,
for financial support and for providing facilities throughout the research
project No. (Math/2008/05/B).
It has taken me five years to write this book and thanks must go to my
long-suffering family for my frequent unsocial behavior over these years. I
am profoundly grateful to my wife Saima, and our children Fatima, Usman,
xx Acknowledgments
1.1 Introduction
When engineering systems are modeled, the mathematical description is
frequently developed in terms of a set of algebraic simultaneous equations.
Sometimes these equations are nonlinear and sometimes linear. In this
chapter, we discuss systems of simultaneous linear equations and describe
the numerical methods for the approximate solutions of such systems. The
solution of a system of simultaneous linear algebraic equations is proba-
bly one of the most important topics in engineering computation. Prob-
lems involving simultaneous linear equations arise in the areas of elasticity,
electric-circuit analysis, heat transfer, vibrations, and so on. Also, the
numerical integration of some types of ordinary and partial differential
equations may be reduced to the solution of such a system of equations. It
has been estimated, for example, that about 75% of all scientific problems
require the solution of a system of linear equations at one stage or another.
It is therefore important to be able to solve linear problems efficiently and
accurately.
1
2 Applied Linear Algebra and Optimization using MATLAB
For example,
4x1 − 2x2 = 5
3x1 + 2x2 = 4
is a system of two equations in two variables x1 and x2 , and
There are three possible types of linear systems that arise in engineering
problems, and they are described as follows:
1. If there are more equations than unknown variables (m > n), then
the system is usually called overdetermined. Typically, an overdeter-
mined system has no solution. For example, the following system
4x1 = 8
3x1 + 9x2 = 13
3x2 = 9
has no solution.
2. If there are more unknown variables than the number of the equations
(n > m), then the system is usually called underdetermined. Typi-
cally, an underdetermined system has an infinite number of solutions.
For example, the system
x1 + 5x2 = 45
3x2 + 4x3 = 21
4 Applied Linear Algebra and Optimization using MATLAB
2x1 + x2 − x3 = 1
x1 − 2x2 + 3x3 = 4
x1 + x2 = 1
x1 = 1, x2 = 0, x3 = 1.
5x1 + x2 + x3 = 4
3x1 − x2 + x3 = −2
x1 + x2 = 3
does not have a unique solution since the equations are not linear
independent; the first equation is equal to the second equation plus
twice the third equation.
Matrices and Linear Systems 5
Every system of linear equations has either no solution, exactly one solu-
tion, or infinitely many solutions. •
x1 + x2 = 1
2x1 + 2x2 = 3.
From the graphs (Figure 1.1(a)) of the given two equations we can see
that the lines are parallel, so the given system has no solution. It can be
proved algebraically simply by multiplying the first equation of the system
by 2 to get a system of the form
2x1 + 2x2 = 2
2x1 + 2x2 = 3,
Second, the two lines may not be parallel, and they may meet at exactly
one point, so in this case the system has exactly one solution. For example,
consider the system
x1 − x2 = −1
3x1 − x2 = 3.
From the graphs (Figure 1.1(b)) of these two equations we can see that
the lines intersect at exactly one point, namely, (2, 3), and so the system
has exactly one solution, x1 = 2, x2 = 3. To show this algebraically, if we
substitute x2 = x1 + 1 in the second equation, we have 3x1 − x1 − 1 = 3,
or x1 = 2, and using this value of x in x2 = x1 + 1 gives x2 = 3.
6 Applied Linear Algebra and Optimization using MATLAB
Finally, the two lines may actually be the same line, and so in this case,
every point on the lines gives a solution to the system and therefore there
are infinitely many solutions. For example, consider the system
x1 + x2 = 1
2x1 + 2x2 = 2.
Here, both equations have the same line for their graph (Figure 1.1(c)).
So this system has infinitely many solutions because any point on this line
gives a solution to this system, since any solution of the first equation is
also a solution of the second equation. For example, if we set x2 = x1 − 1,
and choose x1 = 0, x2 = 1, x1 = 1, x2 = 0, and so on. •
Note that a system of equations with no solution is said to be an incon-
sistent system and if it has at least one solution, it is said to be a consistent
system.
Matrices and Linear Systems 7
The system of linear equations (1.3) can be written as the single matrix
equation
a11 a12 · · · a1n x1 b1
a21 a22 · · · a2n x2 b2
.. .. = .. . (1.4)
.. .. ..
. . . . . .
an1 an2 · · · ann xn bn
If we compute the product of the two matrices on the left-hand side of
(1.9), we have
a11 x1 + a12 x2 + · · · + a1n xn b1
a21 x1 + a22 x2 + · · · + a2n xn b2
= . (1.5)
.. .. .. .. ..
. . . . .
an1 x1 + an2 x2 + · · · + ann xn bn
But two matrices are equal if and only if their corresponding elements
are equal. Hence, the single matrix equation (1.9) is equivalent to the
system of the linear equations (1.3). If we define
a11 a12 · · · a1n x1 b1
a21 a22 · · · a2n x2 b2
A = .. , x = , b = .. ,
.. .. .. ..
. . . . . .
an1 an2 · · · ann xn bn
the coefficient matrix, the column matrix of unknowns, and the column
matrix of constants, respectively, and then the system (1.3) can be written
very compactly as
Ax = b, (1.6)
8 Applied Linear Algebra and Optimization using MATLAB
which is called the matrix form of the system of linear equations (1.3). The
column matrices x and b are called vectors.
If the right-hand sides of the equal signs of (1.6) are not zero, then the
linear system (1.6) is called a nonhomogeneous system, and we will find
that all the equations must be independent to obtain a unique solution.
If the constants b of (1.6) are added to the coefficient matrix A as a
column of elements in the position shown below
..
a11 a12 · · · a1n . b1
..
a21 a22 · · · a2n . b2
[A|b] = , (1.7)
.. .. .. .. .. ..
. . . . . .
..
an1 an2 · · · ann . bn
then the matrix [A|b] is called the augmented matrix of the system (1.6).
In many instances, it may be convenient to operate on the augmented ma-
trix instead of manipulating the equations. It is customary to put a bar
between the last two columns of the augmented matrix to remind us where
the last column came from. However, the bar is not absolutely necessary.
The coefficient and augmented matrices of a linear system will play key
roles in our methods of solving linear systems.
>> A = [1 2 3; 4 5 6; 7 8 9];
>> b = [10; 11; 12];
>> Aug = [A b]
Aug =
1 2 3 10
4 5 6 11
7 8 9 12
Also,
Matrices and Linear Systems 9
The system of linear equations (1.8) can be written as the single matrix
equation
a11 a12 · · · a1n x1 0
a21 a22 · · · a2n x2 0
.. .. = .. . (1.9)
.. .. ..
. . . . . .
an1 an2 · · · ann xn 0
It can also be written in more compact form as
Ax = 0, (1.10)
where
a11 a12 · · · a1n x1 0
a21 a22 · · · a2n x2 0
A= .. , x= , 0= .
.. .. .. .. ..
. . . . . .
an1 an2 · · · ann xn 0
system, there are only two possibilities: either the zero solution is the only
solution, or there are infinitely many solutions (called nontrivial solutions).
Of course, it is usually nontrivial solutions that are of interest in physical
problems. A nontrivial solution to the homogeneous system can occur with
certain conditions on the coefficient matrix A, which we will discuss later.
The numbers a11 , a12 , . . . , amn that make up the array are called the ele-
ments of the matrix. The first subscript for the element denotes the row
and the second denotes the column in which the element appears. The
elements of a matrix may take many forms. They could be all numbers
(real or complex), or variables, or functions, or integrals, or derivatives, or
even matrices themselves.
The order or size of a matrix is specified by the number of rows (m)
and column (n); thus, the matrix A in (1.11) is of order m by n, usually
written as m × n.
A vector can be considered a special case of a matrix having only one
row or one column. A row vector containing n elements is a 1 × n matrix,
called a row matrix, and a column vector of n elements is an n × 1 matrix,
called a (column matrix). A matrix of order 1 × 1 is called a scalar.
Matrices and Linear Systems 11
Two matrices A = (aij ) and B = (bij ) are equal if they are the same size
and the corresponding elements in A and B are equal, i.e.,
Then
1 2 4 1 5 3
+ = = C.
3 4 5 2 8 6
•
>> A = [1 2; 3 4];
>> B = [4 1; 5 2];
>> C = A + B
C=
5 3
8 6
Then
1 2 4 1 4 + 10 1 + 4 14 5
= = = C.
3 4 5 2 12 + 20 3 + 8 32 11
then
2 3 1 7
AB = while BA = .
−2 2 −1 3
Thus, AB 6= BA. •
>> A = [1 2; −1 3];
>> B = [2 1; 0 1];
>> C = A ∗ B
C=
2 3
−2 2
>> u = [1 2 3 4];
>> v = [5 3 0 2];
>> x = u. ∗ v
x=
5608
>> y = u./v;
y=
0.2000 0.6667 Inf 2.0000
>> A = [1 2 3; 4 5 6; 7 8 9];
>> B = [9 8 7; 6 5 4; 3 2 1];
>> C = A. ∗ B
C=
9 16 21
24 25 24
21 16 9
>> A = [1 2 3; 4 5 6; 7 8 9];
>> D = A.ˆ 2
D=
1 4 9
16 25 36
49 64 81
Matrices and Linear Systems 15
A matrix A which has the same number of rows m and columns n, i.e.,
m = n, defined as
are square matrices because both have the same number of rows and columns.
•
16 Applied Linear Algebra and Optimization using MATLAB
In MATLAB, identity matrices are created with the eye function, which
can take either one or two input arguments:
>> I = eye(n)
>> I = eye(m, n)
>> A = [1 2 3; 4 5 6; 7 8 9]
>> B = A0
B=
1.0000 4.0000 7.0000
2.0000 5.0000 8.0000
3.0000 6.0000 9.0000
Note that
1. (AT )T = A,
AB = BA = In .
Then the matrix B is called the inverse of A and is denoted by A−1 . For
example, let
−1 32
2 3
A= and B= .
2 2 1 1
Then we have
AB = BA = I2 ,
>> A = [21 0 0; −1 2 − 1 0; 0 − 1 2 − 1; 0 0 − 1 2]
>> Ainv = IN V M AT (A)
Ainv =
0.8000 0.6000 0.4000 0.2000
0.6000 1.2000 0.8000 0.4000
0.4000 0.8000 1.2000 0.6000
0.2000 0.4000 0.6000 0.8000
Matrices and Linear Systems 19
Program 1.1
MATLAB m-file for Finding the Inverse of a Matrix
function [Ainv]=INVMAT(A)
[n,n]=size(A); I=zeros(n,n);
for i=1:n; I(i,i)=1; end
m(1:n,1:n)=A; m(1 : n, n + 1 : 2 ∗ n) = I;
for i=1:n; m(i, 1 : 2 ∗ n) = m(i, 1 : 2 ∗ n)/m(i, i);
for k=1:n; if i˜ =k
m(k, 1 : 2∗n) = m(k, 1 : 2∗n)−m(k, i)∗m(i, 1 : 2∗n);
end; end; end
invrs = m(1 : n, n + 1 : 2 ∗ n);
>> I = Ainv ∗ A;
>> f ormat short e
>> disp(I)
I=
1.0000e + 00 −1.1102e − 16 0 0
0 1.0000e + 00 0 0
0 0 1.0000e + 00 2.2204e − 16
0 0 0 1.0000e + 00
The values of I(2, 1), and I(3, 4) are very small, but nonzero, due to
round-off errors in the computation of Ainv and I. It is often preferable to
use rational numbers rather than decimal numbers. The function frac(x)
returns the rational approximation to x, or we can use the other MATLAB
command as follows:
3. Its product with another invertible matrix is invertible, and the in-
verse of the product is the product of the inverses in the reverse order.
If A and B are invertible matrices of the same size, then AB is in-
vertible and (AB)−1 = B −1 A−1 .
It is a square matrix having all elements equal to zero except those on the
main diagonal, i.e.,
aij = 0, if i 6= j
A = (aij ) =
aij 6= 0, if i = j.
Note that all diagonal matrices are invertible if all diagonal entries are
nonzero. •
Matrices and Linear Systems 21
>> B = [2 − 4 1; 6 10 − 3; 0 5 8]
>> M = diag(B)
M=
2
10
8
It is a square matrix which has zero elements below and to the left of the
main diagonal. The diagonal as well as the above diagonal elements can
take on any value, i.e.,
1 2 3
U = 0 4 5 .
0 0 6
>> A = [1 2 3; 4 5 6; 7 8 9];
>> U = triu(A)
U=
1 2 3
0 4 5
0 0 6
We can also create a strictly upper-triangular matrix, i.e., an upper-
triangular matrix with zero diagonals, from a given matrix A by using the
MATLAB built-in function triu(A,I) as follows:
>> A = [1 2 3; 4 5 6; 7 8 9];
>> U = triu(A, I)
U=
0 2 3
0 0 5
0 0 0
Matrices and Linear Systems 23
It is a square matrix which has zero elements above and to the right of the
main diagonal, and the rest of the elements can take on any value, i.e.,
Note that all the triangular matrices (upper or lower) with nonzero
diagonal entries are invertible.
>> A = [1 : 4; 5 : 8; 9 : 12]
%A is not symmetric
>> B = A0 ∗ A
B=
107 122 137 152
122 140 158 176
137 158 179 200
152 176 200 224
>> C = A ∗ A0
C=
30 70 110
70 174 278
110 278 446
Example 1.1 Find all the values of a, b, and c for which the following
matrix is symmetric:
4 a+b+c 0
A = −1 3 b − c .
−a + 2b − 2c 1 b − 2c
Solution. If the given matrix is symmetric, then A = AT , i.e.,
4 a+b+c 0
A = −1 3 b−c
−a + 2b − 2c 1 b − 2c
4 −1 −a + 2b − 2c
= a+b+c 3 1 = AT ,
0 b − c b − 2c
Matrices and Linear Systems 25
0 = −a + 2b − 2c
−1 = a + b + c
1 = b − c.
a = 2, b = −1, c = −2,
and using these values, we have the given matrix of the form
4 −1 0
A= −1 3 1 .
0 1 3
•
Theorem 1.3 If A and B are symmetric matrices of the same size, and
if k is any scalar, then:
1. AT is also symmetric;
If for a matrix A, the aij = −aji for a i 6= j and the main diagonal
elements are not all zero, then the matrix A is called a skew matrix. If
all the elements on the main diagonal of a skew matrix are zero, then the
matrix is called skew symmetric, i.e.,
Any square matrix may be split into the sum of a symmetric and a skew
symmetric matrix. Thus,
1 1
A = (A + AT ) + (A − AT ),
2 2
1
where 2 (A+A ) is a symmetric matrix and 12 (A−AT ) is a skew symmetric
T
The partitioning lines must always extend entirely through the matrix
as in the above example. If the submatrices of A are denoted by the symbols
A11 , A12 , A21 , and A22 so that
a11 a12 a13 a14 a15
A11 = a21 a22 a23 , A12 = a24 a25 ,
a31 a32 a33 a34 a35
A21 = a41 a42 a43 , A22 = a44 a45 ,
then the original matrix can be written in the form
A11 A12
A= .
A21 A22
A partitioned matrix may be transposed by appropriate transposition
and rearrangement of the submatrices. For example, it can be seen by
inspection that the transpose of the matrix A is
T
A11 AT12
AT = .
T T
A21 A22
Partitioned matrices such as the one given above can be added, sub-
tracted, and multiplied provided that the partitioning is performed in an
appropriate manner. For the addition and subtraction of two matrices, it is
necessary that both matrices be partitioned in exactly the same way. Thus,
a partitioned matrix B of order 4 × 5 (compare with matrix A above) will
be conformable for addition with A only if it is partitioned as follows:
.
b11 b12 b13 .. b14 b15
b21 b22 b23 ... b24 b25
B = b31 b32 b33 ... b34 b35 .
..
··· ··· ··· . ··· ···
.
b41 b42 b43 .. b44 b45
28 Applied Linear Algebra and Optimization using MATLAB
in which B11 , B12 , B21 , and B22 represent the corresponding submatrices.
In order to add A and B and obtain a sum C, it is necessary according to
the rules for addition of matrices that the following represent the sum:
A11 + B11 A12 + B12 C11 C12
A+B = = = C.
A21 + B21 A22 + B22 C21 C22
Note that like A and B, the sum matrix C will also have the same par-
titions.
Then, when forming the product AD according to the usual rules for
matrix multiplication, the following result is obtained:
A11 A12 D11 D12
M = AD =
A21 A22 D21 D22
A11 D11 + A12 D21 A11 D12 + A12 D22
=
A21 D11 + A22 D21 A21 D12 + A22 D22
M11 M12
= .
M21 M22
same way as the rows of the second partitioned matrix. It does not matter
how the rows of the first partitioned matrix and the columns of the second
partitioned matrix are partitioned. •
0 0 7 8
is a tridiagonal matrix.
A permutation matrix P has only 0s and 1s and there is exactly one in each
row and column of P . For example, the following matrices are permutation
matrices:
0 1 0 0
1 0 0 1 0 0 0
P = 0 0 1 , P = 0 0 1 0 .
0 1 0
0 0 0 1
The product P A has the same rows as A but in a different order (permuted),
while AP is just A with the columns permuted. •
2. The first entry from the left of a nonzero row is 1. This entry is
called the leading one of its row.
Matrices and Linear Systems 31
3. For each nonzero row, the leading one appears to the right and below
any leading ones in preceding rows.
Note that, in particular, in any column containing a leading one, all
entries below the leading one are zero. For example, the following matrices
are in row echelon form:
0 1 0 1
1 2 1 1 0 2 1 2 3 4
0 1 3 , 0 1 2 , 0 0 1 2 , 0 0 1 0 .
0 0 0 0
0 0 0 0 0 1 0 0 0 0
0 0 0 0
0x1 + 0x2 = 0,
Similarly, the linear system that corresponds to the second above matrix
is
x1 = 2
x2 = 2
0 = 1.
The third equation of this system shows that
0x1 + 0x2 = 1,
32 Applied Linear Algebra and Optimization using MATLAB
which is not possible for any choices of x1 and x2 . Hence, the system has
no solution.
Finally, the linear system that corresponds to the third above matrix is
x1 + 2x2 + 3x3 = 4
x3 = 2
0x1 + 0x2 + 0x3 = 0,
and by backward substitution (without using the third equation of the sys-
tem), we get
2. The first entry from the left of a nonzero row is 1. This entry is
called the leading one of its row.
3. For each nonzero row, the leading one appears to the right and below
any leading ones in preceding rows.
For example, the following matrices are in reduced row echelon form:
1 0 1 1 0 0 2 1 4 5 0 1 1 0 0 0
0 1 2 , 0 1 0 4 , 0 0 0 1 , 0 0 1 0 2 ,
0 0 0 0 0 1 6 0 0 0 0 0 0 0 1 1
and the following matrices are not in reduced row echelon form:
1 3 0 2 1 3 0 2 1 0 0 3 1 0 2 0 0
0 0 0 0 , 0 0 5 4 , 0 0 1 2 , 0 1 1 0 2 .
0 0 1 4 0 0 0 1 0 1 0 6 0 2 0 1 1
There are usually many sequences of row operations that can be used to
transform a given matrix to reduced row echelon form—they all, however,
lead to the same reduced row echelon form. In the following, we shall
discuss how to transform a given matrix in reduced row echelon form.
Theorem 1.4 Every matrix can be brought to reduced row echelon form
by a series of elementary row operations. •
Using the finite sequence of elementary row operations, we get the matrix
of the form
1 −3 0 0 1
R1 = 0 0 1 −1 1 ,
0 0 0 1 0
Matrices and Linear Systems 35
which is in row echelon form. If we continue with the matrix R1 and make
all elements above the leading one equal to zero, we obtain
1 −3 0 0 1
R2 = 0 0 1 0 1 ,
0 0 0 1 0
>> A = [1 − 3 0 0 1; 2 − 6 − 1 1 1; 3 − 9 2 − 1 5];
>> B = rref (A)
B=
1 −3 0 0 1
0 0 1 0 1
0 0 0 1 0
so R1 is row equivalent to A.
Interchanging row 2 and row 3 of the matrix R1 gives the matrix of the
form
1 3 6 5
R2 = 2 −7 −3 −1 ,
2 1 4 3
so R2 is row equivalent to R1 .
Theorem 1.5
1. Every matrix is row equivalent to itself.
2. If a matrix A is row equivalent to a matrix B, then B is row equivalent
to A.
3. If a matrix A is row equivalent to a matrix B and B is row equivalent
to a matrix C, then A is row equivalent to C. •
Example 1.5 Use elementary row operations on matrices to solve the lin-
ear system
− x2 + x3 = 1
x1 − x2 − x3 = 1
−x1 + 3x3 = −2.
Matrices and Linear Systems 37
x1 = −4
x2 = −3
x3 = −2,
and we get the solution [−4, −3, −2]T of the given linear system. •
1. det(A) = a11 , if n = 1.
For example, if
4 2 6 3
A= and B = ,
−3 7 2 5
then
>> A = [2 2; 6 7];
>> B = det(A)
B=
2.0000
Another way to find the determinants of only 2 × 2 and 3 × 3 matrices
can be found easily and quickly using diagonals (or direct evaluation). For
a 2 × 2 matrix, the determinant can be obtained by forming the product of
the entries on the line from left to right and subtracting from this number
the product of the entries on the line from right to left. For a matrix of
size 3 × 3, the diagonals of an array consisting of the matrix with the first
two columns added to the right are used. Then the determinant can be
obtained by forming the sum of the products of the entries on the lines
from left to right, and subtracting from this number the products of the
entries on the lines from right to left, as shown in Figure (1.2).
then the minor M11 will be obtained by deleting the first row and the first
column of the given matrix A, i.e.,
3 2
M11 = = (3)(4) − (−2)(2) = 12 + 4 = 16.
−2 4
Similarly, we can find the other possible minors of the given matrix as
Matrices and Linear Systems 41
follows:
5 2
M12 =
4 4
=
20 − 16 = 4
5 3
M13 = = −10 − 12 = −22
4 −2
3 −1
M21 = = 12 − 2 = 10
−2 4
2 −1
M22 =
4
= 8+4 = 12
4
2 3
M23 =
4 −2
=
−4 − 12 = −16
3 −1
M31 =
3
= 6+3 = 9
2
2 −1
M32 =
5
= 4+5 = 9
2
2 3
M33 = 5 3 =
6 − 15 = −9,
>> A = [2 3 − 1; 5 3 2; 4 − 2 4];
>> Cof A = cof actor(A, 1, 1);
>> Cof A = cof actor(A, 1, 2);
>> Cof A = cof actor(A, 1, 3);
>> Cof A = cof actor(A, 2, 1);
>> Cof A = cof actor(A, 2, 2);
>> Cof A = cof actor(A, 2, 3);
>> Cof A = cof actor(A, 3, 1);
>> Cof A = cof actor(A, 3, 2);
>> Cof A = cof actor(A, 3, 3);
Matrices and Linear Systems 43
Program 1.2
MATLAB m-file for Finding Minors and Cofactors
of a Matrix
function CofA = cofactor(A,i,j)
[m,n] = size(A);
if m ˜ = n error(Matrix must be square) end
A1 = A([1:i-1,i+1:n],[1:j-1,j+1:n]);
Minor = det(A1);
CofA = (-1)ˆ (i+j)*det(Minor);
where the summation is on i for any fixed value of the jth column (1 ≤ j ≤
n), or on j for any fixed value of the ith row (1 ≤ i ≤ n), and Aij is the
cofactor of element aij . •
Example 1.6 Find the minors and cofactors of the matrix A and use them
to evaluate the determinant of the matrix
3 1 −4
A= 2 5 6 .
1 4 8
44 Applied Linear Algebra and Optimization using MATLAB
From these values of the minors, we can calculate the cofactors of the
elements of the given matrix as follows:
Now by using the cofactor expansion along the first row, we can find
the determinant of the matrix as follows:
>> A = [3 1 − 4; 2 5 6; 1 4 8];
>> DetA = Cof F exp(A);
Matrices and Linear Systems 45
Program 1.3
MATLAB m-file for Finding the Determinant of a
Matrix by Cofactor Expansion
function DetA = CofFexp(A)
[m,n] = size(A);
if m ˜ = n error (Matrix must be square) end
a = A(1,:);c = [ ];
for i=1:n
c1i = cofactor(A,1,i);
c = [c;c1i]; end
DetA = a*c;;
which is called the cofactor expansion along the ith row, and also as
and is called the cofactor expansion along the jth column. This is called
the Laplace expansion theorem. •
Note that the cofactor and minor of an element aij differs only in sign,
i.e., Aij = ±Mij . A quick way for determining whether to use the + or −
is to use the fact that the sign relating Aij and Mij is in the ith row and
jth column of the checkerboard array
46 Applied Linear Algebra and Optimization using MATLAB
+ − + − + ···
− + − + − ···
+ − + − + ··· .
− + − + − ···
.. .. .. .. .. ..
. . . . . .
For example, A11 = M11 , A21 = −M21 , A12 = −M12 , A22 = M22 , and so on.
If A is any n × n matrix and Aij is the cofactor of aij , then the matrix
A11 A12 · · · A1n
A21 A22 · · · A2n
.. .. ..
. . ··· .
An1 An2 · · · Ann
is called the matrix of the cofactor from A. For example, the cofactor of
the matrix
3 2 −1
A= 1 6 3
2 −4 0
can be calculated as follows:
Thus,
The following are special properties, which will be helpful in reducing the
amount of work involved in evaluating determinants.
Let A be an n × n matrix:
det(B) = 8 = det(A).
then
det(A) = (3)(4)(5) = 60.
11. The determinant of the kth power of a matrix A is equal to the kth
power of the determinant of the matrix A, i.e., det(Ak ) = (det(A))k .
For example, if
2 −2 0
A= 2 3 −1 ,
1 0 1
then det(A) = 12, and for the matrix
−18 −30 12
B = A3 = 24 −3 −9 ,
3 −12 3
obtained by taking the cubic power of the matrix A, we have
Example 1.8 Find all the values of α for which det(A) = 0, where
α−3 1 0
A= 0 α − 1 1 .
0 2 α
Solution. We find the determinant of the given matrix by using the co-
factor expansion along the first row, so we compute
|A| = 0
(α − 3)(α + 1)(α − 2) = 0,
which gives
α = −1, α = 2, α = 3,
the required values of α for which det(A) = 0. •
Solution. Since
4α α
= 4α(α + 1) − α,
1 α+1
which is equivalent to
4α α
= 4α2 + 3α.
1 α+1
Matrices and Linear Systems 53
Also,
3 −1 0
0 α
−2 = 3[α(α+1)+6]−(−1)[(0)(α+1)+2]+0[(0)(3)−(−1)(α)],
−1 3 α+1
Given that
3 −1 0
4α α
1 α+1
= 0 α
−2 ,
−1 3 α+1
we get
4α2 + 3α = 3α2 + 3α + 16.
Simplifying this quadratic polynomial, we have
α2 = 16 or α2 − 16 = 0,
which gives
α = −4 and α = 4,
the required values of α. •
if
a b c
det d e f = 4.
g h i
54 Applied Linear Algebra and Optimization using MATLAB
or
a b c
|A| = (−5)(−1)(2)(−1) d e f .
g h i
Since it is given that
a b c
d e f = 4,
g h i
we have
|A| = (−5)(−1)(2)(−1)(4) = −40,
the required determinant of the given matrix. •
One can easily transform the given determinant into upper-triangular form
by using the following row operations:
1. Add a multiple of one row to another row, and this will not affect
the determinant.
Matrices and Linear Systems 55
Now to create the zeros below the main diagonal, column by column, we
do as follows:
Replace the second row of the determinant with the sum of itself and (−6)
times the first row of the determinant and then replace the third row of
the determinant with the sum of itself and (3) times the first row of the
determinant, which gives
1 2 3
(3) 0 −10 −25 .
0 7 8
1
Multiplying row 2 of the determinant by − 10 gives
1 2 3
(3)(−10) 0 1 −5/2 .
0 7 8
Replacing the third row of the determinant with the sum of itself and (−7)
times the second row of the determinant, we obtain
56 Applied Linear Algebra and Optimization using MATLAB
1 2 3
(3)(−10) 0 1 −5/2
= (3)(−10)(1)(1)(−19/2) = 285,
0 0 −19/2
Example 1.12 For what values of α does the following matrix have an
inverse?
1 0 α
A= 2 2 1
0 2α 1
Solution. We find the determinant of the given matrix by using cofactor
expansion along the first row as follows:
which is equal to
2 1
C11 = (−1)1+1 M11 = M11 = = 2 − 2α
2α 1
2 2
C13 = (−1)1+3 M13 = M13 = − = 4α.
0 2α
Thus,
From Theorem 1.9 we know that the matrix has an inverse if det(A) 6= 0,
so
|A| = 2 − 2α + 4α2 6= 0,
Example 1.13 Use the adjoint method to compute the inverse of the fol-
lowing matrix:
1 2 −1
A = 2 −1 1 .
1 2 2
Also, find the inverse and determinant of the adjoint matrix.
−1 1 2 −1
= −4, C12 = − 2 1
C11 = + = −3, C13 = + = 5,
2 2 1 2 1 2
2 −1 −1
= −6, C22 = + 1
1 2
C21 = − = 3, C23 = − = 0,
2 2 1 2 1 2
2 −1 1 −1 1 2
C31 = + = 1, C32 = − = −3, C33 = + = −5.
−1 1 2 1 2 −1
Thus, the cofactor matrix has the form
−4 −3 5
−6 3 0 ,
1 −3 −5
and the adjoint is the transpose of the cofactor matrix
T
−4 −3 5 −4 −6 1
adj(A) = −6 3 0 = −3 3 −3 .
1 −3 −5 5 0 −5
To get the adjoint of the matrix of Example 1.13, we use the MATLAB
Command Window as follows:
>> A = [1 2 − 1; 2 − 1 1; 1 2 2];
>> AdjA = Adjoint(A);
Program 1.4
MATLAB m-file for Finding the Adjoint of a
Matrix Function AdjA = Adjoint(A)
[m,n] = size(A);
if m ˜ = n error(‘Matrix must be square’) end
A1 = [ ];
for i = 1:n
for j=1:n
A1 = [A1;cofactor(A,i,j)];end;end
AdjA = reshape(A1,n,n);
Matrices and Linear Systems 59
Then by using Theorem 1.9 we can have the inverse of the matrix as
follows:
−4 −6 1 4/15 2/5 −1/15
Adj(A) 1
A−1 = = − −3 3 −3 = 1/5 −1/5 1/5 .
det(A) 15
5 0 −5 −1/3 0 1/3
Using Theorem 1.9 we can compute the inverse of the adjoint matrix as:
−1/15 −2/15 1/15
A
(adj(A))−1 = = −2/15 1/15 −1/15 ,
det(A)
−1/15 −2/15 −2/15
by using the adjoint and the determinant of the matrix in the MATLAB
Command Window as:
>> A = [1 − 1 1 2; 1 0 1 3; 0 0 2 4; 1 1 − 1 1];
The cofactors Aij of elements of the given matrix A can also be found
directly by using the MATLAB Command Window as follows:
60 Applied Linear Algebra and Optimization using MATLAB
>> B =
[A11 A12 A13 A14;
A21 A22 A23 A24;
A31 A32 A33 A34;
A41 A42 A43 A44]
which gives
B=
−2 −4 −4 2
6 6 8 −4
−3 −2 −3 2
−2 −2 −4 2
>> adjA = B 0
−2 6 −3 −2
−4 6 −2 −2
−4 8 −3 −4
2 −4 2 2
The determinant of the matrix can be obtained as:
>> det(A)
ans =
2
The inverse of A is the adjoint matrix divided by the determinant of A.
>> inv(A)
det(A2 B −1 AT B 3 ) = 432.
The system of linear equations (1.14) can be written as the single matrix
equation
a11 a12 · · · a1n x1 0
a21 a22 · · · a2n x2 0
.. .. = .. . (1.15)
.. .. ..
. . . . . .
am1 am2 · · · amn xn 0
If we compute the product of the two matrices on the left-hand side of
(1.15), we have
a11 x1 + a12 x2 + · · · + a1n xn 0
a21 x1 + a22 x2 + · · · + a2n xn 0
= . (1.16)
.. .. .. .. ..
. . . . .
am1 x1 + am2 x2 + · · · + amn xn 0
But the two matrices are equal if and only if their corresponding elements
are equal. Hence, the single matrix equation (1.15) is equivalent to the
system of the linear equations (1.14). If we define
a11 a12 · · · a1n x1 0
a21 a22 · · · a2n x2 0
A = .. .. , x = .. , b = .. ,
.. ..
. . . . . .
am1 am2 · · · amn xn 0
the coefficient matrix, the column matrix of unknowns, and the column
matrix of constants, respectively, then the system (1.14) can be written
very compactly as
Ax = b, (1.17)
which is called the matrix form of the homogeneous system. •
[A|0].
64 Applied Linear Algebra and Optimization using MATLAB
x1 + x2 + 2x3 = 0
2x1 + 3x2 + 4x3 = 0
3x1 + 4x2 + 7x3 = 0.
Next, using the elementary row operations: row3 – row2 and row1 – row2,
we get
..
1 0 2 . 0
≡ 0 1 0 ... 0
.
..
0 0 1 . 0
Thus,
x1 = 0, x2 = 0, x3 = 0
x1 + 2x2 + x3 = 0
2x1 − 3x2 + 4x3 = 0.
. !
1 2 1 .. 0
∼ . .
0 1 − 72 .. 0
11
x1 + 0x2 + x3 = 0
7
2
0x1 + x2 − x3 = 0
7
and from it, we get
2
x2 = x3 .
7
Taking x3 = t, for t ∈ R and t 6= 0, we get the nontrivial solution
11 2
[x1 , x2 , x3 ]T = [ t, t, t]T .
7 7
Thus, the given system has infinitely many solutions, and this is to be
expected because the given system has three unknowns and only two equa-
tions. •
Example 1.17 For what values of α does the homogeneous linear system
(α − 2)x1 + x2 = 0
x1 + (α − 2)x2 = 0
.. !
1 (α − 2) . 0
∼ .. .
0 1 − (α − 2)2 . 0
Using backward substitution, we obtain
x1 + (α − 2)x2 = 0
0x1 + 1 − (α − 2)2 x2 = 0.
Notice that if x2 = 0, then x1 = 0, and the given system has a trivial
6 0. This implies that
solution, so let x2 =
1 − (α − 2)2 = 0
1 − α2 + 4α − 4 = 0
α2 − 4α + 3 = 0
(α − 3)(α − 1) = 0,
which gives
α = 1 and α = 3.
Notice that for these values of α, the given set of equations are identical,
i.e.,
(for α = 1)
−x1 + x2 = 0
x1 − x2 = 0,
and (for α = 3)
x1 + x2 = 0
x1 + x2 = 0.
Thus, the given system has nontrivial solutions (infinitely many solu-
tions) for α = 1 and α = 3. •
The following basic theorems on the solvability of linear systems are
proved in linear algebra.
68 Applied Linear Algebra and Optimization using MATLAB
A−1 Ax = A−1 b
Ix = A−1 b
or
x = A−1 b. (1.18)
If A is a square invertible matrix, there exists a sequence of elementary
row operations that carry A to the identity matrix I of the same size, i.e.,
A −→ I. This same sequence of row operations carries I to A−1 , i.e.,
I −→ A−1 . This can also be written as
[A|I] −→ [I|A−1 ].
Example 1.18 Use the matrix inversion method to find the solution of
the following linear system:
x1 + 2x2 = 1
−2x1 + x2 + 2x3 = 1
−x1 + x2 + x3 = 1.
Matrices and Linear Systems 69
Multiply the first row by −2 and −1 and then, subtracting the results
from the second and third rows, respectively, we get
.
1 2 0 .. 1 0 0
∼
..
.
0 5 2 . 2 1 0
..
0 3 1 . 1 0 1
Multiplying the second row by 2 and 3 and then subtracting the results
from the first and third rows, respectively, we get
.
1 0 −4/5 .. 1/5 −2/5 0
.
∼ 0 1 2/5 .. 2/5 1/5 .
0
..
0 0 −1/5 . −1/5 −3/5 1
.
1 0 −4/5 .. 1/5 −2/5 0
∼ ..
.
0 1 2/5 . 2/5 1/5 0
..
0 0 1 . 1 3 −5
Multiplying the third row by 25 and − 45 and then subtracting the results
from the second and first rows, respectively, we get
..
1 0 0 . 1 2 −4
.
∼ 0 1 0 .. 0 −1
.
2
.
0 0 1 .. 1 3 −5
Thus, the inverse of the given matrix is
1 2 −4
A−1 = 0 −1 2 ,
1 3 −5
and the unique solution of the system can be computed as
1 2 −4 1 −1
x = A−1 b = 0 −1 2 1 = 1 ,
1 3 −5 1 −1
i.e.,
x1 = −1, x2 = 1, x3 = −1,
the solution of the given system by the matrix inversion method. •
Thus, when the matrix inverse A−1 of the coefficient matrix A is com-
puted, the solution vector x of the system (1.6) is simply the product of
inverse matrix A−1 and the right-hand side vector b.
>> A = [1 2 0; −2 1 2; −1 1 1];
>> b = [1; 1; 1];
>> x = A \ b
x=
1.0000
1.0000
−1.0000
Not all matrices have inverses. Singular matrices don’t have inverses
and thus the corresponding systems of equations do not have unique solu-
tions. The inverse of a matrix can also be computed by using the following
numerical methods for linear systems: Gauss-elimination method, Gauss–
Jordan method, and LU decomposition method. But the best and simplest
method for finding the inverse of a matrix is to perform the Gauss–Jordan
method on the augmented matrix with an identity matrix of the same size.
E3 E2 E1 A = I,
and so
A = (E3 E2 E1 )−1 .
This means that
0 1 1 0 1 1
A= E1−1 E2−1 E3−1 = .
1 0 2 1 0 1
•
74 Applied Linear Algebra and Optimization using MATLAB
4. It has n pivots. •
In the following, we will discuss the direct methods for solving the linear
systems.
In a similar way, one can use Cramer’s rule for a set of n linear equations
as follows:
76 Applied Linear Algebra and Optimization using MATLAB
|Ai |
xi = , i = 1, 2, 3, . . . , n, (1.19)
|A|
i.e., the solution for any one of the unknown xi in a set of simultaneous
equations is equal to the ratio of two determinants; the determinant in
the denominator is the determinant of the coefficient matrix A, while the
determinant in the numerator is the same determinant with the ith column
replaced by the elements from the right-hand sides of the equation.
5x1 + x3 + 2x4 = 3
x1 + x2 + 3x3 + x4 = 5
x1 + x2 + 2x4 = 1
x1 + x2 + x3 + x4 = −1.
gives
5 0 1 2 3
1 1 3 1 5
A=
1
and b=
1 .
1 0 2
1 1 1 1 −1
The determinant of the matrix A can be calculated by using cofactor
expansion as follows:
5 0 1 2
1 1 3 1
|A| =
1 1 0 2
1 1 1 1
= a11 c11 + a12 c12 + a13 c13 + a14 c14 = 5(2) + 0(−2) + 1(0) + 2(0) = 10 6= 0,
Matrices and Linear Systems 77
which shows that the given matrix A is nonsingular. Then the matrices
A1 , A2 , A3 , and A4 can be computed as
3 0 1 2 5 3 1 2
5 1 3 1 1 5 3 1
A1 = 1 1 0 2
, A2 =
1
,
1 0 2
−1 1 1 1 1 −1 1 1
5 0 3 2 5 0 1 3
1 1 5 1 1 1 3 5
A3 = 1 1
, A4 = .
1 2 1 1 0 1
1 1 −1 1 1 1 1 −1
The determinant of the matrices A1 , A2 , A3 , and A4 can be computed
as follows:
|A2 | 70
x2 = = − = −7
|A| 10
|A3 | 30
x3 = = = 3
|A| 10
|A3 | 50
x4 = = = 5,
|A| 10
which is the required solution of the given system. •
3
system of n linear equations by Cramer’s rule will require N = (n + 1) n3
multiplications. Therefore, this rule is much less efficient for large values of
n and is at most never used for computational purposes. When the number
of equations is large (n > 4), other methods of solutions are more desirable.
Use MATLAB commands to find the solution of the above linear sys-
tem by Cramer’s rule as follows:
>> A = [5 0 1 2; 1 1 3 1; 1 1 0 2; 1 1 1 1];
>> b = [3; 5; 1; −1];
>> A1 = [b A(:, [2 : 4])];
>> x1 = det(A1)/det(A);
>> A2 = [A(:, 1) b A(:, [3 : 4])];
>> x2 = det(A2)/det(A);
>> A3 = [A(:, [1 : 2]) b A(:, 4)];
>> x3 = det(A3)/det(A);
>> A4 = [A(:, [1 : 3]) b];
>> x4 = det(A4)/det(A);
The m-file CRule.m and the following MATLAB commands can be used
to generate the solution of Example 1.21 as follows:
Matrices and Linear Systems 79
>> A = [5 0 1 2; 1 1 3 1; 1 1 0 2; 1 1 1 1];
>> b = [3; 5; 1; −1];
>> sol = CRule(A, b);
Program 1.5
MATLAB m-file for Cramer’s Rule for a Linear System
function sol=CRule(A,b)
[m, n] = size(A);
if m ˜ = n error(‘Matrix is not square.’); end
if det(A) == 0 error(‘Matrix is singular.’);end
for i = 1:n
B = A; B(:, i) = b;
sol(i) = det(B) / det(A);end
sol = sol’;
Forward Elimination
as the first pivotal equation with the first pivot element a11 . Then the first
equation times multiples mi1 = (ai1 /a11 ), i = 2, 3, . . . , n is subtracted from
the ith equation to eliminate the first variable x1 , producing an equivalent
system
an equivalent system
a11 x1 + a12 x2 + a13 x3 + · · · + a1n xn = b1
(1) (1) (1) (1)
a22 x2 + a23 x3 + · · · + a2n xn = b2
(2) (2) (2)
a33 x3 + · · · + a3n xn = b3 (1.24)
.. .. .. ..
. . . .
(2) (2) (2)
an3 x3 + · · · + ann xn = bn .
Backward Substitution
After the triangular set of equations has been obtained, the last equation of
system (1.26) yields the value of xn directly. The value is then substituted
into the equation next to the last one of the system (1.26) to obtain a value
of xn−1 , which is, in turn, used along with the value of xn in the second
82 Applied Linear Algebra and Optimization using MATLAB
The Gaussian elimination can be carried out by writing only the co-
efficients and the right-hand side terms in a matrix form, the augmented
matrix form. Indeed, this is exactly what a computer program for Gaus-
sian elimination does. Even for hand calculations, the augmented matrix
form is more convenient than writing all sets of equations. The augmented
matrix is formed as follows:
a11 a12 a13 · · · a1n | b1
a21 a22 a23 · · · a2n | b2
a31 a32 a33 · · · a3n | b3
. (1.28)
.. .. .. .. ..
. . . . . |
an1 an2 an3 · · · ann | bn
.
1 2 1 .. 2
0 1 1 ... −3 .
..
0 1 3 . 3
(1) (1)
Since a22 = 1 6= 0, we eliminate the entry in the a32 position by subtracting
1
the multiple m32 = = 1 of the second row from the third row to get
1
.
1 2 1 .. 2
0 1 1 ... −3 .
..
0 0 2 . 6
Obviously, the original set of equations has been transformed to an
upper-triangular form. Since all the diagonal elements of the obtaining
upper-triangular matrix are nonzero, the coefficient matrix of the given
system is nonsingular, and the given system has a unique solution. Now
expressing the set in algebraic form yields
x1 + 2x2 + x3 = 2
x2 + x3 = −3
2x3 = 6.
Now using backward substitution, we get
2x3 = 6 gives x3 = 3
x2 = −x3 − 3 = −(3) − 3 = −6 gives x2 = −6
x1 = 2 − 2x2 − x3 = 2 − 2(−6) − 3 gives x1 = 11,
which is the required solution of the given system. •
The above results can be obtained using MATLAB commands as fol-
lows:
>> B = [1 2 1 2; 2 5 3 1; 1 3 4 5];
%B = [A|b] = Augmented matrix
>> x = W P (B);
>> disp(x)
Matrices and Linear Systems 85
Program 1.6
MATLAB m-file for Simple Gaussian Elimination Method
function x=WP(B)
[n,t]=size(B); U=B;
for k=1:n-1; for i=k:n-1; m=U(i+1,k)/U(k,k);
for j=1:t; U(i+1,j)=U(i+1,j)-m*U(k,j);end;end end
i=n; x(i,1)=U(i,t)/U(i,i);
for i=n-1:-1:1; s=0;
for k=n:-1:i+1; s = s + U (i, k) ∗ x(k, 1); end
x(i,1)=(U(i,t)-s)/U(i,i); end; B; U; x; end
Example 1.23 Solve the following linear system using the simple Gaus-
sian elimination method:
x2 + x3 = 1
x1 + 2x2 + 2x3 = 1
2x1 + x2 + 2x3 = 3.
To solve this system, the simple Gaussian elimination method will fail
immediately because the element in the first row on the leading diagonal,
the pivot, is zero. Thus, it is impossible to divide that row by the pivot
86 Applied Linear Algebra and Optimization using MATLAB
x2 = 1 − x3 = 1 − 4 = −3
Example 1.24 Solve the following linear system using the simple Gaus-
sian elimination method:
x1 + x2 + x3 = 3
2x1 + 2x2 + 3x3 = 7
x1 + 2x2 + 3x3 = 6.
The first elimination step is to eliminate the elements a21 = 2 and a31 =
1 from the second and third rows by subtracting the multiples m21 = 21 = 2
and m31 = 11 = 1 of row 1 from row 2 and row 3, respectively, which gives
.
1 1 1 .. 3
0 0 1 ... 1 .
..
0 1 2 . 3
We finished the second column. So the third row of the equivalent upper-
triangular system is
0x1 + 0x2 + 0x3 = b − a. (1.31)
First, if (1.31) has no constraint on unknowns x1 , x2 , and x3 , then the
upper-triangular system represents only two nontrivial equations, namely,
2x1 − x2 + 3x3 = 1
4x2 − 4x3 = 2a − 2
in the three unknowns. As a result, one of the unknowns can be chosen
arbitrarily, say x3 = x∗3 , then x∗2 and x∗1 can be obtained by using backward
substitution:
1
x∗2 = 1/2a − 1/2 − x∗3 ; x∗1 = (1 + 1/2a − 1/2 − 4x∗3 ).
2
Hence,
1
x∗ = [ (1 + 1/2a − 1/2 − 4x∗3 ), 1/2a − 1/2 − x∗3 , x∗3 ]T
2
is an approximation solution of the given system for any value of x∗3 for
any real value of a. Hence, the given linear system is consistent (infinitely
many solutions).
Then using the multiple m32 = 42 = 2 of the second row from the third row,
we get
..
1 1 −2 . 0
..
.
0 2 1 . 0
..
0 0 −1 . 0
Obviously, the original set of equations has been transformed to an
upper-triangular form. Thus, the system has the unique solution [0, 0, 0]T ,
i.e., the system has only the trivial solution. •
Example 1.27 Find the value of k for which the following homogeneous
linear system has nontrivial solutions by using the simple Gaussian elimi-
nation method:
2x1 − 3x2 + 5x3 = 0
−2x1 + 6x2 − x3 = 0
4x1 − 9x2 + kx3 = 0.
Solution. The process begins with the augmented matrix form
.
2 −3 5 .. 0
−2 ..
,
6 −1 . 0
..
4 −9 k . 0
which gives
.
2 −3 5 .. 0
0 ..
.
3 4 . 0
..
0 −3 k − 10 . 0
−3
Also, by using the multiple m32 = 3
= −1, we get
..
2 −3 5 . 0
.
0
3 4 .. 0
.
.
0 0 k − 6 .. 0
k − 6 = 0, which gives k = 6.
Example 1.28 Use the simple Gaussian elimination method to find all
the values of α which make the following matrix singular:
1 −1 α
A= 2 2 1 .
0 α −1.5
92 Applied Linear Algebra and Optimization using MATLAB
Solution. Apply the forward elimination step of the simple Gaussian elim-
ination on the given matrix A and eliminate the element a21 by subtracting
from the second row the appropriate multiple of the first row. In this case,
the multiple is given as
1 −1 α
0 4 1 − 2α .
0 α −1.5
We finished the first elimination step. The second elimination step is
(1)
to eliminate element a32 = α by subtracting a multiple m32 = α4 of row 2
from row 3, which gives
1 −1 α
0 4 1 − 2α .
α(1 − 2α)
0 0 −1.5 −
4
To show that the given matrix is singular, we have to set the third diagonal
element equal to zero (by Theorem 1.19), i.e.,
α(1 − 2α)
−1.5 − = 0.
4
After simplifying, we obtain
2α2 − α − 6 = 0.
Solving the above quadratic equation, we get
3
α=− and α = 2,
2
which are the possible values of α, which make the given matrix singular.•
Example 1.29 Use the smallest positive integer value of α to find the
unique solution of the linear system Ax = [1, 6, −4]T by the simple Gaus-
sian elimination method, where
1 −1 α
A= 2 2 1 .
0 α −1.5
Matrices and Linear Systems 93
Solution. Since we know from Example 1.28 that the given matrix A is
singular when α = − 32 and α = 2, to find the unique solution we take the
smallest positive integer value α = 1 and consider the augmented matrix
as follows:
.
1 −1 1 .. 1
2 ..
.
2 1 . 6
..
0 1 −1.5 . −4
Applying the forward elimination step of the simple Gaussian elimina-
tion on the given matrix A and eliminating the element a21 by subtracting
from the second row the appropriate multiple m21 = 2 of the first row gives
..
1 −1 1 . 1
.
0
4 −1 .. .
4
.
0 1 −1.5 .. −4
(1)
The second elimination step is to eliminate element a32 = 1 by subtracting
a multiple m32 = 14 of row 2 from row 3, which gives
..
1 −1 1 . 1
..
.
0
4 −1 . 4
..
0 0 −5/4 . −5
x1 − x2 + x3 = 1
4x2 − x3 = 4
−5/4x3 = −5.
x3 = 4, x2 = 2, x1 = −1,
Note that the inverse of the nonsingular matrix A can be easily determined
by using the simple Gaussian elimination method. Here, we have to con-
sider the augmented matrix as a combination of the given matrix A and the
identity matrix I (the same size as A). To find the inverse matrix BA−1 ,
we must solve the linear system in which the jth column of the matrix B
is the solution of the linear system with the right-hand side the jth column
of the matrix I.
Example 1.30 Use the simple Gaussian elimination method to find the
inverse of the following matrix:
2 −1 3
A = 4 −1 6 .
2 −3 4
Solution. Suppose that the inverse A−1 = B of the given matrix exists
and let
2 −1 3 b11 b12 b13 1 0 0
AB = 4 −1 6 b21 b22 b23 = 0 1 0 = I.
2 −3 4 b31 b32 b33 0 0 1
which gives
b11 = 7, b21 = −2, b31 = −5.
Similarly, the solution of the second linear system
2 −1 3 b12 0
0 1 0 b22 = 1
0 0 1 b32 2
which gives
b12 = −5/2, b22 = 1, b32 = 2.
96 Applied Linear Algebra and Optimization using MATLAB
and it gives
b13 = −3/2, b23 = 0, b33 = 1.
Hence, the elements of the inverse matrix B are
5 3
7 − −
2 2
B = A−1 = −2 ,
1 0
−5 2 1
2. Check the first pivot element a11 6= 0, then move to the next step;
otherwise, interchange rows so that a11 6= 0.
3. Multiply row one by multiplier mi1 = ai1 /a11 and subtract to the ith
row for i = 2, 3, . . . , n.
4. Repeat steps 2 and 3 for the remaining pivots elements unless coeffi-
cient matrix A becomes upper-triangular matrix U .
bn−1
5. Use backward substitution to solve xn from the nth equation xn = n
ann
and solve the other (n − 1) unknown variables by using (1.27).
Matrices and Linear Systems 97
We finished the first elimination step. The second pivot is in the (2, 2)
position, but after eliminating the element below it, we find the triangular
form to be
1 2 4
0 −1 1 .
0 0 3
Since the number of pivots are three, the rank of the given matrix is 3. Note
that the original matrix is nonsingular since the rank of the 3 × 3 matrix
is 3. •
>> A = [1 2 4; 1 1 5; 1 1 6];
>> rank(A)
ans =
3
Note that:
rank(AB) ≤ min(rank(A), rank(B))
rank(A + B) ≤ rank(A) + rank(B)
rank(AAT ) = rank(A) = rank(AT A)
Although the rank of a matrix is very useful to categorize the behavior
of matrices and systems of equations, the rank of a matrix is usually not
computed. •
0.000100x1 + x2 = 1
x1 + x2 = 2,
which has the exact solution x = [1.00010, 0.99990]T . Now we solve this
system by simple Gaussian elimination. The first elimination step is to
eliminate the first variable x1 from the second equation by subtracting
Matrices and Linear Systems 99
multiple m21 = 10000 of the first equation from the second equation, which
gives
0.000100x1 + x2 = 1
− 10000x2 = −10000.
Partial Pivoting
Here, we develop an implementation of Gaussian elimination that utilizes
the pivoting strategy discussed above. In using Gaussian elimination by
partial pivoting (or row pivoting), the basic approach is to use the largest
(in absolute value) element on or below the diagonal in the column of
current interest as the pivotal element for elimination in the rest of that
column.
One immediate effect of this will be to force all the multiples used to be
not greater than 1 in absolute value. This will inhibit the growth of error in
the rest of the elimination phase and in subsequent backward substitution.
At stage k of forward elimination, it is necessary, therefore, to be able to
identify the largest element from |akk |, |ak+1,k |, . . . , |ank |, where these aik s
are the elements in the current partially triangularized coefficient matrix. If
this maximum occurs in row p, then the pth and kth rows of the augmented
matrix are interchanged and the elimination proceeds as usual. In solving
n linear equations, a total of N = n(n+1)/2 coefficients must be examined.
Example 1.32 Solve the following linear system using Gaussian elimina-
tion with partial pivoting:
x1 + x2 + x3 = 1
2x1 + 3x2 + 4x3 = 3
4x1 + 9x2 + 16x3 = 11.
Solution. For the first elimination step, since 4 is the largest absolute
coefficient of the first variable x1 , the first row and the third row are inter-
changed, which gives us
4x1 + 9x2 + 16x3 = 11
2x1 + 3x2 + 4x3 = 3
x1 + x2 + x3 = 1.
Eliminate the first variable x1 from the second and third rows by subtracting
the multiples m21 = 24 and m31 = 41 of row 1 from row 2 and row 3,
respectively, which gives
4x1 + 9x2 + 16x3 = 11
− 3/2x2 − 4x3 = −5/2
− 5/4x2 − x3 = −7/5.
Matrices and Linear Systems 101
For the second elimination step, − 32 is the largest absolute coefficient of the
second variable x2 , so eliminate the second variable x2 from the third row
by subtracting the multiple m32 = 56 of row 2 from row 3, which gives
x1 = 1, x2 = −1, x3 = 1,
The following MATLAB commands will give the same results we ob-
tained in Example 1.32 of the Gaussian elimination method with partial
pivoting:
>> B = [1 1 1 1; 2 3 4 3; 4 9 16 11];
>> x = P P (B);
>> disp(x)
102 Applied Linear Algebra and Optimization using MATLAB
Program 1.7
MATLAB m-file for Gaussian Elimination by Partial Pivoting
function x=PP(B)
% B = input(0 input matrix in f orm[A/b]0 );
[n, t] = size(B); U = B;
for M = 1:n-1
mx(M ) = abs(U (M, M )); r = M ;
for i = M+1:n
if mx(M ) < abs(U (i, M ))
mx(M)=abs(U(i,M)); r = i; end; end
rw1(1,1:t)=U(r,1:t); rw2(1,1:t)=U(M,1:t);
U(M,1:t)=rw1 ; U(r,1:t)=rw2 ;
for k=M+1:n
m=U(k,M)/U(M,M);
for j=M:t
U (k, j) = U (k, j) − m ∗ U (M, j); end;end
i=n; x(i)=U(i,t)/U(i,i);
for i=n-1:-1:1; s=0;
for k=n:-1:i+1
s = s + U (i, k) ∗ x(k); end
x(i)=(U(i,t)-s)/U(i,i); end; B; U; x; end
1. Suppose we are about to work on the ith column of the matrix. Then
we search that portion of the ith column below and including the di-
agonal and find the element that has the largest absolute value. Let
p denote the index of the row that contains this element.
Total Pivoting
In the case of total pivoting (or complete pivoting), we search for the largest
number (in absolute value) in the entire array instead of just in the first
column, and this number is the pivot. This means that we shall probably
need to interchange the columns as well as rows. When solving a system
of equations using complete pivoting, each row interchange is equivalent to
interchanging two equations, while each column interchange is equivalent
to interchanging the two unknowns.
At the kth step, interchange both the rows and columns of the matrix
so that the largest number in the remaining matrix is used as the pivot
i.e., after the pivoting
|akk | = max|aij |, for i = k, k + 1, . . . , n, j = k, k + 1, . . . , n.
There are times when the partial pivoting procedure is inadequate.
When some rows have coefficients that are very large in comparison to
those in other rows, partial pivoting may not give a correct solution.
Then eliminate the third variable x3 from the second and third rows by
4 1
subtracting the multiples m21 = 16 and m31 = 16 of row 1 from rows 2 and
3, which respectively, gives
For the second elimination step, 1 is the largest absolute coefficient of the
first variable x1 in the second row and third column, so the second and third
columns are interchanged, giving us
Eliminate the first variable x1 from the third row by subtracting the multiple
m32 = 43 of row 2 from row 3, which gives
x1 = 1, x2 = −1, x3 = 1,
Program 1.8
MATLAB m-file for the Gaussian Elimination by Total Pivoting
function x=TP(B)
% B = input(‘input matrix in f orm[A/b]0 );
[n,m]=size(B);U=B; w=zeros(n,n);
for i=1:n; N(i)=i; end
for M = 1:n-1; r=M; c=M;
for i = M:n; for j = M:n
if max(M ) < abs(U (i, j)); max(M)=abs(U(i,j));
r = i; c = j; end; end; end
rw1(1,1:m)=U(r,1:m); rw2(1,1:m)=U(M,1:m);
U(M,1:m)=rw1;U(r,1:m)=rw2 ; cl1(1:n,1)= U(1:n,c);
cl2(1 : n, 1) = U (1 : n, M ); U (1 : n, M ) = cl1(1 : n, 1);
U (1 : n, c) = cl2(1 : n, 1); p = N (M ); N (M ) = N (c);
N (c) = p; w(M, 1 : n) = N ;
for k = M + 1 : n; e = U (k, M )/U (M, M );
for j = M : m; U (k, j) = U (k, j) − e ∗ U (M, j); end; end
i = n; x(i, 1) = U (i, m)/U (i, i);
for i = n − 1 : −1 : 1; s = 0;
for k = n : −1 : i + 1; s = s + U (i, k) ∗ x(k, 1);end
x(i, 1) = (U (i, m) − s)/U (i, i); end
for i=1:n; X(N (i), 1) = x(i, 1); end; B; U ; X; end
>> B = [1 1 1 1; 2 3 4 3; 4 9 16 11];
>> x = T P (B);
>> disp(x)
Total pivoting offers little advantage over partial pivoting and it is signifi-
cantly slower, requiring N = n(n+1)(2n+1)
6
elements to be examined in total.
It is rarely used in practice because interchanging columns changes the
order of the xs and, consequently, add significant and usually unjustified
106 Applied Linear Algebra and Optimization using MATLAB
[A|b] → [I|c],
where I is the identity matrix and c is the column matrix, which represents
the possible solution of the given linear system.
Example 1.34 Solve the following linear system using the Gauss–Jordan
method:
x1 + 2x2 = 3
−x1 − 2x3 = −5
−3x1 − 5x2 + x3 = −4.
Solution. Write the given system in the augmented matrix form
..
1 2 0 . 3
..
.
−1 0 −2 . −5
..
−3 −5 1 . −4
The first elimination step is to eliminate elements a21 = −1 and a31 = −3
by subtracting the multiples m21 = −1 and m31 = −3 of row 1 from rows
Matrices and Linear Systems 107
Program 1.9
MATLAB m-file for the Gauss–Jordan Method
function sol=GaussJ(Ab)
[m,n]=size(Ab);
for i=1:m
Ab(i, :) = Ab(i, :)/Ab(i, i);
for j=1:m
if j == i; continue; end
Ab(j, :) = Ab(j, :) − Ab(j, i) ∗ Ab(i, :);
end; end; sol=Ab;
3. Use the nth row to reduce the nth column to an equivalent identity
matrix column.
4. Repeat step 3 for n–1 through 2 to get the augmented matrix of the
form [I|c].
Matrices and Linear Systems 109
Obviously, the original augmented matrix [A|I] has been transformed to the
augmented matrix of the form [I|A−1 ]. Hence, the solution of the linear
system can be obtained by the matrix multiplication (1.32) as
x1 −0.36 −0.16 0.28 1 1
x2 = 1.6 0.6 −0.8 2 = −2 .
x3 −0.6 −0.2 0.4 6 1.4
A = LU, (1.33)
LU x = b
or can be written as
Ly = b,
where
y = U x.
The unknown elements of matrix L and matrix U are computed by equating
corresponding elements in matrices A and LU in a systematic way. Once
the matrices L and U have been constructed, the solution of system (1.35)
can be computed in the following two steps:
Now defining the three elementary matrices (each of them can be ob-
tained by adding a multiple of row i to row j) associated with these row
operations:
1 0 0 1 0 0 1 0 0
E1 = −2 1 0 , E2 = 0 1 0 , E3 = 0 1 0 .
0 0 1 1 0 1 0 −2 1
Then
1 0 0 1 0 0 1 0 0 1 0 0
E3E2E1 = 0 1 0 0 1 0 −2 1 0 = −2 1 0
0 −2 1 1 0 1 0 0 1 5 −2 1
and
1 0 0 2 4 2 2 4 2
E3E2E1A = −2 1 0 4 9 7 = 0 1 3 = U.
5 −2 1 −2 −2 5 0 0 1
So
A = E1−1 E2−1 E3−1 = LU,
where
1 0 0 1 0 0 1 0 0 1 0 0
E1−1 E2−1 E3−1 = 2 1 0 0 1 0 0 1 0 = 2 1 0 = L.
0 0 1 −1 0 1 0 2 1 −1 2 1
Thus, A = LU is a product of a lower-triangular matrix L and an upper-
triangular matrix U . Naturally, this is called an LU decomposition of A.
Matrices and Linear Systems 115
Doolittle’s Method
In Doolittle’s method (called Gauss factorization), the upper-triangular
matrix U is obtained by forward elimination of the Gaussian elimination
method and the lower-triangular matrix L containing the multiples used in
the Gaussian elimination process as the elements below the diagonal with
unity elements on the main diagonal.
where the unknown elements of matrix L are the used multiples and the
matrix U is the same as we obtained in the forward elimination process.
Example 1.36 Construct the LU decomposition of the following matrix A
by using Gauss factorization (i.e., LU decomposition by Doolittle’s method).
Find the value(s) of α for which the following matrix is
1 −1 α
A = −1 2 −α
α 1 1
singular. Also, find the unique solution of the linear system Ax = [1, 1, 2]T
by using the smallest positive integer value of α.
116 Applied Linear Algebra and Optimization using MATLAB
now we will use only the forward elimination step of the simple Gaussian
elimination method to convert the given matrix A into the upper-triangular
matrix U . Since a11 = 1 6= 0, we wish to eliminate the elements a21 = −1
and a31 = α by subtracting from the second and third rows the appropriate
multiples of the first row. In this case, the multiples are given,
−1 α
m21 = = −1 and m31 = = α.
1 1
Hence,
1 −1 α
0 1 0 .
0 1 + α 1 − α2
(1) (1)
Since a22 = 1 6= 0, we eliminate the entry in the a32 = 1 + α position by
subtracting the multiple m32 = 1+α
1
of the second row from the third row to
get
1 −1 α
0 1 0 .
0 0 1 − α2
Obviously, the original set of equations has been transformed to an upper-
triangular form. Thus,
1 −1 α 1 0 0 1 −1 α
−1 2 −α = −1 1 0 0 1 0 ,
α 1 1 α 1+α 1 0 0 1 − α2
y1 = 1 gives y1 = 1,
−y1 + y2 = 1 gives y2 = 2,
2y1 + 3y2 + y3 = 2 gives y3 = −6.
x1 − x2 + 2x3 = 1 gives x1 = −1
x2 = 2 gives x2 = 2
− 3x3 = −6 gives x3 = 2,
which gives
x1 = −1
x2 = 2
x3 = 2,
the approximate solution of the given system. •
118 Applied Linear Algebra and Optimization using MATLAB
>> A = [1 2 0; −1 0 − 2; −3 − 5 1];
>> B = lu − gauss(A);
>> L = eye(size(B)) + tril(B, −1);
>> U = triu(A);
>> b = [3 − 5 − 4]0 ;
>> y = L \ b;
>> x = U \ y;
Program 1.10
MATLAB m-file for the LU Decomposition Method
function A = lu − gauss(A)
% LU factorization without pivoting
[n,n] = size(A); for i=1:n-1; pivot = A(i,i);
for k=i+1:n; A(k,i)=A(k,i)/pivot;
for j=i+1:n; A(k, j) = A(k, j) − A(k, i) ∗ A(i, j);
end;end; end
There is another way to find the values of the unknown elements of the
matrices L and U , which we describe in the following example.
Solution. Since
1 0 0 u11 u12 u13
A = LU = l21 1 0 0 u22 u23 ,
l31 l32 1 0 0 u33
1 = u11 , u11 = 1
1 = l21 u11 , l21 = 1
2 = l31 u11 , l31 = 2.
2 = u12 , u12 = 2
3 = l21 u12 + u22 , u22 = 3 − 2 = 1
2 = l31 u12 + l32 u22 , l32 = 2 − 4 = −2.
4 = u13 , u13 = 4
3 = l21 u13 + u23 , u23 = 3 − 4 = −1
2 = l31 u13 + l32 u23 + u33 , u33 = 2 − 10 = −8.
Thus, we obtain
1 2 4 1 0 0 1 2 4
1 3 3 = 1 1 0 0 1 1 ,
2 2 2 2 −2 1 0 0 −8
i−1
X
uij = aij − lik ukj , 2 ≤i≤j
k=1
" j−1
#
1 X
lij = aij − lik ukj , i >j≥2
uii . (1.38)
k=1
uij = a1j , i =1
ai1 ai1
lij = = , j =1
u11 a11
1 2 4 x1 −2
0 1 1 x2 = 5 .
0 0 −8 x3 8
which gives
x1 = −6
x2 = 4
x3 = −1,
We can also write the MATLAB m-file called Doolittle.m to get the
solution of the linear system by LU decomposition by using Doolittle’s
method. In order to reproduce the above results using MATLAB com-
mands, we do the following:
>> A = [1 2 4; 1 3 3; 2 2 2];
>> b = [−2 3 − 6];
>> sol = Doolittle(A, b);
122 Applied Linear Algebra and Optimization using MATLAB
Program 1.11
MATLAB m-file for using Doolittle’s Method
function sol = Doolittle(A,b)
[n,n]=size(A); u=A;l=zeros(n,n);
for i=1:n-1; if abs(u(i,i))> 0
for i1=i+1:n; m(i1,i)=u(i1,i)/u(i,i);
for j=1:n
u(i1, j) = u(i1, j) − m(i1, i) ∗ u(i, j);end;end;end;end
for i=1:n; l(i,1)=A(i,1)/u(1,1); end
for j=2:n; for i=2:n; s=0;
for k=1:j-1; s = s + l(i, k) ∗ u(k, j); end
l(i,j)=(A(i,j)-s)/u(j,j); end; end y(1)=b(1)/l(1,1);
for k=2:n; sum=b(k);
for i=1:k-1; sum = sum − l(k, i) ∗ y(i); end
y(k)=sum/l(k,k); end
x(n)=y(n)/u(n,n);
for k=n-1:-1:1; sum=y(k);
for i=k+1:n; sum = sum − u(k, i) ∗ x(i); end
x(k)=sum/u(k,k); end; l; u; y; x
Let D denote the diagonal matrix having the same diagonal elements as
the upper-triangular matrix U ; in other words, D contains the pivots on
its diagonal and zeros everywhere else. Let V be the redefining upper-
triangular matrix obtained from the original upper-triangular matrix U by
dividing each row by its pivot, so that V has all 1s on the diagonal. It
is easily seen that U = DV , which allows any LU decomposition to be
written as
A = LDV,
where L and V are lower- and upper-triangular matrices with 1s on both
of their diagonals. This is called the LDV factorization of A.
L=VT and V = LT .
Note that not every symmetric matrix has an LDLT factorization. How-
ever, if A = LDLT , then A must be symmetric because
Note that
1 3 2 1 0 0
V = 0 1 1 = LT = 3 1 0 .
0 0 1 2 1 1
Thus, we obtain
1 0 0 1 0 0 1 3 2
A = LDLT = 3 1 0 0 −5 0 0 1 1 ,
2 1 1 0 0 3 0 0 1
Crout’s Method
Crout’s method, in which matrix U has unity on the main diagonal, is
similar to Doolittle’s method in all other aspects. The L and U matrices
are obtained by expanding the matrix equation A = LU term by term to
determine the elements of the L and U matrices.
Solution. Since
l11 0 0 1 u12 u13
A = LU = l21 l22 0 0 1 u23 ,
l31 l32 l33 0 0 1
performing the multiplication on the right-hand side gives
1 2 3 l11 l11 u12 l11 u13
6 5 4 = l21 l21 u12 + l22 l21 u13 + l22 u23 .
2 5 6 l31 l31 u12 + l32 l31 u13 + l32 u23 + l33
Then equate elements of the first column to obtain
1 = l11
6 = l21
2 = l31 .
Then equate elements of the second column to obtain
2 = l11 u12 , u12 = 2
j−1
X
lij = aij − lik ukj , i ≥ j, i = 1, 2, . . . , n
k=1
i−1
1 X
uij = [aij − lik ukj ], i < j, j = 2, 3, . . . , n
lii . (1.39)
k=1
lij = ai1 , j=1
aij
uij = , i=1
a11
y1 = 1 gives y1 = 1
6y1 − 7y2 = −1 gives y2 = 1
2y1 + y2 − 2y3 = 5 gives y3 = −1.
1 2 3 x1 1
0 1 2 x2 = 1 .
0 0 1 x3 −1
>> A = [1 2 3; 6 5 4; 2 5 6];
>> b = [1 − 1 5];
>> sol = Crout(A, b);
Matrices and Linear Systems 129
Program 1.12
MATLAB m-file for the Crout’s Method
function sol = Crout(A, b)
[n,n]=size(A); u=zeros(n,n); l=u;
for i=1:n; u(i,i)=1; end
l(1,1)=A(1,1);
for i=2:n
u(1,i)=A(1,i)/l(1,1); l(i,1)=A(i,1); end
for i=2:n; for j=2:n; s=0;
if i <= j; K=i-1;
else; K=j-1; end
for k=1:K; s = s + l(i, k) ∗ u(k, j); end
if j > i; u(i,j)=(A(i,j)-s)/l(i,i); else
l(i,j)=A(i,j)-s; end;end;end
y(1)=b(1)/l(1,1);
for k=2:n; sum=b(k);
for i=1:k-1; sum = sum − l(k, i) ∗ y(i); end
y(k)=sum/l(k,k); end
x(n)=y(n)/u(n,n);
for k=n-1:-1:1; sum=y(k);
for i=k+1:n; sum = sum − u(k, i) ∗ x(i); end
x(k)=sum/u(k,k); end; l; u; y; x;
Example 1.43 Find the determinant and inverse of the following matrix
using LU decomposition by Doolittle’s method:
1 −2 1
A = 1 −1 1 .
1 1 2
Solution. We know that
1 −2 1 1 0 0 u11 u12 u13
A = 1 −1 1 = m21 1 0 0 u22 u23 = LU.
1 1 2 m31 m32 1 0 0 u33
Matrices and Linear Systems 131
Now we will use only the forward elimination step of the simple Gaussian
elimination method to convert the given matrix A into the upper-triangular
matrix U . Since a11 = 1 6= 0, we wish to eliminate the elements a21 = 1
and a31 = 1 by subtracting from the second and third rows the appropriate
multiples of the first row. In this case, the multiples are given as
Hence,
1 −2 1
0 1 0 .
0 3 1
(1) (1)
Since a22 = 1 6= 0, we eliminate the entry in the a32 = 3 position by
subtracting the multiple m32 = 3 of the second row from the third row to
get
1 −2 1
0 1 0 .
0 0 1
Obviously, the original set of equations has been transformed to an upper-
triangular form. Thus,
1 −2 1 1 0 0 1 −2 1
1 −1 1 = 1 1 0 0 1 0 ,
1 1 2 1 3 1 0 0 1
To find the inverse of matrix A, first we will compute the inverse of the
lower-triangular matrix L−1 from
0
1 0 0 l11 0 0 1 0 0
LL−1 = 1 1 0 l21 0 0
l22 0 = 0 1 0 =I
0 0 0
1 3 1 l31 l32 l33 0 0 1
132 Applied Linear Algebra and Optimization using MATLAB
can be obtained
0 0
l22 = 1, l32 = −3.
Finally, the solution of the third linear system
1 0 0 0 0
1 1 0 0 = 0
0
1 3 1 l33 1
0
gives l33 = 1.
Hence, the elements of the matrix L−1 are
1 0 0
L−1 = −1 1 0 ,
2 −3 1
For LU decomposition we have not used pivoting for the sake of sim-
plicity. However, pivoting is important for the same reason as in Gaussian
elimination. We know that pivoting in Gaussian elimination is equivalent
to interchanging the rows of the coefficients matrix together with the terms
on the right-hand side. This indicates that pivoting may be applied to LU
decomposition as long as the interchanging is applied to the left and right
terms in the same way. When performing pivoting in LU decomposition,
the changes in the order of the rows are recorded. The same reordering is
then applied to the right-hand side terms before starting the solution in
accordance with the forward elimination and backward substitution steps.
Indirect LU Decomposition
and replacing the matrix A by P A. For example, using the above matrix
A, we have
1 0 0 2 2 −4 2 2 −4
P A = 0 0 1 2 2 −1 = 3 2 −3 .
0 1 0 3 2 −3 2 2 −1
From this multiplication we see that rows 2 and 3 of the original matrix A
are interchanged, and the resulting matrix P A has a LU factorization and
we have
1 0 0 2 2 −4 2 2 −4
1.5 1 0 0 −1 3 = 3 2 −3 .
1 0 1 0 0 3 2 2 −1
A0 = P A, equivalently A = P −1 A0 ,
A0 = P A = LU
and so
A = P −1 LU = (P T L)U
since P −1 = P T . The determinant of A may now be written as
or
det(A) = β det(L) det(U ),
where β = det(P −1 ) equals −1 or +1 depending on whether the number
pivoting is odd or even, respectively. •
One can use the MATLAB built-in lu function to obtain the permuta-
tion matrix P so that the P A matrix has a LU decomposition:
>> A = [0 1 2; −1 4 2; 2 2 1];
>> [L, U, P ] = lu(A);
It will give us the permutation matrix P and the matrices L and U as
follows:
0 0 1
P = 0 1 0
1 0 0
and
1 0 0 2 2 1
P A = −0.5 1 0 0 5 2.5 = LU.
0 0.2 1 0 0 1.5
So
A = P −1 LU
Matrices and Linear Systems 137
or
0 0.2 1 2 2 1
A = (P T L)U = −0.5 1 0 0 5 2.5 .
1 0 0 0 0 1.5
then:
1. Show that A does not have LU factorization;
Note that during this elimination process two row interchanges were needed,
which means we got two elementary permutation matrices of the inter-
changes (from Theorem 1.23), which are
0 0 1 1 0 0
p1 = 0 1 0 and p2 = 0 0 1 .
1 0 0 0 1 0
3
m21 = 2, m31 = 0, m32 = −
2
as follows:
3 2 5 3 2 5 3 2 5
6 2 4 → 6 −2 −6 → 0 −2 −6 = U.
0 3 3 0 3 3 0 0 −6
Matrices and Linear Systems 139
Thus, P A = LU , where
1 0 0 3 2 5
L= 2 1 0 and U = 0 −2 −6 .
0 −3/2 1 0 0 −6
(3) Solve the first system Ly = P b = [4, 3, 6]T for unknown vector y, i.e.,
1 0 0 y1 4
2 1 0 y2 = 3 .
0 −3/2 1 y3 6
y1 = 4 gives y1 = 4
2y1 + y2 = 3 gives y2 = −5
− 3/2y2 + y3 = 6 gives y3 = −1.5.
Then solve the second system U x = y for the unknown vector x, i.e.,
3 2 5 x1 4
0 −2 −6 x2 = −5 .
0 0 −6 x3 −1.5
The function
a11 a12 · · · a1n
x 1
a21 a22 · · · a2n x2
xT Ax = x1 x2 · · · xn
a31 a32 · · · a3n
.
.. .. .. .. ..
. . . .
xn
an1 an2 · · · ann
or n X
n
X
xT Ax = aij xi xj (1.42)
i=1 j=1
can be used to represent any quadratic polynomial in the variables x1 , x2 , . . . , xn
and is called a quadratic form. A matrix is said to be positive-definite if
its quadratic form is positive for all real nonzero vectors x, i.e.,
xT Ax > 0, for every n-dimensional column vector x 6= 0.
Example 1.45 The matrix
4 −1 0
A = −1 4 −1
0 −1 4
is positive-definite and suppose x is any nonzero three-dimensional column
vector, then
4 −1 0 x1
xT Ax = x1 x2 x3 −1
4 −1 x2
0 −1 4 x3
Matrices and Linear Systems 141
or
4x1 − x2
= x1 x2 x3 −x1 + 4x2 − x3 .
− x2 + 4x3
Thus,
xT Ax = 4x21 − 2x1 x2 + 4x22 − 2x2 x3 + 4x23 .
After rearranging the terms, we have
Hence,
3x21 + (x1 − x2 )2 + 2x22 + (x2 − x3 )2 + 3x23 > 0,
unless x1 = x2 = x3 = 0. •
The principal minors of a matrix A are the square submatrices lying in the
upper-left hand corner of A. An n × n matrix A has n of these principal
minors. For example, for the matrix
6 2 1
A = 2 3 2 ,
1 1 2
142 Applied Linear Algebra and Optimization using MATLAB
det(6) = 6 > 0,
6 2
det = 18 − 4 = 14 > 0,
2 3
6 2 1
det 2 3 2 = 19 > 0.
1 1 2
det(4) = 4 > 0,
4 −1
det = 12 − 1 = 11 > 0,
−1 3
4 −1 2
det −1 3 0 = 43 > 0.
2 0 5
we can have
3 4 4
AT A = 4 6 5 .
4 5 6
det(3) = 3 > 0,
3 4
det = 18 − 16 = 2 > 0,
4 6
3 4 4
det 4 6 5 = 1 > 0.
4 5 6
Cholesky Method
The Cholesky method (or square root method) is of the same form as
Doolittle’s method and Crout’s method except it is limited to equations
involving symmetrical coefficient matrices. In the case of a symmetric
and positive-definite matrix A it is possible to construct an alternative
triangular factorization with a saved number of calculations compared with
previous factorizations. Here, we decompose the matrix A into the product
of LLT , i.e.,
A = LLT , (1.43)
144 Applied Linear Algebra and Optimization using MATLAB
2. Solve LT x = y, for x.
(using backward substitution)
In this procedure, it is necessary to take the square root of the elements
on the main diagonal of the coefficient matrix. However, for a positive-
definite matrix the terms on its main diagonal are positive, so no difficulty
will arise when taking the square root of these terms.
Example 1.46 Construct the LU decomposition of the following matrix
using the Cholesky method:
1 1 2
A = 1 2 4 .
2 4 9
Solution. Since
l11 0 0 l11 l21 l31
A = LLT = l21 l22 0 0 l22 l32 ,
l31 l32 l33 0 0 l33
2
√
1 = l11 gives l11 = 1=1
√
Note that l11 could be − 1 and so the matrix L is not (quite) unique.
Now equate elements of the second column to obtain
2 2
2 = l21 + l22 gives l22 = 1
4 = l31 l21 + l32 l22 gives l32 = 2.
2 2 2
9 = l31 + l32 + l33 gives l33 = 1.
Thus, we obtain
1 1 2 1 0 0 1 1 2
1 2 4 = 1 1 0 0 1 2 ,
2 4 9 2 2 1 0 0 1
The method fails if ljj = 0 and the expression inside the square root is
negative, in which case all of the elements in column j are purely imaginary.
There is, however, a special class of matrices for which these problems don’t
occur.
The Cholesky method provides a convenient method for investigat-
ing the positive-definiteness of symmetric matrices. The formal definition
xT Ax > 0, for all x 6= 0, is not easy to verify in practice. However, it is
relatively straightforward to attempt the construct of a Cholesky decom-
position of a symmetric matrix.
Solution. Since
l11 0 0 l11 l21 l31
A = LLT = l21 l22 0 0 l22 l32 ,
l31 l32 l33 0 0 l33
performing the multiplication on the right-hand side gives
2
9 3 6 l11 l11 l21 l11 l31
3 10 8 = l11 l21 l21 2 2
+ l22 l21 l31 + l22 l32 .
2 2 2
6 8 9 l11 l31 l31 l21 + l22 l32 l31 + l32 + l33
Then equate elements of the first column to obtain
2
√
9 = l11 gives l11 = 9=3
y1 = 2, y1 = 2
y1 + y2 = 1, y2 = −1
2y1 + 2y2 + y3 = 1, y3 = −1.
x1 + x2 + 2x3 = 2 gives x1 = 3
x2 + 2x3 = −1 gives x2 = 1
x3 = −1 gives x3 = −1,
Now use the following MATLAB commands to obtain the above results:
>> A = [1 1 2; 1 2 4; 2 4 9];
>> b = [2 1 1];
>> sol = Cholesky(A, b);
Example 1.49 Find the bounds on α for which the Cholesky factorization
of the following matrix with real elements
1 2 α
A = 2 8 2α
α 2α 9
is possible.
Solution. Since
l11 0 0 l11 l21 l31
A = LLT = l21 l22 0 0 l22 l32 ,
l31 l32 l33 0 0 l33
150 Applied Linear Algebra and Optimization using MATLAB
2
1 2 α l11 l11 l21 l11 l31
2
2 8 2α = l11 l21 l21 2
+ l22 l21 l31 + l22 l32 .
2 2 2
α 2α 9 l11 l31 l31 l21 + l22 l32 l31 + l32 + l33
2
√
1 = l11 gives l11 = 1=1
√
Note that l11 could be − 1 and so matrix L is not (quite) unique.
Now equate elements of the second column to obtain
2 2
8 = l21 + l22 gives l22 = 2
2α = l31 l21 + l32 l22 gives l32 = 0.
2 2 2
√
9 = l31 + l32 + l33 gives l33 = 9 − α2 ,
Program 1.13
MATLAB m-file for the Cholesky Method
function sol = Cholesky(A, b)
[n,n]=size(A); l=zeros(n,n); u=l;
l(1, 1) = (A(1, 1)) \ 0.5; u(1,1)=l(1,1);
for i=2:n; u(1,i)=A(1,i)/l(1,1);
l(i,1)=A(i,1)/u(1,1); end
for i=2:n; for j=2:n; s=0;
if i <= j; K=i-1; else; K=j-1; end
for k=1:K; s = s + l(i, k) ∗ u(k, j); end
if j > i; u(i,j)=(A(i,j)-s)/l(i,i);
elseif i == j
l(i, j) = (A(i, j) − s) \ 0.5; u(i,j)=l(i,j);
else; l(i,j)=(A(i,j)-s)/u(j,j); end; end; end
y(1)=b(1)/l(1,1);
for k=2:n; sum=b(k);
for i=1:k-1; sum = sum − l(k, i) ∗ y(i); end
y(k)=sum/l(k,k); end
x(n)=y(n)/u(n,n);
for k=n-1:-1:1; sum=y(k);
for i=k+1:n; sum = sum − u(k, i) ∗ x(i); end
x(k)=sum/u(k,k); end; l; u; y; x;
Solution. By using the simple Gauss elimination method, one can convert
the given matrix into the upper-triangular matrix
4 −2 4
U = 0 1 4
0 0 9
152 Applied Linear Algebra and Optimization using MATLAB
where
1 0 0 2 0 0 2 0 0
L̂ = LD1/2 = −0.5 1 0 0 1 0 = −1 1 0
1 4 1 0 0 3 2 4 3
Matrices and Linear Systems 153
and
2 0 0 1 −0.5 1 2 −1 2
L̂T = D1/2 V = 0 1 0 0 1 4 = 0 1 4 .
0 0 3 0 0 1 0 0 3
Thus, we obtain
2 0 0 2 −1 2
A = L̂L̂T = −1 1 0 0 1 4 ,
2 4 3 0 0 3
1. Matrix A is nonsingular.
Example 1.53 Solve the following linear system using the simple Gaus-
sian elimination method and also find the LU decomposition of the matrix
using Doolittle’s method and Crout’s method:
5x1 + x2 + x3 = 7
2x1 + 6x2 + x3 = 9
x1 + 2x2 + 9x3 = 12.
and since a11 = 5 6= 0, we can eliminate the elements a21 and a31 by
subtracting from the second and third rows the appropriate multiples of the
first row. In this case the multiples are given,
2 1
m21 = = 0.4 and m31 = = 0.2.
5 5
Hence,
..
5 1 1 . 7
0 5.6 0.6 ... 6.2 .
..
0 1.8 8.8 . 10.6
(1) (1)
Since a22 = 5.6 6= 0, we eliminate the entry in the a32 position by sub-
1.8
tracting the multiple m32 = 5.6 = 0.32 of the second row from the third row
156 Applied Linear Algebra and Optimization using MATLAB
to get
.
5 1 ..
1 7
0 5.6 0.6 ... 6.2 .
..
0 0 8.6 . 8.6
Obviously, the original set of equations has been transformed to an upper-
triangular form. All the diagonal elements of the obtaining upper-triangular
matrix are nonzero, which means that the coefficient matrix of the given
system is nonsingular, therefore, the given system has a unique solution.
Now expressing the set in algebraic form yields
5x1 + x2 + x3 = 7
5.6x2 + 0.6x3 = 6.2
8.6x3 = 8.6.
Now use backward substitution to get the solution of the system as
8.6x3 = 8.6 gives x3 = 1
5.6x2 = −0.6x3 + 6.2 = −0.6 + 6.2 = 5.6 gives x2 = 1
5x1 = 7 − x2 − x3 = 7 − 1 − 1 = 5 gives x1 = 1.
We know that when using LU decomposition by Doolittle’s method the un-
known elements of matrix L are the multiples used and the matrix U is the
same as we obtained in the forward elimination process of the simple Gauss
elimination. Thus, the LU decomposition of matrix A can be obtained by
using Doolittle’s method as follows:
5 1 1 1 0 0 5 1 1
A = 2 6 1 0.4 1 0 0 5.6 0.6 = LU.
1 2 9 0.5 0.32 1 0 0 8.6
Similarly, the LU decomposition of matrix A can be obtained by using
Crout’s method as
5 1 1 5 0 0 1 0.2 0.2
A= 2 6 1 2 5.6 0 0 1 0.1 = LU.
1 2 9 1 1.8 8.6 0 0 1
Thus, the conditions of Theorem 1.30 are satisfied. •
Matrices and Linear Systems 157
After finding the values for li and ui , then they are used along with the
elements ci , to solve the tridiagonal system (1.47) by solving the first bidi-
agonal system
Ly = b, (1.50)
for y by using forward substitution,
y 1 = b1
, (1.51)
yi = bi − li yi−1 , i = 2, 3, . . . , n
U x = y, (1.52)
The entire process for solving the original system (1.47) requires 3n ad-
ditions, 3n multiplications, and 2n divisions. Thus, the total number of
multiplications and divisions is approximately 5n.
x1 + x2 = 1
x1 + 2x2 + x3 = 0
x2 + 3x3 + x4 = 1
x3 + 4x4 = 1.
Then the elements of the L and U matrices can be computed by using (1.48)
as follows:
u1 = α1 = 1
β2 1
l2 = = =1
u1 1
u2 = α2 − l2 c1 = 2 − (1)1 = 1
b3 1
l3 = = =1
u2 1
u3 = α3 − l3 c2 = 3 − (1)1 = 2
b4 1
l4 = =
u3 2
1 7
u4 = α4 − l4 c3 = 4 − ( )1 = .
2 2
After finding the elements of the bidiagonal matrices L and U , we solve the
160 Applied Linear Algebra and Optimization using MATLAB
1 0 0 0 y1 1
1 1 0 0 y2 0
= .
0 1 1 0 y3 1
0 0 21 1 y4 1
1 1 0 0 x1 1
0 1 1 0 x2 −1
=
2 .
0 0 2 1 x3
0 0 0 27 x4 0
Program 1.14
MATLAB m-file for LU Decomposition for a Tridiagonal System
function sol=TRiDLU(Tb)
[m,n]=size(Tb); L=eye(m); U=zeros(m);
U(1,1)=Tb(1,1);
for i=2:m
U (i − 1, i) = T b(i − 1, i);
L(i, i − 1) = T b(i, i − 1)/U (i − 1, i − 1);
U (i, i) − L(i, i − 1) ∗ T b(i − 1, i); end
disp(’The lower-triangular matrix’) L;
disp(’The upper-triangular matrix’) U;
y = inv(L) ∗ T b(:, n); x = inv(U ) ∗ y;
Vector Norms
There are three norms in Rn that are most commonly used in applications,
called l1 -norm, l2 -norm, and l∞ -norm, and are defined for the given vectors
Matrices and Linear Systems 163
x = [x1 , x2 , . . . , xn ]T as
n
X
kxk1 = |xi |
i=1
n
!1/2
X
kxk2 = x2i
i=1
The l1 -norm is called the absolute norm, the l2 -norm is frequently called
the Euclidean norm as it is just the formula for distance in ordinary three-
dimensional Euclidean space extended to dimension n, and finally, the
l∞ -norm is called the maximum norm or occasionally the uniform norm.
All these three norms are also called the natural norms.
Matrix Norms
A matrix norm is a measure of how well one matrix approximates another,
or, more accurately, of how well their difference approximates the zero ma-
trix. An iterative procedure for inverting a matrix produces a sequence
of approximate inverses. Since, in practice, such a process must be termi-
nated, it is desirable to have some measure of the error of an approximate
inverse.
1. kAk > 0, A 6= 0;
2. kAk = 0, A = 0;
5. kA + Bk ≤ kAk + kBk;
Matrices and Linear Systems 165
6. kABk ≤ kAkkBk;
7. kA − Bk ≥ kAk − kBk.
Several norms for matrices have been defined, and we shall use the following
three natural norms l1 , l2 , and l∞ for a square matrix of order n:
n
!
X
kAk1 = max |aij | = maximum column sum,
j
i=1
n
!
X
kAk∞ = max |aij | = row sum norm.
i
j=1
The l1 -norm and l∞ -norm are widely used because they are easy to cal-
culate. The matrix norm kAk2 that corresponds to the l2 -norm is related
to the eigenvalues of the matrix. It sometimes has special utility because
no other norm is smaller than this norm. It, therefore, provides the best
measure of the size of a matrix, but is also the most difficult to compute.
We will discuss this natural norm later in the chapter.
m X
n
!1/2
X
kAkF = |aij |2 .
i=1 j=1
so
kAk1 = max{8, 9, 10} = 10.
Also,
3
X
|a1j | = |4| + |2| + | − 1| = 7,
j=1
3
X
|a2j | = |3| + |5| + | − 2| = 10,
j=1
X3
|a3j | = |1| + | − 2| + |7| = 10,
j=1
so
kAk∞ = max{7, 10, 10} = 10.
Finally, we have
>> A = [4 2 − 1; 3 5 − 2; 1 − 2 − 7];
>> B = norm(A, 1)
B=
10
The l∞ -norm of a matrix A is:
>> A = [4 2 − 1; 3 5 − 2; 1 − 2 − 7];
>> B = norm(A, inf )
B=
10
Finally, the Frobenius norm of the matrix A is:
>> A = [4 2 − 1; 3 5 − 2; 1 − 2 − 7];
>> B = norm(A,0 f ro0 )
B=
10.6301
Program 1.15
MATLAB m-file for Finding the Residual Vector
function r=RES(A,b,x0)
[n,n]=size(A);
for i=1:n; R(i) = b(i);
for j=1:n
R(i)=R(i)-A(i,j)*x0(j);end
RES(i)=R(i); end
r=RES’
has the approximate solution x∗ = [3, 0]T . To see how good this solution
is, we compute the residual, r = [0, −0.0002]T .
has the exact solution x = [1, 1, 1, 1]T and the approximate solution due to
Gaussian elimination without pivoting is
We found that all the elements of the residual for the second case (with
pivoting) are less than 0.6 × 10−7 , whereas for the first case (without piv-
oting) they are as large as 0.2 × 10−4 . Even without knowing the exact
solution, it is clear that the solution obtained in the second case is much
better than the first case. The residual provides a reasonable measure of
the accuracy of a solution in those cases where the error is primarily due
to the accumulation of round-off errors.
We have seen that for ill-conditioned systems the residual is not neces-
sarily a good measure of the accuracy of a solution. How then can we tell
when a system is ill-conditioned? In the following we discuss some possible
indicators of ill-conditioned systems.
Note that the condition number K(A) for A depends on the matrix norm
used and can, for some matrices, vary considerably as the matrix norm is
changed. Since
The condition numbers provide bounds for the sensitivity of the solution
of a set of equations to changes in the coefficient matrix. Unfortunately,
the evaluation of any of the condition numbers of a matrix A is not a trivial
task since it is necessary first to obtain its inverse.
Example 1.57 Compute the condition number of the following matrix us-
ing the l∞ -norm:
2 −1 0
A = 2 −4 −1 .
−1 0 2
Solution. The condition number of a matrix is defined as
and
n 8 −2 −1 3 −4 −2 4 −1 6 o
kA−1 k∞ = max + + , + + , + + ,
13 13 13 13 13 13 13 13 13
which gives
11
kA−1 k∞ = .
13
172 Applied Linear Algebra and Optimization using MATLAB
Therefore,
−1 11
K(A) = kAk∞ kA k∞ = (7) ≈ 5.9231.
13
Depending on the application, we might consider this number to be rea-
sonably small and conclude that the given matrix A is reasonably well-
conditioned. •
To get the above results using MATLAB commands, we do the follow-
ing:
>> A = [2 − 1 0; 2 − 4 − 1; −1 0 2];
>> Ainv = inv(A)
>> K(A) = norm(A, inf ) ∗ norm(Ainv, inf );
K(A) =
5.9231
Some matrices are notoriously ill-conditioned. For example, consider the
4 × 4 Hilbert matrix
1 1 1
1
2 3 4
1 1 1 1
2 3 4 5
H= ,
1 1 1 1
3 4 5 6
1 1 1 1
4 5 6 7
whose entries are defined by
1
aij = , for i, j = 1, 2, . . . , n.
(i + j − 1)
The inverse of the matrix H can be obtained as
16 −120 240 −140
−120 1200 −2700 1680
H −1 =
240 −2700
.
6480 −4200
−140 1680 −4200 2800
Matrices and Linear Systems 173
which is quite large. Note that the condition numbers of Hilbert matri-
ces increase rapidly as the sizes of the matrices increase. Therefore, large
Hilbert matrices are considered to be extremely ill-conditioned.
for which det A = 10−14 ≈ 0. One can easily find the condition number of
the given matrix as
The condition number of a matrix K(A) using the l2 -norm can be com-
puted by the built-in function cond command in MATLAB as follows:
>> A = [1 − 1 2; 3 1 − 1; 2 0 1];
>> K(A) = cond(A);
K(A) =
19.7982
kx − x∗ k ≤ krkkA−1 k, (1.56)
174 Applied Linear Algebra and Optimization using MATLAB
Ax − Ax∗ = b − (b − r) = r,
kbk
kbk ≤ kAkkxk, or kxk ≥ .
kAk
Hence,
kx − x∗ k kA−1 kkrk krk
≤ ≤ K(A) .
kxk kbk/kAk kbk
The inequalities (1.56) and (1.57) imply that the quantities kA−1 k and
K(A) can be used to give an indication of the connection between the
residual vector and the accuracy of the approximation. If the quantity
K(A) ≈ 1, the relative error will be fairly close to the relative residual.
But if K(A) >> 1, then the relative error could be many times larger than
the relative residual.
x1 + x2 − x3 = 1
x1 + 2x2 − 2x3 = 0
−2x1 + x2 + x3 = −1.
Matrices and Linear Systems 175
>> A = [1 1 − 1; 1 2 − 2; −2 1 1];
>> K(A) = norm(A, inf ) ∗ norm(inv(A), inf );
(b) The residual vector can be calculated as
r = b − Ax∗
1 1 1 −1 2.01
= 0 − 1 2 −2 1.01 .
−1 −2 1 1 1.98
176 Applied Linear Algebra and Optimization using MATLAB
and it gives
krk∞ = 0.07.
>> A = [1 1 − 1; 1 2 − 2; −2 1 1];
>> b = [1 0 − 1]0 ;
>> x0 = [2.01 1.01 1.98]0 ;
>> r = RES(A, b, x0);
>> rnorm = norm(r, inf );
(c) From (1.57), we have
kx − x∗ k krk
≤ K(A) .
kxk kbk
By using parts (a) and (b) and the value kbk∞ = 1, we obtain
kx − x∗ k (0.07)
≤ (22.5) = 1.575.
kxk 1
After applying the forward elimination step of the simple Gauss elimination
method, we obtain
..
1 1 −1 . −0.04
0 1 −1 ... −0.03 .
..
0 0 2 . 0.04
Now by using backward substitution, we obtain the solution
Conditioning
Let us consider the conditioning of the linear system
Ax = b. (1.59)
Case 1.1 Suppose that the right-hand side term b is replaced by b + δb,
where δb is an error in b. If x + δx is the solution corresponding to the
right-hand side b + δb, then we have
Ax + Aδx = b + δb,
Aδx = δb.
δx = A−1 δb.
Thus, the change kδxk in the solution is bounded by kA−1 k times the change
kδbk in the right-hand side.
The conditioning of the linear system is connected with the ratio between
kδxk kδbk
the relative error and the relative change in the right-hand side,
kxk kbk
which gives
Thus, the relative change in the solution is bounded by the condition num-
ber of the matrix times the relative change in the right-hand side. When
the product in the right-hand side is small, the relative change in the so-
lution is small.
or
Aδx = −δA(x + δx).
Multiplying by A−1 , we get
or
kδxk(1 − kA−1 kkδAk) ≤ kA−1 kkδAkkxk,
which can be written as
kδxk kA−1 kkδAk K(A)kδAk/kAk
≤ −1
= . (1.64)
kxk (1 − kA kkδAk) (1 − kA−1 kkδAk)
If the product kA−1 kkδAk is much smaller than 1, the denominator in
(1.64) is near 1. Consequently, when kA−1 kkδAk is much smaller than 1,
then (1.64) implies that the relative change in the solution is bounded by
the condition number of a matrix times the relative change in the coefficient
matrix.
Case 1.3 Suppose that there is a change in the coefficient matrix A and the
right-hand side term b together, and if x + δx is the solution corresponding
to the coefficient matrix A + δA and the right-hand side b + δb, then we
have
(A + δA)(x + δx) = (b + δb), (1.65)
which implies that
Ax + Aδx + xδA + δAδx = b + δb
or
Aδx + δxδA = (δb − xδA).
Multiplying by A−1 , we get
δx(I + A−1 δA) = A−1 (δb − xδA)
or
δx = (I + A−1 δA)−1 A−1 (δb − xδA). (1.66)
Since we know that if A is nonsingular and δA is the error in A, we obtain
kA−1 δAk ≤ kA−1 kkδAk < 1, (1.67)
it then follows that (see Fröberg 1969) the matrix (I+A−1 δA) is nonsingular
and
1 1
k(I + A−1 δA)−1 k ≤ −1
≤ −1
. (1.68)
1 − kA δAk 1 − kA kkδAk
180 Applied Linear Algebra and Optimization using MATLAB
1.6 Applications
In this section we discuss applications of linear systems. Here, we will solve
or tackle a variety of real-life problems from several areas of science.
Solution. Observe that in this example we are given three points and we
want to find a polynomial of degree 2 (one less than the number of data
points). Let the polynomial be
p(x) = a0 + a1 x + a2 x2 .
We are given three points and shall use these three sets of information to
determine the three unknowns a0 , a1 , and a2 . Substituting
x = 1, y = 6; x = 2, y = 3; x = 3, y = 2,
in turn, into the polynomial leads to the following system of three linear
equations in a0 , a1 , and a2 :
a0 + a1 + a2 = 6
a0 + 2a1 + 4a2 = 3
a0 + 3a1 + 9a2 = 2.
Solve this system for a2 , a1 , and a0 using the Gauss elimination method:
. . .
1 1 1 .. 6 1 1 1 .. 6 1 1 1 .. 6
1 2 4 ... 0 1 3 ... −3 ≈ 0 1 3 ... −3 .
3
≈
. .. ..
1 3 9 .. 2 0 2 8 . −4 0 0 2 . 2
Now use backward substitution to get the solution of the system (Fig-
ure 1.4),
2a2 = 2 gives a2 = 1
a1 + 3a2 = −3 gives a1 = −6
a0 + a1 + a2 = 6 gives a0 = 11.
Thus,
p(x) = 11 − 6x + x2
is the required the polynomial. •
Matrices and Linear Systems 183
1. Junctions: All the current flowing into a junction must flow out of
it.
2. Paths: The sum of the IR terms (where I denotes current and R
resistance) in any direction around a closed path is equal to the total voltage
in the path in that direction. •
Example 1.60 Consider the electric network in Figure 1.5. Let us deter-
mine the currents through each branch of this network.
Solution. The batteries are 8 volts and 16 volts. The resistances are 1
ohm, 4 ohms, and 2 ohms. The current entering each battery will be the
184 Applied Linear Algebra and Optimization using MATLAB
Junctions
Junction B : I1 + I2 = I3
Junction D : I3 = I1 + I2
These two equations result in a single linear equation
I1 + I2 − I3 = 0.
Paths
The problem thus reduces to solving the following system of three linear
equations in three variables I1 , I2 , and I3 :
I1 + I2 − I3 = 0
4I1 + I3 = 8
4I2 + I3 = 16.
Solve this system for I1 , I2 , and I3 using the Gauss elimination method:
.. .. .
1 1 −1 . 0 1 1 −1 . 0 1 1 −1 .. 0
. . .
4
0 1 .. 8 ≈
0 −4 5 .. ≈ 0 −4
8 5 .. 8 .
. . .
0 4 1 .. 16 0 4 1 .. −4 0 0 6 .. 24
Now use backward substitution to get the solution of the system:
6I3 = 24 gives I3 = 4
I1 + I2 − I3 = 0 gives I1 = 1.
Thus, the currents are I1 = 1, I2 = 3, and I3 = 4. The units are amps.
The solution is unique, as is to be expected in this physical situation. •
Traffic Flow
.
1 0 0 0 0 1 −1 .. 600
.
0 1 0 0 0 0 −1 .. 500
..
0 0 1 0 0 0 −1 . −200
≈ ..
.
1 −1 .
0 0 0 1 0 600
..
0 0 0 0 1 −1 1 . 000
..
0 0 0 0 0 0 0 . 000
The system of equations that corresponds to this form is:
Let us now examine what the flows in the other branches will be when
this minimum flow along Adams is attained, when x7 gives
x1 = −x6 + 800
x2 = + 700
x3 = 000
x4 = −x6 + 800
x5 = x6 − 200.
Since x7 = 200 implies that x3 = 0 and vice-versa, we see that the minimum
flow in branch x7 can be attained by making x3 = 0; i.e., by closing branch
DE to traffic. •
Suppose we have a thin rectangular metal plate whose edges are kept at
fixed temperatures. As an example, let the left edge be 0o C , the right edge
2o C, and the top and bottom edges 1o C (Figure 1.7). We want to know
the temperature inside the plate. There are several ways of approaching
this kind of problem. The simplest approach of interest to us will be
the following type of approximation: we shall overlay our plate with finer
and finer grids, or meshes. The intersections of the mesh lines are called
mesh points. Mesh points are divided into boundary and interior points,
depending on whether they lie on the boundary or the interior of the plate.
We may consider these points as heat elements, such that each influences
its neighboring points. We need the temperature of the interior points,
190 Applied Linear Algebra and Optimization using MATLAB
given the temperature of the boundary points. It is obvious that the finer
the grid, the better the approximation of the temperature distribution of
the plate. To compute the temperature of the interior points, we use the
following principle.
Theorem 1.34 (Mean Value Property for Heat Conduction)
1
x2 = (x1 + x4 + 3)
4
1
x3 = (x1 + x4 + 1)
4
1
x4 = (x2 + x3 + 3) .
4
The problem thus reduces to solving the following system of four linear
equations in four variables x1 , x2 , x3 , and x4 :
4x1 − x2 − x3 = 1
−x1 + 4x2 − x4 = 3
−x1 + 4x3 − x4 = 1
− x2 − x3 + 4x4 = 3.
Solve this system for x1 , x2 , x3 , and x4 using the Gauss elimination method:
..
4 −1 −1 0 . 1
..
4 −1 −1 0 . 1 15 1 . 13
..
. 0 − −1
−1 4 0 −1 .. 3 4 4 4
≈ ··· ≈ .
−1 .
.. 1 56 16 .
. 22
0 4 −1 0 0 − .
15 15 15
..
0 −1 −1 4 . 3 24 .. 30
0 0 0 .
7 7
Now use backward substitution to get the solution of the system:
24 30 5
x4 = gives x4 =
7 7 4
56 16 22 3
x3 − x4 = gives x3 =
15 15 15 4
15 1 13 5
x2 − x3 − x4 = gives x2 =
4 4 4 4
3
4x1 − x2 − x3 = 1 gives x1 = .
4
Thus, the temperatures are x1 = 34 , x2 = 54 , x3 = 34 , and x4 = 54 . •
192 Applied Linear Algebra and Optimization using MATLAB
Solve this system for x, y, and z using the Gauss elimination method:
.
2.5 4.2 5.6 .. 26.50
3.4 4.7 2.8 ...
22.86
.
3.6 6.1 3.7 .. 29.12
..
2.5 4.2 5.6 . 26.50
.
≈ 0 −1.012 −4.816 .. −13.18
..
0 0.052 −4.364 . −9.04
Matrices and Linear Systems 193
.
2.5 4.2 5.6 .. 26.50
.
≈
0 −1.012 −4.816 .. −13.18
.
..
0 0 −4.612 . −9.717
Now use backward substitution to get the solution of the system:
For example, for the reaction in which hydrogen gas (H2 ) and oxygen
(O2 ) combine to form water (H2 O), a balanced chemical equation is
2H2 + O2 −→ 2H2 O,
Nitrogen: w = 2y
Hydrogen: 3w = 2z
Oxygen: 2x = z.
w − 2y = 0
3w − 2z = 0
2x − z = 0.
.
1 0 −2 0 .. 0
.
3 0
0 −2 .. 0
.
.
0 2 0 −1 .. 0
Solve this system for w, x, y, and z using the Gauss elimination method
with partial pivoting:
.. .
3 0 0 −2 . 0 3 0 0 −2 .. 0
2 .. .
0 0 −2
3 ≈ 0 2
. 0 0 −1 .. 0 .
. 2 ..
0 2 0 −1 .. 0 0 0 −2 3
. 0
Matrices and Linear Systems 195
Now use backward substitution to get the solution of the homogeneous sys-
tem:
−2y + 32 z = 0 gives y = 31 z
2x − z=0 gives x = 12 z
3w − 2z = 0 gives w = 32 z.
The smallest positive value of z that will produce integer values for all
four variables is the least common denominator of the fractions 23 , 12 , and
1
3
—namely, 6—which gives
w = 4, x = 3, y = 2, z = 6.
Therefore,
4N H3 + 3O2 −→ 2N2 + 6H2 O
is the balanced chemical equation. •
Solve this system for x, y, and z using the Gauss elimination method:
..
15 12 10 . 1250
..
15 12 10 . 1250
..
0 13 1 .
. 200
4 4.5 3 . 400 ≈
.
10 3 3
..
5 2.5 2.5 . 320
3 5 .. 290
0 − − . −
2 6 3
.
15 12 10 .. 1250
13 1 .
.. 200
≈ 0 .
10 3 3
35 .. 770
0 0 − . −
78 39
Now use backward substitution to get the solution of the system:
35 770
− z=− gives z = 44
78 39
13 1 200
y+ z= gives y = 40
10 3 3
Example 1.65 (Weather) The average of the temperature for the cities
of Jeddah, Makkah, and Riyadh was 50o C during a given summer day. The
temperature in Makkah was 5o C higher than the average of the temperatures
of the other two cities. The temperature in Riyadh was 5o C lower than the
average temperature of the other two cities. What was the temperature in
each of the cities?
50o C. On the other hand, the temperature in Makkah exceeds the average
temperature of Jeddah and Riyadh, (x+z)
2
, by 5o C. So, y = (x+z)
2+5
. Likewise,
(x+y)
we have z = 2−5 . So, the system becomes
(x + y + z)
= 50
3
(x + z)
y = +5
2
(x + y)
z = − 5.
2
Rewriting the above system in standard form, we get
x + y + z = 150
−x + 2y − z = 10
−x − y + 2z = −10.
Solve this system for x, y, and z using the Gauss elimination method:
.. ..
1 1 1 . 150 1 1 1 . 150
.. .
−1
2 −1 . 10 ≈ 0 3
0 .. 160 .
.. .
−1 −1 2 . −10 0 0 3 .. 140
Thus, the temperature in Jeddah was 50o C and the temperatures in Makkah
and Riyadh were approximately, 53o C and 470 C, respectively. •
following rates: the dollar was 60 rupees, 0.6 pounds, and 3.75 riyals. The
second time he exchanged a total of $25500 at these rates: the dollar was
65 rupees, 0.56 pounds, and 3.76 riyals. The third time he exchanged again
a total of $25500 at these rates: the dollar was 65 rupees, 0.6 pounds, and
3.75 riyals. How many rupees, pounds, and riyals did he buy each time?
1 1 1
x+ y+ z = 26000.
60 0.8 3.75
The same reasoning applies to the other two purchases, and we get the
system
1 5 4
x + y + z = 26000
60 3 15
1 25 25
x + y + z = 25500
65 14 94
1 5 4
x + y + z = 25500.
65 3 15
Solve this system for x, y, and z using the Gauss elimination method:
1 5 4 .. 1 5 4 ..
60 . 26000 . 26000
3 15 60 3 15
1 25 25 .. 45 121 ..
65 . 25500 ≈ 0 . 1500
14 94
182 6110
1 5 4 .. 5 4 ..
. 25500 0 . 1500
65 3 15 39 195
Matrices and Linear Systems 199
1 5 4 ..
60 . 26000
3 15
45 121 ..
≈ 0
. 1500 .
182 6110
13 .. 6500
0 0 .
1269 9
Now use backward substitution to get the solution of the system:
13 6500
z= gives z = 70500
1269 9
45 121
y+ z = 1500 gives y = 420
182 6110
1 5 4
x + y + z = 26000 gives x = 390000.
60 3 15
Therefore, each time he bought 390000 rupees, 420 pounds, and 70500 riyals
for his trips. •
Example 1.67 (Inheritance) A father plans to distribute his estate, worth
SR234,000, between his four daughters as follows: 23 of the estate is to be
split equally among the daughters. For the rest, each daughter is to receive
SR3,000 for each year that remains until her 21st birthday. Given that
the daughters are all 3 years apart, how much would each receive from her
father’s estate? How old are the daughters now?
Solve this system for x1 , x2 , x3 , and x4 using the Gauss elimination method
with partial pivoting:
.. ..
1 1 1 1 . 78, 000 1
..
1 1 1 . 78, 000
..
0 0 −1 1 . 9, 000 0 0 −1 1 . 9, 000
. ≈ ..
1 0 .. 9, 000
0 −1
0 −1 1 0 . 9, 000
.. ..
−1 1 0 0 . 9, 000 0 2 1 1 . 87, 000
and
..
..
1 1 1 1 . 78, 000 1 1 1 1 . 78, 000
.
.. 0 2 1 1 .. 87, 000
0 2 1 1 . 87, 000
3 1 .. ≈ 0
3 1 ..
.
0 0 . 52, 500 0 . 52, 500
2 2
2 2
.. 4 ..
0 0 −1 1 . 9, 000 0 0 0 . 44, 000
3
Now use backward substitution to get the solution of the system:
4
w = 44, 000 gives w = 33, 000
3
3 1
z + w = 52, 500 gives z = 24, 000
2 2
18x + 9y + 9z + 12w = 72
6x + 6y + 9z + 12w = 45
6x + 12y + 6z + 9w = 42
6x + 9y + 9z + 18w = 60.
Solve this system for x, y, z, and w using the Gauss elimination method:
.. ..
18 9 9 12 .
..
72 18 9 9 12 .
..
72
6 6 9 12 . 45 0 3 6 8 . 21
.. ≈ ..
6 12 6 9 . 42 0 9 3 5 . 18
.. ..
6 9 9 18 . 60 0 6 6 14 . 36
202 Applied Linear Algebra and Optimization using MATLAB
and
..
..
18 9 9 12 . 72
18 9 9 12
..
. 72
..
0 3
6 8 . 21
0 3 6 8 . 21
≈ .
.. ..
0 0 −15 −19 . −45 0 0 −15 −19 . −45
28 ..
..
0 0 −6 −2 . 36 0 0 0 . 12
5
Now use backward substitution to get the solution of the system:
28 15
w = 12 gives w =
5 7
2
−15z − 19w = −45 gives z =
7
5
3y + 6z + 8w = 21 gives y =
7
29
18x + 9y + 9z + 12w = 72 gives x = .
14
29
Thus, the amount in ounces of foods A, B, C, and D are x = 14
,y = 75 , z =
2
7
, and w = 15
7
, respectively. •
1.7 Summary
The basic methods for solving systems of linear algebraic equations were
discussed in this chapter. Since these methods use matrices and determi-
nants, the basic properties of matrices and determinants were presented.
Several direct solution methods were also discussed. Among them were
Cramer’s rule, Gaussian elimination and its variants, the Gauss–Jordan
method, and the LU decomposition method. Cramer’s rule is impracti-
cal for solving systems with more than three or four equations. Gaussian
elimination is the best choice for solving linear systems. For systems of
equations having a constant coefficients matrix but many right-hand side
Matrices and Linear Systems 203
1.8 Problems
1. Determine the matrix C given by the following expression
C = 2A − 3B,
4. Let
1 2 3 1 1 2 1 0 1
A = 0 −1 2 , B = −1 1 −1 , C = 0 1 2 .
2 0 2 1 0 2 2 0 1
6. Find the values of a and b such that each of the following matrices is
symmetric:
1 3 5 −2 a + b 2
(a) A = a + 2 5 6 , (b) B = 3 4 2a + b ,
b+1 6 7 2 5 −3
1 4 a−b 1 a − 4b 2
(c) C = 4 2 a + 3b , (d) D = 2 8 6 .
7 3 4 7 a − 7b 8
(a)
1 −5 0 −4
A= , B= ,
5 0 4 0
(b)
1 9 1 6
C= , D= ,
−9 7 −6 2
(c)
0 2 −2 3 −3 −3
E = −2 0 4 , F = 3 3 −3 ,
2 −4 0 3 3 3
(d)
1 −5 1 2 8 6
G= 5 1 4 , H = −8 4 2 .
−1 −4 1 −6 −2 5
(a)
1 0 0 1 0 8
A = 0 1 0 , B = 0 1 2 ,
0 0 3 0 0 0
206 Applied Linear Algebra and Optimization using MATLAB
(b)
1 2 3 0 1 2 0 0 1
C = 0 0 0 1 , D = 0 0 1 0 1 ,
0 0 0 1 0 0 0 1 0
(c)
1 4 5 6
0 1 0 0 0 3
1 7 8
E=
0
, F = 0 0 1 0 4 ,
0 1 9
0 0 0 1 5
0 0 0 0
(d)
1 0 0 3 0 0 0 0 0
0 1 0 4 0 0 1 2 4
G=
0
, H= .
0 0 5 0 0 0 1 0
0 0 0 6 0 0 0 0 0
9. Find the row echelon form of each of the following matrices using
elementary row operations, and then solve the linear system:
(a)
0 1 2 1
A = 2 3 4 , b = −1 .
1 3 2 2
(b)
1 2 3 1
A = 0 3 1 , b = 0 .
−1 4 5 −3
(c)
0 −1 0 1
A= 3 0 1 , b = 3 .
0 1 1 2
(d)
0 −1 2 4 2
2 3 5 6 1
A= , b=
−1 .
1 3 −2 4
1 2 −1 3 2
Matrices and Linear Systems 207
10. Find the row echelon form of each of the following matrices using
elementary row operations, and then solve the linear system:
(a)
1 4 −2 5
A = 2 3 2 , b = −3 .
6 4 1 4
(b)
2 2 7 3
A = 0 3 2 , b = 2 .
3 2 1 5
(c)
0 −1 0 1
A= 5 0 2 , b = 1 .
−1 1 4 1
(d)
1 1 2 4 11
1 3 4 5 7
A=
1
, b=
6 .
4 2 4
2 2 −1 3 4
11. Find the reduced row echelon form of each of the following matrices
using elementary row operations, and then solve the linear system:
(a)
1 2 3 4
A = −1 2 1 , b = 3 .
0 1 2 1
(b)
0 1 4 1
A = 2 1 −1 , b = 1 .
1 3 4 −1
208 Applied Linear Algebra and Optimization using MATLAB
(c)
0 −1 3 2 6
3 2 5 4 4
A=
−1
, b=
4 .
3 1 2
2 3 4 1 4
(d)
1 2 −4 1 1
−2 0 2 3 −1
A= , b=
2 .
0 1 −1 2
2 3 0 −1 4
14. Let
1 1 1 0
A= , B= ,
0 1 1 1
then show that (AB)−1 = B −1 A−1 .
15. Evaluate the determinant of each of the following matrices using the
Gauss elimination method:
3 1 −1 4 1 6 17 46 7
A= 2 0 4 , B = −3 6 4 , C = 20 49 8 .
1 −5 1 5 0 9 23 52 19
Matrices and Linear Systems 209
16. Evaluate the determinant of each of the following matrices using the
Gauss elimination method:
4 2 5 −1 4 −2 5 −3
2 5 4 6
, B = 1 8 12 7
A= 4 5 1
,
3 1 4 3 6
11 7 1 1 5 3 −3 6
13 22 −12 8 9 11 2 8
15 10 33 4 15 1 3 12
C=
9 −12
, D=
9 −12 5 17 .
5 7
15 33 −19 26 13 17 21 15
17. Find all zeros (values of x such that f (x) = 0) of polynomial f (x) =
det(A), where
x−1 3 2
A= 3 x 1 .
2 1 x−2
18. Find all zeros (values of x such that f (x) = 0) of polynomial f (x) =
det(A), where
x 0 1
A = 2 1 3 .
0 x 2
19. Find all zeros (values of x such that f (x) = 0) of polynomial f (x) =
det(A), where
x −8 5 2
−3 x 2 1
A= 3
.
4 x 1
3 6 −5 17
23. Find the inverse and determinant of the adjoint matrix of each of the
following matrices:
4 1 5 3 4 −2 1 2 4
A = 5 6 3 , B = 2 5 4 , C = 1 4 0 .
5 4 4 7 −3 4 3 1 1
24. Find the inverse and determinant of the adjoint matrix of each of the
following matrices:
3 2 5 5 3 −2 1 2 3
A = 2 5 4 , B = 3 5 6 , C = 4 5 6 .
5 4 6 −2 6 5 7 8 8
25. Find the inverse of each of the following matrices using the determi-
nant:
0 4 2 −4
0 1 5 2 4 −2 6 1 4 −3
A = 3 1 2 , B = −4 7 5 , C = 4
.
3 1 3
2 3 4 5 −4 4
8 4 −3 2
(a)
x1 − 2x2 + x3 = 0
x1 + x2 + 3x3 = 0
2x1 + 3x2 − 5x3 = 0.
212 Applied Linear Algebra and Optimization using MATLAB
(b)
x1 − 5x2 + 3x3 = 0
2x1 + 3x2 + 2x3 = 0
x1 − 2x2 − 4x3 = 0.
(c)
3x1 + 4x2 − 2x3 = 0
2x1 − 5x2 − 4x3 = 0
3x1 − 2x2 + 3x3 = 0.
(d)
x1 + x2 + 3x3 − 2x4 = 0
x1 + 2x2 + 5x3 + x4 = 0
x1 − 3x2 + x3 + 2x4 = 0.
27. Find value(s) of α such that each of the following homogeneous linear
systems has a nontrivial solution:
(a)
2x1 − (1 − 3α)x2 = 0
x1 + αx2 = .
(b)
2x1 + 2αx2 − x3 = 0
x1 − 2x2 + x3 = 0
αx1 + 2x2 − 3x3 = 0.
(c)
x1 + 2x2 + 4x3 = 0
3x1 + 7x2 + αx3 = 0
3x1 + 3x2 + 15x3 = 0.
(d)
x1 + x2 + 2x3 − 3x4 = 0
x1 + 2x2 + x3 − 2x4 = 0
3x1 + x2 + αx3 + 3x4 = 0
3x1 + x2 + αx3 + 3x4 = 0
2x1 + 3x2 + x3 + αx4 = 0.
Matrices and Linear Systems 213
28. Using the matrices in Problem 15, solve the following systems using
the matrix inversion method:
29. Solve the following systems using the matrix inversion method:
(a)
x1 + 3x2 − x3 = 4
5x1 − 2x2 − x3 = −2
2x1 + 2x2 + x3 = 9.
(b)
x1 + x2 + 3x3 = 2
5x1 + 3x2 + x3 = 3
2x1 + 3x2 + x3 = −1.
(c)
4x1 + x2 − 3x3 = −1
3x1 + 2x2 − 6x3 = −2
x1 − 5x2 + 3x3 = −3.
(d)
7x1 + 11x2 − 15x3 = 21
3x1 + 22x2 − 18x3 = 12
2x1 − 13x2 + 9x3 = 16.
30. Solve the following systems using the matrix inversion method:
(a)
3x1 − 2x2 − 4x3 = 7
5x1 − 2x2 − 3x3 = 8
7x1 + 4x2 + 2x3 = 9.
(b)
−3x1 + 4x2 + 3x3 = 11
5x1 + 3x2 + x3 = 12
x1 + x2 + 5x3 = 10.
214 Applied Linear Algebra and Optimization using MATLAB
(c)
x1 + 42 − 8x3 = 7
2x1 + 7x2 − 5x3 = −5
3x1 − 6x2 + 6x3 = 4.
(d)
17x1 + 18x2 − 19x3 = 10
43x1 + 22x2 − 14x3 = 11
25x1 − 33x2 + 21x3 = 12.
31. Solve the following systems using the matrix inversion method:
(a)
2x1 + 3x2 − 4x3 + 4x4 = 11
x1 + 3x2 − 4x3 + 2x4 = 12
4x1 + 3x2 + 2x3 + 3x4 = 14
3x1 − 4x2 + 5x3 + 6x4 = 15.
(b)
7x1 + 13x2 + 12x3 + 9x4 = 21
3x1 + 23x2 − 5x3 + 2x4 = 10
4x1 − 7x2 + 22x3 + 3x4 = 11
3x1 − 4x2 + 25x3 + 16x4 = 10.
(c)
12x1 + 6x2 + 5x3 − 2x4 = 21
11x1 + 13x2 + 7x3 + 2x4 = 22
14x1 + 9x2 + 2x3 − 6x4 = 23
7x1 − 24x2 − 7x3 + 8x4 = 24.
(d)
15x1 − 26x2 + 15x3 − 11x4 = 17
14x1 + 15x2 + 7x3 + 7x4 = 18
17x1 + 14x2 − 22x3 − 16x4 = 19
21x1 − 12x2 − 7x3 + 8x4 = 20.
(a)
3x1 + 4x2 + 5x3 = 1
3x1 + 2x2 + x3 = 2
4x1 + 3x2 + 5x3 = 3.
(b)
x1 − 4x2 + 2x3 = 4
−4x1 + 5x2 + 6x3 = 0
7x1 − 3x2 + 5x3 = 4.
(c)
6x1 + 7x2 + 8x3 = 1
−5x1 + 3x2 + 2x3 = 1
x1 + 2x2 + 3x3 = 1.
(d)
x1 + 3x2 − 4x3 + 5x4 = 2
6x1 − x2 + 6x3 + 3x4 = −3
2x1 + x2 + 3x3 + 2x4 = 4
x1 + 5x2 + 6x3 + 7x4 = 2.
(a)
2x1 − 2x2 + 8x3 = 1
5x1 + 6x2 + 5x3 = 2
7x1 + 7x2 + 9x3 = 3.
(b)
3x1 − 3x2 + 12x3 = 14
−4x1 + 5x2 + 16x3 = 18
x1 − 15x2 + 24x3 = 19.
216 Applied Linear Algebra and Optimization using MATLAB
(c)
9x1 − 11x2 + 12x3 = 3
−5x1 + 3x2 + 2x3 = 4
7x1 − 12x2 + 13x3 = 5.
(d)
11x1 + 3x2 − 13x3 + 15x4 = 22
26x1 − 5x2 + 6x3 + 13x4 = 23
22x1 + 6x2 + 13x3 + 12x4 = 24
17x1 − 25x2 + 16x3 + 27x4 = 25.
36. Use the simple Gaussian elimination method to show that the fol-
lowing system does not have a solution:
3x1 + x2 = 1.5
2x1 − x2 − x3 = 2
4x1 + 3x2 + x3 = 0.
38. Solve the following systems using the simple Gaussian elimination
method:
(a)
x1 − x2 = −2
−x1 + 2x2 − x3 = 5
4x1 − x2 + 4x3 = 1.
(b)
3x1 + x2 − x3 = 5
5x1 − 3x2 + 2x3 = 7
2x1 − x2 + x3 = 3.
(c)
3x1 + x2 + x3 = 2
2x1 + 2x2 + 4x3 = 3
4x1 + 9x2 + 16x3 = 1.
Matrices and Linear Systems 217
(d)
2x1 + x2 + x3 − x4 = 9
x1 + 9x2 + 8x3 + 4x4 = 11
−x1 + 3x2 + 5x3 + 2x4 = 10
5x1 + x2 + x4 = 12.
39. Solve the following systems using the simple Gaussian elimination
method:
(a)
2x1 + 5x2 − 4x3 = 3
2x1 + 2x2 − x3 = 1
3x1 + 2x2 − 3x3 = −5.
(b)
2x2 − x3 = 1
3x1 − x2 + 2x3 = 4
x1 + 3x2 − 5x3 = 1.
(c)
x1 + 2x2 = 3
−x1 − 2x3 = −5
−3x1 − 5x2 + x3 = −4.
(d)
3x1 + 2x2 + 4x3 − x4 = 2
x1 + 4x2 + 5x3 + x4 = 1
4x1 + 5x2 + 4x3 + 3x4 = 5
2x1 + 3x2 + 2x3 + 4x4 = 6.
40. For what values of a and b does the following linear system have no
solution or infinitely many solutions:
(a)
2x1 + x2 + x3 = 2
−2x1 + x2 + 3x3 = a
2x1 − x3 = b.
218 Applied Linear Algebra and Optimization using MATLAB
(b)
2x1 + 3x2 − x3 = 1
x1 − x2 + 3x3 = a
3x1 + 7x2 − 5x3 = b.
(c)
2x1 − x2 + 3x3 = 3
3x1 + x2 − 5x3 = a
−5x1 − 5x2 + 21x3 = b.
(d)
2x1 − x2 + 3x3 = 5
4x1 + 2x2 + bx3 = 6
−2x1 + ax2 + 3x3 = 4.
41. Find the value(s) of α so that each of the following linear systems
has a nontrivial solution:
(a)
2x1 + 2x2 + 3x3 = 1
3x1 + αx2 + 5x3 = 3
x1 + 7x2 + 3x3 = 2.
(b)
x1 + 2x2 + x3 = 2
x1 + 3x2 + 6x3 = 5
2x1 + 3x2 + αx3 = 6.
(c)
αx1 + x2 + x3 = 7
x1 + x2 − x3 = 2
x1 + x2 + αx3 = 1.
(d)
2x1 + αx2 + 3x3 = 9
3x1 − 4x2 − 5x3 = 11
4x1 + 5x2 + αx3 = 12.
Matrices and Linear Systems 219
42. Find the inverse of each of the following matrices by using the simple
Gauss elimination method:
3 3 3 5 3 2 1 2 3
A = 0 2 2 , B = 3 2 2 , C = 2 5 2 .
2 4 5 2 6 5 3 4 3
43. Find the inverse of each of the following matrices by using the simple
Gauss elimination method:
3 2 3 1 −3 2 5 2 3
A = 4 2 2 , B = 3 2 6 , C = 2 5 5 .
2 4 3 2 −6 5 3 2 4
48. Solve the following linear systems using Gaussian elimination with
partial and without pivoting:
(a)
1.001x1 + 1.5x2 = 0
2x1 + 3x2 = 1.
220 Applied Linear Algebra and Optimization using MATLAB
(b)
x1 + 1.001x2 = 2.001
x1 + x2 = 2.
(c)
6.122x1 + 1500.5x2 = 1506.622
2000x1 + 3x2 = 2003.
49. The elements of matrix A, the Hilbert matrix, are defined by
aij = 1/(i + j − 1), for i, j = 1, 2, . . . , n.
Find the solution of the system Ax = b for n = 4 and b = [1, 2, 3, 4]T
using Gaussian elimination by partial pivoting.
50. Solve the following systems using the Gauss–Jordan method:
(a)
x1 + 4x2 + x3 = 1
2x1 + 4x2 + x3 = 9
3x1 + 5x2 − 2x3 = 11.
(b)
x1 + x 2 + x3 = 1
2x1 − x2 + 3x3 = 4
3x1 + 2x2 − 2x3 = −2.
(c)
2x1 + 3x2 + 6x3 + x4 = 2
x1 + x2 − 2x3 + 4x4 = 1
3x1 + 5x2 − 2x3 + 2x4 = 11
2x1 + 2x2 + 2x3 − 3x4 = 2.
51. The following sets of linear equations have a common coefficients ma-
trix but different right-side terms:
(a)
2x1 + 3x2 + 5x3 = 0
3x1 + x2 − 2x3 = −2
x1 + 3x2 + 4x3 = −3.
Matrices and Linear Systems 221
(b)
2x1 + 3x2 + 5x3 = 1
3x1 + x2 − 2x3 = 2
x1 + 3x2 + 4x3 = 4.
(c)
2x1 + 3x2 + 5x3 = −5
3x1 + x2 − 2x3 = 6
x1 + 3x2 + 4x3 = −1.
The coefficients and the three sets of right-side terms may be com-
bined into an augmented matrix of the form
..
2 3 5 . 0 1 −5
3 1 −2 ..
.
. −2 2 6
..
1 3 4 . −3 4 −1
If we apply the Gauss–Jordan method to this augmented matrix form
and reduce the first three columns to the unity matrix form, the solu-
tion for the three problems are automatically obtained in the fourth,
fifth, and sixth columns when elimination is completed. Calculate
the solution in this way.
52. Calculate the inverse of each matrix using the Gauss–Jordan method:
5 −2 0 0
3 −9 5 1 4 5 −2 5 −2 0
(a) 0 5 1 , (b) 2 1 2 , (c) 0 −2
.
5 −2
−1 6 3 8 1 1
0 0 −2 5
53. Find the inverse of the Hilbert matrix of size 4 × 4 using the Gauss–
Jordan method. Then solve the linear system Ax = [1, 2, 3, 4]T .
54. Find the LU decomposition of each matrix A using Doolittle’s method
and then solve the systems:
(a)
2 −1 1 4
A= −3 4 −1 , b= 5 .
1 −1 1 6
222 Applied Linear Algebra and Optimization using MATLAB
(b)
7 6 5 2
A = 5 4 3 , b = 1 .
3 7 6 2
(c)
2 2 2 0
A = 1 2 1 , b = −4 .
3 3 4 1
(d)
2 4 −6 −4
A= 1 5 3 , b = 10 .
1 3 2 5
(e)
1 −1 0 2
A = 2 −1 1 , b = 4 .
2 −2 −1 3
(f )
1 5 3 4
A = 2 4 6 , b = 11 .
1 3 2 5
55. Find the LU decomposition of each matrix A using Doolittle’s method,
and then solve the systems:
(a)
3 −2 1 1 3
−3 7 4 −3 2
A=
2 −5 3
, b=
1 .
4
7 −3 2 4 2
(b)
2 −4 5 3 6
3 5 −4 3 5
A=
1
, b=
2 .
6 2 6
7 2 5 1 4
Matrices and Linear Systems 223
(c)
2 2 3 −2 10
10 2 13 11 14
A=
2
, b=
11 .
5 4 6
1 −4 −2 7 9
(d)
5 12 4 −11 44
21 15 13 23 33
A=
31
, b=
55 .
33 12 22
−17 15 14 11 22
(e)
1 −1 10 8 −2
12 −17 11 22 7
A= , b=
6 .
22 31 13 −1
8 24 13 9 5
(f )
41
41 25 23 −18 1
A = 2 13 −16 12 , 15 .
b=
11 13 9 7
13
1 −1 2
(a) A = −1 3 −1 .
α −2 3
1 5 7
(b) A = 4 4 α .
−2 α 9
2 −4 α
(c) A= 2 4 3 .
4 −2 5
2 α 1−α
(d) A= 2 5 −2 .
2 5 4
1 −1 3
(e) A= 3 2 3 .
4 α−2 7
1 5 α
(f ) A= 1 4 α − 2 .
1 −2 8
58. Use the smallest positive integer to find the unique solution of each of
the linear systems of Problem 56 using LU decomposition by Doolit-
tle’s method:
(a)
3 4 3 4 −2 3
A = 2 3 3 , B= 5 2 −3 .
1 3 5 4 3 6
(b)
2 5 4 3 2 −6
A = 2 1 6 , B = 2 2 −5 .
3 2 7 3 4 7
(c)
1 −5 4 4 7 −6
A= 2 3 −4 , B= 5 5 −5 .
3 2 6 6 −4 9
(d)
2 3 4 5
3 −1 4 3 1 2 4
A= 2 2 −1 , B=
3
.
1 1 1
3 2 2
4 3 1 2
226 Applied Linear Algebra and Optimization using MATLAB
(a)
2 3 4
A = 3 5 2 .
4 2 6
(b)
3 −2 4
A = −2 2 1 .
4 1 3
(c)
2 1 −1
A= 1 3 2 .
−1 2 2
(d)
1 −2 3 4
−2 3 4 5
A= .
3 4 5 −6
4 5 −6 7
(a)
1 −1 1 2
A = −1 5 −1 , b = 2 .
1 −1 10 2
(b)
10 2 1 7
A = 2 10 3 , b = −4 .
1 3 10 3
(c)
4 2 3 1
A = 2 17 1 , b = 2 .
3 1 5 5
(d)
3 4 −6 0 4
4 5 3 1 5
A=
−6
, b=
2 .
3 3 1
0 1 1 3 3
(a)
5 −1 1 5
A = −3 5 −2 , b = 7 .
2 −1 7 9
(b)
6 2 −3 5
A = 3 12 −4 , b = 2 .
6 3 13 4
228 Applied Linear Algebra and Optimization using MATLAB
(c)
5 2 −5 3
A= 2 4 4 , b = 11 .
−3 −2 7 14
(d)
1 4 −6 0 12
2 2 3 3 13
A=
−3
, b=
14 .
6 7 1
0 2 −3 5 15
(a)
3 −1 0 1
A = −1 3 −1 , b = 1 .
0 −1 3 1
(b)
2 3 0 0 6
3 2 3 0 7
A=
0
, b=
5 .
3 2 3
0 0 3 2 3
(c)
4 −1 0 0 1
−1 4 −1 0 1
A=
0 −1
, b=
1 .
4 −1
0 0 −1 4 1
(d)
2 3 0 0 1
3 5 4 0 2
A=
0
, b=
3 .
4 6 3
0 0 3 4 4
(a)
4 −2 0 5
A = −2 5 −2 , b = 6 .
0 −2 6 7
(b)
8 1 0 0 2
1 8 1 0 2
A=
0
, b=
2 .
1 8 1
0 0 1 8 2
(c)
5 −3 0 0 7
−3 6 −2 0 −5
A=
0 −2
, b=
4 .
7 −5
0 0 −5 8 2
(d)
2 −4 0 0 11
−4 5 7 0 12
A=
0
, b=
13 .
7 6 2
0 0 2 8 14
67. Find kxk1 , kxk2 , and kxk∞ for the following vectors:
(a)
[2, −1, −6, 3]T .
(b)
[sin k, cos k, 3k ]T , for a fixed integer k.
68. Find k.k1 , k.k∞ and k.ke for the following matrices:
3 1 −1 4 1 6
A= 2 0 4 , B = −3 6 4 ,
1 −5 1 5 0 9
3 11 −5 2
17 46 7 6 8 −11 6
C= 20 49 8 , D = −4 −8
.
10 14
23 52 9
13 14 −12 9
230 Applied Linear Algebra and Optimization using MATLAB
(a)
0.89x1 + 0.53x2 = 0.36
0.47x1 + 0.28x2 = 0.19
Matrices and Linear Systems 231
x = [1, −1]T
x∗ = [0.702, −0.500]T
(b)
0.986x1 + 0.579x2 = 0.235
0.409x1 + 0.237x2 = 0.107
x = [2, −3]T
x∗ = [2.110, −3.170]T
(c)
1.003x1 + 58.090x2 = 68.12
5.550x1 + 321.8x2 = 377.3
x = [10, 1]T
x∗ = [−10, 1]T
1.01x1 + 0.99x2 = 2
0.99x1 + 1.01x2 = 2.
1 kA − Bk
≤ .
K(A) kAk
(a)
K(A) ≥ 1 and K(B) ≥ 1.
(b)
K(AB) ≤ K(A)K(B).
80. If kAk < 1, then show that the matrix (I − A) is nonsingular and
1
k(I − A)−1 k ≤ .
1 − kAk
85. Determine the currents through the various branches of the electrical
network in Figure 1.8:
Note how the current through the branch AB is reversed in (b). What
would the voltage of C have to be for no current to pass through AB?
Matrices and Linear Systems 235
87. Figure 1.10 represents the traffic entering and leaving a “roundabout”
road junction. Such junctions are very common in Europe. Construct
a mathematical model that describes the flow of traffic along the vari-
ous branches. What is the minimum flow theoretically possible along
the branch BC? Is this flow ever likely to be realized in practice?
(a) C7 H6 O2 + O2 −→ H2 O + CO2 .
94. The average of the temperature for the cities of Jeddah, Makkah,
and Riyadh was 15o C during a given winter day. The temperature
in Makkah was 6o C higher than the average of the temperatures
of the other two cities. The temperature in Riyadh was 6o C lower
than the average temperature of the other two cities. What was the
temperature in each one of the cities?
97. A biologist has placed three strains of bacteria (denoted by I, II, and
III) in a test tube, where they will feed on three different food sources
(A, B, and C). Each day 2300 units of A, 800 units of B, and 1500
units of C are placed in the test tube, and each bacterium consumes
a certain number of units of each food per day, as shown in the given
table. How many bacteria of each strain can coexist in the test tube
and consume all the food?
98. Al-karim hires three types of laborers, I, II, and III, and pays them
SR20, SR15, and SR10 per hour, respectively. If the total amount
paid is SR20,000 for a total of 300 hours of work, find the possible
number of hours put in by the three categories of workers if the
category III workers must put in the maximum amount of hours.
Chapter 2
2.1 Introduction
The methods discussed in Chapter 1 for the solution of the system of linear
equations have been direct, which required a finite number of arithmetic
operations. The elimination methods for solving such systems usually
yield sufficiently accurate solutions for approximately 20 to 25 simulta-
neous equations, where most of the unknowns are present in all of the
equations. When the coefficients matrix is sparse (has many zeros), a con-
siderably large number of equations can be handled by the elimination
methods. But these methods are generally impractical when many hun-
dreds or thousands of equations must be solved simultaneously.
There are, however, several methods that can be used to solve large
numbers of simultaneous equations. These methods, called iterative meth-
ods, are methods by which an approximation to the solution of a system
243
244 Applied Linear Algebra and Optimization using MATLAB
of linear equations may be obtained. The iterative methods are used most
often for large, sparse systems of linear equations and they are efficient in
terms of computer storage and time requirements. Systems of this type
arise frequently in the numerical solutions of boundary value problems
and partial differential equations. Unlike the direct methods, the iterative
methods may not always yield a solution, even if the determinant of the
coefficients matrix is not zero.
Ax = b (2.1)
x = Tx + c (2.2)
for some square matrix T and vector c. After the initial vector x(0) is se-
lected, the sequence of approximate solution vectors is generated by com-
puting
x(k+1) = T x(k) + c, for k = 0, 1, 2, . . . . (2.3)
Among them, the most useful methods are the Jacobi method, the
Gauss–Seidel method, the Successive Over-Relaxation (SOR) method, and
the conjugate gradient method.
A = L + D + U, (2.5)
Iterative Methods for Linear Systems 245
or in matrix form
Dx = b − (L + U )x.
Divide both sides of the above three equations by their diagonal elements,
a11 , a22 , and a33 , respectively, to get
1 h i
x1 = b1 − a12 x2 − a13 x3
a11
1 h i
x2 = b2 − a21 x1 − a23 x3
a22
1 h i
x3 = b3 − a31 x1 − a32 x2 ,
a33
x = D−1 [b − (L + U )x].
h iT
(0) (0) (0) (0)
Let x = x1 , x2 , x3
be an initial solution of the exact solution x of
the linear system (2.1). Then define an iterative sequence
or in matrix form
where k is the number of iterative steps. Then the form (2.7) is called
the Jacobi formula for the system of three equations and (2.8) is called its
matrix form. For a general system of n linear equations, the Jacobi method
Iterative Methods for Linear Systems 247
is defined by
" i−1 n
#
(k+1) 1 X (k)
X (k)
xi = bi − aij xj − aij xj (2.9)
aii j=1 j=i+1
i = 1, 2, . . . , n, k = 0, 1, 2, . . .
or
x1 c1 0 −t12 · · · −t1n x1
x2 −t21
c2 0 · · · −t2n x2
= + .. .. ,
.. .. .. .. ..
. . . . . . .
xn (k+1) cn −tn1 −tn2 · · · 0 xn (k)
(2.11)
where the Jacobi iteration matrix TJ and vector c are defined as follows:
tij = 0, i=j
bi
ci = , i = 1, 2, . . . , n.
aii
The Jacobi iterative method is sometimes called the method of simultane-
ous iterations, because all values of xi are iterated simultaneously. That
(k+1) (k)
is, all values of xi depend only on the values of xi .
248 Applied Linear Algebra and Optimization using MATLAB
Note that the diagonal elements of the Jacobi iteration matrix TJ are
(0)
always zero. As usual with iterative methods, an initial approximation xi
must be supplied. If we don’t have knowledge of the exact solution, it is
(0)
conventional to start with xi = 0, for all i. The iterations defined by
(2.9) are stopped when
kx(k+1) − x(k) k
< (2.14)
kx(k+1) k
Example 2.1 Solve the following system of equations using the Jacobi it-
erative method using = 10−5 in the l∞ -norm:
Note that the Jacobi method converged and after 15 iterations we ob-
tained the good approximation [2.24373, 2.93123, 3.57829, 4.18940]T of
the given system having the exact solution [2.24374, 2.93124, 3.57830,
4.18941]T . Ideally, the iterations should stop automatically when we ob-
tain the required accuracy using one of the stopping criteria mentioned in
(2.13) or (2.14). •
Example 2.2 Solve the following system of equations using the Jacobi it-
erative method:
Solution. Results for this linear system are listed in Table 2.2. Note that
in this case the Jacobi method diverges rapidly. Although the given linear
system is the same as the linear system of Example 2.1, the first and second
equations are interchanged. From this example we conclude that the Jacobi
iterative method is not always convergent.
Iterative Methods for Linear Systems 251
Program 2.1
MATLAB m-file for the Jacobi Iterative Method
function x=JacobiM(Ab,x,acc) % Ab = [A b]
[n,t]=size(Ab); b=Ab(1:n,t); R=1; k=1;
d(1,1:n+1)=[0 x]; while R > acc
for i=1:n
sum=0;
for j=1:n; if j ˜ =i
sum = sum + Ab(i, j) ∗ d(k, j + 1); end;
x(1, i) = (1/Ab(i, i)) ∗ (b(i, 1) − sum); end;end
k=k+1; d(k,1:n+1)=[k-1 x];
R=max(abs((d(k,2:n+1)-d(k-1,2:n+1))));
if k > 10 & R > 100
(‘Jacobi Method diverges’)
break; end; end; x=d;
bi
3. Compute the constant c = D−1 b = , for i = 1, 2, . . . , n.
aii
(k+1) (k)
5. Solve for the approximate solutions xi = TJ xi +c, i = 1, 2, . . . , n
k = 0, 1, . . ..
(k+1) (k)
6. Repeat step 5 until kxi − xi k < .
From the Jacobi iterative formula (2.9), it is seen that the new estimates
for solution x are computed from the old estimates and only when all
the new estimates have been determined are they then used in the right-
hand side of the equation to perform the next iteration. But the Gauss–
Seidel method is used to make use of the new estimates in the right-hand
side of the equation as soon as they become available. For example, the
Gauss–Seidel formula for the system of three equations can be defined as
an iterative sequence:
which are called the Gauss–Seidel iteration matrix and the vector, respec-
tively.
Example 2.3 Solve the following system of equations using the Gauss–
Seidel iterative method, with = 10−5 in the l∞ -norm:
Note that the Gauss–Seidel method converged for the given system and re-
quired nine iterations to obtain the approximate solution [2.24374, 2.93123,
3.57830, 4.18941]T , which is equal to the exact solution [2.24374, 2.93124,
3.57830, 4.18941]T up to six significant digits, which is six iterations less
than required by the Jacobi method for the same linear system. •
Example 2.4 Solve the following system of equations using the Gauss–
Seidel iterative method, with = 10−5 in the l∞ -norm:
Solution. Results for this linear system are listed in Table 2.4. Note that
in this case the Gauss–Seidel method diverges rapidly. Although the given
linear system is the same as the linear system of the previous Example 2.3,
the first and second equations are interchanged. From this example we
conclude that the Gauss–Seidel iterative method is not always convergent.•
256 Applied Linear Algebra and Optimization using MATLAB
Program 2.2
MATLAB m-file for the Gauss–Seidel Iterative Method
function x=GaussSM(Ab,x,acc) % Ab = [A b]
[n,t]=size(Ab); b=Ab(1:n,t);R=1; k=1;
d(1,1:n+1)=[0 x]; k=k+1; while R > acc
for i=1:n; sum=0; for j=1:n
if j <= i − 1; sum = sum + Ab(i, j) ∗ d(k, j + 1);
elseif j >= i + 1
sum = sum + Ab(i, j) ∗ d(k − 1, j + 1); end; end
x(1, i) = (1/Ab(i, i)) ∗ (b(i, 1) − sum);
d(k,1)=k-1; d(k,i+1)=x(1,i); end
R=max(abs((d(k,2:n+1)-d(k-1,2:n+1))));
k=k+1; if R > 100 & k > 10 (‘Gauss–Seidel method Diverges’)
break ;end;end;x=d;
Procedure 2.2 (Gauss–Seidel Method)
1. Check that the coefficient matrix A is strictly diagonally dominant
(for guaranteed convergence).
2. Initialize the first approximation x(0) ∈ R and preassigned accuracy
.
Iterative Methods for Linear Systems 257
The first and subsequent iterations are listed in Table 2.5. Now we solve
the same system by the Gauss–Seidel method and for the given system, the
Iterative Methods for Linear Systems 259
Gauss–Seidel formula is
The first and subsequent iterations are listed in Table 2.6. Note that the Ja-
cobi method diverged and the Gauss–Seidel method converged after 28 itera-
tions with the approximate solution [0.66150, −0.28350, 0.63177, 0.48758]T
of the given system, which has the exact solution [0.66169, −0.28358, 0.63184,
0.48756]T . •
Example 2.6 Solve the following system of equations using the Jacobi and
Gauss–Seidel iterative methods, using = 10−5 in the l∞ -norm and taking
the initial solution x(0) = [0, 0, 0, 0]T :
x1 + 2x2 − 2x3 = 1
x1 + x2 + x3 = 2
2x1 + 2x2 + x3 = 3
x1 + x2 + x3 + x 4 = 4.
Solution. First, we solve by the Jacobi method and for the given system,
the Jacobi formula is
The first and subsequent iterations are listed in Table 2.7. Now we solve
the same system by the Gauss–Seidel method and for the given system, the
262 Applied Linear Algebra and Optimization using MATLAB
Gauss–Seidel formula is
(k+1) 1h (k) (k)
i
x1 = 1 − 2x2 + 2x3
1
Jacobi method converged quickly (only five iterations) but the Gauss–Seidel
method diverged for the given system. •
0 0 0 0 2 0 6 0 0
A=L+U +D = 1 0 0 + 0 0 −2 + 0 7 0 .
3 −2 0 0 0 0 0 0 9
(a) Since the matrix form of the Jacobi iterative method can be written as
x(k+1) = TJ x(k) + c, k = 0, 1, 2, . . .
where
TJ = −D−1 (L + U ) and c = D−1 b
one can easily compute the Jacobi iteration matrix TJ and the vector c as
follows:
2 1
0 −6 0 6
1 2 2
TJ =
−7 0 and c =
7 .
7
3 2 1
− 0 −
9 9 9
Thus, the matrix form of the Jacobi iterative method is
2 1
0 − 0
6
6
1 2 (k)
2
x(k+1) =
−7 0 x + , k = 0, 1, 2.
7
7
3 2 1
− 0 −
9 9 9
(b) Now by writing the above iterative matrix form in component form, we
Iterative Methods for Linear Systems 265
have
1 1
0 − 0
x1
3 x1
6
x2 = − 1 2 2
0 x2 + ,
7 7
7
x3 1 2 x3 1
− 0 −
3 9 9
and it is equivalent to
1 1
x1 = − x2 +
3 6
1 2 2
x2 = − x1 + x3 +
7 7 7
1 2 1
x3 = − x1 + x2 − .
3 9 9
Now solving for x1 , x2 , and x3 , we get
1
x1 12
x2 = 1
,
4
x3 1
−
12
which is the exact solution of the given system.
(c) Since the error in the (n + 1)th step is defined as
e(k+1) = x − x(k+1) ,
we have
1 2 1
0 − 0
12
6
6
1 1
2 (k)
2
e(k+1) = − − 0 x + .
4 7
7
7
1 3 2 1
− − 0 −
12 9 9 9
266 Applied Linear Algebra and Optimization using MATLAB
1 2 1
0 − 0
12
6 12
1 1
2 1
e(k+1) = − − 0
4 7
7
4
1 3 2 1
− − 0 −
12 9 9 12
2
0 − 0
6
1 2
e(k)
+ − 0
7 7
3 2
− 0
9 9
1
6
2
+
7
1
−
9
or
2
0 − 0
6
1 2
e(k+1) =
−7 0 e(k) ,
7
3 2
− 0
9 9
(because x(k) = e(k) − x) which is the required error in the (n + 1)th step.
(d) Now finding the first approximation of the error, we have to compute
Iterative Methods for Linear Systems 267
the following:
2
0 − 0
6
1 2
e(1) =
−7 0 e(0) ,
7
3 2
− 0
9 9
where
e(0) = x − x(0) .
Using x(0) = [0, 0, 0]T , we have
1 1
12 0 12
(0)
1
− 0 = 1
e = .
4
4
1 0 1
− −
12 12
Thus,
2 1 1
0 − 6 0 12 − 12
(1)
1 2 1 1
e = 7 − 0 = − .
7
4
28
3 2 1 1
− 0 −
9 9 12 36
Similarly, for the second approximation of the error, we have to compute
the following:
2
0 −6 0
(2)
1 2 (1)
e = 7− 0 e
7
3 2
− 0
9 9
268 Applied Linear Algebra and Optimization using MATLAB
or
2 1 1
0 − 0 − 12 84
6
1 2 − 1 = 5 ,
e(2) =
−7 0
7 28 252
3 2 1 5
− 0
9 9 36 252
(a) Now by using the Gauss–Seidel method, first we compute the Gauss–
Seidel iteration matrix TG and the vector c as follows:
1 1
0 − 0
3
6
1 2 11
TG =
0
and c= .
21 7
42
23 4 41
0 −
189 63 378
1 1
0 − 0
3
6
1 2 (k)
11
x(k+1) =
0 x + , k = 0, 1, 2.
21 7
42
23 4 41
0 −
189 63 378
Iterative Methods for Linear Systems 269
(d) The first and second approximations of the error can be calculated as
follows:
1
0 − 0
3
1 2 e(0) = [− 1 , − 1 , 19 ]T
(1)
e = 0
21 7 12 84 756
23 4
0
189 63
and
1
0 − 0
3
1 e(1) = [ 1 , 5 , 1 ]T ,
2
e(2) =
0
21 7 252 756 6804
23 4
0
189 63
which is the required second approximation of the error. •
kx − x(k) k ≤ kT kk kx(0) − xk
(2.20)
(k) kT kk
kx − x k ≤ kx(1) − x(0) k.
1 − kT k
Note that the smaller the value of kT k, the faster the convergence of
the iterative methods.
the Gauss–Seidel iterative method converges faster than the Jacobi iterative
method.
Solution. Here we will show that the l∞ -norm of the Gauss–Seidel itera-
tion matrix TG is less than the l∞ -norm of the Jacobi iteration matrix TJ ,
i.e.,
kTG k < kTJ k.
The Jacobi iteration matrix TJ can be obtained from the given matrix A as
272 Applied Linear Algebra and Optimization using MATLAB
follows:
1
0 0
5
−1
5 0 0 0 0 −1
1
−1
TJ = −D (L+U ) = − 0 3 0 −1 0 0 = .
0 0
3
0 0 4 0 −1 0
1
0 − 0
4
Then the l∞ -norm of the matrix TJ is
1 1 1 1
kTJ k∞ = max , , = = 0.3333 < 1.
5 3 4 3
The Gauss–Seidel iteration matrix TG is defined as
5 0 0 0 0 −1
TG = −(D + L)−1 U = − −1 3 0 0 0 0 ,
0 −1 4 0 0 0
and it gives
1
0 0 5
1
TG = 0 0 −
.
15
1
0 0
60
Then the l∞ -norm of the matrix TG is
1 1 1 1
kTG k∞ = max , , = = 0.2000 < 1,
5 15 60 5
which shows that the Gauss–Seidel method will converge faster than the
Jacobi method for the given linear system. •
Note that the condition kT k < 1 is equivalent to the condition that a
matrix A is to be strictly diagonally dominant.
Iterative Methods for Linear Systems 273
For the Jacobi method for a general matrix A, the norm of the Jacobi
iteration matrix is defined as
n
X aij
kTJ k = max aii .
1≤i≤n
j=1
j6=i
TJ = −D−1 (L + U ),
1
2 1 1
0 − − −
0 0 0 10 10 10
10
0 2 1 1
1 1 2
0 1 − 0 − −
0 0
1 0 1 2 12 12 12
TJ = −
12 = ,
2 1 0 3
2 1 3
0 1
1 2 1 0
− − 0 −
0 0
13 13 13
13
1
0 0 0 1 2 1
15 − − − 0
15 15 15
then the l∞ norm of the matrix TJ is
4 4 6 4 6
kTJ k∞ = max , , , = = 0.46154 < 1.
10 12 13 15 13
Thus, the Jacobi method will converge for the given linear system.
x(1) = [0.5, 0.75, 0.07692, 0.86667]T , x(2) = [0.25564, 0.62970, −0.12436, 0.72821]T .
kTJ kk
kx − x(k) k ≤ kx(1) − x(0) k ≤ 10−4 .
1 − kTJ k
It gives
6 k
( 13 )
7 (0.86667) ≤ 10−4
13
or
7
6 k ( 13 ) × 10−4
( ) ≤ ,
13 0.86667
which gives
(0.46154)k ≤ (6.21) × 10−5 .
Taking ln on both sides, we obtain
2
k ln( ) ≤ ln (6.21) × 10−5
3
or
k(−0.77327) ≤ (−9.68621),
and it gives
k ≥ 12.5263 or k = 13,
276 Applied Linear Algebra and Optimization using MATLAB
TG = −(D + L)−1 U,
Thus, the Gauss–Seidel method will converge for the given linear system.
Iterative Methods for Linear Systems 277
kTJ kk
kx − x(k) k ≤ kx(1) − x(0) k ≤ 10−4 .
1 − kTJ k
It gives
(0.4)k
(0.74252) ≤ 10−4
0.6
or
(0.4)k ≤ (8.08 × 10−5 ).
278 Applied Linear Algebra and Optimization using MATLAB
or
k(−0.91629) ≤ (−9.4235),
and it gives
k ≥ 10.28441 or k = 11,
Example 2.10 Solve the following system of linear equations using Gauss–
Seidel iterative methods, using = 10−5 in the l∞ -norm and taking the
initial solution x(0) = [0, 0, 0, 0]T :
5x1 − x3 = 1
14x2 − x3 − x4 = 1
−x1 − x2 + 13x3 = 4
− x2 + 9x4 = 3.
(k+1) 1h (k)
i
x1 = 1 + x3
5
(k+1) 1h (k+1)
i
x4 = 3 + x2 .
9
(0) (0) (0) (0)
So starting with an initial approximation x1 = 0, x2 = 0, x3 = 0, x4 =
0, and for k = 0, we get
(1) 1h (0)
i
x1 = 1 + x3 = 0.200000
5
(1) 1h (1)
i
x4 = 3 + x2 = 0.341270.
9
The first and subsequent iterations are listed in Table 2.9. •
Note that the Gauss–Seidel method converged very fast (only five it-
erations) and the approximate solution of the given system [0.267505,
0.120302, 0.337524, 0.346700]T is equal to the exact solution [0.267505,
0.120302, 0.337524, 0.346700]T up to six decimal places.
280 Applied Linear Algebra and Optimization using MATLAB
Example 2.11 Find the eigenvalues and eigenvectors of the following ma-
trix:
−6 0 0
A = 11 −3 0 .
−3 6 7
Solution. To find the eigenvalues of the given matrix A by using (2.23),
we have
−6 − λ 0 0
11 −3 − λ 0 = 0,
−3 6 7−λ
which gives a characteristic equation of the form
λ3 + 2λ2 − 45λ − 126 = 0.
It factorizes to
(−6 − λ)(−3 − λ)(7 − λ) = 0
and gives us the eigenvalues λ = −6, λ = −3, and λ = 7 of the given
matrix A. Note that the sum of these eigenvalues is −2, and this agrees
with the trace of A. After finding the eigenvalues of the matrix we turn to
the problem of finding eigenvectors. The eigenvectors of A corresponding
to the eigenvalues λ are the nonzero vectors x that satisfy (2.22). Equiv-
alently, the eigenvectors corresponding to λ are the nonzero vectors in the
solution space of (2.22). We call this solution space the eigenspace of A
corresponding to λ.
>> D = eig(A);
>> A = [4 1 − 3; 0 0 2; 0 0 − 3];
>> B = max(eig(A))
B=
4
gives r r
cb cb
λ1 = − , λ2 =
ad ad
and r
cb
λmax = .
ad
Similarly, we can find the Gauss–Seidel iteration matrix as
TG = −(L + D)−1 U,
which gives
cb
µ1 = 0, µ2 =
ad
and
cb
µmax = .
ad
Thus,
r !2
cb
µmax = = λ2max ,
ad
which is the required result. •
286 Applied Linear Algebra and Optimization using MATLAB
The necessary and sufficient condition for the convergence of the Jacobi
iterative method and the Gauss–Seidel iterative method is defined in the
following theorem.
Note that the condition ρ(T ) < 1 is satisfied when kT k < 1 because
ρ(T ) ≤ kT k for any natural norm. •
Example 2.13 Find the spectral radius of the Jacobi and the Gauss–Seidel
iteration matrices using each of the following matrices:
2 0 −1 1 −1 1
(a) A = −1 3 0 , (b) A = −2 2 −1 ,
0 −1 4 0 1 5
1 0 0 1 0 −1
(c) A = −1 2 0 , (d) A = 1 1 0 .
0 −1 3 0 1 1
Iterative Methods for Linear Systems 287
Solution. (a) The Jacobi iteration matrix TJ for the given matrix A can
be obtained as
1
0 0
2
1
TJ =
3 0 0 ,
1
0 0
4
and the characteristic equation of the matrix TJ is
1
det(TJ − λI) = −λ3 + = 0.
24
Solving this cubic polynomial, the maximum eigenvalue (in absolute) of TJ
329
is , i.e.,
949
329
ρ(TJ ) = = 0.3467.
949
Also, the Gauss–Seidel iteration matrix TG for the given matrix A is
1
0 0
2
1
TG = 0 0
6
1
0 0
24
and has the characteristic equation of the form
1 2
det(TG − λI) = −λ3 + λ = 0.
24
Solving this cubic polynomial, we obtain the maximum eigenvalue of TG ,
1
, i.e.,
24
1
ρ(TG ) = = 0.0417.
24
288 Applied Linear Algebra and Optimization using MATLAB
and
0 0 1 0 0 1
TJ = −1 0 0 , TG = 0 0 −1 ,
0 −1 0 0 0 1
with
ρ(TJ ) = 1.0000 and ρ(TG ) = 1.0000,
respectively. •
and it gives
k
1 k
lim = 0 and lim = 0.
k→∞ 3 k→∞ 3k+1
Hence, the given matrix A is convergent. •
Since the above matrix has the eigenvalue 31 of order two, its spectral
radius is 13 . This shows the important relation existing between the spectral
radius of a matrix and the convergent of a matrix.
Theorem 2.6 The following statements are equivalent:
1. A is a convergent matrix.
2. lim kAn k = 0, for all natural norms.
n→∞
3. ρ(A) < 1.
4. lim An x = 0, for every x. •
n→∞
Solving this cubic equation, the eigenvalues of A are –2, –1, and 1. Thus
the spectral radius of A is
Also,
−2 1 0 −2 1 2 5 −2 −4
AT A = 1 0 1 1 0 0 = −2 2 2 ,
2 0 0 0 1 0 −4 2 4
which gives the eigenvalues 0.4174, 1, and 9.5826. Therefore, the spectral
radius of AT A is 9.5826. Hence,
p √
kAk2 = ρ(AT A) = 9.5826 ≈ 3.0956.
292 Applied Linear Algebra and Optimization using MATLAB
−λ3 + 4λ2 + 9λ − 36 = 0.
Note that this result is also true for any natural norm. •
which gives the eigenvalues 17.96 and 0.04. The spectral radius of AT A is
17.96. Hence, p √
kAk2 = ρ(AT A) = 17.96 ≈ 4.24.
Since a characteristic equation of (A−1 )T (A−1 ) is
−1 T −1
13 − λ 4
det[(A ) (A ) − λI] = = λ2 − 18λ + 49 = 0,
4 5−λ
294 Applied Linear Algebra and Optimization using MATLAB
which gives the eigenvalues 14.64 and 3.36 of (A−1 )T (A−1 ), its spectral
radius 14.64. Hence,
p √
kA−1 k2 = ρ((A−1 )T (A−1 )) = 14.64 ≈ 3.83.
Note that the eigenvalues of A are 3.73 and 0.27, therefore, its spectral
radius is 3.73. Hence,
1
< |3.73| < 4.24,
3.83
which satisfies Theorem 2.9. •
which is equivalent to
x(k+1) = Tω x(k) + c, (2.28)
where
Tω = (D + ωL)−1 [(1 − ω)D − ωU ] and c = ω(D − ωL)−1 b (2.29)
are called the SOR iteration matrix and the vector, respectively.
which is equal to
−1
5 0 −0.025 1.005
Tω = .
−1.005 10 0 −0.05
Thus,
0.2 0 −0.025 1.005
Tω =
0.0201 0.1 0 −0.05
or
−0.005 0.201
Tω = .
−0.0005 0.0152
The l∞ -norm of the matrix Tω is
kTω k∞ = max{0.206, 0.0157} = 0.206.
•
Example 2.20 Solve the following system of linear equations, taking an
initial approximation x(0) = [0, 0, 0, 0]T and with = 10−4 in the l∞ -norm:
2x1 + 8x2 = 1
5x1 − x2 + x3 = 2
−x1 + x2 + 4x3 + x4 = 12
x2 + x3 + 5x4 = 12.
(a) Using the Gauss–Seidel method.
(b) Using the SOR method with ω = 0.33.
(1) 1h (0)
i
x1 = 1 − 8x2 = 0.5
2
Example 2.21 Solve the following system of linear equations using the
SOR method, with = 0.5 × 10−6 in the l∞ -norm:
2x1 + x2 = 4
x1 + 2x2 + x3 = 8
x2 + 2x3 + x4 = 12
x3 + 2x4 = 11.
Start with an initial approximation x(0) = [0, 0, 0, 0]T and take ω = 1.27.
Iterative Methods for Linear Systems 299
Solution. For the given system, the SOR method with ω = 1.27 is
we obtain
(1) (0) 1.27 (0)
x1 = (1 − 1.27)x1 + [4 − x2 ] = 2.54
2
In practice, ω should be chosen in the range 1 < ω < 2, but the precise
choice of ω is a major problem. Finding the optimum value for ω depends
on the particular problem (size of the system of equations and the nature
of the equations) and often requires careful work. A detailed study for
the optimization of ω can be found in Isaacson and Keller (1966). The
following theorems can be used in certain situations for the convergence of
the SOR method.
302 Applied Linear Algebra and Optimization using MATLAB
Now to find the spectral radius of the Jacobi iteration matrix TJ , we use
the characteristic equation
λ
det(TJ − λI) = |TJ − λi| = −λ3 + ,
2
which gives the eigenvalues of matrix TJ , as λ = 0, ± √12 . Thus,
1
ρ(TJ ) = √ = 0.707107,
2
and the optimal value of ω is
2
ω= p = 1.171573.
1 + 1 − (0.707107)2
Also, note that the Gauss–Seidel iteration matrix TG has a the form
1
0 2 0
1 1
TG = 0
,
4 2
1 1
0
8 4
and its characteristic equation is
λ2
det(TG − λI) = |TG − λI| = −λ3 + .
2
Thus,
1
= 0.50000 = (ρ(TJ ))2 ,
ρ(TG ) =
2
which agrees with Theorem 2.12. •
Note that the optimal value of ω can also be found by using (2.30) if the
eigenvalues of the Jacobi iteration matrix TJ are real and 0 < ρ(TJ ) < 1. •
304 Applied Linear Algebra and Optimization using MATLAB
Example 2.23 Find the optimal choice for the relaxation factor ω by us-
ing the matrix
5 −1 −1 −1
2 5 −1 0
A= −1
.
−1 5 −1
−1 −1 −1 5
Solution. Using the given matrix A, we can find the Jacobi iteration
matrix TJ as
1 1 1
0
5 5 5
2 1
5 0 5 0
TJ = .
1 1 1
0
5 5 5
1 1 1
0
5 5 5
Now to find the spectral radius of the Jacobi iteration matrix TJ , we use
the characteristic equation
det(TJ − λI) = 0,
6 2 8 8
−λ4 − λ − λ− = (5λ − 3) ∗ (5λ + 1)3 = 0.
25 125 125
Solving the above polynomial equation, we obtain
3 1 1 1
λ = ,− ,− ,− ,
5 5 5 5
which are the eigenvalues of the matrix TJ . From this we get
3
ρ(TJ ) = = 0.6,
5
Iterative Methods for Linear Systems 305
λ4 − 0.1875λ2 + 1/256 = 0.
which shows that the Jacobi method will converge for the given linear sys-
tem.
ρ(Tω ) = ω − 1.
Now to find the spectral radius of the SOR iteration matrix Tω , we have to
calculate first the optimal value of ω by using
2
ω= p .
1 + 1 − [ρ(TJ )]2
Iterative Methods for Linear Systems 307
Using this optimal value of ω, we can compute the spectral radius of the
SOR iteration matrix Tω as follows:
Thus the SOR method will also converge for the given system, and faster
than the other two methods, because
Program 2.3
MATLAB m-file for the SOR Iterative Method
function sol=SORM(Ab,x,w,acc) % Ab = [A b]
[n,t]=size(Ab); b=Ab(1:n,t); R=1; k=1;
d(1,1:n+1)=[0 x];
k=k+1; while R > acc
for i=1:n
sum=0;
for j=1:n
if j <= i − 1; sum = sum + Ab(i, j) ∗ d(k, j + 1);
elseif j >= i + 1; sum = sum + Ab(i, j) ∗ d(k − 1, j + 1);
end;end
x(1, i) = (1 − w) ∗ d(k − 1, i + 1) + (w/Ab(i, i)) ∗ (b(i, 1) −
sum);
d(k, 1) = k − 1; d(k, i + 1) = x(1, i); end
R = max(abs((d(k, 2 : n + 1) − d(k − 1, 2 : n + 1))));
if R > 100 & k > 10; break; end
k=k+1; end; x=d;
308 Applied Linear Algebra and Optimization using MATLAB
This method has merit for nonlinear systems and optimization prob-
lems, but it is not used for linear systems because of slow convergence. An
alternative approach uses a set of nonzero direction vectors {v(1) , . . . , v(n) }
that satisfy
< v(i) , Av(j) >= 0, if i 6= j.
This is called an A-orthogonality condition, and the set of vectors {v(1) , . . . , v(n) }
is said to be A-orthogonal.
In the conjugate gradient method, we use v(1) equal to r(0) only at the
beginning of the process. For all later iterations, we choose
kr(k) k2 (k)
v(k+1) = r(k) + v
kr(k−1) k2
to be conjugate to all previous direction vectors.
Note that the initial approximation x(0) can be chosen by the user,
with x(0) = 0 as the default. The number of iterations, m ≤ n, can be
chosen by the user in advance; alternatively, one can impose a stopping
criterion based on the size of the residual vector, kr(k) k, or, alternatively,
the distance between successive iterates, kx(k+1) − x(k) k. If the process is
carried on to the bitter end, i.e., m = n, then, in the absence of round-off
errors, the results will be the exact solution to the linear system. More
iterations than n may be required in practical applications because of the
introduction of round-off errors.
Example 2.25 The linear system
2x1 − x2 = 1
−x1 + 2x2 − x3 = 0
− x2 + x3 = 1
has the exact solution x = [2, 3, 4]T . Solve the system by the conjugate
gradient method.
Solution. Start with an initial approximation x(0) = [0, 0, 0]T and find the
residual vector as
which, as one can check, is conjugate to both v(1) and v(2) . Thus, the
solution is obtained from
13 1
6 −4
2
kr (2) 2
k 1
(3) (2) (3) 4
x =x + v = 3 + 3 0 = 3 .
< v(3) , v(3) >
4
8
11 1
3 2
Since we applied the method n = 3 times, this is the actual solution. •
Note that in larger examples, one would not carry through the method
to the bitter end, since an approximation to the solution is typically ob-
tained with only a few iterations. The result can be a substantial saving
in computational time and effort required to produce an approximation to
the solution.
To get the above results using MATLAB commands, we do the follow-
ing:
>> A = [2 − 1 0; −1 2 − 1; 0 − 1 1];
>> b = [1 0 1]0 ;
>> x0 = [0 0 0]0 ;
>> acc = 0.5e − 6;
>> maxI = 3;
>> CON JG(A, b, x0, acc, maxI);
Iterative Methods for Linear Systems 313
Program 2.4
MATLAB m-file for the Conjugate Gradient Method
function x=CONJG(A,b,x0,acc,maxI)
x = x0; r = b − A ∗ x0; v = r;
alpha=r0 ∗ r; iter=0;flag=0;
normb=norm(b); if normb < eps; normb=1; end
while (norm(r)/normb > acc)
u = A ∗ v; t = alpha/(u0 ∗ v); x = x + t ∗ v;
r = r − t ∗ u; beta = r0 ∗ r;
v = r + beta/alpha ∗ v; alpha = beta;
iter = iter+1; if (iter == maxI); flag= 1;
break; end; end
Ax = b (2.33)
314 Applied Linear Algebra and Optimization using MATLAB
x = x(1) + y.
Ay = r = b − Ax(1) , (2.34)
where r is the residual. The system (2.34) can now be solved to give
correction y to the approximation x(1) . Thus, the new approximation
x(2) = x(1) + y
will be closer to the solution than x(1) . If necessary, we compute the new
residual
r = b − Ax(2)
and solve system (2.34) again to get new corrections. Normally, two or
three iterations are enough to get an exact solution. This iterative method
can be used to obtain an improved solution whenever an approximate so-
lution has been obtained by any means.
1.01x1 + 0.99x2 = 2
0.99x1 + 1.01x2 = 2
has the exact solution x = [1, 1]T . The approximate solution using the
Gaussian elimination method is x(1) = [1.01, 1.01]T and residual r(1) =
[−0.02, −0.02]T . Then the solution to the system
Ay = r(1) ,
For MATLAB commands for the above iterative method, the two m-
files RES.m and WP.m have been used, then the first iteration is easily
performed by the following sequence of MATLAB commands:
2.9 Summary
Several iterative methods were discussed. Among them were the Jacobi
method, the Gauss–Seidel method, and the SOR method. All methods
converge if the coefficient matrix is strictly diagonally dominant. The SOR
is the best method of choice. Although the determination of the optimum
value of the relaxation factor ω is difficult, it is generally worthwhile if the
system of equations is to be solved many times for right-hand side vectors.
The need for estimating parameters is removed in the conjugate gradient
316 Applied Linear Algebra and Optimization using MATLAB
method, which, although more complicated to code, can rival the SOR
method in efficiency when dealing with large, sparse systems. Iterative
methods are generally used when the number of equations is large, and
the coefficient matrix is strictly diagonally dominant. At the end of this
chapter we discussed the residual corrector method, which improved the
approximate solution.
2.10 Problems
1. Find the Jacobi iteration matrix and its l∞ -norm for each of the
following matrices:
11 −3 2 7 1 1
(a) 4 10 3 , (b) 3 13 2 ,
−2 5 9 −4 3 14
11 −3 2 7 1 1
(c) 4 10 3 , (d) 3 13 2 ,
−2 5 9 −4 3 14
8 1 −1 0 7 1 −3 1
2 13 −2 1 1 10 2 −3
(e) −1 3 15 2 ,
(f )
1 −5 25
.
4
1 4 5 20 1 2 3 17
2. Find the Jacobi iteration matrix and its l2 -norm for each of the fol-
lowing matrices:
5 2 3 6 3 0
(a) −3 6 4 , (b) 4 7 5 ,
2 5 8 −3 2 11
21 −13 6 9 0 11
(c) 5 15 2 , (d) 1 11 3 ,
11 2 19 −1 0 4
18 2 −3 2 5 4 3 0
6 17 −2 1 2 10 3 3
(e) 1 13 25 2
, (f ) 3 4 12 −3 .
−6 8 7 21 2 0 −1 7
Iterative Methods for Linear Systems 317
3. Solve the following linear systems using the Jacobi method. Start
with initial approximation x(0) = 0 and iterate until kx(k+1) − x(k) k∞ ≤
10−5 for each system:
(a)
4x1 − x2 + x3 = 7
4x1 − 8x2 + x3 = −21
−2x1 + x2 + 5x3 = 15
(b)
3x1 + x2 + x3 = 5
2x1 + 6x2 + x3 = 9
x1 + x2 + 4x3 = 6
(c)
4x1 + 2x2 + x3 = 1
x1 + 7x2 + x3 = 4
x1 + x2 + 20x3 = 7
(d)
5x1 + 2x2 − x3 = 6
x1 + 6x2 − 3x3 = 4
2x1 + x2 + 4x3 = 7
(e)
6x1 − x2 + 3x3 = −2
3x2 + x3 = 1
−2x1 + x2 + 5x3 = 5
(f )
4x1 + x2 = −1
2x1 + 5x2 + x3 = 0
−x1 + 2x2 + 4x3 = 3
(g)
5x1 − x2 + x3 = 1
3x2 − x3 = −1
x1 + 2x2 + 4x3 = 2
318 Applied Linear Algebra and Optimization using MATLAB
(h)
9x1 + x2 + x3 = 10
2x1 + 10x2 + 3x3 = 19
3x1 + 4x2 + 11x3 = 0
4x1 + 2x2 + x3 = 1
x1 + 7x2 + x3 = 4
x1 + x2 + 20x3 = 7.
(a) Show that the Jacobi method converges by using kTJ k∞ < 1.
(b) Compute the second approximation x(2) , starting with x(0) =
[0, 0, 0]T .
(c) Compute an error estimate kx − x(2) k∞ for your approximation.
5. If
4 1 0 3
A= 1 3 −1 , b = 4 .
0 −1 4 5
Find the Jacobi iteration matrix TJ . If the first approximate solution
of the given linear system by the Jacobi method is [ 34 , 43 , 54 ]T , using
x(0) = [0, 0, 0]T , then estimate the number of iterations necessary to
obtain approximations accurate to within 10−6 .
Find the Jacobi iteration matrix TJ and show that kTJ k < 1. Use
the Jacobi method to find the first approximate solution x(1) of the
linear system by using x(0) = [0, 0, 0]T . Also, compute the error bound
kx − x(10) k. Compute the number of steps needed to get accuracy
within 10−5 .
Iterative Methods for Linear Systems 319
Find the Gauss–Seidel iteration matrix TG and show that kTG k < 1.
Use the Gauss–Seidel method to find the second approximate solution
x(1) of the linear system by using x(0) = [−0.5, −2.5, −1.5]T . Also,
compute the error bound.
Show that the Gauss–Seidel method converges for the given linear
system. If the first approximate solution of the given linear system
by the Gauss–Seidel method is x(1) = [0.6, −2.7, −1]T , by using the
initial approximation x(0) = [0, 0, 0]T , then compute the number of
steps needed to get accuracy within 10−4 . Also, compute an upper
bound for the relative error in solving the given linear system.
4x1 + x2 − 2x3 = 4
2x1 + 9x2 − 3x3 = 3
x1 − 2x2 + 8x3 = 2.
(a) Find the matrix form of both the iterative (Jacobi and
Gauss–Seidel) methods.
(k) (k) (k)
(b) If x(k) = [x1 , x2 , x3 ]T , then write the iterative forms of
part (a) in component form and find the exact solution of the
given system.
Iterative Methods for Linear Systems 321
(c) Find the formulas for the error e(k+1) in the (n + 1)th step.
(d) Find the second approximation of the error e(2) using part (c),
if x(0) = [0, 00]T .
14. Consider the following system:
16x1 − 3x2 + 2x3 = 11
x1 + 15x2 − 3x3 = 12
5x1 − 3x2 + 14x3 = 13.
(a) Find the matrix form of both the iterative (Jacobi and
Gauss–Seidel) methods.
(k) (k) (k)
(b) If x(k) = [x1 , x2 , x3 ]T , then write the iterative forms of
part (a) in component form and find the exact solution of the
given system.
(c) Find the formulas for the error e(k+1) in the (n + 1)th step.
(d) Find the second approximation of the error e(2) using part (c),
if x(0) = [0, 00]T .
15. Which of the following matrices is convergent?
1 0 −1 1
2 2 3 1 0 0 2 2 −1 1
(a) 1 2 1 , (b) −1 3 0 , (c) .
0 1 3 −2
2 −2 1 3 2 −2
1 0 1 4
16. Find the eigenvalues and their associated eigenvectors of the matrix
2 −2 3
A= 0 3 −2 .
0 −1 2
Also, show that kAk2 > ρ(A).
17. Find the l2 -norm of each of the following matrices:
1 3 2 1
2 1 3 1 2 0 2 2 −1 1
(a) 1 2 4 , (b) 1 2 0 , (c) .
0 1 2 −2
2 −2 1 1 2 −2
1 3 1 5
322 Applied Linear Algebra and Optimization using MATLAB
22. Use the given parameter ω to solve each of the following linear sys-
tems by using the SOR method within accuracy 10−6 in the l∞ -norm,
starting with x(0) = 0 :
(a) ω = 1.323
6x1 − 2x2 + x3 = 9
2x1 + 7x2 + 3x3 = 11
−4x1 + 5x2 + 15x3 = 10
(b) ω = 1.201
2x1 − 3x2 + x3 = 6
2x1 + 16x2 + 5x3 = −5
2x1 + 3x2 + 8x3 = 3
Iterative Methods for Linear Systems 323
(c) ω = 1.110
4x1 + 3x2 = 11
x1 + 7x2 − 3x3 = 13
2x1 + 7x2 + 20x3 = 20
(d) ω = 1.543
x1 − 2x2 = −3
−2x1 + 5x2 − x3 = 5
− x2 + 2x3 − 0.5x4 = 2
− 0.5x3 + 1.25x4 = 3.5
(a) ω = 1.25962
2 1 0 0
1 2 1 0
A=
0
.
1 2 1
0 0 1 2
(b) ω = 1.15810
4 −1 0 0
−1 4 −1 0
A= .
0 −1 4 −1
0 0 −1 4
28. Perform only two steps of the conjugate gradient method for the fol-
lowing linear systems, starting with x(0) = 0 :
(a)
3x1 − x2 + x3 = 7
−x1 + 3x2 + 2x3 = 1
x1 + 2x2 + 5x3 = 5
(b)
3x1 − 2x2 + x3 = 5
−2x1 + 6x2 − x3 = 9
x1 − x2 + 4x3 = 6
Iterative Methods for Linear Systems 325
(c)
4x1 − 2x2 + x3 = 1
−2x1 + 7x2 + x3 = 4
x1 + x2 + 20x3 = 1
(d)
5x1 − 3x2 − x3 = 6
−3x1 + 6x2 − 3x3 = 4
−x1 − 3x2 + 4x3 = 7
(e)
9x1 − 3x2 − x3 + 2x4 = 11
3x1 − 10x2 − 2x3 + x4 = 9
2x1 + 3x2 − 11x3 + 3x4 = 15
x1 − 3x2 − 2x3 + 12x4 = 8
29. Perform only two steps of the conjugate gradient method for the fol-
lowing linear systems, starting with x(0) = 0 :
(a)
6x1 + 2x2 + x3 = 1
2x1 + 3x2 − x3 = 0
x1 − x2 + 2x3 = −2
(b)
5x1 − 2x2 + x3 = 3
−2x1 + 4x2 − x3 = 2
x1 − x2 + 3x3 = 1
(c)
6x1 − x2 − x3 + 5x4 = 1
−x1 + 7x2 + x3 − x4 = 2
−x1 + x2 + 3x3 − 3x4 = 0
5x1 − x2 − 3x3 + 6x4 = −1
(d)
3x1 − 2x2 − x3 + 3x4 = 1
−2x1 + 7x2 + x3 − x4 = 0
−x1 + x2 + 3x3 − 3x4 = 0
3x1 − x2 − 3x3 + 6x4 = 0
326 Applied Linear Algebra and Optimization using MATLAB
x1 + 2x2 = 1
2x1 + 4.0001x2 = 1.9999
using simple Gaussian elimination, and then use the residual cor-
rection method (two iterations only) to improve the approximate
solution.
31. The following linear system has the exact solution x = [10, 1]T . Find
the approximate solution of the system
by using simple Gaussian elimination, and then use the residual cor-
rection method (one iteration only) to improve the approximate so-
lution.
32. The following linear system has the exact solution x = [1, 1]T . Find
the approximate solution of the system
x1 + 2x2 = 3
x1 + 2.01x2 = 3.01
by using simple Gaussian elimination, and then use the residual cor-
rection method (one iteration only) to improve the approximate so-
lution.
Chapter 3
3.1 Introduction
In this chapter we describe numerical methods for solving eigenvalue prob-
lems that arise in many branches of science and engineering and seem to
be a very fundamental part of the structure of the universe. Eigenvalue
problems are important in a less direct manner in numerical applications.
For example, discovering the condition factor in the solution of a set of
linear algebraic equations involves finding the ratio of the largest to the
smallest eigenvalue values of the underlying matrix. Also, the eigenvalue
problem is involved when establishing the stiffness of ordinary differential
equations problems. In solving eigenvalue problems, we are mainly con-
cerned with the task of finding the values of the parameter λ and vector
x, which satisfy a set of equations of the form
Ax = λx. (3.1)
327
328 Applied Linear Algebra and Optimization using MATLAB
Ax = λIx,
which gives
(A − λI)x = 0, (3.2)
where I is an n × n identity matrix. The matrix (A − λI) appears as
(a11 − λ) a12 ··· a1n
a21 (a22 − λ) · · · a2n
,
.. .. . . .
.
. . . .
an1 an2 · · · (ann − λ)
Then by using Cramer’s rule, we see that the determinant of the de-
nominator, namely, the determinant of the matrix of the system (3.3) must
vanish if there is to be a nontrivial solution, i.e., a solution other than
x = 0. Geometrically, Ax = λx says that under transformation by A,
The Eigenvalue Problems 329
is defined as
trace(A) = 7 + 6 + 3 = 16. •
Theorem 3.1 If A and B are square matrices of the same size, then:
330 Applied Linear Algebra and Optimization using MATLAB
Then
5 1 3
AT = −4 4 2
2 3 7
and
trace(AT ) = 16 = trace(A).
Also,
20 −16 8
4A = 4 16 12
12 8 28
and
trace(4A) = 64 = 4(16) = 4trace(A).
and
trace(A + B) = 38 = 16 + 22 = trace(A) + trace(B).
The Eigenvalue Problems 331
−4 −7 6
A − B = 0 −1 1 ,
0 4 −1
and
trace(A − B) = −6 = 16 − 22 = trace(A) − trace(B).
47 −9 −12
AB = 22 17 28
50 5 48
and
36 −32 −1
BA = 16 20 31 .
37 −4 56
Then
trace(AB) = 112 = trace(BA).
>> A = [5 − 4 2; 1 4 3; 3 2 7];
>> B = [9 3 − 4; 1 5 2; 3 − 2 8];
>> C = A + B;
>> D = A ∗ B;
>> trac(C);
>> trac(D);
332 Applied Linear Algebra and Optimization using MATLAB
Program 3.1
MATLAB m-file for Finding Trace of The Ma-
trix
function [trc]=trac(A)
n=max(size(A)); trc=0;
for i=1:n; for k=1:n
if i==k tracc=A(i,k); trc=trc+tracc; else
trc=trc;
end; end; end
Then the eigenvectors are determined by setting one of the nonzero ele-
ments of x to unity and calculating the remaining elements by equating
coefficients in the relation (3.2).
Eigenvalues of 2 × 2 Matrices
Let λ1 and λ2 be the eigenvalues of a 2 × 2 matrix A, then a quadratic
polynomial p(λ) is defined as
p(λ) = (λ − λ1 )(λ − λ2 )
= λ2 − (λ1 + λ2 )λ + λ1 λ2 .
Note that
trace(A) = λ1 + λ2
det(A) = λ1 λ2 .
The Eigenvalue Problems 333
So
p(λ) = λ2 − trace(A)λ + det(A).
For example, if the given matrix is
5 4
A= ,
3 4
then
5−λ 4
A − λI =
3 4−λ
and
p(λ) = (5 − λ)(4 − λ) − 12 = λ2 − 9λ + 8.
By solving the above quadratic polynomial, we get
λ1 = 8, λ2 = 1,
Note that
trace(A) = λ1 + λ2 = 9,
det(A) = λ1 λ2 = 8,
D = [trace(A)]2 − 4 det(A).
D = [2]2 − 4(1) = 0. •
Example 3.1 Find the eigenvalues and eigenvectors of the following ma-
trix:
6 −3
A= .
−4 5
Solution. The eigenvalues of the given matrix are real and distinct because
Then
6−λ −3
A − λI =
−4 5 − λ
and
p(λ) = (6 − λ)(5 − λ) − 12 = λ2 − 11λ + 18.
By solving the above quadratic polynomial, we get
λ1 = 9, λ2 = 2,
Note that
trace(A) = λ1 + λ2 = 11
det(A) = λ1 λ2 = 18.
When λ2 = 2, we have
4 −3 x1 0
= .
−4 3 x2 0
>> A = [6 − 3; −4 5];
>> EigT wo(A);
The Eigenvalue Problems 337
Program 3.2
MATLAB m-file for Finding Eigenvalues of a 2 × 2
Matrix
function [Lambda,x]= EigTwo(A)
det = A(1, 1) ∗ A(2, 2) − A(1, 2) ∗ A(2, 1);
trace = A(1,1) + A(2,2);
L1 = (trace + sqrt(traceˆ 2−4 ∗ det))/2;
L2 = (trace − sqrt(traceˆ 2−4 ∗ det))/2;
if A(1,2) =˜ 0
x1 = [A(1,2); L1-A(1,1)];
x2 = [A(1,2); L2-A(1,1)];
elseif A(2,1) = ˜ 0
x1 = [L1-A(2,2); A(2,1)];
x2 = [L2-A(2,2); A(2,1)];
else x1 = [1; 0]; x2 = [0; 1];end
disp([0 Lˆ 2−0 num2str(trace)0 ∗ L +0
0 0
num2str(det) = 0 ])
L1 x1 L2 x2
For larger size matrices, there is no doubt that the eigenvalue problem
is computationally more difficult than the linear system Ax = b. With
a linear system, a finite number of elimination steps produces the exact
answer in a finite time. In the case of an eigenvalue, no such steps and no
such formula can exist. The characteristic polynomial of a 5 × 5 matrix is
a quintic, and it is proved there can be no algebraic form for the roots of
a fifth degree polynomial, although there are a few simple checks on the
eigenvalues, after they have been computed, and we mention here two of
them:
It should be noted that the system matrix A of (3.1) may be real and
symmetric, or real and nonsymmetric, or complex with symmetric real and
skew symmetric imaginary parts. These different types of a matrix A are
explained as follows:
xTi xj = 0, i 6= j (3.5)
and
xTi Axj = 0, i 6= j. (3.6)
Equations (3.5) and (3.6) represent the orthogonality relationships.
Note that if i = j, then in general, xTi xi and xTi Axi are not zero. Re-
calling that xi includes an arbitrary scaling factor, then the product
xTi xi must also be arbitrary. However, if the arbitrary scaling factor
is adjusted so that
xTi xj = 1, (3.7)
then
xTi Axj = λi , (3.8)
and the eigenvectors are known to be normalized.
Sometimes the eigenvalues are not distinct and the eigenvectors asso-
ciated with these equal or repeated eigenvalues are not, of necessity,
orthogonal. If λi = λj and the other eigenvalues, λk , are distinct,
then
xTi xk = 0, k = 1, 2, · · · , n k 6= i, k 6= j (3.9)
The Eigenvalue Problems 339
and
xTj xk = 0, k = 1, 2, · · · , n k 6= i, k. 6= j. (3.10)
When λi = λj , the eigenvectors xi and xj are not unique and a linear
combination of them, i.e., axi + bxj , where a and b are arbitrary con-
stants, also satisfies the eigenvalue problems. One important result is
that a symmetric matrix of order n always has n distinct eigenvectors
even if some of the eigenvalues are repeated.
2. If a given A is a real nonsymmetric matrix, then a pair of related
eigenvalue problems can arise as follows:
Ax = λx (3.11)
and
AT y = βy. (3.12)
By taking the transpose of (3.12), we have
yT A = βyT . (3.13)
The vectors x and y are called the right-hand and left-hand vectors
of A, respectively. The eigenvalues of A and AT are identical, i.e.,
λi = βi , but the eigenvectors x and y will, in general, differ from
each other. The eigenvalues and eigenvectors of a nonsymmetric real
matrix are either real or pairs of complex conjugates. If λi , xi , yi , and
λj , xj , yj are solutions that satisfy the eigenvalue problems of (3.11)
and (3.12) and λi and λj are distinct, then
yjT xi = 0, i 6= j (3.14)
and
yjT Axi = 0, i 6= j. (3.15)
Equations (3.14) and (3.15) are called bi-orthogonal relationships.
Note that if, in these equations, i = j, then, in general, yiT xi and
yiT Axi are not zero. The eigenvectors xi and yi include arbitrary scal-
ing factors, and so the product of these vectors will also be arbitrary.
However, if the vectors are adjusted so that
yiT xi = 1, (3.16)
340 Applied Linear Algebra and Optimization using MATLAB
then
yiT Axi = λi . (3.17)
and
3. Let us consider the case when the given A is a complex matrix. The
properties of one particular complex matrix is an Hermitian matrix,
which is defined as
H = A + iB, (3.20)
where A and B are real matrices such that A = AT and B = −B T .
Hence, A is symmetric and B is skew symmetric with zero terms on
the leading diagonal. Thus, by definition of an Hermitian matrix,
H has a symmetric real part and a skew symmetric imaginary part,
making H equal to the transpose of its complex conjugate, denoted
by H ∗ . Consider now the eigenvalue problem
Hx = λx. (3.21)
x∗i xj = 0, i 6= j (3.22)
The Eigenvalue Problems 341
and
x∗i Hxj = 0, i 6= j, (3.23)
where x∗i is the transpose of the complex conjugate of xi . As before,
xi includes an arbitrary scaling factor and the product x∗i xi must
also be arbitrary. However, if the arbitrary scaling factor is adjusted
so that
x∗i xi = 1, (3.24)
then
x∗i Hxi = λi , (3.25)
and the eigenvectors are then said to be normalized.
Example 3.2 Find the eigenvalues and eigenvectors of the following ma-
trix:
3 0 1
A = 0 −3 0 .
1 0 3
Solution. First, we shall find the eigenvalues of the given matrix A. From
(3.2), we have
3 0 1 x1 0
0 −3 0 x2 = 0 .
1 0 3 x3 0
For nontrivial solutions, using (3.4), we get
3−λ 0 1
0 −3 − λ 0 = 0,
1 0 3−λ
λ3 − 3λ2 − 10λ + 24 = 0,
342 Applied Linear Algebra and Optimization using MATLAB
which factorizes to
(λ + 3)(λ − 2)(λ − 4) = 0,
which gives the eigenvalues 4, –3, and 2 of the given matrix A. One can
note that the sum of these eigenvalues is 3, and this agrees with the trace
of A. •
>> A = [3 0 1; 0 − 3 0; 1 0 3];
>> P = poly(A);
>> P P = poly2sym(P );
>> P P =
xˆ 3−3xˆ 2−10x + 24
>> roots(P );
>> d = eig(A);
To get the results of Example 3.2, we use the MATLAB Command Win-
dow as follows:
The Eigenvalue Problems 345
>> A = [3 0 1; 0 − 3 0; 1 0 3];
>> P = poly(A);
>> P P = poly2sym(P );
>> [X, D] = eig(A);
>> λ = diag(D);
>> x1 = X(:, 1); x2 = X(:, 2); x3 = X(:, 3);
Example 3.3 Find the eigenvalues and eigenvectors of the following ma-
trix:
1 2 2
A= 0 3 3 .
−1 1 1
Solution. From (3.2), we have
1 2 2 x1 0
0 3 3 x2 = 0 .
−1 1 1 x3 0
For nontrivial solutions, using (3.4), we get
1−λ 2 2
0 3 − λ 3 = 0,
−1 1 1−λ
which gives a characteristic equation of the form
−λ3 + 5λ2 − 6λ = 0.
It factorizes to
λ(λ − 2)(λ − 3) = 0,
which gives the eigenvalues 0, 2, and 3 of the given matrix A. One can
note that the sum of these three eigenvalues is 5, and this agrees with the
trace of A.
x1 + 2x2 + 2x3 = 0
0x1 + x2 + x3 = 0
0x1 + 0x2 + 0x3 = 0.
Example 3.4 Find the eigenvalues and eigenvectors of the following ma-
trix:
3 2 −1
A= 2 6 −2 .
−1 −2 3
Solution. From (3.2), we have
3 2 −1 x1 0
2 6 −2 x2 = 0 .
−1 −2 3 x3 0
The Eigenvalue Problems 347
5. 1u = u. •
whenever
u1 v1
u2 v2
u= , v= .
.. ..
. .
un vn
The zero vector is 0 = (0, 0, . . . , 0)T . The fact that vectors in Rn satisfy
all of the vector space axioms is an immediate consequence of the laws of
vector addition and scalar multiplication. •
1. 0u = 0, for all u ∈ V.
2. α0 = 0, for all α ∈ R.
3. If αu = 0, then α = 0 or u = 0.
4. (−1)u = u, for all u ∈ V. •
For example, every vector space has at least two subspaces, itself and the
subspace{0} (called the zero subspace) consisting of only the zero vector. •
v = k1 v1 + k2 v2 + · · · + kn vn , (3.26)
k1 v1 + k2 v2 + · · · + kn vn = 0, (3.27)
The Eigenvalue Problems 351
The above results can be obtained using the MATLAB Command Win-
dow as follows:
>> v1 = [1 2]0 ;
>> v2 = [−1 1]0 ;
>> A = [v1 v2];
>> null(A);
Note that using the MATLAB command, we obtained
ans =
Empty matrix: 2 − by − 0,
352 Applied Linear Algebra and Optimization using MATLAB
p1 (t) = t2 + t + 2
p2 (t) = 2t2 + t + 3
p3 (t) = 3t2 + 2t + 2.
Show that the set {p1 (t), p2 (t), p3 (t)} is linearly independent.
k1 + 2k2 + 3k3 = 0
k1 + k2 + 2k3 = 0
2k1 + 3k2 + 2k3 = 0.
we get
1 0 0 k1 0
0 1 0 k2 = 0 .
0 0 1 k3 0
Since the only solution to the above system is a trivial solution
k1 = k2 = k3 = 0,
k1 + k2 + k3 = 0
k1 − k2 + k3 = 0
k1 − k2 − k3 = 0.
k1 = 0, k2 = 0, k3 = 0,
x = k1 v1 + k2 v2 + · · · + kn vn , (3.29)
Example 3.8 Consider the vectors v1 = (1, 2, 1), v2 = (1, 3, −2), and
v3 = (0, 1, −3) in R3 . If k1 , k2 , and k3 are numbers with
k1 v1 + k2 v2 + k3 v3 = 0,
354 Applied Linear Algebra and Optimization using MATLAB
this is equivalent to
k1 + k2 + 0k3 = 0
2k1 + 3k2 + k3 = 0
k1 − 2k2 − 3k3 = 0.
The above results can be obtained using the MATLAB Command Win-
dow as follows:
>> v1 = [1 2 1]0 ;
>> v2 = [1 3 − 2]0 ;
>> v3 = [0 1 − 3]0 ;
>> A = [v1 v2 v3];
>> null(A);
By using this MATLAB command, the answer we obtained means that
there is a nonzero solution to the homogeneous system Ak = 0.
Example 3.9 Find the value of α for which the set {(1, −2), (4, −α)} is
linearly dependent.
Solution. Suppose a linear combination of these given vectors (1, −2) and
(4, −α) vanishes, i.e.,
k1 + 4k2 = 0
−2k1 − αk2 = 0
The Eigenvalue Problems 355
or
1 4 k1 0
= .
−2 α k2 0
By solving this system, we obtain
1 4 k1 0
= ,
0 8−α k2 0
and it shows that the system has infinitely many solutions for α = 8. Thus,
the given set {(1, −2), (4, −α)} is linearly dependent for α = 8. •
If, in addition
Theorem 3.8 An orthogonal set of vectors that does not contain the zero
vectors is linearly independent. •
356 Applied Linear Algebra and Optimization using MATLAB
The proof of this theorem is beyond the scope of this text and will
be omitted. However, the result is extremely important and can be easily
understood and used. We illustrate this result by considering the matrix
6 −2 2
A = −2 5 0 ,
2 0 7
k1 v1 + k2 v2 + k3 v3 = 0,
Theorem 3.9 The determinant of a matrix is zero, if and only if the rows
(or columns) of the matrix form a linearly dependent set. •
The Eigenvalue Problems 357
and are called spectral matrices, i.e., all the diagonal elements of D are
the eigenvalues of A. This simple but useful result makes it desirable to
find ways to transform a general n × n matrix A into a diagonal matrix
having the same eigenvalues. Unfortunately, the elementary operations
that can be used to reduce A → D are not suitable, because the scale and
subtract operations alter eigenvalues. Here, what we needed are similarity
transformations. Similarity transformations occur frequently in the context
of relating coordinate systems.
Solution. Let
−1
−1 0 −2 0 0 −2 −1 0 −2
B = Q−1 AQ = 0 1 1 1 2 1 0 1 1
1 0 1 1 0 3 1 0 1
1 0 2 0 0 −2 −1 0 −2
= 1 1 1 1 2 1 0 1 1
−1 0 −1 1 0 3 1 0 1
2 0 0
= 0 2 0 .
0 0 1
2. If A ≡ B, then B ≡ A.
3. If A ≡ B and B ≡ C, then A ≡ C. •
Note that Theorem 3.11 gives the sufficient conditions for the similar ma-
trices. For example, for the matrices
1 0 1 1
A= and B = ,
0 1 0 1
then
det(A) = det(B)
rankA = 1 = rankB
(λ1 = 1, λ2 = 1) = (λ1 = 1, λ2 = 1).
= |A − λI||Q−1 Q| = |A − λI||I|
= |A − λI|.
D = Q−1 AQ (3.32)
is a diagonal matrix. Note that all the diagonal elements of it are the
eigenvalues of A, and a invertible matrix Q can be written as
Q = x1 |x2 | · · · |xn
Q−1 AQ = D,
Q = (x1 · · · xn ).
columns gives
AQ = (Ax1 · · · Axn )
= (λ1 x1 · · · λn xn )
λ1 0
= (x1 · · · xn )
...
0 λn
λ1 0
= Q
... .
0 λn
Note that the converse of the above theorem also exists, i.e., if A is
diagonalizable, then it has n linearly independent eigenvectors. •
λ3 − 6λ2 + 11λ − 6 = 0,
362 Applied Linear Algebra and Optimization using MATLAB
(λ − 1)(λ − 2)(λ − 3) = 0.
Thus,
3 1 −3 0 0 1 1 1 1
Q−1 AQ = −3 −2 5 3 7 −9 1 3 6 ,
1 1 −2 0 2 −1 1 2 3
•
The above results can be obtained using the MATLAB Command Win-
dow as follows:
The Eigenvalue Problems 363
>> A = [0 0 1; 3 7 − 9; 0 2 − 1];
>> P = poly(A);
>> P P = poly2sym(P );
>> [X, D] = eig(A);
>> eigenvalues = diag(D);
>> x1 = X(:, 1); x2 = X(:, 2); x3 = X(:, 3);
>> Q = [x1 x2 x3];
>> D = inv(Q) ∗ A ∗ Q;
It is possible that independent eigenvectors may exist even though the
eigenvalues are not distinct, though no theorem exists to show under what
conditions they may do so. The following example shows the situation that
can arise.
Example 3.12 Consider the matrix
2 1 1
A = 2 3 2 ,
3 3 4
which has a characteristic equation
λ3 − 9λ2 + 15λ − 7 = 0,
and it can be easily factorized to give
(λ − 7)(λ − 1)2 = 0.
The eigenvalues of A are 7 of multiplicity one and 1 of multiplicity two. The
eigenvectors corresponding to these eigenvalues are x1 = [1, 2, 3]T , x2 =
[1, 0, −1]T , and x3 = [0, 1, −1]T . Thus, the nonsingular matrix Q is given
by
1 1 0
Q= 2 0 1 ,
3 −1 −1
and the inverse of this matrix is
1 −1 −1
1
Q−1 = −5 1 1 .
6
2 −4 2
364 Applied Linear Algebra and Optimization using MATLAB
Thus,
7 0 0
Q−1 AQ = 0 1 0 = D.
0 0 1
•
Computing Powers of a Matrix
λ3 + λ2 − 4λ − 4 = 0
and factorizes to
(λ + 1)(λ + 2)(λ − 2) = 0.
It gives eigenvalues 2, –2, and –1 of the given matrix A with the correspond-
ing eigenvectors [1, 2, 1]T , [1, 1, 1]T , and [1, 1, 0]T . Then the factorization
A = QDQ−1
becomes
1 1 −4 1 1 1 −1 0 0 −1 1 0
2 0 −4 = 2 1 1 0 −2 0 1 −1 1 ,
−1 1 −2 1 1 0 0 0 2 1 0 −1
For this formula, we can easily compute any power of a given matrix A.
For example, if k = 10, then
2047 −1023 0
A10 = 2046 −1022 0 ,
1023 −1023 1024
The above results can be obtained using the MATLAB Command Win-
dow as follows:
>> A = [1 1 − 4; 2 0 − 4; −1 1 − 2];
>> P = poly(A);
>> [X, D] = eig(A);
>> eigenvalues = diag(D);
>> x1 = X(:, 1); x2 = X(:, 2); x3 = X(:, 3);
>> Q = [x1 x2 x3];
>> A10 = Q ∗ Dˆ 10∗inv(Q);
Orthogonal Diagonalization
D = Q−1 AQ = QT AQ
is a diagonal matrix. •
D = QT AQ.
Therefore,
A = QDQT .
Taking its transpose gives
Thus, A is symmetric.
The converse of this theorem is also true, but it is beyond the scope of
this text and will be omitted. •
Symmetric Matrices
Theorem 3.16 The following conditions are equivalent for an n×n matrix
Q:
(a) Q is invertible and Q−1 = QT .
λ3 − 8λ2 + 19λ − 12 = 0,
and it gives the eigenvalues 1, 3, and 4 for the given matrix A. Correspond-
ing to these eigenvalues, the eigenvectors of A are x1 = [1, 2, 1]T , x2 =
[1, 0, −1]T , and x3 = [1, −1, 1]T , and they form an orthogonal set. Note
that the following vectors
x1
u1 = = √1 [1, 2, 1]T
kx1 k2 6
x2
u2 = = √1 [1, 0, −1]T
kx2 k2 2
x3
u3 = = √1 [1, −1, 1]T
kx3 k2 3
1 1 1
√ √ √
6 2 3
2 1
Q= √ 0 −√
6 3
1 1 1
√ −√ √
6 2 3
and
1 2 1 1 1 1
√ √ √ √ √ √
6 6 6 0 6 2 3
3 −1
1 1 2 1
T
Q AQ = √ 0 − √ −1 2 −1 √ 0 −√ ,
2 2 6 3
0 −1 3
1 1 1 1 1 1
√ −√ √ √ −√ √
3 3 3 6 2 3
Note that the eigenvalues 1, 3, and 4 of the matrix A are real and its
eigenvectors form an orthonormal set, since they inherit orthogonally from
x1 , x2 , and x3 , which satisfy the preceding theorem. •
The results of Example 3.15 can be obtained using the MATLAB Com-
mand Window as follows:
The Eigenvalue Problems 371
>> A = [3 − 1 0; −1 2 − 1; 0 − 1 3];
>> P = poly(A);
>> P P = poly2sym(P );
>> [X, D] = eig(A);
>> λ = diag(D);
>> x1 = X(:, 1); x2 = X(:, 2); x3 = X(:, 3);
>> u1 = x1/norm(x1); u2 = x2/norm(x2); u3 = x3/norm(x3);
>> Q = [u1 u2 u3];
>> D = Q0 ∗ A ∗ Q;
A = A∗ = ĀT ,
i.e., whenever aij = āji . This is the complex analog of symmetry. For
example, the following matrix A is Hermitian if it has the form
a b + ic
A= ,
b − ic d
are Hermitian where the matrix A is symmetric but the matrix B is not.
Note that:
1. Every diagonal matrix is Hermitian, if and only if it is real.
2. The square matrix A is said to be a skew Hermitian when
A = −A∗ = −ĀT ,
i.e., whenever aij = −āji . This is the complex analog of skew symmetry.
For example, the following matrix A is skew Hermitian if it has the form
0 1+i
A= .
−1 + i i
•
AA∗ = A∗ A = In ,
The Eigenvalue Problems 373
However, its eigenvalues are a + ib. Note that all the Hermitian, skew
Hermitian and unitary matrices are normal matrices. •
374 Applied Linear Algebra and Optimization using MATLAB
−λ3 + 3λ2 = 0.
−λ3 + 4λ2 − 3λ = 0.
λ3 − 3λ2 − 10λ + 24 = 0,
|A − λI| = 0,
λ3 − 10λ2 + 9λ = 0.
Multiplying each term in (3.34) by A−1 , when A−1 exists and thus
α0 6= 0, gives an important relationship for the inverse of a matrix:
or
1 n−1
A−1 = − [A + αn−1 An−2 + αn−2 An−3 + · · · + α1 I].
α0
378 Applied Linear Algebra and Optimization using MATLAB
Program 3.3
MATLAB m-file for Using the Cayley–Hamilton Theorem
function [c,Ainv]= chim(A)
n=max(size(A));
for i=1:n; for j=1:n
I(i,j)=0; I(i,i)=1; end; end
AA=A; AAA=A; c=[1];
for k=1:n; traceA=0;
for g=1:n % Loop to find the trace of matrix A.
traceA=traceA+A(g,g); end
cc = −1/k ∗ traceA; % To find coefficients of the polynomial.
c = [c, cc]; if k < n;
for i=1:n; for j=1:n; b(i, j) = A(i, j) + cc ∗ I(i, j); end; end
for i=1:n; for j=1:n; s=0;
for m=1:n; ss = AA(i, m) ∗ b(m, j); s=s+ss; end;
A(i,j)=s; end; end; end; end
for i=1:n; for j=1:n
su1(i, j) = c(n) ∗ I(i, j) + c(n − 1) ∗ AA(i, j); su2(i,j)=0; end; end
if n > 2
for z=2:n-1; for i=1:n; for j=1:n; s=0;
for m=1:n; ss = AAA(i, m) ∗ AA(m, j); s = s + ss; end
am(i,j)=s; end; end; AAA=am;
for i=1:n; for j=1:n
su2(i, j) = su2(i, j) + c(n − z) ∗ AAA(i, j); end; end;end;end
for i=1:n; for j=1:n
su(i,j)=su1(i,j)+su2(i,j); Ainv(i, j) = −1/c(n + 1) ∗ su(i, j);
end;end
A2 − 9A + 24I − 20A−1 = 0,
which gives
1 2
A−1 = [A − 9A + 24I].
20
Similarly, one can also find the higher power of the given matrix A.
For example, one can compute the value of the matrix A5 by solving
the expression
A5 = 9A4 − 24A3 + 20A2 ,
and it gives
32 80 3013
A5 = 0 32 3093 .
0 0 3125
•
>> A = [2 1 2; 0 2 3; 0 0 5];
c = chim(A);
380 Applied Linear Algebra and Optimization using MATLAB
Ax = λx,
then
1 −1
A Ax = A−1 x.
λ
Hence,
1
A−1 x = x,
λ
which shows that x is also an eigenvector of A−1 .
λ3 − 6λ2 + 11λ − 6 = 0,
p(λ) = λ3 + c1 λ2 + c2 λ + c3 .
A1 = AB0 = AI = A,
so
1
c1 = − tr(A1 ) = −12.
1
382 Applied Linear Algebra and Optimization using MATLAB
Now
−7 −2 −4
B1 = A1 + c1 I = A1 − 12I = −2 −10 2
−4 2 −7
and
−15 2 −4
A2 = AB1 = 2 −12 −2 ,
4 −2 −15
so
1
c2 = − tr(A2 ) = 21.
2
Now
6 2 4
B2 = A2 + c2 I = A2 + 21I = 2 9 −2
4 −2 6
and
10 0 0
A3 = AB2 = 0 10 0 ,
0 0 10
so
1
c3 = − tr(A3 ) = −10.
3
Thus,
p(λ) = λ3 − 12λ2 + 21λ − 10
and
6 2 4
1 1
A−1 = − B2 = 2 9 −2
c3 10
4 −2 6
is the inverse of the given matrix A. •
>> A = [5 − 2 − 4; −2 2 2; −4 2 5];
>> [c, Ainv] = sourian(A);
The Eigenvalue Problems 383
Program 3.4
MATLAB m-file for Using the Sourian–Frame Theorem
function [c,Ainv]=sourian(A)
[n,n]=size(A);
for i=1:n; for j=1:n; b0(i,j)=0;b0(i,i)=1;end;end
AA=A; c[1]; for k = 1:n;
traceA=0; for i=1:n; traceA=traceA+A(i,i);end;
cc = −1/k ∗ tracA; c = [c, cc]; if k < n;
for i=1:n; for j=1:n
b(i, j) = A(i, j) + cc ∗ b0(i, j);end;end
for i=1:n; for j=1:n; s = 0; for m =1:n;
ss = AA(i, m) ∗ b(m, j); s = s + ss; end;
A(i, j) = s; end; end; end; end;
for i=1:n; for j=1:n;
ai(i,j)=-1/c(n+1)*b(i,j); end; end
disp(’Coefficients of the polynomial’) c
disp(’Inverse of the matrix A=ai’) ai
αn−1 = −tr(A)
1
αn−2 = − [αn−1 tr(A) + tr(A2 )]
2
1
αn−3 = − [αn−2 tr(A) + αn−1 tr(A2 ) + tr(A3 )]
3
.. ..
. .
1
α0 = − [α1 tr(A) + α2 tr(A2 ) + · · · + tr(An )].
n
This formula is called Bocher’s formula, which can be used to find
the coefficients of a characteristic equation of a square matrix.
384 Applied Linear Algebra and Optimization using MATLAB
λ3 + α2 λ2 + α1 λ + α0 = 0,
where
α2 = −tr(A)
1
α1 = − [α2 tr(A) + tr(A2 )]
2
1
α0 = − [α1 tr(A) + α2 tr(A2 ) + tr(A3 )].
3
In order to find the values of the above coefficients, we must compute
the powers of matrix A as follows:
4 1 3 13 1 12
2 3
A = 9 0 9 and A = 27 0 27 .
5 −1 6 14 −1 15
α2 = −(4) = −4
1
α1 = − [−4(4) + 10] = 3
2
1
α0 = − [3(4) + (−4)(10) + 28] = 0.
3
The Eigenvalue Problems 385
λ3 − 4λ2 + 3λ = 0.
>> A = [1 1 0; 3 0 1; 2 − 1 3];
>> c = BOCH(A);
Program 3.5
MATLAB m-file for Using Bocher’s Theorem
function c=BOCH(A)
[n,n]=size(A);
for i=1:n; for j=1:n; I(i,i)=1; end; end
A(n-1)=-trace(A);
for i=2:n; s=0; p=1;
for k=i-1:-1:1
s = s + A(n − k) ∗ trace(Aˆ p); p = p + 1; end
˜ A(n − i) = (−1/i) ∗ (s + trace(Aˆ i)); else
if i=n;
Ao = (−1/i) ∗ (s + trace(Aˆ i)); end; end
coeff=[Ao A]; if ao=0;
˜ s=A ˆ (n-1);
for i=1:n-2
s = s + a(n − i) ∗ Aˆ (n − (i + 1)); end
s = s + A(1) ∗ I; Ainv = −s/Ao; else; end
1
αn−2 = − (AD2 )
2
1
αn−3 = − (AD3 )
3
.. ..
. .
1
α0 = − (ADn ),
n
where
D1 = I
D2 = AD1 + αn−1 I = A + αn−1 I
D3 = AD2 + αn−2 I = A2 + αn−1 A + αn−2 I
..
.
Dn = ADn−1 + α1 I = An−1 + αn−1 An−2 + · · · + α2 A + α1 I
and also
Dn+1 = ADn + α0 I = 0.
Then the determinant of A is
det(A) = |A| = (−1)n α0 ,
the adjoint of A is
adj(A) = (−1)n+1 Dn ,
and the inverse of A is
1
A−1 = − Dn .
α0
Note that a singular matrix is indicated by α0 = 0.
|A − λI| = λ3 + α2 λ2 + α1 λ + α0 = 0.
and
25 0 0
AD3 = 0 25 0 ,
0 0 25
388 Applied Linear Algebra and Optimization using MATLAB
which gives
1 1
α0 = − tr(AD3 ) = − (25) = −25,
3 3
which shows that the given matrix is nonsingular. Hence, the char-
acteristic equation is
|A − λI| = λ3 + λ2 + 3λ − 25 = 0.
Ax = λx,
Thus, the scalar is equal to its own conjugate and hence, real. There-
fore, λ is real.
The Eigenvalue Problems 389
λ3 − 7λ2 + 15λ − 9 = 0,
and it gives the real eigenvalues 1, 3, and 3 for the given matrix A.•
So
AA∗ = (QBQ−1 )(QB ∗ Q−1 )
= (QBB ∗ Q−1 )
= (QB ∗ BQ−1 )
= (QB ∗ Q−1 QBQ−1 )
= (QB ∗ Q−1 )(QBQ−1 ) = A∗ A.
This shows that the matrix A is normal.
18. The value of the exponential of a matrix A can be calculated from
eA = Q(expΛ)Q−1 ,
where expΛ is a diagonal matrix whose elements are the exponential
of successive eigenvalues, and Q is a matrix of the eigenvectors of A.
λ2 − αᾱ = λ2 − |α|2 = 0,
(αᾱ − λ)2 = 0,
•
394 Applied Linear Algebra and Optimization using MATLAB
which gives the eigenvalues −1, 31 of A−1 . Hence, the spectral radius
of the matrix A−1 is
1
ρ(A−1 ) = max{| − 1|, | |} = 1.
3
Thus,
1 > min{−1.7321, −0.2680},
which satisfies the relation (3.35). •
The Eigenvalue Problems 395
>> A = hilb(3);
A=
1.0000 0.5000 0.3333
0.5000 0.3333 0.2500
0.3333 0.2500 0.2000
and one can find the condition number of this Hilbert matrix as
>> cond(A)
ans = .
524.0568
Solution. Since
T
25 − λ 26
det(A A − λI) = = 0,
26 29 − λ
solving the above equation gives
λ2 − 54λ + 49 = 0.
The solutions 53.08 and 0.92 of the above equation are called the
eigenvalues of the matrix AT A. Thus, the conditioning of the given
matrix can be obtained as
1/2
53.08
condA = ≈ 7.6,
0.92
which shows that the given matrix A is not ill-conditioned. •
The Eigenvalue Problems 397
where the aij are known constants. This is called a linear system of differ-
ential equations. To write (3.36) in matrix form, we have
f1 (t) f10 (t) a11 a12 · · · a1n
f2 (t)
0
f20 (t) a21 a22 · · · a2n
f (t) = , f (t) = , A= .. .
.. .. .. .. ..
. . . . . .
fn (t) fn0 (t) an1 an2 · · · ann
A set of vector functions {f (1) (t), f (2) (t), . . . , f (n) (t)} is said to be a fun-
damental system for (3.36) if every solution to (3.36) can be written in the
form (3.38). In this case, the right side of (3.38), where c1 , c2 , . . . , cn are
arbitrary constants, is said to be the general solution to (3.37).
and B is the n × n matrix whose columns are f (1) (0), f (2) (0), . . . , f (n) (0),
respectively.
Note that, if f (1) (t), f (2) (t), . . . , f (n) (t) form a fundamental system for
(3.36), then B is nonsingular, so (3.40) always has a unique solution.
Example 3.34 The simplest system of the form (3.36) is the single equa-
tion
df
= αf, (3.41)
dt
where α is a constant. The general solution to (3.41) is
f1 (t) = c1 ea11 t
f2 (t) = c2 ea22 t
.. .. (3.44)
. .
fn (t) = cn eann t ,
400 Applied Linear Algebra and Optimization using MATLAB
is the general solution to the given system of differential equations, and the
functions
1 0 0
f (1) (t) = 0 e2t , f (2) (t) = 1 e5t , f (3) (t) = 0 et
0 0 1
form a fundamental system for the given diagonal system. •
If the system (3.37) is not diagonal, then there is an extension of the
method discussed in the preceding example that yields the general solution
in the case where A is diagonalizable. Suppose that A is diagonalizable and
Q is a nonsingular matrix such that
D = Q−1 AQ, (3.45)
where D is diagonal. Then by multiplying Q−1 to the system (3.37), we
get
Q−1 f 0 (t) = Q−1 Af (t)
or
Q−1 f 0 (t) = Q−1 AQQ−1 f (t) = (Q−1 AQ)(Q−1 f (t)), (3.46)
−1
(since QQ = In ). Let
g(t) = Q−1 f (t), (3.47)
and by taking its derivative, we have
g0 (t) = Q−1 f 0 (t). (3.48)
Since Q−1 is a constant matrix, using (3.48) and (3.45), we can write (3.46)
as
g0 (t) = Dg(t). (3.49)
Then the system (3.49) is a diagonal system and can be solved by the
method just discussed. Since the matrix D is a diagonal matrix and all its
diagonal elements are also called the eigenvalues λ1 , λ2 , . . . , λn of A, it can
be written as
λ1 0 · · · 0
0 λ2 · · · 0
D = .. .. .
.. ..
. . . .
0 0 · · · λn
402 Applied Linear Algebra and Optimization using MATLAB
where
1 0 0
(1)
0
λ1 t (2)
1
λ2 t (n)
0
λn t
g (t) = e , g (t) = e , ...,g (t) = e ,
.. .. ..
. . .
0 0 1
(3.50)
and c1 , c2 , . . . , cn are arbitrary constants. The system (3.47) can also be
written as
f (t) = Qg(t). (3.51)
So the general solution to the given system (3.37) is
However, since the constant vectors in (3.50) are the columns of the
identity matrix and QIn = Q, (3.52) can be rewritten as
f 0 (t) = Af (t)
Q = [q1 , q2 , . . . , qn ],
The Eigenvalue Problems 403
f10 = 3f1 − f2 − f3
0
f2 = −12f1 + 5f3
0
f3 = 4f1 − 2f2 − f3 .
or
0 = f (λ) = −(λ + 1)(λ − 1)(λ − 2).
So the eigenvalues of A are λ1 = −1, λ2 = 1, and λ3 = 2, and the associated
eigenvectors are
1
3 1
2 −1 , −1 ,
1 ,
1 7 2
404 Applied Linear Algebra and Optimization using MATLAB
1 7 2
or 1
3 1 c1 1
2
1 −1 −1 c2 = 4 .
1 7 2 c3 3
Solving this system for c1 , c2 , and c3 using the Gauss elimination method,
we obtain
c1 = 2, c2 = 1, c3 = −3.
Therefore, the solution to the initial-value problem is
1
3 1
f (t) = 2 21 e−t + −1 et − 3 −1 e2t .
1 7 2
•
The Eigenvalue Problems 405
an = n3 + 2. (3.54)
a0 = 2, a1 = 3, a2 = 10, . . . .
where p and q are fixed numbers and a0 and a1 are the given initial con-
ditions. This equation is called a linear difference equation (because each
ai appears to the first power) and is of order 2 (because an is expressed
in terms of two preceding terms an−1 and an−2 ). To solve (3.56), let us
introduce a relation bn = an−1 . So we get the system
an = pan−1 + qbn−1
bn = an−1 , n = 2, 3, 4, . . . . (3.57)
Then
n−1
An−1 = QDQ−1 = QDQ−1 QDQ−1 · · · QDQ−1 = QDn−1 Q−1
or
n−1 (λ1 )n−1 0
λ1 0
An−1 = Q Q−1 = Q Q−1 .
0 λ2 n−1
0 (λ2 )
This gives
(λ1 )n−1 0
Vn = Q Q−1 V1 . (3.62)
n−1
0 (λ2 )
Example 3.37 Solve the difference equation
an = 2an−1 + 8an−2 , n = 2, 3, 4, . . .
an = 2an−1 + 8bn−1
bn = an−1 , n = 2, 3, 4, . . . .
p(λ) = λ2 − 2λ − 8 = (λ − 4)(λ + 2) = 0,
λ1 = 4 and λ2 = −2,
2 1 2 1
a20 = (4)20 + (−2)20 = (1099511627776) + (1048576),
3 3 3 3
so
a20 = 733008101376. •
3.7 Summary
In this chapter we discussed the approximation of eigenvalues and eigen-
vectors. We discussed similar, unitary, and diagonalizable matrices. The
set of diagonalizable matrices includes matrices with n distinct eigenvalues
and symmetric matrices. Matrices that are not diagonalizable are some-
times referred to as defective matrices.
3.8 Problems
1. Find the characteristic polynomial, eigenvalues, and eigenvectors of
each matrix:
3 2 1 −2 1 1 2 −1 −1
(a) 2 1 1 , (b) −6 1 3 , (c) −1 2 −1 ,
1 1 1 −12 −2 8 −1 −1 2
4 3 2 1
1 1 1 3 2 −2 3 3 2 1
(d) 1 1 0 , (e) −3 −1 3 , (f )
2 2 2 1 .
1 0 1 1 2 0
1 1 1 1
2. Determine whether each of the given sets of vectors is linearly depen-
dent or independent:
(a) p1 = 3x2 − 1, p2 = x2 + 2x − 1, p3 = x2 − 4x + 3.
(b) p1 = x2 + 5x + 12, p2 = 3x2 + 5x − 3, p3 = 4x2 − 3x + 7.
(c) p1 = 2x2 − 8x + 9, p2 = 6x2 + 13x − 22, p3 = 4x2 − 11x + 2.
(d) p1 = −2x2 + 3x, p2 = 7x2 − 5x − 10, p3 = −3x2 + 9x − 13.
4. For what values of k are the following vectors in R3 linearly indepen-
dent?
(a) (−1, 0, −1), (2, 1, 2), and (1, 1, k).
(b) (1, 2, 3), (2, −1, 4), and (3, k, 4).
(c) (2, k, 1), (1, 0, 1), and (0, 1, 3).
(d) (k, 21 , 12 ), ( 21 , k, 12 ), and ( 21 , 12 , k).
5. Show that the vectors (1, a, a2 ), (1, b, b2 ), and (1, c, c2 ) are linearly
independent, if a 6= b, a =
6 c, b 6= c.
The Eigenvalue Problems 411
9. Find the formula for the kth power of each matrix considered in Prob-
lem 5, and then compute A5 .
Find a formula for the kth power of the matrix and compute A10 .
412 Applied Linear Algebra and Optimization using MATLAB
16. Find the characteristic polynomial and inverse of each of the matrices
considered in Problem 5 by using the Cayley–Hamilton theorem.
The Eigenvalue Problems 413
18. Find the characteristic polynomial and inverse for each of the follow-
ing matrices by using the Sourian–Frame theorem:
5 0 0 2 2 1 1 1 −1
(a) 2 1 2 , (b) −2 1 2 , (c) 3 1 0 ,
0 1 1 1 −2 2 1 −2 1
1 0 2 −1 4
4 0 0 2
1 2 1 5 3 −1 0 1
0 3 0 −1
(d) 1 2 0 , (e)
1 0
, (f ) 8 5 −3 −1 4 .
2 2
1 4 2 6 2 0 0 1
0 0 0 2
0 1 4 2 0
19. Find the characteristic polynomial and inverse of the following matrix
by using the Sourian–Frame theorem:
1 0 0 0
−2 1 0 0
A=
5 −4 1
.
0
−5 3 0 1
20. Find the characteristic polynomial and inverse of each of the given
matrices considered in Problem 11 by using the Sourian–Frame the-
orem.
24. Find the general solution to the following system. Then find the
solution to the initial-value problem determined by the given initial
condition.
f10 = −f1 + 5f2
f20 = f1 + 3f2
f1 (0) = 1, f2 (0) = −1.
25. Find the general solution to the following system. Then find the
solution to the initial-value problem determined by the given initial
condition.
f10 = 9f1 + 2f2
f20 = 3f1 + 5f2
f1 (0) = 2, f2 (0) = 4.
26. Find the general solution to each of the following systems. Then find
the solution to the initial-value problem determined by the given ini-
tial condition.
(a)
f10 = 2f1 + f2 + 2f3
f20 = 2f1 + 2f2 − 2f3
f30 = 3f1 + f2 + f3
f1 (0) = 1, f2 (0) = 1, f3 (0) = 1.
The Eigenvalue Problems 415
(b)
f10 = 3f1 + 2f2 + 3f3
f20 = −f1 + f2 + 4f3
f30 = f1 + f2 + 2f3
f1 (0) = 4, f2 (0) = 2, f3 (0) = −1.
(c)
f10 = 5f1 + 8f2 + 16f3
0
f2 = 4f1 + f2 + 8f3
f30 = −4f1 − 4f2 − 11f3
f1 (0) = 1, f2 (0) = 1, f3 (0) = 1.
(d)
f10 = 2f1 − 3f2 − 2f3
f20 = −2f1 + 3f3
0
f3 = f1 − 6f2 + 9f3
f1 (0) = 3, f2 (0) = 2, f3 (0) = 1.
27. Find the general solution to each of the following systems. Then find
the solution to the initial-value problem determined by the given ini-
tial condition.
(a)
f10 = 5f1
f20 = + 7f2
f30 = + 6f3
f1 (0) = 1, f2 (0) = 2, f3 (0) = 3.
(b)
f10 = 3f1 + 2f3
f20 = 4f1 − 3f2 + f3
f30 = 2f1 + 2f2 + 4f3
f1 (0) = 3, f2 (0) = 3, f3 (0) = 3.
416 Applied Linear Algebra and Optimization using MATLAB
(c)
f10 = 2f1 + f2 + 2f3
f20 = 1f1 + 5f2 + 5f3
f30 = 2f1 + f2 + f3
f1 (0) = −1, f2 (0) = 2, f3 (0) = 5.
(d)
f10 = f1 + f3
f20 = 2f1 + f2 + 3f3
f30 = f1 + 3f2 + 5f3
f1 (0) = 2, f2 (0) = 1, f3 (0) = 6.
28. Solve each of the following difference equations and use the solution
to find the given term.
29. Solve each of the following difference equations and use the solution
to find the given term.
Numerical Computation of
Eigenvalues
4.1 Introduction
The importance of the eigenvalues of a square matrix in a broad range of
applications has been amply demonstrated in the previous chapters. How-
ever, finding the eigenvalues and associated eigenvectors is not such an easy
task. At this point, the only method we have for computing the eigenvalues
of a matrix is to solve the characteristic equation. However, there are sev-
eral problems with this method that render it impractical in all but small
examples. The first problem is that it depends on the computation of a de-
terminant, which is a very time-consuming process for large matrices. The
second problem is that the characteristic equation is a polynomial equa-
tion, and there are no formulas for solving polynomial equations of degrees
higher than 4 (polynomials of degrees 2, 3, and 4 can be solved using the
quadratic formula and its analogues). Thus, we are forced to approximate
417
418 Applied Linear Algebra and Optimization using MATLAB
The power methods include three versions. First, is the regular power
method or simple iterations based on the power of a matrix. Second, is
the inverse power method which is based on the inverse power of a matrix.
Third, is the shifted inverse power method in which the given matrix A is
Numerical Computation of Eigenvalues 419
Avi = λi vi , (4.1)
where λi is the ith eigenvalue and vi is the corresponding ith eigenvector of
A. The power method can be used on both symmetric and nonsymmetric
matrices. If A is a symmetric matrix, then all the eigenvalues are real. If
A is a nonsymmetric, then there is a possibility that there is not a single
real dominant eigenvalue but a complex conjugate pair. Under these con-
ditions the power method does not converge. We assume that the largest
eigenvalue is real and not repeated and that eigenvalues are numbered in
increasing order, i.e.,
and it gives
x1 = Ax0
x2 = Ax1 = A2 x0
x3 = Ax2 = A3 x0 .
..
.
Thus,
xk = A k x 0 , for k = 1, 2, . . . .
x0 = α1 v1 + α2 v2 + · · · + αn vn .
Let
x1 = Ax0 = A(α1 v1 + α2 v2 + · · · + αn vn )
= α1 Av1 + α2 Av2 + · · · + αn Avn
= α1 λ1 v1 + α2 λ2 v2 + · · · + αn λn vn ,
x2 = Ax1 = A(α1 λ1 v1 + α2 λ2 v2 + · · · + αn λn vn )
= α1 λ1 Av1 + α2 λ2 Av2 + · · · + αn λn Avn
= α1 λ21 v1 + α2 λ22 v2 + · · · + αn λ2n vn .
All of the terms except the first in the above relation (4.4) converge to
the zero vector as k → ∞, since |λ1 | > |λi | for i 6= 1. Hence,
Numerical Computation of Eigenvalues 421
Axk ≈ λ1 xk ,
Example 4.1 Find the first five iterations obtained by the power method
applied to the following matrix using the initial approximation x0 = [1, 1, 1]T :
1 1 2
A = 1 2 1 .
1 1 0
which gives
1.0000
Ax0 = λ1 x1 = 4.0000 1.0000 .
0.5000
422 Applied Linear Algebra and Optimization using MATLAB
>> A = [1 1 2; 1 2 1; 1 1 0];
>> X = [1 1 1]0 ;
>> maxI = 5;
>> sol = P M (A, X, maxI);
Program 4.1
MATLAB m-file for the Power method
function sol=PM (A, X, maxI)
[n, n] = size(A);
for k=1:maxI; for i=1:n; s=0;
for j=1:n; ss = A(i, j) ∗ X(j, 1); s = s + ss;end
XX(i, 1) = s;end; X = XX; y = max(X);
for i=1:n; X(i, 1) = 1/y ∗ X(i, 1);end; yy=abs(y-y1);
if (yy <= 1e − 6); break; end; y; end
The power method has the disadvantage that it is unknown at the out-
set whether or not a matrix has a single dominant eigenvalue. Nor is it
known how an initial vector x0 should be chosen to ensure that its repre-
sentation in terms of the eigenvectors of a matrix will contain a nonzero
contribution from the eigenvector associated with the dominant eigenvalue,
should it exist.
The power method will converge if the given n × n matrix A has linearly
independent eigenvectors and a symmetric matrix satisfies this property.
Now we will discuss the power method for finding the dominant eigenvalue
of a symmetric matrix only.
The basic steps for the power method with Euclidean scaling are:
1. Choose an arbitrary nonzero vector and normalize it to obtain a unit
vector x0 .
Example 4.3 Apply the power method with Euclidean scaling to the ma-
trix
2 2
A= ,
2 5
with x0 = [0, 1]T to get the first four approximations to the dominant unit
eigenvector and the dominant eigenvalue.
Solution. Starting with the unit vector x0 = [0, 1]T , we get the first ap-
proximation of the dominant unit eigenvector as follows:
2 2 0 2 Ax0 1 2 0.3714
Ax0 = = , x1 = =√ ≈ .
2 5 1 5 max(Ax0 ) 29 5 0.9285
Similarly, for the second, third, and fourth approximations of the dominant
unit eigenvector, we find
2 2 0.3714 2.5997
Ax1 = = ,
2 5 0.9285 5.3852
Ax1 1 2.5997 0.4347
x2 = ≈ ≈ .
max(Ax1 ) 5.9799 5.3852 0.9006
426 Applied Linear Algebra and Optimization using MATLAB
2 2 0.4347 2.6706
Ax2 = = ,
2 5 0.9006 5.3723
Ax2 1 2.6706 0.4451
x3 = ≈ ≈ .
max(Ax2 ) 5.9994 5.3723 0.8955
2 2 0.4451 2.6812
Ax3 = = ,
2 5 0.8955 5.3676
Ax3 1 2.6812 0.4469
x4 = ≈ ≈ .
max(Ax3 ) 6 5.3676 0.8946
(Ax1 )T x1
Ax1 .x1 0.3714
λ1 = = = (2.5997 5.3852) ≈ 5.9655,
x1 .x1 xT1 x1 0.9285
(Ax2 )T x2
Ax2 .x2 0.4347
λ2 = = = (2.6706 5.3723) ≈ 5.9992,
x2 .x2 xT2 x2 0.9006
(Ax3 )T x3
Ax3 .x3 0.4451
λ3 = = = (2.6812 5.3676) ≈ 6.0001,
x3 .x3 xT3 x3 0.8955
(Ax4 )T x4
Ax4 .x4 0.4469
λ4 = = = (2.6830 5.3668) ≈ 6.0002.
x4 .x4 xT4 x4 0.8946
Example 4.4 Apply the power method with maximum entry scaling to the
matrix
2 2
A= ,
2 5
with x0 = [0, 1]T , to get the first four approximations to the dominant
eigenvector and the dominant eigenvalue.
428 Applied Linear Algebra and Optimization using MATLAB
Solution. Starting with x0 = [0, 1]T , we get the first approximation of the
dominant eigenvector as follows:
2 2 0 2 Ax0 1 2 0.4000
Ax0 = = , x1 = = = .
2 5 1 5 max(Ax0 ) 5 5 1.0000
Similarly, for the second, third, and fourth approximations of the dominant
eigenvector, we find
2 2 0.4000 2.8000
Ax1 = = ,
2 5 1.0000 5.8000
Ax1 1 2.8000 0.4828
x2 = = ≈ .
max(Ax1 ) 5.8000 5.8000 1.0000
2 2 0.4828 2.9655
Ax2 = ≈ ,
2 5 1.0000 5.9655
Ax2 1 2.9655 0.4971
x3 = = ≈ .
max(Ax2 ) 5.9655 5.9655 1.0000
2 2 0.4971 2.9942
Ax3 = ≈ ,
2 5 1.0000 5.9942
Ax3 1 2.9942 0.4995
x4 = = ≈ ,
max(Ax3 ) 5.9942 5.9942 1.0000
which are the required first four approximations of the dominant eigenvec-
tor.
Notice that the main difference between the power method with Eu-
clidean scaling and the power method with maximum entry scaling is that
the Euclidean scaling gives a sequence that approaches a unit dominant
eigenvector, whereas maximum entry scaling gives a sequence that ap-
proaches a dominant eigenvector whose largest component is 1.
A−1 Ax = λA−1 x
430 Applied Linear Algebra and Optimization using MATLAB
or
1
A−1 x = x. (4.10)
λ
The solution procedure is initiated by starting with an initial guess for
the vector xi and improving the solution by getting a new vector xi+1 , and
so on until the vector xi is approximately equal to xi+1 .
Example 4.5 Use the inverse power method to find the first seven approxi-
mations of the least dominant eigenvalue and the corresponding eigenvector
of the following matrix using an initial approximation x0 = [0, 1, 2]T :
3 0 1
A = 0 −3 0 .
1 0 3
3 1
0 −
8 8
−1
1
A = 0 − .
0
3
1 3
− 0
8 8
3 1
0 −
8 8 0 −0.2500
1
A−1 x0 =
0 −3 0 = −0.3333
1
1 3 2 0.7500
− 0
8 8
−0.3333
= 0.75 −0.4444 = λ1 x1 .
1.0000
Numerical Computation of Eigenvalues 431
3 1
0 −
8 8 −0.3333 −0.2500
1
A−1 x1 = 0 − 0 −0.4444 = 0.1481
1 3
1.0000 0.4167
3
− 0
8 8
−0.6000
= 0.4167 0.3558 = λ2 x2 .
1.0000
3 1
0 −
8 8 −0.6000 −0.3500
1
A−1 x2 =
0 −3 0
0.3558 = −0.1185
1 3 1.0000 0.4500
− 0
8 8
−0.7778
= 0.4500 −0.2634 = λ3 x3 .
1.0000
3 1
0 −
8 8 −0.7778 −0.4167
1
A−1 x3 =
0 −3 0 −0.2634
= 0.0878
1 3 1.0000 0.4722
− 0
8 8
−0.8824
= 0.4722 0.1859 = λ4 x4 .
1.0000
432 Applied Linear Algebra and Optimization using MATLAB
3 1
0 −
−0.8824
8 8 −0.4559
1
A−1 x4 =
0 −3 0
0.1859 = −0.0620
1 3 1.0000 0.4853
− 0
8 8
−0.9394
= 0.4853 −0.1277 = λ5 x5 .
1.0000
3 1
0 −
8 8 −0.9394 −0.4773
1
A−1 x5 =
0 −3 0 −0.1277
= −0.0426
1 3 1.0000 0.4924
− 0
8 8
−0.9692
= 0.4924 −0.0864 = λ6 x6 .
1.0000
3 1
0 −
8 8 −0.9692 −0.4885
1
A−1 x6 =
0 −3 0
0.0864 = −0.0288
1 3 1.0000 0.4962
− 0
8 8
−0.9845
= 0.4962 −0.0581 = λ7 x7 .
1.0000
Since the eigenvalues of the given matrix A are −3.0000, 2.0000, and
4.0000, the dominant eigenvalue of A−1 after the seven iterations is λ7 =
0.4962 and converges to 21 and so the smallest dominant eigenvalue of the
given matrix A is the reciprocal of the dominant eigenvalue 21 of the matrix
A−1 , i.e., 2 and the corresponding eigenvector is [−0.9845, −0.0581, 1.0000]T .
Numerical Computation of Eigenvalues 433
To get the above results using the MATLAB Command Window, we do:
>> A = [3 0 1; 0 − 3 0; 1 0 3];
>> X = [0 1 2]0 ;
>> maxI = 7;
>> sol = IP M (A, X, maxI);
Program 4.2
MATLAB m-file for using the Inverse Power method
function sol=IPM (A, X, maxI)
[n, n] = size(A); B = inv(A);
for k=1:maxI; for i=1:n; s=0;
for j=1:n; ss = B(i, j) ∗ X(j, 1); s = s + ss;end
XX(i, 1) = s;end; X = XX; y = max(X);
for i=1:n; X(i, 1) = 1/y ∗ X(i, 1);end; yy=abs(y-y1);
if (yy <= 1e − 6); break; end; y; end
and it follows that the eigenvalues of (A − µI) are the same as those of A
except that they have all been shifted by an amount µ. The eigenvectors
remain unaffected by the shift.
The shifted inverse power method is to apply the power method to the
system
1
(A − µI)−1 x = x. (4.12)
(λ − µ)
Thus the iteration of (A − µI)−1 leads to the largest value of (λ−µ)
1
, i.e.,
the smallest value of (λ − µ). The smallest value of (λ − µ) implies that
the value of λ will be the value closest to µ. Thus, by a suitable choice
434 Applied Linear Algebra and Optimization using MATLAB
(A − µI)−1 us = vs (4.13)
and
vs
us+1 = . (4.14)
max(vs )
By rearranging (4.13), we obtain
us = (A − µI)vs
= LU vs .
Let
U vs = z, (4.15)
then
Lz = us . (4.16)
By using an initial value, we can find z from (4.16) by applying for-
ward substitution, and knowing z we can find vs from (4.15) by applying
backward substitution. The new estimate for the vector us+1 can then be
found from (4.14). The iteration is terminated when us+1 is sufficiently
close to us , and it can be easily shown when convergence is completed.
1
λµ = + µ. (4.17)
dominant eigenvalue of (A − µI)−1
The shifted inverse power method uses the power method as a basis
but gives faster convergence. Convergence is to the eigenvalue λ that is
Numerical Computation of Eigenvalues 435
Example 4.6 Use the shifted inverse power method to find the first five
approximations of the eigenvalue nearest µ = 6 of the following matrix
using the initial approximation x0 = [1, 1]T :
4 2
A= .
3 5
Solution. Consider
−2 2
B = (A − 6I) = .
3 −1
The inverse of B is
1 1
4 2
B −1 = (A − 3I)−1 = .
3 1
4 2
436 Applied Linear Algebra and Optimization using MATLAB
1
λµ = + 6 = 7.
1
To get the above results using the MATLAB Command Window, we
do the following:
>> A = [4 2; 3 5];
>> mu = 6;
>> X = [1 1]0 ;
>> maxI = 5;
>> sol = SIP M (A, X, mu, maxI);
Program 4.3
MATLAB m-file for Using Shifted Inverse Power method
function sol=SIPM (A, X, mu, maxI)
[n, n] = size(A); B = A − mu ∗ eye(n); C = inv(B);
for k=1:maxI; for i=1:n; s=0;
for j=1:n; ss = C(i, j) ∗ X(j, 1); s = s + ss;end
XX(i, 1) = s;end; X = XX; y = max(X);
for i=1:n; X(i, 1) = 1/y ∗ X(i, 1);end; yy=abs(y-y1);
if (yy <= 1e − 6); break; end; lmu = (1/y) + mu; end
438 Applied Linear Algebra and Optimization using MATLAB
n
X
Ri = {z ∈ C : |z − aii | ≤ |aij |}, i = 1, 2, · · · , n, (4.18)
j=1
j6=i
−10 and −5 each and must contain an eigenvalue. The other eigenvalues
must lie in the interval [3, 14]. By using the shifted inverse power method,
with = 0.000005, with initial approximations of 10, 5, −5, and −10, leads
to approximations of
λ1 = 10.4698, λ2 = 4.8803
λ3 = −5.1497, λ4 = −10.2004,
respectively. The number of iterations required ranges from 9 to 13. •
440 Applied Linear Algebra and Optimization using MATLAB
Rayleigh quotient as
xT Ax
µ= . (4.19)
xT x
The maximum eigenvalue λ1 can be obtained when x is the corresponding
vector, as in
xT Ax
λ1 = max T . (4.20)
x6=0 x x
λ1 ≥ λ2 ≥ λ3 · · · ≥ λn , (4.21)
It is evident that this process can be continued until all of the eigen-
values have been extracted. Although this method shows promise, it does
have a significant drawback, i.e., at each iteration performed in deflating
Numerical Computation of Eigenvalues 443
the original matrix, any errors in the computed eigenvalues and eigenvec-
tors will be passed on to the next eigenvectors. This could result in serious
inaccuracy, especially when dealing with large eigenvalue problems. This
is precisely why this method is generally used for small eigenvalue problems.
and let C be an (n−1)×(n−1) matrix obtained by deleting the first row and
first column of a matrix B. The matrix B has eigenvalues λ1 together with
the (n−1) eigenvalues of C. Moreover, if (β2 , β3 , . . . , βn )T is an eigenvector
of C with eigenvalue µ 6= λ1 , then the corresponding eigenvector of B is
(β1 , β2 , . . . , βn )T , with
Xn
a1j βj
j=2
β1 = . (4.24)
µ − λ1
Note that eigenvectors xi of A can be recovered by premultiplication by
Q. •
which has the dominant eigenvalue λ1 = 18, with the corresponding eigen-
vector x1 = [1, −1, − 21 ]T . Use the deflation method to find the other eigen-
values and eigenvectors of A.
Now we can easily find the eigenvalues of C, which are 6 and 3, with the
corresponding eigenvectors [1, − 21 ]T and [1, 1]T respectively. Thus, the other
two eigenvalues of A are 6 and 3. Now we calculate the eigenvectors of A
corresponding to these two eigenvalues. First, we calculate the eigenvectors
of B corresponding to λ = 6 from the system
β
β1
18 −6 −4 1
0 5 −2 1 = 6 1 .
1 1
0 −1 4 − −
2 2
Numerical Computation of Eigenvalues 445
18β1 − 4 = 6β1 ,
Program 4.4
MATLAB m-file for Using the Deflation method
function [Lamda,X]=DEFLATED(A,Lamda,XA)
[n,n]=size(a); Q=eye(n);
Q(:, 1) = XA(:, 1); B = inv(Q)∗A∗Q; c=B(2:n,2:n);
[xv, ev] = eig(c,0 nobalance0 );
for i=1:n-1
b = −(B(1, 2 : n) ∗ xv(:, i))/(Lamda − ev(i, i));
Xb(:, i) = [bxv(:, i)0 ]0 ; XA(:, i + 1) = Q ∗ Xb(:, i); end
Lamda=[Lamda;diag(ev)]; Lamda; XA; end
A = QBQT . (4.25)
B T = (QT AQ)T = QT AQ = B.
Av = λv, (4.27)
vT Av = λ. (4.29)
Assume that
A1 = QT1 AQ1
.. ..
. .
Ak = QTk · · · QT1 AQ1 · · · Qk .
Ak → λ and Q1 Q2 · · · Qk → v. (4.30)
1. QT Q = I,
2. QT AQ = D,
A1 = QT1 AQ1
or
a∗11 a∗12
cos θ sin θ a11 a12 cos θ − sin θ
= .
a∗12 a∗22 − sin θ cos θ a21 a22 sin θ cos θ
Since our task is to reduce a∗12 to zero, carrying out the multiplication on
the right-hand side and using matrix equality gives
a∗12 = 0 = −(sin θ cos θ)a11 + (cos2 θ)a12 − (sin2 θ)a12 + (cos θ sin θ)a22 .
Numerical Computation of Eigenvalues 451
Q1 Q2 · · · Qk = v.
Example 4.11 Use the Jacobi method to find the eigenvalues and the
eigenvectors of the matrix
3.0 0.01 0.02
A = 0.01 2.0 0.1 .
0.02 0.1 1.0
452 Applied Linear Algebra and Optimization using MATLAB
Solution. The largest off-diagonal entry of the given matrix A is a23 = 0.1,
so we begin by reducing element a23 to zero. Since p = 2 and q = 3, the
first orthogonal transformation matrix has the form
1 0 0
Q1 = 0 c −s .
0 s c
Then
1 0 0 1 0 0
Q1 = 0 0.9951 −0.0985 and QT1 = 0 0.9951 0.0985
0 0.0985 0.9951 0 −0.0985 0.9951
and
3.0 0.0119 0.0189
A1 = QT1 AQ1 = 0.0119 2.0099 0 .
0.0189 0 0.9901
Note that the rotation makes a32 and a23 zero, increasing slightly a21
and a12 , and decreasing the second dominant off-diagonal entries a13 and
a31 .
Then
0.9999 0 −0.0094 0.9999 0 0.0094
Q2 = 0 1 0 and QT2 = 0 1 0 .
0.0094 0 0.9999 −0.0094 0 0.9999
Hence,
3.0002 0.0119 0
A2 = QT2 A1 Q2 = QT2 QT1 AQ1 Q2 = 0.0119 2.0099 −0.0001 .
0 −0.0001 0.9899
and
1 2a12 1 2(0.0119)
θ = arctan = arctan ≈ 0.7638
2 a11 − a22 2 3.0002 − 2.0099
Then
0.9999 −0.0120 0
Q3 = 0.0120 0.9999 0 and
0 0 1
454 Applied Linear Algebra and Optimization using MATLAB
0.9999 0.0120 0
QT3 = −0.0120 0.9999 0 .
0 0 1
Hence,
3.0003 0 −1.35E − 6
A3 = QT3 QT2 QT1 AQ1 Q2 Q3 = 0 2.00 −1.122E − 4 ,
−1.35E − 6 −1.122E − 4 0.9899
which gives the diagonal matrix D, and its diagonal elements converge
to 3, 2, and 1, which are the eigenvalues of the original matrix A. The
corresponding eigenvectors can be computed as follows:
0.9998 −0.0121 −0.0094
v = Q1 Q2 Q3 = 0.0111 0.9951 −0.0985 .
0.0106 0.0984 0.9951
To reproduce the above results by using the Jacobi method and the
MATLAB Command Window, we do the following:
Program 4.5
MATLAB m-file for the Jacobi method
function sol=JOBM(A)
[n,n]=size(A); QQ=[ ];
for u = 1 : .5 ∗ n ∗ (n − 1); for i=1:n; for j=1:n
if (j > i); aa(i,j)=A(i,j); else; aa(i,j)=0;
end; end; end
aa=abs(aa); mm=max(aa); m=max(mm);
[i,j]=find(aa==m); i=i(1); j=j(1);
t = .5 ∗ atan(2 ∗ A(i, j)/(A(i, i) − A(j, j))); c=cos(t); s=sin(t);
for ii=1:n; for jj=1:n; Q(ii,jj)=0.0;
if (ii==jj); Q(ii,jj)=1.0; end; end; end
Q(i,i)=c; Q(i,j)=-s; Q(j,i)=s; Q(j,j)=c;
for i1=1:n; for j1=1:n;
QT(i1,j1)=Q(j1,i1); end; end
for i2=1:n; for j2=1:n; s=0;
for k = 1 : n; ss = QT (i2, k) ∗ A(k, j2);
s=s+ss; end; QTA(i2,j2)=s; end; end
for i3=1:n; for j3=1:n; s=0;
for k=1:n; ss = QT A(i3, k) ∗ Q(k, j3); s=s+ss; end
A(i3,j3)=s; end; end; QQ=[QQ,Q]; end; D=A
y=[]; for k=1:n; yy=A(k,k); y=[y;yy]; end; eigvals=y
x=Q; if (n > 2) % Compute eigenvectors
x(1:n,1:n)=QQ(1:n,1:n);
for c = n + 1 : n : n ∗ n; xx(1:n,1:n)=QQ(1:n,c:n+c-1);
for i=1:n; for j=1:n; s=0;
for k=1:n; ss = x(i, k) ∗ xx(k, j); s=s+ss; end
x1(i,j)=s; end; end; x=x1; end; end
0 0 b 4 a4
and assume that bi 6= 0, for each i = 2, 3, 4. Then one can define the
characteristic polynomial of a given matrix A as
f4 (λ) = det(A − λI), (4.32)
which is equivalent to
a1 − λ b2 0 0
b 2 a 2 − λ b 3 0
f4 (λ) = .
0 b 3 a 3 − λ b4
0 0 b 4 a4 − λ
Example 4.12 Use the Sturm sequence iteration to find the eigenvalues
of the symmetric tridiagonal matrix
1 2 0 0
2 4 1 0
A= .
0 1 5 −1
0 0 −1 3
f0 (λ) = 1
f1 (λ) = (a1 − λ)
= 1 − λ.
Thus,
f4 (λ) = λ4 − 13λ3 − 53λ2 − 66λ − 3 = 0.
Solving the above equation, we have the eigenvalues 6.11, 4.41, 2.54, and
−0.04 of the given symmetric tridiagonal matrix. •
458 Applied Linear Algebra and Optimization using MATLAB
>> A = [1 2 0 0; 2 4 1 0; 0 1 5 − 1; 0 0 − 1 3];
>> sol = SturmS(A);
Program 4.6
MATLAB m-file for the Sturm Sequence method
function sol=SturmS(A)
% This evaluates the eigenvalues of a tridiagonal symmetric ma-
trix
[n,n] = size(A);
ff(1,:)=[1 0 0 0 0]; ff(2,:)=[A(1,1) -1 0 0 0];
for i=3:n+1; h=[A(i-1,i-1) -1];
f f (i, 1) = h(1) ∗ f f (i − 1, 1) − A(i − 1, i − 2)ˆ 2 ∗ f f (i − 2, 1);
for z=2:n+1
f f (i, z) = h(1)∗f f (i−1, z)+h(2)∗f f (i−1, z−1)−A(i−1, i−2)ˆ
2 ∗ f f (i − 2, z); end; end
for i=1:n+1; y(i)=ff(n+1,n+2-i);end; eigval=roots(y)
Theorem 4.7 For any real number λ∗ , the number of agreements in signs
of successive terms of the Sturm sequence {f0 (λ∗ ), f1 (λ∗ ), . . . , fn (λ∗ )} is
equal to the number of eigenvalues of the tridiagonal matrix A greater than
λ∗ . The sign of a zero is taken to be opposite to that of the previous term.
•
Example 4.13 Find the number of eigenvalues of the matrix
3 −1 0
A = −1 2 −1
0 −1 3
lying in the interval (0, 4).
Also,
f2 (0) = (a2 − 0)f1 (0) − b22 f0 (0)
which have signs + + ++, with three agreements. So all three eigenvalues
are greater than λ∗ = 0.
Also,
f2 (4) = (a2 − 4)f1 (4) − b22 f0 (4)
For Given’s method, the angle θ is chosen to create zeros, not in the
(p, q) and (q, p) positions as in the Jacobi method, but in the (p − 1, q)
and (q, p − 1) positions. This is because zeros can be created in row order
without destroying those previously obtained.
in exactly
1
(n − 2) + (n − 3) + · · · + 1 ≡ (n − 1)(n − 2)
2
steps. This method also uses rotation matrices as the Jacobi method does,
but in the following form:
cos θ = (p−1, p−1), sin θ = (p−1, q), − sin θ = (q, p−1), cos θ = (q, q)
and
ap−1q
θ = − arctan .
ap−1p
We can also find the values of cos θ and sin θ by using
|ap−1p | |ap−1q |
cos θ = and sin θ = ,
R R
where q
R= (ap−1p )2 + (ap−1q )2 .
Solution.
Step I. Create a zero in the (1, 3) position by using the first orthogonal
transformation matrix as
1 0 0 0
0 c s 0
Q23 =
0 −s c 0 .
0 0 0 1
462 Applied Linear Algebra and Optimization using MATLAB
Then
1 0 0 0 1 0 0 0
0 0.3333 0 0.9428 0 0.3333 0 −0.9428
Q24 =
0
and QT24 = ,
0 1 0 0 0 1 0
0 −0.9428 0 0.3333 0 0.9428 0 0.3333
which gives
2.0 −4.2426 0 0
−4.2426 3.4444 0.3333 −3.6927
A2 = QT24 A1 Q24 = .
0 0.3333 5.0 −1.1785
0 −3.6927 −1.7185 5.5556
Step III. Create a zero in the (2, 4) position by using the third orthogonal
transformation matrix as
1 0 0 0
0 1 0 0
Q34 = 0
0 c s
0 0 −s c
and
a24 −3.6927
θ = − arctan = − arctan ≈ 94.2695
a23 0.3333
>> A = [2 − 1 1 4; −1 3 1 2; 1 1 5 3; 4 2 − 3 6];
>> sol = Given(A);
Program 4.7
MATLAB m-file for Given’s method
function sol=Given(A)
[n, n] = size(A); t = 0; for i=1:n; for j=1:n
if i==j; Q(i,j)=1; else; Q(i,j)=0; end; end;end
for i=1:n-2; for j=i+2:n; t=t+1;
for f=1:n; for g=1:n
Q(f, t ∗ n + g) = Q(f, g); end; end;
theta=atan(A(i,j)/A(i,i+1));
Q(i+1, t∗n+i+1) = cos(theta); Q(i+1, t∗n+j) = −sin(theta);
Q(j, t ∗ n + i + 1) = sin(theta); Q(j, t ∗ n + j) = cos(theta);
for f=1:n; for g=1:n; sum=0; for l=1:n
sum = sum + a(f, l) ∗ Q(l, t ∗ n + g); ;end; aa(f,g)=sum; end; end
for f=1:n; for g=1:n; sum=0; for l=1:n
sum = sum + Q(l, t ∗ n + f ) ∗ aa(l, g); end
A(f,g)=sum; end; end; end; end T=A
% Solve the tridiagonal matrix using Sturm sequence method
ff(1,:)=[1 0 0 0 0]; ff(2,:)=[A(1,1) -1 0 0 0];
for i=3:n+1; h=[A(i-1,i-1) -1];
f f (i, 1) = h(1) ∗ f f (i − 1, 1) − A(i − 1, i − 2)ˆ 2∗f f (i − 2, 1);
for z=2:n+1
f f (i, z) = h(1)∗f f (i−1, z)+h(2)∗f f (i−1, z −1)−A(i−1, i−2)
ˆ 2∗f f (i − 2, z); end;end
for i=1:n+1; y(i) = f f (n + 1, n + 2 − i); end; eigval = roots(y)
Numerical Computation of Eigenvalues 465
Thus,
Hw = Hw−1 = HwT ,
which shows that Hw is symmetric. Note that the determinant of a House-
holder matrix Hw is always equal to −1.
>> w = [1 2]0 ;
>> w = w/norm(w);
>> Hw = eye(2) − 2 ∗ w ∗ w0 ;
The basic steps of Householder’s method that require us to convert the
symmetric matrix into a symmetric tridiagonal matrix are as follows:
A1 = A
A2 = QT1 A1 Q1
A3 = QT2 A2 Q2
.. ..
. .
Ak+1 = QTk Ak Qk ,
Numerical Computation of Eigenvalues 467
and v
u n
uX
wk+1k = ak+1k ± t a2ik .
i=k+1
Solution. Since the given matrix is of size 3 × 3, only one iteration is re-
quired in order to reduce the given symmetric matrix into symmetric tridi-
agonal form. Thus, for k = 1, we construct the elements of the vector w1
as follows:
w11 = 0
w31 = a31 = p
5 √
w21 = a21 ± a221 + a231 = 6 ± 62 + 52 = 6 ± 7.81.
Since the given coefficient a21 is positive, the positive sign must be used for
w21 , i.e.,
w21 = 13.81.
Therefore, the vector w1 is now determined to be
and
2
s1 = = 0.0093.
(0)2 + (13.81)2 + (5)2
Thus, the first transformation matrix Q1 for the first iteration is
1 0 0 0
Q1 = 0 1 0 − 0.009 13.81 0 13.81 5 ,
0 0 1 5
and it gives
1 0 0
Q1 = 0 −0.7682 −0.6402 .
0 −0.6402 0.7682
Therefore,
30.0 −7.810 0
A2 = QT1 A1 Q1 = −7.810 38.85 −1.622 ,
0 −1.622 21.15
Program 4.8
MATLAB m-file for Householder’s method
function sol=HHHM(A)
[n, n] = size(A); Q = eye(n); for k=1:n-2
alf a = sign(A(k+1, k))∗sqrt(A((k+1) : n, k)0 ∗A((k+1) : n, k));
w = zeros(n, 1);
w(k + 1, 1) = A(k + 1, k) + alf a; w((k + 2) : n, 1) = A((k + 2) :
n, k);
P = eye(n) − 2 ∗ w ∗ w0 /(w0 ∗ w); Q = Q ∗ P ; A = P ∗ A ∗ P ; end
T=A % this is the tridiagonal matrix
% using Sturm sequence method
ff(1,:)=[1 0 0 0 0]; ff(2,:)=[A(1,1) -1 0 0 0];
for i=3:n+1
h = [A(i−1, i−1)−1]; f f (i, 1) = h(1)∗f f (i−1, 1)−A(i−1, i−2)
ˆ 2∗f f (i − 2, 1);
for z=2:n+1
f f (i, z) = h(1)∗f f (i−1, z)+h(2)∗f f (i−1, z−1)−A(i−1, i−2)ˆ
2*f f (i − 2, z); end; end
for i=1:n+1; y(i)=ff(n+1,n+2-i); end; alfa; u; Q; eig-
val=roots(y)
tion.
w11 = 0
w31 = a31 = 2
w41 = a41 = p
1 √ √
w21 = a21 ± a221 + a231 + a241 = 1 ± 12 + 22 + 11 = 1 ± 6.
Since the given coefficient a21 > 0, the positive sign must be used for w21 ,
and it gives
w21 = 1 + 2.4495 = 3.4495.
Thus, the vector w1 takes the form
and
2 2 2
s1 = = = = 0.1183.
w1T w1 (0)2 2 2
+ (3.4495) + (2) + 1 2 16.83
and it gives
1.0000 0 0 0
0 −0.4082 −0.8165 −0.4082
Q1 = .
0 −0.8165 0.5266 −0.2367
0 −0.4082 −0.2367 0.8816
Numerical Computation of Eigenvalues 471
Therefore,
7.0000 −2.4495 0.0000 0
−2.4495 4.6667 1.5700 1.1933
A2 = QT1 A1 Q1 =
0.0000
.
1.5700 4.7816 2.9972
0 1.1933 2.9972 3.5518
w12 = 0
w22 = 0
w42 = 1.1933 p √
w32 = 1.5700 ± (1.5700)2 + (1.1933)2 = 1.5700 ± 0000003.855.
= 1.5700 ± 1.9721
Since the given coefficient a32 > 0, the positive sign must be used for w32 ,
and it gives
w32 = 1.5700 + 1.9721 = 3.5421.
Thus, the vector w2 takes the form
and
2 2
s2 = = = 0.1432.
w2T w2 13.9704
Thus, the second transformation matrix Q2 for the second iteration is
1 0 0 0 0
0 1 0 0
−0.1432 0
Q2 = I−s2 w2 w2T =
0 0 3.5421 1.1933 ,
0 0 1 0 3.5421
0 0 0 1 1.1933
and it gives
1.0000 0 0 0
0 −0.4082 0.8971 0.1690
Q2 = .
0 −0.8165 −0.2760 −0.5071
0 −0.4082 −0.3450 0.8452
472 Applied Linear Algebra and Optimization using MATLAB
Therefore,
7.0000 −2.4495 0.0000 0.0000
−2.4495 4.6667 −1.9720 0.0000
A3 = QT2 A2 Q2 =
0.0000 −1.9720
= T,
7.2190 −0.2100
0.0000 0.0000 −0.2100 1.1143
which is the symmetric tridiagonal form.
where
f3 (λ) = (a3 − λ)f2 (λ) − b23 f1 (λ)
and
f2 (λ) = (a2 − λ)f1 (λ) − b22 f0 (λ),
with
f1 (λ) = (a1 − λ) and f0 (λ) = 1.
Since
and
b2 = −2.4495, b3 = −1.9720, b4 = −0.2100.
Thus,
which are the eigenvalues of the symmetric tridiagonal matrix T and are
also the eigenvalues of the given matrix A. Once the eigenvalues of A are
obtained, then the corresponding eigenvectors of A can be obtained by using
the shifted inverse power method. •
Numerical Computation of Eigenvalues 473
4.6.1 QR Method
We know that the Jacobi, Given’s, and Householder’s methods are appli-
cable only to symmetric matrices for finding all the eigenvalues of a matrix
A. First, we describe the QR method, which can find all the eigenvalues of
a general matrix. In this method we decompose an arbitrary real matrix
A into a product QR, where Q is an orthogonal matrix and R is an upper-
triangular matrix with nonnegative diagonal elements. Note that when A
is nonsingular, this decomposition is unique.
Ri = Q−1
i Ai ,
Ai+1 = Q−1 T
i Ai Qi = Qi Ai Qi ,
where all Ai are similar to A, and thus have the same eigenvalues. It turns
out that in the case where the eigenvalues of A all have different magnitude,
A. When there are distinct eigenvalues of the same size, the iterates Ai
may not approach an upper-triangular matrix; however, they do approach
a matrix that is near enough to an upper-triangular matrix to allow us to
find the eigenvalues of A.
Then
0.7453 0 −0.6667 0.7453 0 0.6667
Q13 = 0 1 0 and QT13 = 0 1 0 ,
0.6667 0 0.7453 −0.6667 0 0.7453
which gives
3.0001 7.3336 5.0002
QT13 (QT12 A) = 0 −2.2360 −2.2360 .
0.0001 1.4909 2.2363
Step III. Create a zero in the (3, 2) position by using the third orthogonal
transformation matrix
1 0 0
Q23 = 0 c s ,
0 −s c
with
a32 1.4909
θ = − arctan = − arctan ≈ 37.4393
a22 −2.2360
0.9999 3.9997 2.9997
A1 = Q1 R1 = 2.0001 3.0001 1.0000 ,
2.0001 6.0001 5.0001
9.2222 1.3506 −0.9509
A2 = R1 Q1 = −3.8589 −0.6068 −1.2564 ,
0.4134 −0.2564 0.3846
Note that if we continue in the same way with the 21 iterations, the
new matrix A21 becomes the upper-triangular matrix
8.5826 −4.9070 −2.1450
A21 = R20 Q20 = 0 1 −1.1491 ,
0 0 −0.5825
>> A = [1 4 3; 2 3 1; 2 6 5];
>> sol = QRM (A);
478 Applied Linear Algebra and Optimization using MATLAB
Program 4.9
MATLAB m-file for the QR method
function sol=QRM(A)
[n,n]=size(A); M=0; for i=1:n; for j=1:n
if j < i; M=M+1;end; end;end;
for i=1:n; I(i,i)=1;end
dd=1; while dd > 0.0001; Q=I; Qs=I; kk=1;
for i=2:n; for j=1:i-1
t = −atan((A(i, j)/A(j, j))); Q(j, j) = cos(t);
Q(j, i) = sin(t); Q(i, j) = −sin(t); Q(i, i) = cos(t); Q;
A = Q0 ∗ A; Qs(:, :, kk) = Q; kk = kk + 1; Q = I; end; end;
Q = Qs(:, :, M ); f orc = M − 1 : −1 : 1
Q = Qs(:, :, c) ∗ Q; end; R = A; Q; A = R ∗ Q; k = 1;
for i=1:n; for j=1:n
if j < i; m(k) = A(i, j); k = k + 1; end; end; end;
m; dd = max(abs(m)); end; for i=1:n; eigvals(i)=A(i,i); end
Solution. First, create a zero in the (2, 1) position with the help of the
orthogonal transformation matrix
c s
Q12 = ,
−s c
and then, for finding the value of the θ, c, and s, we calculate the
a21
θ = − arctan = − arctan(−0.4) = 0.3805,
a11
So,
0.9285 0.3714
Q1 = Q12 =
−0.3714 0.9285
and
5.3853 −4.8282
R1 = QT12 A = .
0 6.6852
Since
0.9285 −0.3714 7 3.5283
c= QT1 b = = ,
0.3714 0.9285 8 10.0278
4.6.2 LR Method
Another method, which is very similar to the QR method, is Rutishauser’s
LR method. This method is based upon the decomposition of a matrix A
into the product of lower-triangular matrix L (with unit diagonal elements)
and upper-triangular matrix R. Starting with A1 = A, the LR method
iteratively computes similar matrices Ai , i = 2, 3, . . . , in two stages.
(1) Factor Ai into Li Ri , i.e., Ai = Li Ri .
Ai+1 = Ri Li = L−1
i Ai Li ,
and so all of the matrices Ai have the same eigenvalues. This triangular
decomposition-based method enables us to reduce a given nonsymmetric
480 Applied Linear Algebra and Optimization using MATLAB
and Pi−1 = Pi .
2 −3.4978 3.0000
= 0 3.9985 −2.0000
0 −0.0022 1.0015
2 −3.4978 3 1 0 0 2 −3.4995 3.0000
A7 = 0 3.9985 −2 0 1.0000 0 = 0 3.9996 −2.0000
0 0 1 0 −0.0005 1 0 −0.0005 1.0004
2 −3.4995 3 1 0 0 2 −3.4999 3.0000
A8 = 0 3.9996 −2 0 1.0000 0 = 0 3.9999 −2.0000
0 0 1 0 −0.0001 1 0 −0.0001 1.0001
2 −3.4999 3 1 0 0 2 −3.5 3
A9 = 0 3.9999 −2 0 1 0 = 0 4 −2 .
0 0 1 0 0 1 0 0 1
•
a11 a12 a13 a14 a15
a21 a22 a23 a24 a25
A=
a31 a32 a33 a34 a35 .
a41 a42 a43 a44 a45
a51 a52 a53 a54 a55
The first step of reducing the given matrix A = A1 into upper Hes-
senberg form is to eliminate the elements in the (3, 1), (4, 1), and (5, 1)
positions. It can be done by subtracting multiples m31 = aa31 21
, m41 = aa41
21
a51
and m51 = a21 of row 2 from rows 3, 4, and 5, respectively, and considering
the matrix
1 0 0 0 0
0 1 0 0 0
M1 =
0 m31 1 0 0 .
0 m41 0 1 0
0 m51 0 0 1
A2 = M1−1 A1 M1 .
484 Applied Linear Algebra and Optimization using MATLAB
1 0 0 0 0
0 1 0 0 0
M3 =
0 0 1 0 0 .
0 0 0 1 0
0 0 0 m53 1
Hence,
(2) (3) (4)
a11 a12 a13 a14 a15
a (2) (3) (4)
21 a22 a23 a24 a25
A4 = M3−1 A3 M3 = 0 (2) (3) (4)
a32 a33 a34 a35
(2)
,
0 (3) (4) (4)
0 a43 a44 a45
(4) (4)
0 0 0 a54 a55
Example 4.21 Use the Gaussian elimination method to convert the ma-
trix
5 3 6 4 9
4 6 5 3 4
4
A1 = 2 3 1 1
2 4 6 3 3
2 5 6 4 7
Solution. In the first step, we eliminate the elements in the (3, 1), (4, 1)
and (5, 1) positions. It can be done by subtracting multiples m31 = 44 =
1, m41 = 24 = 0.5, and m51 = 24 = 0.5 of row 2 from rows 3, 4, and 5,
486 Applied Linear Algebra and Optimization using MATLAB
In the second step, we eliminate the elements in the (4, 2) and (5, 2)
5.75
positions. This can be done by subtracting multiples m42 = −8.50 = −0.6765
9.25
and m52 = −8.50 = −1.0882 of row 3 from rows 4 and 5, respectively. The
matrices M2 and M2−1 are as follows:
1 0 0 0 0 1 0 0 0 0
0 1 0 0 0 0 1 0 0 0
−1
M2 = 0 0 1 0 0 and M2 = 0 0 1 0 0
.
0 0 0.6765 1 0 0 0 −0.6765 1 0
0 0 1.0882 0 1 0 0 −1.0882 0 1
In the last step, we eliminate the elements in the (5, 3) position. This
can be done by subtracting multiples m53 = −0.7837
3.1678
= −0.2474 of row 4
Numerical Computation of Eigenvalues 487
>>>> A = [5 3 6 4 9; 4 6 5 3 4; 4 2 3 1 1; 2 4 6 3 3; 2 5 6 4 7];
>> sol = hes(A);
Program 4.10
MATLAB m-file for the Upper Hessenberg Form
function sol=hes(A)
n = length(A(1, :)); for i = 1:n-1; m = eye(n);
[wj] = max(abs(A(i + 1 : n, i)));
if j > i + 1;
t = m(i + 1, :); m(i + 1, :) = m(j, :);
m(j, :) = t; A = m ∗ A ∗ m0 ; end;
m = eye(n); m(i + 2 : n, i + 1) = −A(i + 2 : n, i)/(A(i +
1, i));
mi = m; mi(i + 2 : n, i + 1) = −m(i + 2 : n, i + 1);
A = m ∗ A ∗ mi; mesh(abs(A)); end
488 Applied Linear Algebra and Optimization using MATLAB
(j)
Note that the above reduction fails if any aj+1,j = 0 and, as in Gaussian
elimination, is unstable whenever |mij | > 1. Row and column interchanges
are used to avoid these difficulties (i.e., Gaussian elimination with pivot-
ing). At step j, the elements below the diagonal in column j are examined.
If the element of the largest modulus occurs in row rj , say, then rows j + 1
and rj are interchanged. Here, we perform the transformation
where Ij+1,rj denotes a matrix obtained from the identity matrix by inter-
changing rows j + 1 and rj , and the elements of Mj are all less than or
equal to one in the modulus. Note that
−1
Ij+1,r j
= Ij+1,rj .
Example 4.22 Use Gaussian elimination with pivoting to convert the ma-
trix
3 2 1 −1
1 4 2 1
A= 2 2 3 −2
5 1 2 3
into upper Hessenberg form.
Solution. The element of the largest modulus below the diagonal occurs
in the fourth row, so we need to interchange rows 2 and 3 and columns 2
and 3 to get
1 0 0 0 3 2 1 −1 1 0 0 0
0 0 0 1 1 4 2 1 0 0 0 1
A1 = I24 AI24 =
0 0 1 0 2 2 3 −2 0 0 1 0 ,
0 1 0 0 5 1 2 3 0 1 0 0
which gives
3 −1 1 2
5 3 2 1
A1 =
2 −2 3 2 .
1 1 2 4
Numerical Computation of Eigenvalues 489
Now we eliminate the elements in the (3, 1) and (4, 1) positions. It can
be done by subtracting multiples m31 = 52 = 0.4 and m41 = 15 = 0.2 of row
2 from rows 3 and 4, respectively. Then the transformation
1 0 0 0 3 −1 1 2 1 0 0 0
0 1 0 0
A2 = M −1 A1 M = 5 3 2 1 0 1 0 0
0 −0.4 1 0 2 −2 3 2 0 0.4 1 0
0 −0.2 0 1 1 1 2 4 0 0.2 0 1
gives
3 −0.2 1 2
5 4 2 1
A2 = .
0 −2 2.2 1.6
0 1.8 1.6 3.8
The element of the largest modulus below the diagonal in the second
column occurs in the third row, and so there is no need to interchange the
row and column. Now we eliminate the elements in the (4, 2) position.
This can be done by subtracting multiples m42 = 1.8
−2
= −0.9 of row 3 from
row 4. Then the transformation
1 0 0 0 3 −0.2 1 2 1 0 0 0
0 1 0 0 5 4 2 1 0 1 0 0
A3 = M2−1 A2 M2 =
0 0 1 0 0 −2 2.2 1.6 0 0 1 0
0 0 0.9 1 0 1.8 1.6 3.8 0 0 0.9 1
gives
3 −0.2 −0.8 2
5 4 1.1 1
A3 =
0 −2
,
0.76 1.6
0 0 −1.136 5.24
which is in upper Hessenberg form. •
is changed to
Hi − µi I = Qi Ri
Hi+1 = Ri Qi + µi I. (4.36)
This change is called shift because subtracting µi I from Hi shifts the
eigenvalues of the right side by µi as well as the eigenvalues of Ri Qi . Adding
µi I in the second equation in (4.36) shifts the eigenvalues of Hi+1 back
to the original values. However, the shifts accelerate convergence of the
eigenvalues close to µi .
Solution. Since the singular values of A are the square roots of the eigen-
values of AT A, we compute
1 1 2 1 1
1 0 1
AT A = 0 1 = 1 1 0 .
1 1 0
1 0 1 0 1
Note that the singular values of A are not the same as its eigenvalues,
but there is a connection between them if A is a symmetric matrix.
Theorem 4.8 If A = AT is a symmetric matrix, then its singular values
are the absolute values of its nonzero eigenvalues, i.e.,
σi = |λi | > 0. •
The following are some of the properties that make singular value decom-
positions useful:
2. A real square matrix is invertible, if and only if all its singular values
are nonzero.
v1T
σ1 0
σ2 v2T
...
..
.
A = U DV T = (u1 u2 · · · ur ur+1 · · · un ) σr vrT .
T
0 vr+1
.. ..
. .
0 0 vnT
•
494 Applied Linear Algebra and Optimization using MATLAB
For the orthogonal matrix U , we first note that {Av1 , Av2 , . . . , Avn }
is an orthogonal set of vectors in Rm . To see this, suppose that vi is an
eigenvector of AT A corresponding to an eigenvalue λi , then, for i 6= j, we
have
since the eigenvectors vi are orthogonal. Now recall that the singular values
satisfy σi = kAvi k and that the first r of these are nonzero. Therefore, we
can normalize Av1 , . . . , Avr by setting
1
ui = Avi , for i = 1, 2, . . . , r.
σi
This guarantees that {u1 , u2 , . . . , ur } is an orthonormal set in Rm , but
if r < m, it will not be a basis for Rm . In this case, we extend the set
{u1 , u2 , . . . , ur } to an orthonormal basis {u1 , u2 , . . . , um } for Rm .
Example 4.25 Find the singular value decomposition of the following ma-
trix:
1 0 1
A= .
1 1 0
Solution. We compute
1 1 2 1 1
1 0 1
AT A = 0 1 = 1 1 0 ,
1 1 0
1 0 1 0 1
Numerical Computation of Eigenvalues 495
λ1 = 3, λ2 = 1, λ3 = 0,
2 1
√ 0 −√
6 3
1
1
− √
1
v1 = √ , v2 = , v3 = √ .
6 2
3
1
1
√ 1
√ 2 √
6 3
Thus,
2 1
√ 0 −√
6 3
1
√
√ −√1 1
3 0 0
V = √ ,
D= .
6 2 3 0 1 0
1 1 1
√ √ √
6 2 3
496 Applied Linear Algebra and Optimization using MATLAB
To find U , we compute
2
√
6 1
√
2
1 1 1 0 1 1
u1 = Av1 = √ √ =
σ1 3 1 1 0
6 1
√
2
1
√
6
and
0
1
√
1
2
1 1 1 0 1 −√
u2 = Av2 = = .
σ2 1 1 1 0
2 1
−√
1 2
√
2
These vectors already form an orthonormal basis for R2 , so we have
1 1
√ √
2 2
U = .
1 1
√ −√
2 2
This yields the SVD
2 1
√ 0 −√
1 1 6 3
√ √
2 2 √
3 0 0 √1 1 1
A=
−√ √ .
1 0 1 0 6
1 2 3
√ −√
2 2
1 1 1
√ √ √
6 2 3
•
Numerical Computation of Eigenvalues 497
>> A = [1 0 1; 1 1 0];
>> [U, D, V ] = svd(A);
Thus,
0 1 10 0
V = and D= .
1 0 0 5
To find U , we compute
1 1 −4 −6 0 −0.6
u1 = Av1 = =
σ1 10 3 −8 1 −0.8
and
1 1 −4 −6 1 −0.8
u2 = Av2 = = .
σ2 5 3 −8 0 0.6
These vectors already form an orthonormal basis for R2 , so we have
−0.6 −0.8
U= .
−0.8 0.6
x = V D−1 U T b
or
x1 0 1 0.1 0 −0.6 −0.8 −0.04
= = .
x2 1 0 0 0.2 −0.8 0.6 −0.14
So
x1 = −0.04 and x2 = −0.14,
which is the solution of the given linear system. •
4.7 Summary
We discussed many numerical methods for finding eigenvalues and eigen-
vectors. Many eigenvalue problems do not require computation of all of
the eigenvalues. The power method gives us a mechanism for computing
the dominant eigenvalue along with its associated eigenvector for an arbi-
trary matrix. The convergence rate of the power method is poor when the
two largest eigenvalues in magnitude are nearly equal. The technique of
shifting the matrix by an amount (−µI) can help us to overcome this dis-
advantage, and it can also be used to find intermediate eigenvalues by the
power method. Also, if a matrix A is symmetric, then the power method
gives faster convergence to the dominant eigenvalue and associated eigen-
vector. The inverse power method is used to estimate the least dominant
eigenvalue of a nonsingular matrix. The inverse power method is guar-
anteed to converge if a matrix A is diagonalizable with the single least
dominant nonzero eigenvalue. The inverse power method requires more
computational effort than the power method, because a linear algebraic
system must be solved at each iteration. The LU decomposition method
(Chapter 1) can be used to efficiently accomplish this task. We also dis-
cussed the deflation method to obtain other eigenvalues once the dominant
eigenvalue is known, and the Gerschgorin Circles theorem, which gives a
crude approximation of the location of the eigenvalues of a matrix.
500 Applied Linear Algebra and Optimization using MATLAB
4.8 Problems
1. Find the first four iterations of the power method applied to each of
the following matrices:
2 3 1
(a) 1 4 −1 , start with x0 = [0, 1, 1]T .
3 1 2
5 4 6
(b) 2 2 −3 , start with x0 = [1, 1, 1]T .
3 1 1
1 1 1
(c) −2 2 1 , start with x0 = [1, 1, 0]T .
5 1 1
Numerical Computation of Eigenvalues 501
3 0 0 2
0 3 0 −1
, start with x0 = [1, 0, 0, 0]T .
(d)
1 0 2 2
0 0 4 2
2. Find the first four iterations of the power method with Euclidean
scaling applied to each of the following matrices:
2 1 2
(a) 1 4 −1 , start with x0 = [1, 0, 1]T .
2 −1 2
2 4 −1
(b) 4 2 −3 , start with x0 = [1, 0, 0]T .
−1 −3 5
3 2 4
(c) 2 3 −1 , start with x0 = [0, 1, 1]T .
4 −1 3
3 1 0 1
1 4 −1 3 , start with x0 = [1, 0, 1, 1]T .
(d) 0 −1 5 1
1 3 1 2
5. Find the first four iterations of the following matrices by using the
shifted inverse power method:
2 3 3
(a) 1 4 −1 , start with x0 = [0, 1, 1]T , µ = 4.5.
3 1 2
1 1 −1
(b) 2 1 −3 , start with x0 = [1, 1, 1]T , µ = 5.
2 −4 1
502 Applied Linear Algebra and Optimization using MATLAB
1 1 1
(c) −2 2 1 , start with x0 = [1, 1, 0]T , µ = 4.
3 3 3
3 0 3 2
1 3 0 −1
(d) , start with x0 = [1, 0, 0, 0]T , µ = 3.5.
1 0 2 2
0 0 0 2
6. Find the dominant eigenvalue and corresponding eigenvector by using
the power method, with x(0) = [1, 1, 1]t (only four iterations):
3 0 1
A = 2 2 2 .
4 2 5
Also, solve by using the inverse power method by taking the initial
value of the eigenvalue by using the Rayleigh quotient theorem.
7. Find the dominant eigenvalue and corresponding eigenvector of the
matrix A by using the power method. Start with x(0) = [2, 1, 0, −1]T
and = 0.0001:
3 1 −2 1
1 8 −1 0
A= −2 −1
.
3 −1
1 0 −1 8
Also, use the shifted inverse power method with the same x(0) as given
above to find the eigenvalue nearest to µ, which can be calculated by
using the Rayleigh quotient.
8. Find the dominant eigenvalue and corresponding eigenvector by using
the power method, with u0 = [1, 1, 1]t (only four iterations):
3 0 1
A = 2 2 2 .
4 2 5
Also, solve by using the inverse power method by taking the initial
value of the eigenvalue by using the Rayleigh quotient.
Numerical Computation of Eigenvalues 503
9. Use the Gerschgorin Circles theorem to determine the bounds for the
eigenvalues of each of the given matrices:
3 2 1 1 1 1 2 −2 1
(a) 2 3 0 , (b) 1 1 0 , (c) −2 1 1 .
1 0 3 1 0 1 1 1 2
4 4 4 1 2 −1 3 2
0.4 0.3 0.1 4 6 1 4
−1 3 1 −2
(d) 0.3 0.5 0.2 , (e)
4 , (f ) .
1 6 4 3 1 4 1
0.1 0.2 0.6
1 4 4 6 2 −2 1 −3
14. Use the Jacobi method to find all the eigenvalues and eigenvectors of
the matrix
5 −2 −0.5 1.5
−2 5 1.5 −0.5
A= −0.5
.
1.5 5 −2
1.5 −0.5 −2 5
15. Use the Sturm sequence iteration to find the number of eigenvalues
of the following matrices lying in the given intervals (a, b):
2 −1 0 5 −1 0
(a) −1 2 −1 , (−1, 3) (b) −1 2 2 , (0, 4).
0 −1 2 0 2 3
16. Use the Sturm sequence iteration to find the eigenvalues of the fol-
lowing matrices:
1 4 0 1 2 0 1 2 0
(a) 4 1 4 , (b)
2 2 1 , (c) 2 1 2 .
0 1 1 0 4 4 0 2 1
17. Find the eigenvalues and eigenvectors of the given symmetric matrix
A by using the Jacobi method:
0.6532 0.2165 0.0031
A= 0.4105 0.0052 .
0.2132
Also, use Given’s method to tridiagonalize the above matrix.
18. Use Given’s method to convert the given matrix into tridiagonal form:
2 −1 3 2
−1 3 1 −2
A= 3
.
1 4 1
2 −2 1 −3
Numerical Computation of Eigenvalues 505
24. Find the first four QR iterations for each of the given matrices:
1 0 2 2 −1 2 −21 −9 12
(a) −2 1 1 , (b) 3 1 0 , (c) 0 6 0 .
−2 −5 1 0 2 1 −24 −8 15
25. Find the first 15 QR iterations for each of the matrices in Problem
9.
506 Applied Linear Algebra and Optimization using MATLAB
27. Find the eigenvalues using the LR method for each of the given ma-
trices:
3 1 1 2 1 2 4 0 1
(a) 2 1 1 , (b) 3 1 0 , (c) −2 1 0 .
1 1 1 1 2 1 −2 0 1
28. Find the eigenvalues using the LR method for each of the given ma-
trices:
1 2 4 3 3 3 15 13 20
(a) 5 1 1 , (b) 3 3 3 , (c) −21 12 15 .
2 1 1 −3 −3 −3 −8 −8 11
29. Transform each of the given matrices into upper Hessenberg form:
1 6 4 5 4 3 2 5 2
(a) 5 1 3 , (b) 2 3 3 , (c) 11 6 7 ,
2 4 4 −3 −3 8 9 15 22
2 −1 4 2 2 1 −2 −3
3
, (e) 2
2 3 2 2 −3 2
(d)
1 −3 −3
,
2 2 2 4 5
2 −3 4 4 7 8 3 2
9 2 1 −2
2 1 1 −5
(f )
−2
.
1 6 −2
−2 −1 1 −3
Numerical Computation of Eigenvalues 507
30. Transform each of the given matrices into upper Hessenberg form
using Gaussian elimination with pivoting. Then use the QR method
and the LR method to find their eigenvalues:
14 22 2 1
11 33 45 4 3 3 5 1 5 −2
(a) 12 21 23 , (b)
2 5 4 , (c) 6
.
1 6 1
18 22 31 −3 2 1
7 −2 7 4
31. Find the singular values for each of the given matrices:
3 0 0 1 0 1
2 0 1
(a) , (b) −2 3 −2 , (c) 0 1 0 ,
0 2 0
2 0 5 0 1 2
2 0 1 2
4 0 1 2 0 1 0 1 1 3
(d) 0 1 0 , (e)
−4 6 −2 , (f ) 0 3 2 1 .
2 1 1 2 0 7
1 0 3 1
32. Show that the singular values of the following matrices are the same
as the eigenvalues of the matrices:
4 2 1 3 0 1 2 0 0
(a) 2 8 0 , (b) 0 5 0 , (c) 0 6 0 .
1 0 8 1 0 5 0 0 7
(a)
1 −3 x1 1
A= , x= , b= .
3 −5 x2 2
(b)
1 −1 x1 1.1
A= , x= , b= .
1 4 x2 0.5
Numerical Computation of Eigenvalues 509
(c)
3 −1 4 x1 1
A = −1 0 1 , x = x2 , b = 2 .
4 1 2 x3 3
(d)
4 3 2 x1 2.5
A = 1 2 −1 , x = x2 , b = 1.5 .
1 3 2 x3 0.85
39. Find the solution each of the following linear systems, Ax = b, using
singular value decomposition:
(a)
2 2 x1 1
A= , x= , b= .
1 3 x2 0.9
(b)
1 0 x1 1
A= , x= , b= .
3 −2 x2 2
(c)
1 −1 0 x1 1
A= 2 0 1 , x = x2 , b = 1 .
3 0 2 x3 1
(d)
1 2 3 x1 1
A = 2 1 2 , x = x2 , b = 0 .
1 1 1 x3 1
Chapter 5
Interpolation and
Approximation
5.1 Introduction
In this chapter we describe the numerical methods for the approximation
of functions other than elementary functions. The main purpose of these
techniques is to replace a complicated function with one that is simpler
and more manageable. We sometimes know the value of a function f (x) at
a set of points (say, x0 < x1 < x2 · · · < xn ), but we do not have an analytic
expression for f (x) that lets us calculate its value at an arbitrary point.
We will concentrate on techniques that may be adapted if, for example, we
have a table of values of functions that may have been obtained from some
physical measurement or some experiments or long numerical calculations
that cannot be cast into a simple functional form. The task now is to
estimate f (x) for an arbitrary point x by, in some sense, drawing a smooth
511
512 Applied Linear Algebra and Optimization using MATLAB
curve through (and perhaps beyond) the data points xi . If the desired
x is between the largest and smallest of the data points, then the problem
is called interpolation; and if x is outside that range, it is called extrapola-
tion. Here, we shall restrict our attention to interpolation. It is a rational
process generally used in estimating a missing functional value by taking
a weighted average of known functional values at neighboring data points.
pn (x) = a0 + a1 x + a2 x2 + · · · + an xn , (5.1)
If f (x) is a continuous function in the closed interval [a, b], then for every
> 0 there exists a polynomial pn (x), where the value of n depends on the
value of , such that for all x in [a, b],
x x0 x1 · · · xn
.
f (x) f (x0 ) f (x1 ) · · · f (xn )
514 Applied Linear Algebra and Optimization using MATLAB
Generally, the data points x0 , x1 , . . . , xn are arbitrary, and assume the in-
terval between the two adjacent points is not the same, and assume that
the data points are organized in such a way that x0 < x1 < x2 < · · · <
xn−1 < xn .
When the data points in a given functional relationship are not equally
spaced, the interpolation problem becomes more difficult to solve. The
basis for this assertion lies in the fact that the interpolating polynomial
coefficient will depend on the functional values as well as on the data
points given in the table.
p1 (x1 ) = a0 + a1 x1 = y1 = f (x1 ).
where
x − x1 x − x0
L0 (x) = and L1 (x) = . (5.6)
x0 − x 1 x1 − x0
516 Applied Linear Algebra and Optimization using MATLAB
and n
X
Lk (x) = 1. (5.9)
k=0
Interpolation and Approximation 517
where none of the terms in the denominator can be zero, from the assump-
tion of distinct points. Hence,
n
Y x − xk
Li (x) = , i 6= k. (5.14)
k=0
xi − xk
518 Applied Linear Algebra and Optimization using MATLAB
or
Simplifying, we obtain
1 2 1
p(x) = [f (−h)−2f (0)+f (h)]x + [f (h)−f (−h)]x+f (0).
2h2 2h
1 2
= (x − 5x + 6)
2
(x − x0 )(x − x2 ) (x − 1)(x − 3)
L1 (x) = =
(x1 − x0 )(x1 − x2 ) (2 − 1)(2 − 3)
= −(x2 − 4x + 3)
(x − x0 )(x − x1 ) (x − 1)(x − 2)
L2 (x) = =
(x2 − x0 )(x2 − x1 ) (3 − 1)(3 − 2)
1 2
= (x − 3x + 2).
2
Thus,
1 1
p2 (x) = (x2 − 5x + 6)(2) − (x2 − 4x + 3)(3) + (x2 − 3x + 2)(α).
2 2
Separating the coefficients of x2 , x, and a constant term, we get
α 3α
p2 (x) = (1 − 3 + )x2 + (−5 + 12 − )x + (6 − 9 + α)
2 2
or
α 3α
p2 (x) = (−2 + )x2 + (7 − )x + (−3 + α).
2 2
Since the given value of the constant term is 5, using this, we get
(−3 + α) = 5, gives α = 8.
Now using this value of α, the approximation of f (x) and given x = 2.5,
we get
8 2 24
p2 (2.5) = (−2 + (2.5) + (7 − (2.5) + (−3 + 8),
2 2
and it gives
f (2.5) ≈ p2 (2.5) = 12.50 − 12.50 + 5 = 5. •
Interpolation and Approximation 521
(x − x0 )(x − x2 )
L1 (x) =
(x1 − x0 )(x1 − x2 )
(x − x0 )(x − x1 )
L2 (x) = .
(x2 − x0 )(x2 − x1 )
Since the given interpolating point is x = 2.7, the best three points for the
quadratic polynomial should be
x0 = 1.5, x1 = 2.5, x2 = 3,
and the function values at these points are
f (x0 ) = 2.167, f (x1 ) = 2.9, f (x2 ) = 3.333.
So using these values, we have
p2 (x) = 2.167L0 (x) + 2.9L1 (x) + 3.333L2 (x),
where
(x − 2.5)(x − 3) 1 2
L0 (x) = = (x − 5.5x + 7.5)
(1.5 − 2.5)(1.5 − 3) 1.5
(x − 1.5)(x − 3) 1
L1 (x) = = (x2 − 4.5x + 4.5)
(2.5 − 1.5)(2.5 − 3) −0.5
(x − 1.5)(x − 2.5) 1
L2 (x) = = (x2 − 4x + 3.75).
(3 − 1.5)(3 − 2.5) 0.75
522 Applied Linear Algebra and Optimization using MATLAB
Thus,
2.167 2 2.9 2 3.333 2
p2 (x) = (x −5.5x+7.5) = (x −4.5x+4.5) = (x −4x+3.75),
1.5 −0.5 0.75
and simplifying, we get
p2 (x) = 0.0889x2 + 0.3776x + 1.4003,
which is the required quadratic polynomial. At x = 2.7, we have
f (2.7) ≈ p2 (2.7) = 3.0679.
The relative error is
|f (2.7) − p2 (2.7)| |3.0704 − 3.0679|
= = 0.0008. •
|f (2.7)| |3.0704|
Note that the sum of the Lagrange coefficients is equal to 1 as it should
be:
L0 (2.7) + L1 (2.7) + L2 (2.7) = −0.0400 + 0.7200 + 0.3200 = 1.
Using MATLAB commands, the above results can be reproduced as
follows:
Interpolation and Approximation 523
Program 5.1
MATLAB m-file for the Lagrange Interpolation Method
function fi=lint(x,y,x0)
dxi=x0-x; m=length(x); L=zeros(size(y));
L(1) = prod(dxi(2 : m))/prod(x(1) − x(2 : m));
L(m) = prod(dxi(1 : m − 1))/prod(x(m) − x(1 : m − 1));
for j=2:m-1
num = prod(dxi(1 : j − 1)) ∗ prod(dxi(j + 1 : m));
dem = prod(x(j) − x(1 : j − 1)) ∗ prod(x(j) − x(j + 1 : m));
L(j)=num/dem; end; f i = sum(y. ∗ L);
Example 5.4 Using the cubic Lagrange interpolation formula
p( x) = α1 f (0) + α2 f (0.2) + α3 f (0.4) + α4 f (0.6),
for the approximation of f (0.5), show that
α1 + α2 + α3 + α4 = 1.
Solution. Consider the cubic Lagrange interpolating polynomial
p3 (x) = α1 f (x0 ) + α2 f (x1 ) + α3 f (x2 ) + α4 f (x3 ),
where the values of α1 , α2 , α3 , α4 can be defined as follows:
(x − x1 )(x − x2 )(x − x3 )
α1 =
(x0 − x1 )(x0 − x2 )(x0 − x3 )
(x − x0 )(x − x2 )(x − x3 )
α2 =
(x1 − x0 )(x1 − x2 )(x1 − x3 )
(x − x0 )(x − x1 )(x − x3 )
α3 =
(x2 − x0 )(x2 − x1 )(x2 − x3 )
(x − x0 )(x − x1 )(x − x2 )
α4 = .
(x3 − x0 )(x3 − x1 )(x3 − x2 )
524 Applied Linear Algebra and Optimization using MATLAB
2
|f (x) − p1 (x)| ≤ |(x − 0)(x − π)|.
2
Since the function |(x − 0)(x − π)| attains its maximum in [0, π] and occurs
2
at x = (0+π)
2
, the maximum value is (π−0)4
.
(π − 0)2 π2
|f (x) − p1 (x)| ≤ = ,
4 4
which is the required bound of error in the linear interpolation of f (x). •
Example 5.6 Use the best Lagrange interpolating polynomial to find the
approximation of f (1.5), if f (−2) = 2, f (−1) = 1.5, f (1) = 3.5, and
f (2) = 5. Estimate the error bound if the maximum value of |f (4) (x)| is
0.025 in the interval [−2, 2].
f (x) ≈ p3 (x) = L0 (x)f (x0 ) + L1 (x)f (x1 ) + L2 (x)f (x2 ) + L3 (x)f (x3 ),
and taking f (−2) = 2, f (−1) = 1.5, f (1) = 3.5, f (2) = 5, and the interpo-
lating point x = 1.5, we have
f (1.5) ≈ p3 (1.5) = L0 (1.5)f (−2)+L1 (1.5)f (−1)+L2 (1.5)f (1)+L3 (1.5)f (2)
or
f (1.5) ≈ p3 (1.5) = 2L0 (1.5) + 1.5L1 (1.5) + 3.5L2 (1.5) + 5L3 (1.5).
Interpolation and Approximation 527
M
|f (1.5) − p3 (1.5)| ≤ |(1.5 + 2)(1.5 + 1)(1.5 − 1)(1.5 − 2)|,
4!
which gives
(0.025)(2.1875)
|f (1.5) − p3 (1.5)| ≤ = 0.0023,
24
the desired error bound. •
528 Applied Linear Algebra and Optimization using MATLAB
Solution. Suppose that the given table contains the function values f (xi ),
for the points xi = 1 + ih, i = 0, 1, . . . , n, where n = (2−1) h
. If x ∈
[xi−1 , xi+1 ], then we approximate the function f (x) by degree 2 polynomial
p2 (x), which interpolates f (x) at xi−1 , xi , xi+1 . Then the error formula
(5.17) for these data points becomes
(x − x )(x − x )(x − x )
i−1 i i+1 000
|f (x) − p2 (x)| = f (η(x)),
3!
where η(x) ∈ (xi−1 , xi+1 ). Since the point η(x) is unknown, we cannot
estimate f 000 (η(x)), so we let
|f 000 (η(x))| ≤ M = max |f 000 (x)|.
1≤x≤2
Then
M
|(x − xi−1 )(x − xi )(x − xi+1 )|.
|f (x) − p2 (x)| ≤
6
Since f (x) = ex and f 000 (x) = ex ,
|f 000 (η(x))| ≤ M = e2 = 7.3891.
Now to find the maximum value of |(x − xi−1 )(x − xi )(x − xi+1 )|, we
have
max |(x − xi−1 )(x − xi )(x − xi+1 | = max |(t − h)t(t + h)|
x∈[xi−1 ,xi+1 ] t∈[−h,h]
Hence,
2h3
max |(x − xi−1 )(x − xi )(x − xi+1 )| = √ .
x∈[xi−1 ,xi+1 ] 3 3
Thus, for any x ∈ [1, 2], we have
√
(2h3 /3 3)e2 h3 e2
|f (x) − p2 (x)| ≤ = √ ,
6 9 3
if p2 (x) is chosen as the polynomial of degree 2, which interpolates f (x) =
ex at the three tabular points nearest x. If we wish to obtain six decimal
place accuracy this way, we would have to choose h so that
h3 e2
√ < 5 × 10−7 ,
9 3
which implies that
h3 < 10.5483 × 10−7
and gives h = 0.01. •
While the Lagrange interpolation formula is at the heart of polynomial
interpolation, it is not, by any stretch of the imagination, the most prac-
tical way to use it. Just consider for a moment that if we had to add an
additional data point in the previous Example 5.6, in order to find the
cubic polynomial p3 (x), we would have to repeat the whole process again
because we cannot use the solution of the quadratic polynomial p2 (x) in
the construction of the cubic polynomial p3 (x). Therefore, one can note
that the Lagrange method is not particularly efficient for large values of
n, the degree of the polynomial. When n is large and the data for x is
ordered, some improvement in efficiency can be obtained by considering
only the data pairs in the vicinity of the x values for which f (x) is sought.
Let us consider the nth-degree polynomial pn (x) that agrees with the
function f (x) at the distinct numbers x0 , x1 , . . . , xn . The divided differ-
ences of f (x) with respect to x0 , x1 , . . . , xn are derived to express pn (x) in
the form
pn (x) = a0 + a1 (x − x0 ) + a2 (x − x0 )(x − x1 ) + · · ·
+ an (x − x0 )(x − x1 ) · · · (x − xn−1 ), (5.18)
Divided Differences
The first-order or first divided difference at the points xi and xi+1 can
be defined by
f [xi+1 ] − f [xi ] f (xi+1 ) − f (xi )
f [xi , xi+1 ] = = . (5.22)
xi+1 − xi xi+1 − xi
In general, the nth divided difference f [xi , xi+1 , . . . , xi+n ] is defined by
f [xi+1 , xi+2 , . . . , xi+n ] − f [xi , xi+1 , . . . , xi+n−1 ]
f [xi , xi+1 , . . . , xi+n ] = .
xi+n − xi
(5.23)
By using this definition, (5.19) and (5.20) can be written as
a0 = f [x0 ]; a1 = f [x0 , x1 ],
respectively. Similarly, one can have the values of other constants involved
in (5.18) such as
a2 = f [x0 , x1 , x2 ]
a3 = f [x0 , x1 , x2 , x3 ]
··· = ···
··· = ···
an = f [x0 , x1 , . . . , xn ].
Putting the values of these constants in (5.18), we get
pn (x) = f [x0 ] + f [x0 , x1 ](x − x0 ) + f [x0 , x1 , x2 ](x − x0 )(x − x1 )
+ · · · + f [x0 , x1 , . . . , xn ](x − x0 )(x − x1 ) · · · (x − xn−1 ), (5.24)
which can also be written as
n
X
pn (x) = f [x0 ] + f [x0 , x1 , . . . , xk ](x − x0 )(x − x1 ) · · · (x − xk−1 ). (5.25)
k=1
532 Applied Linear Algebra and Optimization using MATLAB
grange form defined by (5.10). For example, the Newton divided difference
interpolation polynomial of degree one is
Example 5.8 Construct the fourth divided differences table for the func-
tion f (x) = 4x4 + 3x3 + 2x2 + 10 using the values x = 3, 4, 5, 6, 7, 8.
Interpolation and Approximation 533
From the results in Table 5.5, one can note that the nth divided difference
for the nth polynomial equation is always constant and the (n+1)th divided
difference is always zero for the nth polynomial equation. •
Using the following MATLAB commands one can construct Table 5.5
as follows:
>> x = [3 4 5 6 7 8];
>> y = 4 ∗ x.ˆ 4+3 ∗ x.ˆ 3+2 ∗ x.ˆ 2+10;
>> D = divdif f (x, y);
Table 5.2: Divided differences table for f (x) = ex at the given points.
p2 (xi ) = f (xi ), i = 0, 1, 2.
and it gives
p2 (x1 ) = f (x0 ) + f (x1 ) − f (x0 ) = f (x1 ).
Finally, taking x = x2 , we have
Program 5.2
MATLAB m-file for the Divided Differences
function D=divdiff(x,y)
% Construct divided difference table
m = length(x); D = zeros(m, m); D(:, 1) = y(:);
for j=2:m; for i=j:m
D(i, j) = (D(i, j−1)−D(i−1, j−1))/(x(i)−x(i−j+1));
end; end
The main advantage of the Newton divided difference form over the La-
grange form is that polynomial pn (x) can be calculated from polynomial
pn−1 (x) by adding just one extra term, since it follows from (5.25) that
Example 5.11 (a) Construct the divided difference table for the function
f (x) = ln(x + 2) in the interval 0 ≤ x ≤ 3 for the stepsize h = 1.
(b) Use Newtons’s divided difference interpolation formula to construct
the interpolating polynomials of degree 2 and degree 3 to approximate
ln(3.5).
(c) Compute error bounds for the approximations in part (b).
Solution. (a) The results of the divided differences are listed in Table 5.4.
(b) First, we construct the second degree polynomial p2 (x) by using the
quadratic Newton interpolation formula as follows:
then with the help of the divided differences Table 5.4, we get
p2 (1.5) = 1.2620,
or
p3 (x) = p2 (x) + 0.0089(x − 0)(x − 1)(x − 2)
p3 (x) = p2 (x) + 0.0089(x3 − 3x2 + 2x),
Interpolation and Approximation 537
(c) Now to compute the error bounds for the approximations in part (b),
we use the error formula (5.17). For the polynomial p2 (x), we have
|f 000 (η(x))|
|f (x) − p2 (x)| = |(x − x0 )(x − x1 )(x − x2 )|.
3!
Since the third derivative of the given function is
2
f 000 (x) =
(x + 2)3
and
000
2
|f (η(x))| = , for η(x) ∈ (0, 2),
(η(x) + 2) 3
then
2
M = max = 0.25
0≤x≤2 (x + 2)3
538 Applied Linear Algebra and Optimization using MATLAB
and
(0.25)
|f (1.5) − p2 (1.5)| ≤ (0.375) = 0.0156,
6
which is the required error bound for the approximation p2 (1.5).
|f (4) (η(x))|
|f (x) − p3 (x)| = |(x − x0 )(x − x1 )(x − x2 )(x − x3 )|,
4!
taking the fourth derivative of the given function, we have
−6
f (4) (x) =
(x + 2)4
and −6
|f (4) (η(x))| = , for η(x) ∈ (0, 3).
(η(x) + 2)4
Since
|f (4) (0)| = 0.375
|f (4) (3)| = 0.0096,
−6
|f (4) (η(x))| ≤ max = 0.375
0≤x≤3 (x + 2)4
and
(0.375)
|f (1.5) − p3 (1.5)| ≤ (0.5625) = 0.0088,
24
which is the required error bound for the approximation p3 (1.5). •
Note that in Example 5.11, we used the value of the quadratic poly-
nomial p2 (1.5) in calculating the cubic polynomial p3 (1.5). It was possible
because the initial value for both polynomials was the same as x0 = 0.
But the situation will be quite different if the initial point for both poly-
nomials is different. For example, if we have to find the approximate
value of ln(4.5), then the suitable data points for the quadratic polyno-
mial will be x0 = 1, x1 = 2, x2 = 3 and for the cubic polynomial will be
x0 = 0, x1 = 1, x2 = 2, x3 = 3. So for getting the best approximation of
Interpolation and Approximation 539
ln(4.5) by the cubic polynomial p3 (2.5), we cannot use the value of the
quadratic polynomial p2 (2.5) in the cubic polynomial p3 (2.5). The best
way is to use the cubic polynomial form
which gives
>> x = [0 1 2 3];
>> y = log(x + 2);
>> x0 = 1.5;
>> Y = N divf (x, y, x0);
540 Applied Linear Algebra and Optimization using MATLAB
Program 5.3
MATLAB m-file for Newton’s Linear Interpolation Method
function Y=Ndivf(x,y,x0)
m = length(x); D = zeros(m, m); D(:, 1) = y(:);
for j=2:m; for i=j:m;
D(i, j) = (D(i, j−1)−D(i−1, j−1))/(x(i)−x(i−j+1)); end; end;
Y = D(m, m) ∗ ones(size(x0));
for i = m − 1 : −1 : 1;
Y = D(i, i) + (x0 − x(i)) ∗ Y ; end
|f (6) (η(x))|
|f (x)−p5 (x)| = |(x−x0 )(x−x1 )(x−x2 )(x−x3 )(x−x4 )(x−x5 )|,
6!
taking the sixth derivative of the given function, we have
f (6) (x) = ex
and
|f (6) (η(x))| = e(η(x)) , for η(x) ∈ (0.5, 1.5).
Interpolation and Approximation 541
Since
|f (6) (0.5)| = 1.6487
|f (6) (1.5)| = 4.4817,
Thus, we get
|f (0.6) − p5 (0.6)| ≤ (0.00095)(4.4817)/720 = 0.000006,
which is the required error bound for the approximation p5 (0.6). •
Example 5.13 Consider the points x0 = 0.5, x1 = 1.5, x2 = 2.5, x3 =
3.0, x4 = 4.5, and for a function f (x), the divided differences are
f [x2 ] = 73.8125, f [x1 , x2 ] = 59.5, f [x0 , x1 , x2 ] = 23.7,
f [x1 , x2 , x3 ] = 47.25, f [x0 , x1 , x2 , x3 , x4 ] = 1.
Using this information, construct the complete divided differences table for
the given data points.
f [x2 , x3 ] − 59.50
47.25 =
3.0 − 1.5
and
f [x3 , x4 ] − f [x2 , x3 ]
f [x2 , x3 , x4 ] =
x4 − x2
f [x3 , x4 ] − 130.375
87.75 =
4.5 − 2.5
and
f [x1 ] − f [x0 ]
f [x0 , x1 ] =
x1 − x 0
73.8125 − f [x1 ]
59.50 =
2.5 − 1.5
Finally,
f [x3 ] − f [x2 ]
f [x2 , x3 ] =
x3 − x2
544 Applied Linear Algebra and Optimization using MATLAB
f [x3 ] − 73.8125
130.375 =
3.0 − 2.5
and
f [x4 ] − f [x3 ]
f [x3 , x4 ] =
x4 − x3
f [x4 ] − 139.00
305.875 =
4.5 − 3.0
Table 5.6: Complete divided differences table for the given points.
Zeroth First Second Third Fourth
Divided Divided Divided Divided Divided
k xk Difference Difference Difference Difference Difference
0 0.5 1.8125
1 1.5 14.3125 12.5000
2 2.5 73.8125 59.5000 23.5000
3 3.0 139.000 130.375 47.2500 9.5000
4 4.5 597.8125 305.875 87.7500 13.500 1.0000
Also, find the values of p[0, 1] and q[0, 1], when f [0, 1] = 4, f (1) = 5, p(1) =
q(0) = 2.
Interpolation and Approximation 545
f (n+1) (η(x))
Rn+1 (x) = , Ln (x),
(n + 1)!
or
f (x) − pn (x) = Ln (x)f [x0 , x1 , . . . , xn , x]. (5.28)
Since the interpolation polynomial agreeing with f (x) at x0 , x1 , . . . , xn is
unique, it follows that these two error expressions must be equal.
Theorem 5.3 Let pn (x) be the polynomial of degree at most n that inter-
polates a function f (x) at a set of n + 1 distinct points x0 , x1 , . . . , xn . If x
is a point different from the points x0 , x1 , . . . , xn , then
n
Y
f (x) − pn (x) = f [x0 , x1 , . . . , xn , x] (x − xj ). (5.29)
j=0
One can easily show the relationship between the divided differences and
the derivative. From (5.23), we have
f (x1 ) − f (x0 )
f [x0 , x1 ] = .
x1 − x0
Interpolation and Approximation 547
Now applying the Mean Value theorem to the above equation implies that
when the derivative f 0 exists, we have
f [x0 , x1 ] = f 0 (η(x))
for the unknown point η(x), which lies between x0 and x1 . The following
theorem generalizes this result.
f 00 (η(x))
f [x0 , x1 , x2 ] = , (5.31)
2!
to compute the value of the left-hand side of the relation (5.31), we have
to find the values of the first-order divided differences
and
f (x2 ) − f (x1 ) 0.3411 − 0.2188
f [x1 , x2 ] = = = 1.2230.
x2 − x1 1.3 − 1.2
Using these values, we can compute the second-order divided difference as
f [x1 , x2 ] − f [x0 , x1 ] 1.2230 − 1.1400
f [x0 , x1 , x2 ] = = = 0.4150.
x2 − x0 1.3 − 1.1
Now we calculate the right-hand side of the relation (5.31) for the given
points, which gives us
f 00 (x0 ) 1
= = 0.4546
2 2x0
f 00 (x1 ) 1
= = 0.4167
2 2x1
f 00 (x2 ) 1
= = 0.3846.
2 2x2
We note that the left-hand side of (5.31) is nearly equal to the right-
hand side when x1 = 1.2. Hence, the best approximate value of η(x) is
x1 = 1.2. •
Properties of Divided Differences
f [xi ] = f (xi ), i = 0, 1, . . . , n.
f (x1 ) − f (x0 )
f [x0 , x0 ] = lim f [x0 , x1 ] = lim = f 0 (x0 ). (5.32)
x1 →x0 x1 →x0 x1 − x0
f (n) (x0 )
f [x0 , x0 , . . . , x0 ] = ,
n!
where the left-hand side denotes the nth divided difference, for which
all points are x0 .
550 Applied Linear Algebra and Optimization using MATLAB
Solution. Since
f [0, 1, 1] − f [0, 0, 1]
f [1, 1, 0, 0] = f [0, 0, 1, 1] =
1−0
= f [0, 1, 1] − f [0, 0, 1]
f [1, 1] − f [0, 1] f [0, 1] − f [0, 0]
= −
1−0 1−0
Thus, we have
and it gives
f [1, 1, 0, 0] = 2e2 − 2 ln 2 + 1.5 = 14.8918.
Interpolation and Approximation 551
Also,
f [0, 0, 1, 1, 1] − f [0, 0, 0, 1, 1]
f [0, 0, 0, 1, 1, 1] =
1−0
= f [0, 0, 1, 1, 1] − f [0, 0, 0, 1, 1]
f [0, 1, 1, 1] − f [0, 0, 1, 1] f [0, 0, 1, 1] − f [0, 0, 0, 1]
= −
1−0 1−0
1 h i
= (x − x1 )P02 (x) − (x − x2 )P01 (x)
x2 − x1
and
1 x − x1 P01 (x)
P01m (x) =
xm − x1 x − xm P0m (x)
1 h i
= (x − x1 )P0m (x) − (x − xm )P01 (x) , (5.35)
xm − x 1
where m can now take any value from 2 to n, and P01m denotes a polynomial
of degree 2 that passes through three points (x0 , f (x0 )), (x1 , f (x1 )), and
(xm , f (xm )). By repeated use of this procedure, higher degree polynomials
can be generated. In general, one can define this procedure as follows:
1 P01···(n−1) (x) f (x n−1 )
P012···n (x) =
xn − xn−1 P01···n (x) f (xn )
1 h i
= P01···(n−1) (x)f (xn ) − P01···n (x)f (xn−1 ) . (5.36)
xn − xn−1
This is a polynomial of degree n and it fits all the data. Table 5.7 shows
the construction of P012···n (x). When using Aitken’s method in practice,
only the values of the polynomials for specified values of x are computed
and coefficients of the polynomials are not determined explicitly. Further-
more, if for a specified x, the stage is reached when the difference in value
between successive degree polynomials is negligible, then the procedure
can be terminated. It is an advantage of this method compared with the
Lagrange interpolation formula.
554 Applied Linear Algebra and Optimization using MATLAB
1h i
= (2.5)(1.3863) − (0.5)(0.6932) = 1.5596
2
Interpolation and Approximation 555
and
1 x − x0 f (x0 )
P03 (x) =
x3 − x0 x − x3 f (x3 )
1 4.5 − 2 0.6932
P03 (4.5) =
5 − 2 4.5 − 5 1.6094
1h i
= (2.5)(1.6094) − (−0.5)(0.6932) = 1.4567.
3
Similarly, the values of second-degree polynomials can be generated as fol-
lows:
1 x − x1 P01 (x)
P012 (x) =
x2 − x1 x − x2 P02 (x)
1 4.5 − 3 P01 (4.5)
P012 (4.5) =
4 − 3 4.5 − 4 P02 (4.5)
1h i
= (1.5)(1.4567) − (−0.5)(1.7067) = 1.5193.
2
Finally, the values of third-degree polynomials can be generated as follows:
1 x − x2 P 012 (x)
P0123 (x) =
x3 − x2 x − x3 P013 (x)
1 4.5 − 4 P012 (4.5)
P0123 (4.5) =
5 − 4 4.5 − 4 P013 (4.5)
The results obtained are listed in Table 5.8. Note that the approximate
value of ln(4.5) is P0123 (4.5) = 1.5027 and its exact value is 1.5048. •
To get the above results using the MATLAB Command Window, we
do the following:
>> x = [2 3 4 5];
>> y = [0.6932 1.0986 1.3863 1.6094];
>> x0 = 4.5;
>> P = Aitken1(x, y, x0);
Program 5.4
MATLAB m-file for Aitken’s Method
function P=Aitken1(x,y,x0)
n=size(x,1);
if n==1
n=size(x,2); end
for i=1:n
P(i,1)=y(i); end
for i=2:n
t=0;
for j=2:i
t=t+1;
P(i,j)=(P(i,j-1)*(x0-x(j-1))-P(t,t)*(x0-x(i)))/(x(i)-
x(j-1));
end; end
Interpolation and Approximation 557
Then
Tn (x) = cos(nθ), for x ∈ [−1, 1]. (5.39)
For example, taking n = 0, then
T0 (x) = cos(0.θ) = 1
T1 (x) = cos(1.θ) = x
T2 (x) = cos(2.θ) = 2x2 − 1
T3 (x) = cos(3.θ) = 4x3 − 3x
T4 (x) = cos(4.θ) = 8x4 − 8x2 + 1
T5 (x) = cos(5.θ) = 16x5 − 20x3 + 5x
T6 (x) = cos(6.θ) = 32x6 − 48x4 + 18x2 − 1
T7 (x) = cos(7.θ) = 64x7 − 112x5 + 56x3 − 7x
T8 (x) = cos(8.θ) = 128x8 − 256x6 + 160x4 − 32x2 + 1
T9 (x) = cos(9.θ) = 256x9 − 576x7 + 432x5 − 120x3 + 9x
T1 0(x) = cos(10.θ) = 512x10 − 1280x8 + 1120x6 − 600x4 + 50x2 − 1.
>> n = 10;
>> T = CHEBP (n);
>> T =
512 0 − 1280 0 1120 0 − 400 0 50 0 − 1
Note that we got the coefficients of the Chebyshev polynomials in de-
scending order of powers.
Program 5.5
MATLAB m-file for Computing Chebyshev
Polynomials
function T = CHEBP(n)
x0 = 1; x1 = [1 0];
if n == 0 T = x0;
elseif n == 1; T = x1; else
for i=2:n
T = [2 ∗ x1 0] − [0 0 x0];
x0 = x1; x1 = T ; end end
The higher order polynomials can be generated from the recursion relation
called the triple recursion relation. This relation can be easily constructed
with the help of the trigonometric addition formulas
So the relation
4. When n = 2m, T2m (x) is an even function, i.e., T2m (−x) = T2m (x).
6. Tn (x) has n distinct zeros xk (called Chebyshev points) that lie on the
interval [−1, 1]:
(2k + 1)π
xk = cos , for k = 0, 1, . . . , n.
2(n + 1)
and
x = −1, θ = π, and x = 1, θ = 0.
1
cos nθ cos mθ = [cos(n + m)θ + cos(n − m)θ] ,
2
then we get
Z 1 Z π Z π
Tn (x)Tm (x) 1 1
√ dx = cos[(n + m)θ]dθ + cos[(n − m)θ]dθ.
−1 1−x 2 2 0 2 0
or
Z 1
Tn (x)Tm (x)
√ dx = 0.
−1 1 − x2
Now when n = m, we have
1
1 π 1 π
Z Z Z
Tn (x)Tn (x)
√ dx = cos[(n + n)θ]dθ + cos[(n − n)θ]dθ
−1 1 − x2 2 0 2 0
1 π 1 π
Z Z
= cos(2nθ)dθ + cos(0)dθ
2 0 2 0
π
1 sin(2nθ) 1
= + [θ]π0
2 2n 0 2
= 0,
for each n ≥ 1. •
In the following example, we shall find the Chebyshev points for the
linear, quadratic, and cubic interpolations for the given function.
562 Applied Linear Algebra and Optimization using MATLAB
Example 5.18 Let f (x) = x2 ex on the interval [−1, 1]. Then the Cheby-
shev points for the linear interpolation (n = 1) are given by
(2(0) + 1)π π
x0 = cos = cos = 0.7071
2(1 + 1) 4
(2(1) + 1)π 3π
x1 = cos = cos = −0.7071.
2(1 + 1) 4
Now using the linear Lagrange polynomial using these two Chebyshev points,
we have
p1 (x) = L0 (x)f (x0 ) + L1 (x)f (x1 ),
where
(x − x1 ) (x + 0.7071) (x + 0.7071)
L0 (x) = = =
(x0 − x1 ) (0.7071 + 0.7071) 1.4142
(x − x0 ) (x − 0.7071) (x − 0.7071)
L1 (x) = = = ,
(x1 − x0 ) (−0.7071 − 0.7071) −1.4142
and the function values at the Chebyshev points are
f (0.7071) = (0.7071)2 e0.7071 = 1.0140
f (−0.7071) = (−0.7071)2 e−0.7071 = 0.2465.
Thus,
(x + 0.7071) (x − 0.7071)
p1 (x) = (1.0140) + (0.2465)
1.4142 −1.4142
gives
p1 (x) = 0.6302 + 0.5427x.
Now to find the quadratic Lagrange polynomial, we need to calculate
three Chebyshev points as follows:
(2(0) + 1)π π
x0 = cos = cos = 0.8660
2(2 + 1) 6
(2(1) + 1)π 3π
x1 = cos = cos = 0.0
2(2 + 1) 6
(2(2) + 1)π 5π
x2 = cos = cos = −0.8660.
2(2 + 1) 6
Interpolation and Approximation 563
Thus,
p3 (x) = −0.1389 − 0.0756x + 1.7592x2 + 1.5404x3 . •
Note that because Tn (x) = cos(nθ), Chebyshev polynomials have a succes-
sion of maximums and minimums of alternating signs, each of magnitude
one. Also, | cos(nθ)| = 1, for nθ = 0, π, 2π, . . . , and because θ varies from
0 to π as x varies from 1 to −1, Tn (x) assumes its maximum magnitude of
unity (n + 1) times on the interval [−1, 1]. An important result of Cheby-
shev polynomials is the fact that, of all polynomials of degree n where the
1
coefficient of xn is unity, the polynomial ( 2n−1 )Tn (x) has a smaller bound
Interpolation and Approximation 565
to its magnitude on the interval [−1, 1] than any other. Because the max-
1
imum magnitude of Tn (x) is one, the upper bound referred to is 2n−1 .
R(x) = (x − x0 )(x − x1 ) · · · (x − xn ),
(n+1)
f
maximum value over all x in [−1, 1]. To minimize the factor
(n + 1)!
max |R(x)|, Chebyshev found that x0 , x1 , . . . , xn should be chosen so that
1
R(x) = Tn+1 (x).
2n
Theorem 5.7 Let f ∈ C n+1 ([−1, 1]) be given, and let pn (x) be the nth
degree polynomial interpolated to f using the Chebyshev points. Then
1 (n+1)
max |f (x) − pn (x)| ≤ max f . (5.41)
−1≤x≤1 2n (n + 1)! −1≤x≤1
Note that
1
≤ max |(x − x0 )(x − x1 ) · · · (x − xn )|,
2n −1≤x≤1
for any choice of x0 , x1 , . . . , xn on the interval [−1, 1]. •
Solution. First, we construct the polynomial with the use of the three
equidistant points
x0 = −1, x1 = 0, x2 = 1,
and their corresponding function values
(x + 1.0)(x − 1.0)
= (2)
(0 + 1.0)(0 − 1.0)
(x + 1.0)(x − 0.0)
= (3e).
(1.0 + 1.0)(1.0 − 0)
Interpolation and Approximation 567
f (x0 ) = 6.81384
f (x1 ) = 2.0
f (x2 ) = 0.4770,
as follows:
(x − 0)(x + 0.8660)
Q2 (x) = (6.81384)
(0.8660 − 0)(0.8660 + 0.8660)
(x − 0.8660)(x + 0.8660)
= (2)
(0 − 0.8660)(0 + 0.8660)
(x − 0.8660)(x − 0)
= (0.4770).
(−0.8660 − 0.8660)(−0.8660 − 0)
Thus, the Lagrange polynomial at the Chebyshev points is
Note that the coefficients p2 (x) and Q2 (x) are different because they
use different points and function values. Also, the actual errors at x = 0.5
using both polynomials are
and
f (0.5) − Q2 (0.5) = −0.2561.
•
b(1 + z) + a(1 − z) 2x − a − b
x= or z = ,
2 b−a
where a ≤ x ≤ b and −1 ≤ z ≤ 1.
b(1 + zk ) + a(1 − zk )
xk = or k = 0, 1, . . . , n. (5.43)
2
Theorem 5.8 (Lagrange–Chebyshev Approximation Polynomial)
Suppose that pn (x) is the Lagrange polynomial that is based on the Cheby-
shev points
b(1 + zk ) + a(1 − zk )
xk = or k = 0, 1, . . . , n.
2
Interpolation and Approximation 569
2(b − a)n+1
|f (x) − pn (x)| ≤ n+1
max f (n+1) (x) . (5.44)
4 (n + 1) a≤x≤b
•
(x − 2)(x − 2.5)
p2 (x) = (0.75799)
(1.13398 − 2)(1.13398 − 2.5)
(x − 1.13398)(x − 2.5)
+ (1.09861)
(2 − 1.13398)(2 − 2.5)
(x − 1.13398)(x − 2)
+ (1.25276)
(2.5 − 1.13398)(2.5 − 2)
2
then using f (3) (x) = , we get
(x + 1)3
2(3 − 1)2+1
2
|f (x) − p2 (x)| ≤ 2+1 max
4 (2 + 1) 1≤x≤3 (x + 1)3
or
|f (x) − p2 (x)| ≤ 0.01042,
which is the required error bound. •
4
2X
a4 = (xj + 2) ln(xj + 2)T4 (xj ) = 0.0015.
5 j=0
T0 (x) = cos(0.θ) = 1
T1 (x) = cos(1.θ) = x
T2 (x) = cos(2.θ) = 2x2 − 1
T3 (x) = cos(3.θ) = 4x3 − 3x
T4 (x) = cos(4.θ) = 8x4 − 8x2 + 1,
we have
or
f (x0 ) = 0.0, f (x1 ) = 0.6082, f (x2 ) = 1.3863, f (x3 ) = 2.2907, f (x4 ) = 3.2958,
as follows:
Interpolation and Approximation 573
Thus,
>> n = 4; a = −1; b = 1;
>> y =0 (x + 2). ∗ log(x + 2)0 ;
>> A = CHEBP A(y, n, a, b);
>> A =
1.5156 1.6597 0.1308 − 0.0115 0.0015
574 Applied Linear Algebra and Optimization using MATLAB
Program 5.6
MATLAB m-file for Computing Coefficients of Chebyshev
Polynomial Approximation
function C= CHEBPA(fn,n,a,b)
if nargin==2, a = −1; b = 1; end
d = pi/(2 ∗ n + 2); A = zeros(1, n + 1);
for i=1:n+1
x(i) = cos((2 ∗ i − 1) ∗ d); end
x = (b − a) ∗ x/2 + (a + b)/2; y = eval(f n);
for i=1:n+1
z = (2 ∗ i − 1) ∗ d;
for j=1:n+1
A(j) = A(j) + y(i) ∗ cos((j − 1) ∗ z); end end
A = 2 ∗ A/(n + 1); A(1) = A(1)/2;
p1 (x) = a + bx (5.49)
should be fitted through the given points (x1 , y1 ), . . . , (xn , yn ) so that the
sum of the squares of the distances of these points from the straight line
is minimum, where the distance is measured in the vertical direction (the
y-direction). Hence, it will suffice to minimize the function
n
X
E(a, b) = (yj − a − bxj )2 . (5.50)
j=1
n
X n
X n
X
xj yj − a xj − b x2j = 0,
j=1 j=1 j=1
n
X n
X n
X
a xj + b x2j = xj y j .
j=1 j=1 j=1
where P
S1 = P xj
S2 = P yj
S3 = P x2j
S4 = xj y j .
578 Applied Linear Algebra and Optimization using MATLAB
We shall call a and b the least squares linear parameters for the data
and the linear guess function with parameters, i.e.,
p1 (x) = a + bx
will be called the least squares line (or regression line) for the data.
Example 5.22 Using the method of least squares, fit a straight line to the
four points (1, 1), (2, 2), (3, 2), and (4, 3).
Solution. The sums required for the normal equation (5.53) are easily
obtained using the values in Table 5.9. The linear system involving a and
b in (5.53) form is
4 10 a 8
= .
10 30 b 23
Then solving the above linear system using LU decomposition by the Cholesky
method discussed in Chapter 1, the solution of the linear system is
a = 0.5 and b = 0.6.
Interpolation and Approximation 579
>> x = [1 2 3 4];
>> y = [1 2 2 3];
>> [a, b] = linef it(x, y);
To plot Figure 5.7 one can use the MATLAB Command Window:
>> xf it = 0 : 0.1 : 5;
>> yf it = 0.6 ∗ xf it + 0.5;
>> plot(x, y,0 o0 , xf it, yf it,0 −0 );
Table 5.10 shows the error analysis of the straight line using least squares
approximation.
Hence, we have
4
X
E(a, b) = (yi − p1 (xi ))2 = 0.2000.
i=1
Program 5.7
MATLAB m-file for the Linear Least Squares Fit
function [a,b]=linefit(x,y)
n=length(x);
S1 = sum(x);
S2 = sum(y);
S3 = sum(x. ∗ x);
S4 = sum(x. ∗ y);
a = (n ∗ S4 − S1 ∗ S2)/(n ∗ S3 − (S1) ˆ 2);
b = (S3 ∗ S2 − S4 ∗ S1)/(n ∗ S3 − (S1) ˆ 2);
f or k = 1 : n
p1 = a + b ∗ x(k);
Error(k) = abs(p1 − y(k));
end
Error = sum(Error. ∗ Error);
Interpolation and Approximation 581
pn (x) = b0 + b1 x + b2 x2 + · · · + bn xn . (5.55)
m
X m
X m
X
= yj2 − 2 pn (xj )yj + (pn (xj ))2
j=1 j=1 j=1
m m n
! m n
!2
X X X X X
= yj2 − 2 bi xij yj + bi xij
j=1 j=1 i=0 j=1 i=0
m n m
! n X
n m
!
X X X X X
= yj2 − 2 bi yj xij + bi bk xji+k .
j=1 i=0 j=1 i=0 k=0 j=1
m
X m
X m
X m
X m
X
b0 x1j + b1 x2j + b2 x3j + · · · + bn xn+1
j = yj x1j
j=1 j=1 j=1 j=1 j=1
..
.
m
X m
X m
X m
X m
X
b0 xnj + b1 xn+1
j + b2 xn+2
j + · · · + bn x2n
j = yj xnj .
j=1 j=1 j=1 j=1 j=1
Note that the coefficients matrix of this system is symmetric and positive-
definite. Hence, the normal equations possess a unique solution.
xj 0 1 2 4 6
.
yj 3 1 0 1 4
The sums required for the normal equation (5.58) are easily obtained using
the values in Table 5.11. The linear system involving unknown coefficients
Interpolation and Approximation 583
b0 , b1 , and b2 is
5b0 + 13b1 + 57b2 = 9
13b0 + 57b1 + 289b2 = 29
57b0 + 289b1 + 1569b2 = 161.
Then solving the above linear system, the solution of the linear system is
>> x = [0 1 2 4 6];
>> y = [3 1 0 1 4];
>> n = 2;
>> C = polyf it(x, y, n);
Clearly, p2 (x) replaces the tabulated functional relationship given by y =
f (x). The original data along with the approximating polynomials are
shown graphically in Figure 5.9. To plot Figure 5.9 one can use the MAT-
LAB Command Window as follows:
584 Applied Linear Algebra and Optimization using MATLAB
>> xf it = −1 : 0.1 : 7;
>> yf it = 2.8252 − 2.0490. ∗ xf it + 0.3774. ∗ xf it. ∗ xf it;
>> plot(x, y,0 o0 , xf it, yf it,0 −0 );
Program 5.8
MATLAB m-file for the Polynomial Least Squares Fit
function C=polyfit(x,y,n)
m=length(x); for i = 1 : 2 ∗ n + 1
a(i) = sum(x.ˆ (i − 1));end
f or i = 1 : n + 1 % coefficients of vector b
b(i) = sum(y. ∗ x.ˆ (i − 1));end
for i=1:n+1; for j=1:n+1
A(i, j) = a(j + i − 1); end;end
C = A \ b0 ; % Solving linear system
f or k = 1 : m; S = C(1); f or i = 2 : m + 1
S = S + C(i). ∗ x(k).ˆ (i − 1);end;
p2 (k) = S; Error(k) = abs(y(k) − p2 (k)); end
Error = sum(Error. ∗ Error);
Interpolation and Approximation 585
Table 5.12 shows the error analysis of the parabola using least squares ap-
proximation. Hence, the error associated with the least squares polynomial
approximation of degree 2 is
5
X
E(b0 , b1 , b2 ) = (yi − p(xi ))2 = 0.2345.
j=1
or
y(x) = aebx . (5.60)
We can develop the normal equations for these analogously to the pre-
vious development for least squares. The least squares error for (5.59) is
given by
Xn
E(a, b) = (yj − axbj )2 , (5.61)
j=1
n
∂E X
= −2 (yj − axbj )xbj = 0
∂a j=1
. (5.62)
n
∂E X
= −2 (yj − axbj )(abxb−1
j ) = 0
∂b
j=1
Then the set of normal equations (5.62) represents the system of two
equations in the two unknowns a and b. Such nonlinear simultaneous
equations can be solved using Newton’s method for nonlinear systems. The
details of this method of nonlinear systems will be discussed in Chapter 7.
Example 5.24 Find the best-fit of the form y = axb by using the data
x 1 2 4 10
y 2.87 4.51 6.11 9.43
by Newton’s method, starting with the initial approximation (a0 , b0 ) = (2, 1)
and taking a desired accuracy within = 10−5 .
By using the given data points, the nonlinear system (5.63) gives
2.87 − a(1 + 22b + 42b + 102b ) + 4.5(2b ) + 6.11(4b ) + 9.43(10b ) = 0
−a(0.69(22b ) + 1.39(42b ) + 2.30(102b )) + 3.12(2b ) + 8.47(4b ) + 21.72(10b ) = 0.
Let us consider the two functions
f1 (a, b) = 2.87 − a(1 + 22b + 42b + 102b ) + 4.5(2b ) + 6.11(4b ) + 9.43(10b )
f2 (a, b) = −a(0.69(22b ) + 1.39(42b ) + 2.30(102b )) + 3.12(2b ) + 8.47(4b )
+21.72(10b ),
Interpolation and Approximation 587
where
∂f1 ∂f1
∂a ∂b
J = ,
∂f ∂f2
2
∂a ∂b
let us start with the initial approximation (a0 , b0 ) = (2, 1), and the values
of the functions at this initial approximation are as follows:
f1 (2, 1) = −111.39
f2 (2, 1) = −253.216.
The Jacobian matrix J and its inverse J −1 at the given initial approx-
imation can be calculated as
−121 −763.576
J(2, 1) =
−255.248 −1700.534
and
−1 −0.1565 0.0703
J (2, 1) = .
0.0235 −0.0111
588 Applied Linear Algebra and Optimization using MATLAB
Substituting all these values in the above Newton’s formula, we get the first
approximation as
a1 2.0 −0.1565 0.0703 −111.39 2.3615
= − = .
b1 1.0 0.0235 −0.0111 −253.216 0.7968
a2 2.3615 −0.2323 0.1063 −35.4457 2.7444
= − = .
b2 0.7968 0.0339 −0.0169 −81.1019 0.6282
The first two and the further steps of the method are listed in Table 7.4 by
taking the desired accuracy within = 10−5 .
y(x) = 3.08314x0.48751 .
Y = A + BX, (5.64)
with A = ln a, B = b, X = ln x, and Y = ln y. The values of A and B can
be chosen to minimize
n
X
E(A, B) = (Yj − (A + BXj ))2 , (5.65)
j=1
n
X n
X n
X
2
A Xj + B Xj = Xj Yj .
j=1 j=1 j=1
where P
S1 = P Xj
S2 = P Yj
S3 = P Xj2
S4 = X j Yj .
590 Applied Linear Algebra and Optimization using MATLAB
S3S2 − S1S4
A =
nS3 − (S1)2
. (5.67)
nS4 − S1S2
B =
nS3 − (S1)2
a = eA and b = B. (5.68)
y(x) = axb
will be called the nonlinear least squares approximation for the data.
Example 5.25 Find the best-fit of the form y = axb by using the following
data:
x 1 2 4 10
.
y 2.87 4.51 6.11 9.43
Solution. The sums required for the normal equation (5.66) are easily
obtained using the values in Table 5.14. The linear system involving A and
Interpolation and Approximation 591
B in (5.66) form is
4 4.3821 A 6.6144
= .
4.3821 7.7043 B 8.7201
Then solving the above linear system, the solution of the linear system is
A = 1.0975 and B = 0.5076.
Using the values of A and B in (5.68), we have the values of the pa-
rameters a and b as
a = eA = 2.9969 and b = B = 0.5076.
Hence, the best nonlinear fit is
y(x) = 2.9969x0.5076 .
•
Use the MATLAB Command Window as follows:
>> x = [1 2 4 10];
>> y = [2.87 4.51 6.11 9.43];
>> [A, B] = exp1f it(x, y);
Program 5.9
MATLAB m-file for the Nonlinear Least Squares Fit
function [A,B]=exp1fit(x,y)% Least square fit y = axb
%Transform the data from (x,y) to (X,Y), X = log(x), Y = log(y);
n = length(x); X = log(x); Y = log(y);
S1 = sum(X); S2 = sum(Y ); S3 = sum(X. ∗ X); S4 = sum(W. ∗
Z);
B = (n ∗ S4 − S1 ∗ S2)/(n ∗ S3 − (S1) ˆ 2);
A = (S3 ∗ S2 − S4 ∗ S1)/(n ∗ S3 − (S1)ˆ 2);
b = B; a = exp(A);
for k=1:n
y = a ∗ X(k).ˆ b; Error(k) = abs(y(k) − y); end
Error = sum(Error. ∗ Error);
592 Applied Linear Algebra and Optimization using MATLAB
Table 5.15 shows the error analysis of the nonlinear least squares approxi-
mation.
Hence, the error associated with the nonlinear least squares approximation
is
X4
E(a, b) = (yi − axbx 2
i ) = 0.1267.
i=1
•
Similarly, for the other nonlinear curve y(x) = aebx , the least squares error
is defined as n
X
E(a, b) = (yj − aebxj )2 , (5.69)
j=1
Then the set of normal equations (5.70) represents the nonlinear simulta-
neous system.
Example 5.26 Find the best-fit of the form y = aebx by using the data
x 0 0.25 0.4 0.5
y 9.532 7.983 4.826 5.503
by Newton’s method, starting with the initial approximation (a0 , b0 ) = (8, 0)
and taking the desired accuracy within = 10−5 .
By using the given data points, the nonlinear system (5.71) gives
where
∂f1 ∂f1
∂a ∂b
J = ,
∂f ∂f
2 2
∂a ∂b
let us start with the initial approximation (a0 , b0 ) = (8, 0), and the values
of the functions at this initial approximation are as follows:
f1 (2, 1) = −4.156
f2 (2, 1) = −2.522.
Interpolation and Approximation 595
The Jacobian matrix J and its inverse J −1 at the given initial approxima-
tion can be computed as
−4 −11.722
J(8, 0) =
−1.15 −4.913
and
−1 −0.7961 1.8993
J (8, 0) = .
0.1863 −0.6481
Substituting all these values in the above Newton’s formula, we get the first
approximation as
a1 8.0 −0.7961 1.8993 −4.156 9.48168
= − = .
b1 0.0 0.1863 −0.6481 −2.522 −0.86015
The first two and the further steps of the method are listed in Table 7.4
by taking the desired accuracy within = 10−5 .
y(x) = 9.73060e−1.26492x .
•
596 Applied Linear Algebra and Optimization using MATLAB
Once again, to make this exponential form a linearized form, we take the
logarithms of both sides of (5.60), and we get
ln y = ln a + bx,
to get the values of A and B, the data set may be transformed to (xj , ln yj )
and determining a and b is a linear least squares problem. The values of
unknowns a and b are deduced from the relations
a = eA and b = B. (5.75)
y(x) = aebx
will be called the nonlinear least squares approximation for the data.
Interpolation and Approximation 597
Example 5.27 Find the best-fit of the form y = aebx by using the following
data:
x 0 0.25 0.4 0.5
.
y 9.532 7.983 4.826 5.503
Solution. The sums required for the normal equation (5.74) are easily
obtained using the values in Table 5.17.
4A + 1.1500B = 7.6112
1.1500A + 0.4725B = 2.0016.
Then solving the above linear system, the solution of the linear system is
Using the values in (5.75), we have the values of the unknown parameters
a and b as
a = eA = 9.7874, and b = B = −1.3157.
Hence, the best nonlinear fit is
y(x) = 9.7874x−1.3157 .
598 Applied Linear Algebra and Optimization using MATLAB
Note that the value of a and b calculated for the linearized problem
will not necessarily be the same as the values obtained for the original
least squares problem. In this example, the nonlinear system becomes
9.532 − a + 7.983e0.25b − ae0.5b + 4.826e0.4b − ae0.8b + 5.503e0.5b − aeb = 0
Now Newton’s method for nonlinear systems can be applied to this system,
and we get the values of a and b as
Table 5.18 shows the error analysis of the nonlinear least squares approxi-
mation.
Hence, the error associated with the nonlinear least squares approximation
is
4
X
E(a, b) = (yi − aebxi )2 = 2.0496.
i=1
•
Table 5.19 shows the conversion of nonlinear forms into linear forms by
using a change of variables and constants.
Example 5.28 Find the best-fit of the form y = axe−bx by using the
change of variables to linearize the following data points:
x 1.5 2.5 4.0 5.5
.
y 3.0 4.3 6.5 7.0
Using these values of A and B, we have the values of the parameters a and
b as
a = eA = 2.3224 and b = −B = 0.1043.
Hence,
y(x) = 2.3224e−0.1043x
Program 5.10
MATLAB m-file for the Nonlinear Least Squares Fit
function [A,B]=exp2fit(x,y) % Least square fit y = aebx
% Transform the data from (x,y) to (x,Y), Y = log(Y );
n = length(x); Y = log(y); S1 = sum(x); S2 = sum(Y );
S3 = sum(x. ∗ x); S4 = sum(x. ∗ Y );
B = (n ∗ S4 − S1 ∗ S2)/(n ∗ S3 − (S1) ˆ 2);
A = (S3 ∗ S2 − S4 ∗ S1)/(n ∗ S3 − (S1)ˆ 2);
b = B; a = exp(A)
for k=1:n
y = a ∗ exp(b ∗ x(k)); Error(k) = abs(y(k) − y); end
Error = sum(Error. ∗ Error);
z = ax + by + c, (5.76)
n
∂E X
= 2 (zj − axj − bxj − c)(−yj ) = 0
∂b j=1
n
∂E X
= 2 (zj − axj − bxj − c)(−1) = 0.
∂c j=1
Solution. The sums required for the normal equation (5.78) are easily
obtained using the values in Table 5.23.
14a + 15b + 8c = 82
15a + 19b + 9c = 93
8a + 9b + 5c = 49.
Then solving the above linear system, the solution of the linear system is
>> x = [1 1 2 2 2];
>> y = [1 2 1 2 3];
>> z = [7 9 10 11 12];
>> sol = planef it(x, y, z);
Table 5.22 shows the error analysis of the least squares plane approxima-
tion. Hence, the error associated with the least squares plane approxima-
tion is
4
X
E(a, b, c) = (zi − axi + byi + c)2 = 0.4000. •
i=1
604 Applied Linear Algebra and Optimization using MATLAB
Program 5.11
MATLAB m-file for the Least Squares Plane Fit
function Sol=planefit(x,y,z)
n = length(x); S1 = sum(x); S2 = sum(y); S3 = sum(z);
S4 = sum(x. ∗ x); S5 = sum(x. ∗ y); S6 = sum(y. ∗ y);
S7 = sum(z. ∗ x); S8 = sum(z. ∗ y);
A = [S4 S5 S1; S5 S6 S2; S1 S2 n]; B = [S7 S8 S3]0 ; C = A\B;
for k=1:n
z = C(1) ∗ x(k) + C(2) ∗ y(k) + C(3);
Error(k) = abs(z(k) − z); end
Error = sum(Error. ∗ Error);
m
a0 X
p(x) = + [ai cos(kx) + bi sin(kx)] (5.79)
2 i=1
n
X
E(a0 , a1 , . . . , an , b1 , b2 , . . . , bn ) = [yj − p(xj )]2 , (5.80)
j=1
∂E
= 0, for k = 0, 1, . . . , m
∂ak
, (5.81)
∂E
= 0, for k = 1, 2, . . . , m
∂bk
Interpolation and Approximation 605
gives
n m m n
X a0 X X X
[ ai cos(kxj ) + bi sin(kxj )] = yj
j=1
2 i=1 i=1 j=1
n m m n
a0 X
X X X
[ ai cos(kxj ) + bi sin(kxj )] cos(kxj ) = cos(kxj )yj
j=1
2 i=1 i=1 j=1
.
n m m n
X a0 X X X
[ ai cos(kxj ) + bi sin(kxj )] sin(kxj ) = sin(kxj )yj
2 i=1
j=1 i=1 j=1
for k = 1, 2, . . . , m
(5.82)
Then the set of these normal equations (5.83) represents the system of
(2m + 1) equations in (2m + 1) unknowns and can be solved using any
numerical method discussed in Chapter 2. Note that the derivation of the
coefficients ak and bk is usually called discrete Fourier analysis.
where
n
X n
X
S1 = cos(xj ), S2 = sin(xj )
j=1 j=1
n
X n
X
S3 = cos2 (xj ), S4 = cos(xj ) sin(xj )
j=1 j=1
n
X n
X
S5 = sin2 (xj ), S6 = yj
j=1 j=1
n
X n
X
S7 = cos(xj )yj , S8 = sin(xj )yj ,
j=1 j=1
where
n
X n
X
S1 = cos(xj ) = 4.7291, S2 = sin(xj ) = 1.4629
j=1 j=1
Interpolation and Approximation 607
n
X n
X
S3 = cos2 (xj ) = 4.4817, S4 = cos(xj ) sin(xj ) = 1.3558
j=1 j=1
n
X n
X
2
S5 = sin (xj ) = 0.5183, S6 = yj = 3.2100
j=1 j=1
n
X n
X
S7 = cos(xj )yj = 3.0720, S8 = sin(xj )yj = 0.8223.
j=1 j=1
So,
2.5 4.7291 1.4629 a0 3.2100
4.7291 4.4817 1.3558 a1 = 3.0720 ,
1.4629 1.3558 0.5183 b1 0.8223
and using the Gauss elimination method, we get the values of unknown as
The original data along with the approximating polynomial are shown
graphically in Figure 5.12. To plot Figure 5.12, one can use the MATLAB
Command Window as follows:
j xj yj C S C2 S2 CS Cyj Syj
1 0.1 0.90 0.9950 0.0998 0.9900 0.0100 0.0993 0.8955 0.0899
2 0.2 0.75 0.9801 0.1987 0.9605 0.0395 0.1947 0.7350 0.1490
3 0.3 0.64 0.9553 0.2955 2.000 0.0873 0.2823 0.6114 0.1891
4 0.4 0.52 0.9211 0.3894 4.000 0.1516 0.3587 0.4790 0.2025
5 0.5 0.40 0.8776 0.4794 6.000 0.2298 0.4207 0.3510 0.1918
n=5 1.5 3.21 4.7291 1.4629 4.4817 0.5183 1.3558 3.0720 0.8223
number of columns (m > n), then the linear system is called an overdeter-
mined system. Typically, an overdetermined system has no solution. This
type of system generally arises when dealing with experimental data. It is
also common in optimization-related problems.
2x1 = 3
4x1 = 1. (5.85)
0 = −5,
The left-hand side of (5.86) is [0, 0]T when x1 = 0, and is [2, 4]T when
x1 = 1. Note that as x1 takes on all possible values, the left-hand side of
(5.86) generates the line connecting the origin and the point (2, 4) (Fig-
ure 5.13). On the other hand, the right-hand side of (5.86) is the vector
[3, 1]T . Since the point (3, 1) does not lie on the line, the left-hand side
and the right-hand side of (5.86) are never equal. The given system (5.86)
is only consistent when the point corresponding to the right-hand side is
contained in the line corresponding to the left-hand side. Thus, the least
squares solution to (5.86) is the value of x1 for which the point on the line
is closest to the point (3, 1). In Figure 5.13, we see that the point (1, 2) on
the line is closest to (3, 1), which we got when x1 = 21 . So the least squares
solution to (5.85) is x1 = 12 . Now consider the following linear system of
three equations in two variables:
a11 x1 + a12 x2 = b1
a21 x1 + a22 x2 = b2 . (5.87)
a31 x1 + a32 x2 = b3
610 Applied Linear Algebra and Optimization using MATLAB
Again, it is impossible to find a solution that can satisfy all of the equations
unless two of the three equations are dependent. That is, if only two out
of the three equations are unique, then a solution is possible. Otherwise,
our best hope is to find a solution that minimizes the error, i.e., the least
squares solution. Now, we discuss the method for finding the least squares
solution to the overdetermined system.
solution to (5.87) is the values for x1 and x2 , which minimize the expression
(b1 − a11 x1 − a12 x2 )2 + (b2 − a21 x1 − a22 x2 )2 + (b3 − a31 x1 − a32 x2 )2 . (5.88)
Minimizing x1 and x2 is done by differentiating (5.88), with respect to x1
and x2 , and setting the derivatives to zero. Then solving for x1 and x2 , we
obtain the least squares solution x̂ = [x1 , x2 ]T to the system (5.87).
Replacing r by b − Ax gives
AT (b − Ax̂) = 0 (5.94)
or
AT Ax̂ = AT b, (5.95)
which is called the normal equation.
and they are the least squares solution of the given overdetermined system.
•
Using the MATLAB Command Window, the above result can be re-
produced as follows:
>> A = [2 5 1; 3 − 4 2; 4 3 3; 5 − 2 4];
>> b = [1; 3; 5; 7];
>> x = overD(A, b);
Program 5.12
MATLAB m-file for the Overdetermined Linear System
function sol=overD(A,b)
x = (A0 ∗ A) \ (A0 ∗ b); % Solve the normal equations
sol=x;
In general, the coefficient in row i and column j for the matrix AAT is
the dot product between row i and row j from A.
614 Applied Linear Algebra and Optimization using MATLAB
We want to find the least squares solution to (5.96). The set of all points
(x1 , x2 ) that satisfy (5.96) forms a line with slope − 43 and the distance from
1
the origin to the point (x1 , x2 ) is (x21 + x22 ) 2 .
To find the least squares solution to (5.96), we choose the point (x1 , x2 )
that is as close to the origin as possible. The point (z1 , z2 ) in Figure 5.14,
which is closest to the origin, is the least squares solution to (5.96). We
see in Figure 5.14 that the vector from the origin to (z1 , z2 ) is orthogonal
to the line 4x1 + 3x2 = 15.
x = p + (x − p) = p + q,
where q = x − p.
Since
A(x − p) = b − b = 0,
then x can be expressed as x = p + q and Az = 0.
The set of all q such that Aq = 0 is called the null space of A (kernel
of A) letting
N = {q : Aq = 0}.
616 Applied Linear Algebra and Optimization using MATLAB
then
z1 a11 a21 am1
z2 a12 a22 am2
z= = t1 + t2 + · · · + tm
.. .. .. ..
. . . .
zn a1n a2n amn
Interpolation and Approximation 617
or
a11 t1 + a21 t2 + · · · + am1 tm a11 a21 · · · am1 t1
a12 t1 + a22 t2 + · · · + am2 tm a12 a22 · · · am2 t2
s= = .
.. .. .. .. .. .. .. .. ..
. . . . . . . . .
a1n t1 + a2n t2 + · · · + amn tm a1n a2n · · · amn tm
So,
z = AT t,
where
t1
t2
t= .
..
.
tm
Substituting x = z = AT t into the linear system Ax = b, we have
AAT t = b, (5.98)
and solving this equation yields t, i.e.,
t = (AAT )−1 b,
while the least squares solution s to the underdetermined system is
z = AT t = AT (AAT )−1 b. (5.99)
Now solving the underdetermined equation (5.96),
x1
4 3 = 15,
x2
we first use (5.98) as follows:
4
4 3 t = 15,
3
15
which gives t = 25
= 0.6. Now using (5.99), we have
T 4 2.4
z=A t= (0.6) = ,
3 1.8
the required least squares solution of the given underdetermined equation.
•
618 Applied Linear Algebra and Optimization using MATLAB
x1 + 2x2 − 3x3 = 42
5x1 − x2 + x3 = 54.
AAT t = b,
we obtain
1 5
1 2 −3 2 −1 t1 42
= ,
5 −1 1 t2 54
−3 1
t1 = 3 t2 = 2.
Since the best least squares solution z to the given linear system is z = AT t,
i.e.,
z1 1 5 13
z2 = 2 −1 3 = 4 ,
2
z3 −3 1 −7
it is called the least squares solution of the given underdetermined system.
•
Interpolation and Approximation 619
Using the MATLAB Command Window, the above result can be re-
produced as:
>> A = [1 2 − 3; 5 − 1 1];
>> b = [42; 54];
>> x = underD(A, b);
Program 5.13
MATLAB m-file for the Underdetermined Linear System
function sol=underD(A,b)
t = (A ∗ A0 ) \ (b); % Solve the normal equations
x = A0 ∗ t; % Solve the normal equations
sol=x;
then we have
1 2
1 2 3 14 20
AT A = 2 3 = ,
2 3 4 20 29
3 4
>> A = [1 2; 2 3; 3 4];
>> pinv(A);
Note that if A is a square matrix, then A+ = A−1 and in such a case, the
least squares solution of a linear system Ax = b is the exact solution, since
x̂ = A+ b = A−1 b.
Interpolation and Approximation 621
Example 5.33 Find the pseudoinverse of the matrix of the following lin-
ear system, and then use it to compute the least squares solution of the
system:
x1 + 2x2 = 3
2x1 − 3x2 = 4.
Solution. The matrix form of the given system is
1 2 x1 3
= ,
2 −3 x2 4
and so
T 1 2 1 2 5 −4
A A= = .
2 −3 2 −3 −4 13
The inverse of the matrix AT A can be computed as
13 4
49 49
T −1
(A A) = .
4 5
49 49
The pseudoinverse of the matrix of the given system is
13 4 3 2
49 49
7 7
1 2
(AT A)−1 AT = A+ = = .
5 2 −3
4 2 1
−
49 49 7 7
Now we compute the least squares solution of the system as
3 2
7 7
+ 3
x̂ = A b = ,
1 4
2
−
7 7
which gives
17
7
x̂ = ,
2
7
622 Applied Linear Algebra and Optimization using MATLAB
>> A = [1 2; 2 − 3];
>> b = [3 4]0 ;
>> x = pinv(A) ∗ b;
1. AA+ A = A.
2. A+ AA+ = A+ .
3. (AT )+ = (A+ )T .
1. A+ = A−1 .
2. (A+ )+ = A.
3. (AT )+ = (A+ )T .
So we conclude that
Rx̂ = QT b
must be satisfied by the solution of AT Ax̂ = AT b, but because, in general,
R is not an even square, we cannot use multiplication by (RT )−1 to arrive
at this conclusion. In fact, it is not true, in general, that the solution of
Rx̂ = QT b
RT Rx̂ = RT QT b. (5.103)
(QT b).
= In 0T
The left-hand side, R1 x̂, is (n × n) × (n × 1) −→ n × 1, and the right-hand
side is (n × (n + (m − n)) × (m × m) × (m × 1) −→ n × 1. If we define the
vector q to be equal to the first n components of QT b, then this becomes
R1 x̂ = q, (5.105)
which is a square linear system involving a nonsingular upper triangular
n×n matrix. So (5.105) is called the least squares solution of the overdeter-
mined system Ax = b, with QR decomposition by backward substitution,
where A = QR is the QR decomposition of A and q is essentially QT b.
Note that the last (m − n) columns of Q are not needed to solve the
least squares solution of the linear system with QR decomposition. The
block-matrix representation of Q corresponding to R (by (5.104)) is
Q = [Q1 , Q2 ],
where Q1 is the matrix composed of the first m columns of Q and Q2 is a
matrix composed of the remaining columns of Q. Note that only the first n
columns of Q are needed to create A using the coefficients in R, and we can
save effort and memory in the process of creating the QR decomposition.
The so-called short QR decomposition of A is
A = Q1 R1 . (5.106)
The only difference between the full QR decomposition and the short de-
composition is that the full QR decomposition contains the additional
(m − n) columns of Q.
Example 5.35 Find the least squares solution of the following linear sys-
tem Ax = b using QR decomposition, where
2 1 1.9
x1
A = 1 0 , x = , b = 0.9 .
x2
3 1 2.8
Interpolation and Approximation 627
and
−0.5345 −0.2673 −0.8018 1.9 −3.5011
QT b = 0.6172 −0.7715 −0.1543 0.9 = 0.0463 ,
−0.5774 −0.5774 0.5774 2.8 0.0000
so that
−3.7417 −1.3363 −3.5011
R1 = and q= .
0 0.4629 0.0463
R1 x̂ = q,
i.e.,
−3.7417 −1.3363 x1 −3.5011
= .
0 0.4629 x2 0.0463
Using backward substitution, we obtain
>> A = [2 1; 1 0; 3 1];
>> [Q, R] = qr(A);
The short QR decomposition of A can be obtained by using the second
form of the built-in function qr as
>> A = [2 1; 1 0; 3 1];
>> [Q1 , R1 ] = qr(A, 0);
Q1 =
−0.5345 0.6172
−0.2673 − 0.7715
−0.8018 − 0.1543
R1 =
−3.7417 − 1.3363
0 0.4629
As expected, Q1 and the first two columns of Q are identical, as are R1
and the first two rows of R. The short QR decomposition of A possesses
all the necessary information in the columns of Q1 and R1 to reconstruct
A.
(U DV T )T U DV T x̂ = (U DV T )T b
Interpolation and Approximation 629
V DT U T U DV T x̂ = V DT U T b
V DT DV T x̂ = V DT U T b
V T V DT DV T x̂ = V T V DT U T b
DT DV T x̂ = DT U T b
DV T x̂ = UT b
V T x̂ = D−1 U T b
x̂ = V D−1 U T b.
This is the same formal solution that we found for the linear system Ax = b
(see Chapter 6), but recall that A is no longer a square matrix.
D1 V T x̂ = q (5.107)
or
x̂ = V D1−1 q. (5.108)
Example 5.36 Find the least squares solution of the following linear sys-
tem Ax = b using SVD, where
1 1 1
x1
A= 0 1 , x=
, b= 1 .
x2
1 0 1
630 Applied Linear Algebra and Optimization using MATLAB
Solution. First, we find the SVD of the given matrix. The first step is to
find the eigenvalues of the following matrix:
1 1
T 1 0 1 2 1
A A= 0 1 = .
1 1 0 1 2
1 0
p(λ) = λ2 − 4λ + 3 = (λ − 3)(λ − 1) = 0,
which gives
λ1 = 3, λ2 = 1,
and the eigenvalues of AT A and the corresponding eigenvectors are
1 −1
and .
1 1
These vectors are orthogonal, so we normalize them to obtain
√ √
2 2
2 − 2
v1 = √ and v2 = √ .
2 2
2 2
The singular values of A are
p √ p √
σ1 = λ1 = 3 and σ2 = λ2 = 1 = 1.
Thus, √ √
2 2
2 − 2
0.7071 −0.7071
V = √ =
√ 0.7071 0.7071
2 2
2 2
and √
3 0 1.7321 0
D= 0 1 = 0 1.0000 .
0 0 0 0
Interpolation and Approximation 631
1
−√
3
1
u3 = √ .
3
1
−√
3
So we have
√
6 1
3 0 −√
3
√ √
6
0.8165 0.0000 −0.5774
2 1
U = 6 √ =
0.4082 0.7071 0.5774 .
2 3 0.4082 −0.7071 0.5774
√ √
6 2 1
− √
2 2 3
Hence,
1.7321 0 0.5774 0
D1 = and D1−1 = .
0 1.0000 0 1.0000
Interpolation and Approximation 633
Also,
0.8165 0.4082 0.4082 1 1.6330
U T b = 0.0000 0.7071 −0.7071 1 = 0.0000 ,
−0.5774 0.5774 0.5774 1 0.5774
x̂ = V D1−1 q,
which gives
x1 0.7071 −0.7071 0.5774 0 1.6330 0.6667
= = ,
x2 0.7071 0.7071 0 1.0000 0.0000 0.6667
In Example 5.36, we apply the full SVD of A using the first form of
the built-in function svd:
634 Applied Linear Algebra and Optimization using MATLAB
>> A = [1 1; 0 1; 1 0];
>> [U, D, V ] = svd(A);
The short SVD of A can be obtained by using the second form of the
built-in function svd:
>> A = [1 1; 0 1; 1 0];
>> [U1 , D1 , V ] = svd(A, 0);
U1 =
0.8165 0.0000
0.4082 0.7071
0.4082 − 0.7071
D1 =
1.7321 0
0 1.0000
V =
0.7071 − 0.7071
0.7071 0.7071
As expected, U1 and the first two columns of U are identical, as are D1 and
the first two rows of D (no change in V in either form). The short SVD
of A possesses all the necessary information in the columns of U1 and D1
(with V also) to reconstruct A.
Note that when m and n are similar in size, SVD is significantly more
expensive to compute than QR decomposition. If m and n are equal, then
solving a least squares problem by SVD is about an order of magnitude
more costly than using QR decomposition. So for least squares problems
it is generally advisable to use QR decomposition. When a least squares
problem is known to be a difficult one, using SVD is probably justified.
5.4 Summary
In this chapter, we discussed the procedures for developing approximating
polynomials for discrete data. First, we discussed the Lagrange and New-
Interpolation and Approximation 635
ton divided differences polynomials and both yield the same interpolation
for a given set of n data pairs (x, f (x)). The pairs are not required to
be ordered, nor is the independent variable required to be equally spaced.
The dependent variable is approximated as a single-valued function. The
Lagrange polynomial works well for small data points. The Newton di-
vided differences polynomial is generally more efficient than the Lagrange
polynomial, and it can be adjusted easily for additional data. For effi-
cient implementation of divided difference interpolation, we used Aitken’s
method, which is designed for the easy evaluation of the polynomial. We
also discussed the Chebyshev polynomial interpolation of the function over
the interval [−1, 1]. These types of polynomials are used to minimize ap-
proximation error.
5.5 Problems
1. Use the Lagrange interpolation formula based on the points x0 =
0, x1 = 1 and x2 = 2 to find the equation of the quadratic polynomial
to approximate f (x) = x+1
x+5
at x = 1.5. Compute the absolute error.
function f (x) at x = 0.5 and x = 3.5. Compute the error bounds for
each case.
7. Consider the following table having the data for f (x) = e3x cos 2x:
(a) Construct the divided differences table for the tabulated function.
(b) Compute the Newton interpolating polynomials p2 (x) and p3 (x)
at x = 3.7.
√
11. Consider the following table of f (x) = x:
x 4 5 6 7 8
.
f (x) 2.0000 2.2361 2.4495 2.6458 2.8284
(a) Construct the divided differences table for the tabulated function.
(b) Find the Newton interpolating polynomials p3 (x) and p4 (x) at
x = 5.9.
(c) Compute error bounds for your approximations in part (b).
12. Let f (x) = ln(x + 3) sin x, with x0 = 0, x1 = 2, x2 = 2.5, x3 = 4, x4 =
4.5. Then:
(a) Construct the divided differences table for the given data points.
(b) Find the Newton divided difference polynomials p2 (x), p3 (x) and
p4 (x) at x = 2.4.
(c) Compute error bounds for your approximations in part (b).
(d) Compute the actual error.
13. Show that if x0 , x1 , and x2 are distinct, then
f [x0 , x1 , x2 ] = f [x1 , x2 , x0 ] = f [x2 , x0 , x1 ].
18. Find the Chebyshev polynomial p3 (x) that approximates the function
f (x) = e2x+1 over [−1, 1].
20. Find the four Chebyshev points in 2 ≤ x ≤ 4 and write the Lagrange
interpolation to interpolate ex (x + 2). Compute the error bound.
25. Find the least squares line fit y = ax + b for the following data:
27. Find the least squares parabolic fit y = a + bx + cx2 for the following
data:
28. Repeat Problem 27 to find the best fit of the form y = axb .
29. Find the best fit of the form y = aebx for the following data:
30. Use a change of variable to linearize each of the following data points:
(a) For the given data (1, 10), (3, 20), (5, 35), (7, 55), (9, 70), find the
4
least squares curve f (x) = (1+ce ax ) .
(b) For the given data (0.5, 2), (1.5, 5), (2.5, 6.5), (4, 7), (6.5, 11), find
1
the least squares curve f (x) = (a+bx 2) .
1 1 1 1
(c) For the given data (1.3, 4 ), (1.8, 9 ), (2.5, 16 ), (3.6, 25 ),
1 1
(4.2, 36 ), find the least squares curve f (x) = (a+bx)2 .
(d) For the given data (4, 5.6), (7, 7.2), (9, 11.5), (12, 15.5), (17, 18.7),
find the least squares curve f (x) = a lnlnx+b x
.
640 Applied Linear Algebra and Optimization using MATLAB
31. Find the least squares planes for the following data:
32. Find the plane z = ax + by + c that best fits the following data:
(a) (1, 2, 3), (1, −2, 1), (2, 1, 3), (2, 2, 1).
(b) (2, 4, −1), (2, 2, 5), (1, 3, 1), (7, 8, 2).
(c) (3, −1, 1), (2, 3, −2), (9, 6, −2), (7, 1, 2).
(d) (1, 2, 2), (3, 1, 6), (1, 2, 2), (2, 5, 1).
33. Find the trigonometric least squares polynomial fit y = a0 +a1 cos x+
b1 sin x for each of the following data:
(a) (1.5, 7.5), (2.5, 11.4), (3.5, 15.3), (4.5, 19.2), (5.5, 23.5).
(b) (0.2, 3.0), (0.4, 5.0), (0.6, 7.0), (0.8, 9.0), (1.0, 11.0).
(d) (−2.0, 1.5), (−1.0, 2.5), (0.0, 3.5), (1.0, 4.5), (2.0, 5.5).
(c) (1.1, 6.5), (1.3, 8.3), (1.5, 10.4), (1.7, 12.9), (1.9, 14.6).
35. Find the least squares solution for each of the following overdeter-
mined systems:
(a)
3x1 + x2 = 1
2x1 + 5x2 = 2
x1 − 4x2 = 3
(b)
7x1 + 6x2 = 5
3x1 + 5x2 = 1
2x1 + 6x2 = 2
Interpolation and Approximation 641
(c)
2x1 − 3x2 + 4x3 = 13
x1 + 5x2 + 3x3 = 7
3x1 + 2x2 + x3 = 11
4x1 + x2 + 5x3 = 10
(d)
4x1 − 3x2 + 4x3 = 9
3x1 + 2x2 − 5x3 = 3
2x1 + 5x2 − 9x3 = 5
4x1 + 12x2 + 3x3 = 7
36. Find the least squares solution for each of the following overdeter-
mined systems:
(a)
12x1 − 9x2 = 7
11x1 + 21x2 = 13
17x1 − 22x2 = 24
(b)
x1 + 5x2 + 19x3 = 13
4x1 − 2x2 + 3x3 = 14
3x1 + x2 − x3 = 12
5x1 − 4x2 + 4x3 = 19
(c)
2x1 − 4x2 + 11x3 = 9
3x1 − 5x2 + 5x3 = 3
11x1 + 11x2 − 7x3 = 2
x1 − 8x2 + 3x3 = 7
(d)
2x1 + 5x2 + 2x3 + 2x4 = 12
x1 + 4x2 + 6x3 + x4 = 14
3x1 + 7x2 + 2x3 − 2x4 = 23
5x1 − 2x2 + 11x3 + 7x4 = 11
x1 + 4x2 − 7x3 + 13x4 = 19
642 Applied Linear Algebra and Optimization using MATLAB
37. Find the least squares solution for each of the following overdeter-
mined systems:
(a)
7x1 + 6x2 − 4x3 = −3
8x1 + 5x2 + 3x3 = −5
9x1 + 3x2 + 5x3 = 6
−3x1 + 2x2 + 6x3 = 7
(b)
x1 + 3x2 + 9x3 = 23
2x1 − 2x2 + 6x3 = 24
3x1 − 7x2 + 5x3 = 11
4x1 − 4x2 + 9x3 = 22
(c)
2x1 − 5x2 + 4x3 + 3x4 = 15
3x1 + 2x2 + x3 + 5x4 = 14
7x1 − 3x2 + 4x3 + 9x4 = 13
11x1 + 8x2 + 5x3 + 7x4 = 12
3x1 + x2 − 2x3 + 6x4 = 11
(d)
x1 + 7x2 + 7x3 − 3x4 = 5
3x1 − 2x2 − 5x3 + 2x4 = 6
3x1 + 3x2 − 5x3 − 2x4 = 7
x1 − 3x2 + 12x3 + 11x4 = 8
2x1 + 4x2 − 15x3 + 3x4 = 9
38. Find the least squares solution for each of the following overdeter-
mined systems:
(a)
2x1 − 9x2 = 5
4x1 + 8x2 = 1
12x1 + 13x2 = 2
Interpolation and Approximation 643
(b)
3x1 + 4x2 + 9x3 = 3
4x1 − 2x2 + 7x3 = 4
2x1 − 8x2 + 4x3 = 2
x1 − 4x2 + 4x3 = 9
(c)
2x1 − 24x2 + 9x3 = 3
11x1 − 15x2 + 14x3 = 1
13x1 + 21x2 − 6x3 = 2
x1 − 8x2 + 3x3 = 3
(d)
x1 + 5x2 + 2x3 − 2x4 = 1
3x1 − 2x2 + 6x3 + x4 = 1
3x1 + 7x2 − 5x3 − 2x4 = 2
5x1 − 2x2 + 12x3 + 11x4 = 1
x1 + 4x2 − 15x3 + 3x4 = 1
39. Find the least squares solution for each of the following underdeter-
mined systems:
(a)
2x1 + x2 + x3 = 4
x1 + 3x2 + 4x3 = 5
(b)
x1 − 3x2 + 4x3 = 2
−x1 + 2x2 + 5x3 = 11
(c)
x1 + 5x2 + 3x3 − 5x4 = 15
2x1 + 5x2 + 6x3 + x4 = 18
3x1 + 2x2 + 5x3 − 3x4 = 10
(d)
2x1 + 5x2 + 2x3 + x4 = −5
4x1 + 3x2 − 2x3 + 9x4 = 6
x1 + x2 + 3x3 + 8x4 = 12
644 Applied Linear Algebra and Optimization using MATLAB
40. Find the least squares solution for each of the following underdeter-
mined systems:
(a)
2x1 + 11x2 + 7x3 = 13
5x1 + 13x2 + 3x3 = 11
(b)
12x1 + 8x2 − 9x3 = 4
15x1 − 13x2 + 14x3 = 5
(c)
x1 + 5x2 + 4x3 + 3x4 = 22
7x1 + 15x2 + 6x3 + x4 = 15
−3x1 + 7x2 + 3x3 + 6x4 = 19
(d)
x1 + 5x2 + x3 + 13x4 = 3
2x1 + 15x2 + 12x3 + 9x4 = 8
x1 − 11x2 + 17x3 + 22x4 = −2
41. Find the least squares solution for each of the following underdeter-
mined systems:
(a)
3x1 − 9x2 + 5x3 = 21
4x1 + 17x2 + 15x3 = 23
(b)
9x1 − 5x2 − 8x3 = 14
7x1 − 3x2 + 4x3 = 11
(c)
2x1 − 6x2 + 7x3 + 9x4 = 9
5x1 + 11x2 + 9x3 − 4x4 = 8
2x1 + 5x2 + 6x3 + 8x4 = 7
(d)
3x1 + 5x2 + 7x3 + 3x4 = 33
2x1 + 21x2 + 2x3 + 29x4 = 18
5x1 − 31x2 + 19x3 + 12x4 = 22
Interpolation and Approximation 645
42. Find the least squares solution for each of the following underdeter-
mined systems:
(a)
3x1 + 23x2 + 14x3 = 51
4x1 − 37x2 + 35x3 = 13
(b)
5x1 − 16x2 − 18x3 = 44
2x1 − 23x2 + 34x3 = 51
(c)
2x1 + 7x2 − 9x3 + 5x4 = 19
−x1 + 11x2 + 18x3 − 24x4 = 27
−3x1 + 15x2 + 6x3 + 8x4 = 39
(d)
13x1 + 5x2 − 13x3 + 23x4 = 17
17x1 − 22x2 − 12x3 + 29x4 = 28
15x1 + 11x2 − 19x3 + 22x4 = 32
1 3 1 −2
(a) A= 1 4 (b) A = 2 −3
2 1 5 1
2 3 1 2
(c) A= 3 4 (d) A = 4 −2
5 2 −1 1
2 2 3 −1
(a) A= 3 0 (b) A= 2 3
1 4 6 5
646 Applied Linear Algebra and Optimization using MATLAB
3 0 1 −1
2 1 1 1
(c) A=
1 −1 (d) A=
2
3
−1 2 3 5
45. Find the least squares solution for each of the following linear sys-
tems by using the pseudoinverse of the matrices:
(a)
2x1 + 5x2 = 4
x1 + 11x2 = 5
(b)
3x1 − 7x2 = 2
5x1 + 2x2 = 5
(c)
x1 + 3x2 + 2x3 = 1
2x1 − 7x2 + x3 = 1
5x1 + 3x2 + 5x3 = 1
(d)
2x1 + 7x2 + 12x3 = 3
x1 + 13x2 − 2x3 = 2
7x1 + 11x2 + 3x3 = 9
46. Find the least squares solution for each of the following linear sys-
tems by using the pseudoinverse of the matrices:
(a)
7x1 + 13x2 = 3
8x1 + 15x2 = 1
(b)
2x1 + 5x2 = 14
5x1 − 3x2 = 11
(c)
3x1 + 5x2 + 3x3 = 2
x1 + 5x2 + 6x3 = 5
−3x1 + 2x2 + 3x3 = 9
Interpolation and Approximation 647
(d)
x1 + 3x2 + 4x3 = 13
x1 − 6x2 + 17x3 = 9
4x1 − 15x2 + 9x3 = 2
(a)
5 3 4.9
x1
A = 1 3 , x= , b = 0.8 .
x2
2 1 1.7
(b)
1 −1 4 1.1
0 x1
2 1 0.2
A=
1
, x= x2 , 0.9 .
b=
1 0
x3
2 −1 1 1.7
(c)
1 −1 1 0.7
−1 x1 −0.8
4 2
A=
−2
, x= x2 , b=
−1.5 .
1 2
x3
1 4 2 1.02
650 Applied Linear Algebra and Optimization using MATLAB
(d)
3 2 1 2.5
1 x1
2 2 1.1
A= , x = x2 , b=
0.8 .
1 0 −1
x3
2 1 −2 1.9
57. Find the least squares solution of each of the following linear systems
Ax = b using QR decomposition:
(a)
2 1 1
x1
A = 1 1 , x= , b = 1 .
x2
2 2 1
(b)
1 −1 0.1
x1
A= 0 2 , x= , b = 1.7 .
x2
1 1 0.9
(c)
3 −1 2.7
x1
A = −1 4 , x= , b = −0.8 .
x2
−2 1 −1.5
(d)
4 2 3.5
x1
A = 1 2 , x= , b = 1.1 .
x2
1 0 0.8
58. Solve Problem 55 using singular value decomposition.
59. Find the least squares solution of each of the following linear systems
Ax = b using singular value decomposition:
(a)
−2 2 −1.8
x1
A = −1 1 , x= , b = −0.9 .
x2
2 2 1.9
Interpolation and Approximation 651
(b)
1 −1 1.1
x1
A= 1 2 , x= , b = 0.7 .
x2
1 1 0.9
(c)
1 0 0.9
x1
A = 1 1 , x= , b = 0.85 .
x2
−1 1 −0.9
(d)
3 2 2.5
x1
A = 1 −1 , x= , b = 1.05 .
x2
1 3 0.85
Linear Programming
6.1 Introduction
In this chapter, we give an introduction to linear programming. Linear
Programming (LP) is a mathematical method for finding optimal solutions
to problems. It deals with the problem of optimizing (maximizing or min-
imizing) a linear function, subject to the constraints imposed by a system
of linear inequalities. It is widely used in industry and in government. His-
torically, linear programming was first developed and applied in 1947 by
George Dantzig, Marshall Wood, and their associates in the U.S. Air Force;
the early applications of LP were thus in the military field. However, the
emphasis in applications has now moved to the general industrial area. LP
today is concerned with the efficient use or allocation of limited resources
to meet desired objectives.
Z(x1 , x2 , . . . , xN ) = c1 x1 + c2 x2 + · · · + cN xN .
For any function Z(x1 , x2 , . . . , xN ) and any real number b, the inequalities
Z(x1 , x2 , . . . , xN ) ≤ b
and
Z(x1 , x2 , . . . , xN ) ≥ b
are linear inequalities. For example, 3x1 + 2x2 ≤ 11 and 10x1 + 15x2 ≥ 17
are linear inequalities, but 2x21 x2 ≥ 3 is not a linear inequality. •
b1 c1 x1 0
b2 c2 x2 0
b= , c= , x= , 0= (6.8)
.. .. .. ..
. . . .
bM cN xN 0
6.3 Terminology
The following terms are commonly used in LP :
Step II. Identify the Constraints: in this problem, the constraints are
the limited availability of the two resources—labor and material. Model
A requires 7 hours of labor for each unit, and its production quantity is
xA . Hence, the labor requirement for model A alone will be 7xA hours
(assuming a linear relationship). Similarly, models B and C will require
3xB and 6xC hours, respectively. Thus, the total requirement of labor will
be 7xA + 3xB + 6xC , which should not exceed the available 250 hours. So
the labor constraint becomes
7xA + 3xB + 6xC ≤ 250.
Similarly, the raw material requirements will be 6xA pounds for model A,
7xB pounds for model B, and 8xC pounds for model C. Thus, the raw
material constraint is given by
6xA + 7xB + 8xC ≤ 360.
658 Applied Linear Algebra and Optimization using MATLAB
Step III. Identify the Objective: the objective is to maximize the total
profit for sales. Assuming that a perfect market exists for the product such
that all that is produced can be sold, the total profit from sales becomes
Thus, the complete mathematical model for the product mix problem may
now be summarized as follows:
xA ≥ 0, xB ≥ 0, xC ≥ 0.
•
hour. Each time an error is made by an inspector, the cost to the company
is $2.00. The company has available for the inspection job eight Grade 1
inspectors and ten Grade 2 inspectors. The company wants to determine
the optimal assignment of inspectors, which will minimize the total cost of
the inspection.
To develop the objective function, we note that the company incurs two
types of costs during inspections: wages paid to the inspector, and the cost
of the inspection error. The hourly cost of each Grade 1 inspector is
Solution. We first find the inequalities that describe the time and mon-
etary constraints. Let the company manufacture x1 of model A and x2 of
model B. Then the total manufacturing time is (5x1 + 2x2 ) hours. There
are 900 hours available. Therefore,
5x1 + 2x2 ≤ 900,
Linear Programming 661
Z = 3x1 + 2x2 .
Thus, the mathematical model for the given LP problem with the profit
function and the system of linear inequalities may be written as
x1 ≥ 0, x2 ≥ 0.
In this problem, we are interested in determining the values of the vari-
ables x1 and x2 that will satisfy all the restrictions and give the maximum
value of the objective function. As a first step in solving this problem, we
want to identify all possible values of x1 and x2 that are nonnegative and
satisfy the constraints. The solution of an LP problem is merely finding
the best feasible solution (optimal solution) in the feasible region (set of all
feasible solutions). In our example, an optimal solution is a feasible solu-
tion which maximizes the objective function 3x1 + 2x2 . The value of the
662 Applied Linear Algebra and Optimization using MATLAB
region. Our objective is to identify the feasible point with the largest value
of the objective function Z.
It has been proved that the maximum value of Z will occur at a vertex
of the feasible region, namely, at one of the points A, B, C, or O; if there
is more than one point at which the maximum occurs, it will be along one
Linear Programming 663
Syntax:
>> x = linprog(Z, A, b)
defines a set of lower and upper bounds on the design variables, x, so that
the solution is always in the range lb <= x <= ub. If no equalities exist,
then set Aeq = [ ] and beq = [ ].
Input Parameters:
Z is the objective function coefficients.
A is a matrix of inequality constraint coefficients.
b is the right-hand side in inequality constraints.
Aeqs the matrix of equality constraints.
beq is the right-hand side in equality constraints.
lb is the lower bounds of the desgin. values; -Inf == unbounded below.
Embty lb ==> -Inf on all variables.
ub is the upper bounds on the desgin. values; Inf == unbounded above.
Embty ub ==> Inf on all variables.
Output Parameters:
x optimal design parameters.
F val optimal design parameters.
exitflag exit conditions of linprog. If exitflag is:
>> Z = [3 2];
>> A = [5 2; 8 10];
>> b = [900; 2800];
Second, evaluate linprog:
>> x =
100.0000
200.0000
and
666 Applied Linear Algebra and Optimization using MATLAB
Note that the optimization functions in the toolbox minimize the objec-
tive function. To maximize a function Z, apply an optimization function
to minimize −Z. The resulting point where the maximum of Z occurs is
also the point where the minimum of −Z occurs.
We graph the regions specified by the constraints. Let’s put in the two
constraint inequalities:
>> X = 0 : 200;
>> Y 1 = max((900 − 5. ∗ X)./2, 0);
>> Y 2 = max((2800 − 8. ∗ X)./10, 0);
>> Y top = min([Y 1; Y 2]);
>> area(X, Y top);
>> Z = [3 2];
>> A = [5 2; 8 10];
>> b = [900; 2800];
>> simlp(−Z, A, b);
>> ans
100.0000
200.0000
This is the same answer we obtained before. Note that we entered the
negative of the coefficient vector for the objective function Z because simlp
also searches for a minimum rather than a maximum.
Z = 2x1 + x2 ,
Linear Programming 667
O(0, 0) : ZO = 0 + 0 = 0
668 Applied Linear Algebra and Optimization using MATLAB
145
Thus, the maximum value of the objective function Z is 17
; namely, when
x1 = 55
17
35
and x2 = 17 . •
To get the above results using MATLAB’s built-in function, simlp, we
do the following:
>> Z = [2 1];
>> A = [3 5; 4 1];
>> b = [20; 15];
>> simlp(−Z, A, b);
>> ans
3.2353
2.0588
In fact, we can compute
is equivalent to
is equivalent to
minimize Z = c1 x1 + c2 x2 + · · · + cN xN
is equivalent to
ZA ≤ ZB .
−ZA ≥ −ZB .
The diet problem arises in the choice of foods for a healthy diet. The prob-
lem is to determine the foods in a diet that minimize the total cost per day,
subject to constraints that ensure minimum daily nutritional requirements.
Let
• M = number of nutrients
and
x1 ≥ 0, x2 ≥ 0, . . . , xN ≥ 0,
where aij , bi , cj (i = 1, 2, . . . , M ; j = 1, 2, . . . , N ) are constants. •
x1 ≤ 8
x2 ≤ 10
5x1 + 3x2 ≥ 45
x1 ≥ 0, x2 ≥ 0.
In this problem, we are interested in determining the values of the vari-
ables x1 and x2 that will satisfy all the restrictions and give the least value
of the objective function. As a first step in solving this problem, we want
to identify all possible values of x1 and x2 that are nonnegative and satisfy
the constraints. For example, a solution x1 = 8 and x2 = 10 is positive
and satisfies all the constraints. In our example, an optimal solution is a
feasible solution which minimizes the objective function 40x1 + 36x2 .
nonnegativity constraints imply that all feasible values of the two variables
will be in the first quadrant. The constraint 5x1 + 3x2 ≥ 45 requires that
any feasible solution (x1 , x2 ) to the problem should be on one side of the
straight line 5x1 + 3x2 = 45. The proper side is found by testing whether
the origin satisfies the constraint or not. The line 5x1 + 3x2 = 45 is first
plotted by taking two convenient points (for example, x1 = 0, x2 = 15 and
x1 = 9, x2 = 0).
In some LP problems, there may exist more than one feasible solution such
that their objective values are equal to the optimal values of the linear
program. In such cases, all of these feasible solutions are optimal solu-
tions, and the LP problem is said to have alternative or multiple optimal
solutions. To illustrate this, consider the following LP problem.
674 Applied Linear Algebra and Optimization using MATLAB
Z = x1 + 2x2 ,
x1 + 2x2 ≤ 10
x1 + x2 ≥ 1
x2 ≤ 4
and
x1 ≥ 0, x2 ≥ 0.
Solution. The feasible region is shown in Figure 6.4. The objective func-
tion lines are drawn for Z = 2, 6, and 10. The optimal value for the LP
problem is 10, and the corresponding objective function line x1 + 2x2 = 10
coincides with side BC of the feasible region. Thus, the corner point fea-
sible solutions x1 = 10, x2 = 0(B), and x1 = 2, x2 = 4(C), and all other
points on the line BC are optimal solutions. •
Linear Programming 675
Unbounded Solution
Notice that each feasible region we have discussed is such that the whole
of the segment of a straight line joining two points within the region lies
within that region. Such a region is called a convex. A theorem states that
the feasible region in an LP problem is a convex (see Figure 6.5).
xi ≥ 0, i = 1, 2, . . . , M.
The characteristics of this form are:
1. All decision variables are nonnegative.
2. All the constraints are of the ≤ form.
3. The objective function is to maximize.
Linear Programming 677
Note that the variables corresponding to the M identity columns are called
basic variables and the remaining variables are called nonbasic variables.
The feasible solution obtained by setting the nonbasic variables equal to
zero and using the constraint equations to solve for the basic variables is
called the basic feasible solution.
and
x1 ≥ 0, x2 ≥ 0, . . . , xN ≥ 0
b1 ≥ 0, b2 ≥ 0, . . . , bM ≥ 0. (6.11)
x≥0
b ≥ 0, (6.14)
Leather Limited manufactures two types of belts: the deluxe model and the
regular model. Each type requires 3 square yards of leather. A regular belt
requires 5 hours of skilled labor and a deluxe belt requires 4 hours. Each
week, 55 square yards of leather and 75 hours of skilled labor are available.
Each regular belt contributes $10 to profit and each deluxe belt, $15. For-
mulate the LP problem.
Solution. Let x1 be the number of deluxe belts and x2 be the regular belts
that are produced weekly. Then the appropriate LP problem sets the form:
maximize Z = 10x1 + 15x2 ,
subject to the constraints
3x1 + 3x2 ≤ 55
4x1 + 5x2 ≤ 75
x1 ≥ 0, x2 ≥ 0.
To convert the above inequality constraints to equality constraints, we de-
fine for each ≤ constraint a slack variable ui (ui = slack variable for ith
constraint), which is the amount of the resource unused in the ith con-
straint. Because x1 + x2 square yards of leather are being used, and 40
square yards are available, we define u1 by
u1 = 55 − 3x1 − 3x2 or 3x1 + 3x2 + u1 = 55.
Similarly, we define u2 by
u2 = 75 − 4x1 − 5x2 or 4x1 + 5x2 + u2 = 75.
Observe that a point (x1 , x2 ) satisfies the ith constraint, if and only if ui ≥
0. Thus, the converted LP problem
maximize Z = 10x1 + 15x2 ,
subject to the constraints
3x1 + 3x2 + u1 = 55
4x1 + 5x2 + u2 = 75
680 Applied Linear Algebra and Optimization using MATLAB
x1 ≥ 0, x2 ≥ 0, u1 ≥ 0, u2 ≥ 0
is in standard form.
or
a11 x1 + a12 x2 + · · · + a1N xN − v1 = b1 .
We do the same for the other remaining ≥ constraints; the converted
standard form of the diet problem after adding the sign restrictions vi ≥
0(i = 1, 2, . . . , M ) may be written as
minimize Z = c1 x1 + c2 x2 + · · · + cN xN ,
and
xi ≥ 0, vi ≥ 0 (i = 1, 2, . . . , M ).
A point (x1 , x2 , . . . , xN ) satisfies the ith ≥ constraint, if and only if vi is
nonnegative.
Linear Programming 681
x1 ≤ 30
x2 ≤ 45
15x1 + 25x2 ≤ 70
30x1 + 35x2 ≥ 90
x1 ≥ 0, x2 ≥ 0
x1 + u1 = 30
x2 + u2 = 45
15x1 + 25x2 + u3 = 70
30x1 + 35x2 + v4 = 90
x1 ≥ 0, x2 ≥ 0, u1 ≥ 0, u2 ≥ 0, u3 ≥ 0, v4 ≥ 0.
682 Applied Linear Algebra and Optimization using MATLAB
Before proceeding further with our discussion with the simplex algo-
rithm, we must define the concept of a basic solution to a linear system
(6.13).
Any basic solution to a linear system (6.13) in which all variables are
nonnegative is a basic feasible solution. •
The simplex method deals only with basic feasible solutions in the sense
that it moves from one basic solution to another. Each basic solution is
associated with an iteration. As a result, the maximum number of itera-
tions in the simplex method cannot exceed the number of basic solutions
of the standard form. We can thus conclude that the maximum number of
iterations cannot exceed
N N!
= .
M (N − M )!M !
The basic–nonbasic swap gives rise to two suggestive concepts: The en-
tering variable is a current nonbasic variable that will “enter” the set of
basic variables at the next iteration. The leaving variable is a current basic
variable that will “leave” the basic solution in the next iteration.
For any LP problem with M constraints, two basic feasible solutions are
said to be adjacent if their sets of basic variables have M −1 basic variables
in common. In other words, an adjacent feasible solution differs from the
present basic feasible solution in exactly one basic variable. •
2. Locate the negative element in the last row, other than the last ele-
ment, that is largest in magnitude (if two or more entries share this
property, any one of these can be selected). If all such entries are
nonnegative, the tableau is in final form.
Linear Programming 685
4. Select the divisor that yields the smallest quotient. This element is
called the pivot element (if two or more elements share this property,
any one of these can be selected as the pivot).
6. Repeat steps 2-5 until all such negative elements have been eliminated
from the last row. The final matrix is called the final simplex tableau
and it leads to the optimal solution.
x1 + x2 + x3 ≤ 100
3x1 + 2x2 + 4x3 ≤ 200
x1 + 2x2 + x3 ≤ 150
x1 ≥ 0, x2 ≥ 0, x3 ≥ 0.
Solution. Take the three slack variables u1 , u2 , and u3 , which must be
added to the given 3 constraints to get the standard constraints, which may
be written in the LP problem as
x1 + x2 + x3 + u 1 = 100
3x1 + 2x2 + 4x3 + u2 = 200
x1 + 2x2 + x3 + u3 = 150
x1 ≥ 0, x2 ≥ 0, x3 ≥ 0, u1 ≥ 0, u2 ≥ 0, u3 ≥ 0.
The objective function Z = 3x1 + 5x2 + 8x3 is rewritten in the form
Thus, the entire problem now becomes that of determining the solution to
the following system of equations:
x1 + x2 + x3 + u1 = 100
3x1 + 2x2 + 4x3 + u2 = 200
x1 + 2x2 + x3 + u3 = 150
−3x1 − 5x2 − 8x3 + Z = 0.
Since we know that the simplex algorithm starts with an initial basic
feasible solution, by inspection, we see that if we set nonbasic variables
x1 = x2 = x3 = 0, we can solve for the values of the basic variables
u1 , u2 , u3 . So the basic feasible solution for the basic variables is
u1 = 100, u2 = 200, u3 = 150, x1 = x2 = x3 = 0.
It is important to observe that each basic variable may be associated
with the row of the canonical form in which the basic variable has a coef-
ficient of 1. Thus, for the initial canonical form, u1 may be thought of as
the basic variable for row 1, as may u2 for row 2, and u3 for row 3. To
perform the simplex algorithm, we also need a basic (although not neces-
sarily nonnegative) variable for the last row. Since Z appears in the last
row with a coefficient of 1, and Z does not appear in any other row, we use
Z as its basic variable. With this convention, the basic feasible solution for
our initial canonical form has
basic variables {u1 , u2 , u3 , Z} and nonbasic variables {x1 , x2 , x3 }.
For this basic feasible solution
u1 = 100, u2 = 200, u3 = 150, Z = 0, x1 = x2 = x3 = 0
Note that a slack variable can be used as a basic variable for an equation
if the right-hand side of the constraint is nonnegative.
basis x1 x2 x3 u 1 u2 u3 Z constants
u1 1 1 1 1 0 0 0 100
3 1 1
u2 4 2
1 0 4
0 0 50
u3 1 2 0 0 0 1 0 150
Z −3 −5 −8 0 0 0 1 0
basis x1 x2 x3 u1 u2 u3 Z constants
1 1
u1 4 2
0 1 − 14 0 0 50
3 1 1
x3 4 2
1 0 4
0 0 50
u3 1 2 0 0 0 1 0 150
Z 3 −1 0 0 2 0 1 400
basis x1 x2 x3 u1 u2 u3 Z constants
1 1
u1 4 2
0 1 − 14 0 0 50
3 1 1
x3 4 2
1 0 4
0 0 50
u3 1 l
2 0 0 0 1 0 150
Z 3 −1 0 0 2 0 1 400
basis x1 x2 x3 u1 u2 u3 Z constants
u1 0 0 0 1 − 14 − 14 0 25
2
1 1
x3 2
0 1 0 4
− 14 0 25
2
1 1
x2 2
1 0 0 0 2
0 75
7 1
Z 2
0 0 0 2 2
1 475
Since all negative elements have been eliminated from the last row, the final
tableau gives the following system of equations:
1 1 25
u1 − u2 − u3 =
4 4 2
1 1 1 25
x1 + x3 + u2 − u3 =
2 4 4 2
1 1
x1 + x2 + u3 = 75
2 2
7 1
x1 + 2u2 + u3 + Z = 475.
2 2
688 Applied Linear Algebra and Optimization using MATLAB
x1 ≥ 0, x2 ≥ 0, x3 ≥ 0, u1 ≥ 0, u2 ≥ 0, u3 ≥ 0.
The final equation, under the constraints, implies that Z has a maximum
value of 475 when x1 = 0, u2 = 0, u3 = 0. On substituting these values back
into the equations, we get
25 25
x2 = 75, x3 = , u1 = .
2 2
Thus, Z = 3x1 + 5x2 + 8x3 has a maximum value of 475 at
25
x1 = 0, x2 = 75, x3 = .
2
Note that the reasoning at this maximum value of Z implies that the
element in the last row and the last column of the final tableau will always
correspond to the maximum value of the objective function Z. •
Z = 8x1 + 2x2 ,
4x1 + x2 ≤ 32
4x1 + 3x2 ≤ 48
x1 ≥ 0, x2 ≥ 0.
Solution. Take the two slack variables u1 and u2 , which must be added
to the given 2 constraints to get the standard constraints, which may be
written in the LP problem as
4x1 + x2 + u1 = 32
4x1 + 3x2 + u2 = 48
Linear Programming 689
x1 ≥ 0, x2 ≥ 0, u1 ≥ 0, u2 ≥ 0.
The objective function Z = 8x1 + 2x2 is rewritten in the form
−8x1 − 2x2 + Z = 0.
Thus, the entire problem now becomes that of determining the solution to
the following system of equations:
4x1 + x2 + u1 = 32
4x1 + 3x2 + u2 = 48
−8x1 − 2x2 + Z = 0.
basis x1 x2 u1 u2 Z constants
1 1
u1 1 4 4
0 0 8
u2 4 3 0 1 0 48
Z −8 −2 0 0 1 0
basis x1 x2 u1
u2 Z constants
1 1
x1 1 4
04
0 8
u2 0 2 −1 1 0 16
Z 0 0 2 0 1 64
Since all negative elements have been eliminated from the last row, the
final tableau gives the following system of equations:
1 1
x1 + x2 + u1 = 8
4 4
2x2 − u1 + u2 = 16
2u1 + Z = 64,
690 Applied Linear Algebra and Optimization using MATLAB
x1 ≥ 0, x2 ≥ 0, u1 ≥ 0, u2 ≥ 0.
2x2 + u2 = 16,
x1 ≥ 0, x2 ≥ 0, u2 ≥ 0.
To use the simplex method set ‘LargeScale’ to ‘off’ and ‘simplex’ to ‘on’ in
options:
Then call the function linprog with the options input argument:
>> Z = [8 : 2];
>> A = [4 1; 4 3];
>> b = [32; 48];
>> lb = [0; 0]; ub = [20; 20];
>> [x, F val, exitf lag, output] = linprog(−Z, A, b, [ ], [ ], lb, ub);
basis x1 x2 u1 u2 u3 Z1 constants
u1 0 3l 1 −2 0 0 12
x1 1 −1 0 1 0 0 4
u3 0 0 0 1 1 0 9
Z1 0 −1 0 2 0 1 8
692 Applied Linear Algebra and Optimization using MATLAB
basis x1 x2 u1 u2 u3 Z1 constants
1
x2 0 1 3
− 23 0 0 4
1 1
x1 1 0 3 3
0 0 8
u3 0 0 0 1 1 0 9
1 4
Z1 0 0 3 3
0 1 12
u2 + u3 = 9
1 4
u1 + u2 + Z1 = 12,
3 3
with the constraints
x1 ≥ 0, x2 ≥ 0, u1 ≥ 0, u2 ≥ 0, u3 ≥ 0.
2x1 + x2 + u 1 = 20
x1 − 2x2 + u2 = 4
−x1 + x2 + u3 = 5
2x1 − x2 + Z = 0.
basis x1 x2 u1 u2 u3 Z constants
u1 2 1 1 0 0 0 20
u2 1l −1 0 1 0 0 4
u3 −1 1 0 0 1 0 5
Z 2 −1 0 0 0 1 0
basis x1 x2 u1 u2 u3 Z constants
u1 0 3l 1 −2 0 0 12
x1 1 −1 0 1 0 0 4
u3 0 0 0 1 1 0 9
Z 0 1 0 −2 0 1 −8
basis x1 x2 u1 u2 u3 Z constants
1
x2 0 1 3
− 23 0 0 4
1 1
x1 1 0 3 3
0 0 8
u3 0 0 0 1 1 0 9
Z 0 0 − 13 − 43 0 1 −12
Thus, Z = −2x1 + x2 has a minimum value of −12 at x1 = 8 and x2 = 4.•
5x1 − x2 ≤ 30
x1 ≤ 5
x1 ≥ 0, x2 unrestricted.
Solution. Since x2 is unrestricted in sign, we replace x2 by x02 − x002 in the
objective function, and in the first constraint we obtain
x1 ≥ 0, x02 ≥ 0, x002 ≥ 0.
Now convert the problem into standard form by adding two slack vari-
ables, u1 and u2 , in the first and second constraints, respectively, and we
get
maximize Z = 30x1 − 4x02 + 4x002 ,
subject to the constraints
x1 ≥ 0, x02 ≥ 0, x002 ≥ 0, u1 ≥ 0, u2 ≥ 0.
The simplex tableaus are as follows:
basis x1 x02 x002 u1 u2 Z constants
u1 5 −1 1 1 0 0 30
u2 1l 0 0 0 1 0 5
Z −30 4 −4 0 0 1 0
Linear Programming 695
Note that the variables x02 and x002 will never both be basic variables in
the same tableau. •
minimize Z = −3x1 + x2 + x3 ,
x1 − 2x2 + x3 ≤ 11
−4x1 + x2 + 2x3 ≥ 3
2x1 − x3 = −1
x1 ≥ 0, x2 ≥ 0, x3 ≥ 0.
First, the problem is converted to the standard form as follows:
minimize Z = −3x1 + x2 + x3 ,
x1 − 2x2 + x3 + u1 = 11
−4x1 + x2 + 2x3 − v2 = 3
−2x1 + x3 = 1
Linear Programming 697
x1 ≥ 0, x2 ≥ 0, x3 ≥ 0, u1 ≥ 0, v2 ≥ 0.
The slack variable u1 in the first constraint equation is a basic variable.
Since there are no basic variables in the other constraint equations, we add
artificial variables, w3 and w4 , to the second and third constraint equations,
respectively. To retain the standard form, w3 and w4 will be restricted to
be nonnegative. Thus, we now have an artificial system given by:
x1 − 2x2 + x3 + u1 = 11
−4x1 + x2 + 2x3 − v2 + w3 = 3
−2x1 + x3 + w4 = 1
x1 ≥ 0, x2 ≥ 0, x3 ≥ 0, u1 ≥ 0, v2 ≥ 0, w3 ≥ 0, w4 ≥ 0.
The artificial system has a basic feasible solution in canonical form given
by
x1 = x2 = x3 = 0, u1 = 11, v2 = 0, w3 = 3, w4 = 1.
But this is not a feasible solution to the original problem due to the presence
of the artificial variables w3 and w4 at positive values. •
On the other hand, it is easy to see that any basic feasible solution
to the artificial system in which the artificial variables (w3 and w4 in the
above example) are zero is automatically a basic feasible solution to the
original problem. Hence, the object is to reduce the artificial variables
to zero as soon as possible. This can be accomplished in two ways, and
each one gives rise to a variant of the simplex method, the Big M simplex
method and the Two-Phase simplex method.
x1 ≥ 0, x2 ≥ 0, x3 ≥ 0, u1 ≥ 0, v2 ≥ 0.
Solution. In order to derive the artificial variables to zero, a large cost
will be assigned to w3 and w4 so that the objective function becomes:
minimize Z = −3x1 + x2 + x3 + M w3 + M w4 ,
where M is a very large positive number. Thus, the LP problem with its
artificial variables becomes:
minimize Z = −3x1 + x2 + x3 + M w3 + M w4 ,
x1 − 2x2 + x3 + u1 = 11
−4x1 + x2 + 2x3 − v2 + w 3 = 3
−2x1 + x3 + w4 = 1
x1 ≥ 0, x2 ≥ 0, x3 ≥ 0, u1 ≥ 0, v2 ≥ 0, w3 ≥ 0, w4 ≥ 0.
Note the reason behind the use of the artificial variables. We have three
equations and seven unknowns. Hence, the starting basic solution must
include 7−3 = 4 zero variables. If we put x1 , x2 , x3 , and v2 at the zero level,
we immediately obtain the solution u1 = 11, w3 = 3, and w4 = 1, which is
the required starting feasible solution. Having constructed a starting feasible
solution, we must “condition” the problem so that when we put it in tabular
form, the right-hand side column will render the starting solution directly.
This is done by using the constraint equations to substitute out w3 and w4
in the objective function. Thus,
w3 = 3 + 4x1 − x2 − 2x3 + v2
w4 = 1 + 2x1 − x3 .
or
Z = (−3 + 6M )x1 + (1 − M )x2 + (1 − 3M )x3 + M v2 + 4M,
700 Applied Linear Algebra and Optimization using MATLAB
basis x1 x2 x3 u1 v2 w3 w4 Z constants
u1 1 −2 1 1 0 0 0 0 11
w3 −4 1 2 0 −1 1 0 0 3
w4 −2 0 1l 0 0 0 1 0 1
Z 3 − 6M −1 + M −1 + 3M 0 −M 0 0 1 4M
basis x1 x2 x3 u1 v2 w3 w4 Z constants
u1 3 −2 0 1 0 0 −1 0 10
w3 0 1l 0 0 −1 1 −2 0 1
x3 −2 0 1 0 0 0 1 0 1
Z 1 −1 + M 0 0 −M 0 1 − 3M 1 M +1
basis x1 x2 x3 u1 v2 w3 w4 Z constants
u1 3l 0 0 1 −2 2 −5 0 12
x2 0 1 0 0 −1 1 −2 0 1
x3 −2 0 1 0 0 0 1 0 1
Z 1 0 0 0 −1 1−M −1 − M 1 2
Now both the artificial variables w3 and w4 have been reduced to zero. Thus,
Tableau 3 represents a basic feasible solution to the original problem. Of
course, this is not an optimal solution since x1 can reduce the objective
function further by replacing u1 in the basis.
basis x1 x2 x3 u1 v2 w3 w4 Z constants
1
x1 1 0 0 3
− 32 2
3
− 35 0 4
x2 0 1 0 0 −1 1 −2 0 1
x3 0 0 1 0 0 0 1 0 9
Z 0 0 0 − 13 − 31 1
3
−M 2
3
− M 1 −2
Linear Programming 701
The following steps describe the Two-Phase simplex method. Note that
steps 1 − 3 for the Two-Phase simplex method are similar to steps 1 − 3
for the Big M simplex method.
4. For now, ignore the original LP’s objective function. Instead solve
an LP problem whose objective function is minimize W = (sum of
all the artificial variables). This is called the Phase 1 LP problem.
The act of solving the Phase 1 LP problem will force the artificial
variables to be zero.
Note that:
If the optimal value of W is equal to zero, and no artificial variables
Linear Programming 703
are in the optimal Phase 1 basis, then we drop all columns in the
optimal Phase 1 tableau that correspond to the artificial variables.
We now combine the original objective function with the constraints
from the optimal Phase 1 tableau. This yields the Phase 2 LP prob-
lem. The optimal solution to the Phase 2 LP problem is the optimal
solution to the original LP problem.
If the optimal value W is greater than zero, then the original LP
problem has no feasible solution.
If the optimal value of W is equal to zero and at least one artificial
variable is in the optimal Phase 1 basis, then we can find the optimal
solution to the original LP problem if, at the end of Phase 1, we drop
from the optimal Phase 1 tableau all nonbasic artificial variables and
any variable from the original problem that has a negative coefficient
in the last row of the optimal Phase 1 tableau.
Example 6.15 To illustrate the Two-Phase simplex method, let us con-
sider again the standard form of Example 6.13:
minimize Z = −3x1 + x2 + x3 ,
subject to the constraints
x1 − 2x2 + x3 + u1 = 11
−4x1 + x2 + 2x3 − v2 = 3
−2x1 + x3 = 1
x1 ≥ 0, x2 ≥ 0, x3 ≥ 0, u1 ≥ 0, v2 ≥ 0.
Solution.
Phase 1 Problem:
Since we need artificial variables w3 and w4 in the second and third equa-
tions, the Phase 1 problem reads as
minimize W = w3 + w4 ,
subject to the constraints
x1 − 2x2 + x3 + u1 = 11
−4x1 + x2 + 2x3 − v2 + w3 = 3
−2x1 + x3 + w4 = 1
704 Applied Linear Algebra and Optimization using MATLAB
x1 ≥ 0, x2 ≥ 0, x3 ≥ 0, u1 ≥ 0, v2 ≥ 0, w3 ≥ 0, w4 ≥ 0.
Because w3 and w4 are in the starting solution, they must be substituted
out in the objective function as follows:
W = w3 + w4
= (3 + 4x1 − x2 − 2x3 + v2 ) + (1 + 2x1 − x3 )
= 4 + 6x1 − x2 − 3x3 + v2 ,
W − 6x1 + x2 + 3x3 − v2 = 4.
The initial basic feasible solution for the Phase 1 problem is given below:
basis x1 x2 x3 u1 v2 w3 w4 W constants
u1 1 −2 1 1 0 0 0 0 11
w3 −4 1 2 0 −1 1 0 0 3
w4 −2 0 1l 0 0 0 1 0 1
W −6 1 3 0 −1 0 0 1 4
basis x1 x 2 x3 u1 v2 w3 w4 W constants
u1 3 −2 0 1 0 0 −1 0 10
w3 0 1l 0 0 −1 1 −2 0 1
x3 −2 0 1 0 0 0 1 0 1
W 0 −1 0 0 −1 0 −3 1 1
basis x1 x2 x3 u1 v2 w3 w4 W constants
u1 3 0 0 1 −2 2 −5 0 12
x2 0 1 0 0 −1 1 −2 0 1
x3 −2 0 1 0 0 0 1 0 1
W 0 0 0 0 0 −1 1 1 0
Phase 2 Problem: The artificial variables have now served their purpose
and must be dispensed with in all subsequent computations. This means
that the equations of the optimum tableau in Phase 1 can be written as
3x1 + u1 − 2v2 = 12
x2 − v2 = 1
−2x1 + x3 = 1.
These equations are exactly equivalent to those in the standard form of the
original problem (before artificial variables are added). Thus, the original
problem can be written as
minimize Z = −3x1 + x2 + x3 ,
3x1 + u1 − 2v2 = 12
x2 − v2 = 1
−2x1 + x3 = 1
x1 ≥ 0, x2 ≥ 0, x3 ≥ 0, u1 ≥ 0, v2 ≥ 0.
As we can see, the principal contribution of the Phase 1 computations is to
provide a ready starting solution to the original problem. Since the prob-
lem has three equations and five variables, by putting 5 − 3 = 2 variables,
namely, x1 = v2 = 0, we immediately obtain the starting basic feasible so-
lution u1 = 12, x2 = 1, and x3 = 1.
Z = −3x1 + x2 + x3
= −3(4 − 31 u1 + 32 v2 ) + (1 + v2 ) + 2(4 − 31 u1 + 23 v2 )
= −2 + 13 u1 + 31 v2 .
basis x1 x2 x3 u1 v2 Z constants
u1 3l 0 0 1 −2 0 12
x2 0 1 0 0 −1 0 1
x3 −2 0 1 0 0 0 1
Z 0 0 0 − 31 − 13 1 −2
basis x1 x2 x3 u1 v2 Z constants
1
x1 1 0 0 3
− 23 0 4
x2 0 1 0 0 −1 0 1
2
x3 0 0 1 3
− 43 0 9
Z 0 0 0 − 13 − 13 1 −2
Comparing the Big M simplex method and the Two-Phase simplex method,
we observe the following:
• The basic approach to both methods is the same. Both add the
artificial variables to get the initial canonical system and then derive
them to zero as soon as possible.
• The Big M simplex method solves the linear problem in one pass,
while the Two-Phase simplex method solves it in two stages as two
linear programs.
6.11 Duality
From both the theoretical and practical points of view, the theory of duality
is one of the most important and interesting concepts in linear program-
ming. Each LP problem has a related LP problem called the dual problem.
Linear Programming 707
The original LP problem is called the primal problem. For the primal prob-
lem defined by (6.1)–(6.3) above, the corresponding dual problem is to find
the values of the M variables y1 , y2 , . . . , yM to solve the following:
minimize V = b 1 y 1 + b2 y 2 + · · · + bM y M , (6.17)
a11 y1 + a21 y2 + · · · + aM 1 y M ≥ c1
a12 y1 + a22 y2 + · · · + aM 2 y M ≥ c2
.. .. .. .. (6.18)
. . ··· . .
a1N y1 + a2N y2 + · · · + aM N yM ≥ cN
and
y1 ≥ 0, y2 ≥ 0, . . . , yM ≥ 0. (6.19)
In matrix notation, the primal and the dual problems are formulated
as
Primal Dual
Maximize Z = cT x Minimize V = bT y
subject to the constraints subject to the constraints
Ax ≤ b AT y ≥ c
x≥0 y ≥ 0,
where
a11 a12 ··· a1N
a21 a22 ··· a2N
A=
.. .. .. ..
. . . .
aM 1 aM 2 · · · aM N
b1 c1 x1 y1
b2 c2 x2 y2
b= , c = .. , x = .. , y=
.. ..
. . . .
bM cN xN yM
708 Applied Linear Algebra and Optimization using MATLAB
The concept of a dual can be introduced with the help of the following
LP problem.
Primal Problem:
x1 ≥ 0, x2 ≥ 0, x3 ≥ 0, x4 ≥ 0.
The above linear problem has two constraints and four variables. The dual
of this primal problem is written as:
Dual Problem:
minimize V = 25y1 + 15y2 ,
subject to the constraints
y1 + 2y2 ≥ 1
2y1 + 2y2 ≥ 2
2y1 − 3y2 ≥ −3
−3y1 + 2y2 ≥ 4
y1 ≥ 0, y2 ≥ 0,
where y1 and y2 are called the dual variables. •
In both the primal and the dual problems, the variables are nonnegative
and the constraints are inequalities. Such problems are called symmetric
dual linear programs.
The general rules for writing the dual of a linear program in symmetric
form are summarized below:
2. Make the cost vector of the primal the right-hand side constants of
the dual.
710 Applied Linear Algebra and Optimization using MATLAB
3. Make the right-hand side vector of the primal the cost vector of the
dual.
Example 6.17 Write the following linear problem in symmetric form and
then find its dual:
x1 + x2 + x3 ≤ 300
x4 + x5 + x6 ≤ 600
x1 + x4 ≥ 200
x2 + x5 ≥ 300
x3 + x6 ≥ 400
x1 ≥ 0, x2 ≥ 0, x3 ≥ 0, x4 ≥ 0, x5 ≥ 0, x6 ≥ 0.
Solution. For the above linear program (minimization) to be in symmetric
form, all the constraints must be in “greater than or equal to” form. Hence,
we multiply the first two constraints by −1, then we have the primal problem
as
minimize Z = 2x1 + 4x2 + 3x3 + 5x4 + 3x5 + 4x6 ,
subject to the constraints
−x1 − x2 − x3 ≥ −300
− x4 − x5 − x6 ≥ −600
x1 + x4 ≥ 200
x2 + x5 ≥ 300
x3 + x6 ≥ 400
Linear Programming 711
x1 ≥ 0, x2 ≥ 0, x3 ≥ 0, x4 ≥ 0, x5 ≥ 0, x6 ≥ 0.
The dual of the above primal problem becomes
−y1 + y3 ≤ 2
−y1 + y4 ≤ 4
−y1 + y5 ≤ 3
− y2 + y3 ≤ 5
− y2 + y4 ≤ 3
− y2 + y5 ≤ 4
y1 ≥ 0, y2 ≥ 0, y3 ≥ 0, y4 ≥ 0, y5 ≥ 0.
•
Example 6.18 Write the standard form of the primal-dual problem of the
following linear problem:
x1 + 2x2 + x3 ≤ 10
2x1 − x2 + x3 = 8
712 Applied Linear Algebra and Optimization using MATLAB
x1 ≥ 0, x2 ≥ 0, x3 ≥ 0.
Solution. The given primal can be put in the standard primal as
x1 + 2x2 + x3 + u1 = 10
2x1 − x2 + x3 = 8
x1 ≥ 0, x2 ≥ 0, x3 ≥ 0, u1 ≥ 0.
Notice that u1 is a slack in the first constraint. Now its dual form can
be written as
minimize V = 10y1 + 8y2 ,
subject to the constraints
y1 + 2y2 ≥ 5
2y1 − y2 ≥ 12
y1 + y2 ≥ 4
y1 ≥ 0, y2 unrestricted.
•
Example 6.19 Write the standard form of the primal-dual problem of the
following linear problem:
−x1 + x2 ≥ −3
2x1 + 3x2 ≤ 5
x1 ≥ 0, x2 ≥ 0.
Solution. The given primal can be put in the standard primal form as
x1 − x2 + u1 = 3
2x1 + 3x2 + u2 = 5
x1 ≥ 0, x2 ≥ 0, u1 ≥ 0, u2 ≥ 0.
Notice that u1 and u2 are slack in the first and second constraints. Their
dual form is
maximize V = 3y1 + 5y2 ,
subject to the constraints
y1 + 2y2 ≤ 5
−y1 + 3y2 ≤ −2
y1 ≤ 0
y2 ≤ 0
y1 , y2 unrestricted.
•
If the primal problem has an optimal solution, then the dual problem also
has an optimal solution, and the optimal values of their objective functions
are equal, i.e.,
Maximize Z = Minimize V.
•
basis x1 x2 x3 u1 u2 Z constants
u1 m
5 2
0 1 − 13 0 50
3 3 3
1 1 1 40
x3 3 3
1 0 3
0 3
Z −7 −4 0 0 5 1 200
basis x1 x2 x3 u1 u2 Z constants
x1 1 m
2
0 3
− 51 0 10
5 5
1
x3 0 5
1 − 15 2
5
0 10
Z 0 − 65 0 21
5
18
5
1 270
basis x1 x2 x3 u1 u2 Z constants
5 3
x2 2
1 0 2
− 21 0 25
x3 − 21 0 1 − 12 1
2
0 5
Z 3 0 0 6 3 1 300
x1 = 0, x2 = 25, x3 = 5,
The optimal solution to the dual problem is found in the objective row
under the slack variables u1 and u2 columns as
y1 = 6 and y2 = 3.
Maximize Z = cT x Minimize V = bT y
subject to the constraints subject to the constraints
Ax ≤ b AT y ≥ c
x≥0 x≥0
The value of the objective function of the minimization problem (dual) for
any feasible solution is always greater than or equal to that of the maxi-
mization problem (primal). •
Example 6.21 Consider the following LP problem:
Primal:
Dual:
y1 ≥ 0, y2 ≥ 0.
The feasible solution for the primal is x1 = x2 = x3 = x4 = 1, and
y1 = y2 = 1 is feasible for the dual. The value of the primal objective is
Z = cT x = 10,
V = bT y = 40.
Note that
cT x < bT y,
which satisfies the Weak Duality Theorem 6.4. •
We will discuss here only changes in the right-hand side constants bi , which
are the most common in sensitivity analysis.
718 Applied Linear Algebra and Optimization using MATLAB
Example 6.22 A small towel company makes two types of towels, stan-
dard and deluxe. Both types have to go through two processing departments,
cutting and sewing. Each standard towel needs 1 minute in the cutting de-
partment and 3 minutes in the sewing department. The total available time
in cutting is 160 minutes for a production run. Each deluxe towel needs
2 minutes in the cutting department and 2 minutes in the sewing depart-
ment. The total available time in sewing is 240 minutes for a production
run. The profit on each standard towel is $1.00, whereas the profit on each
deluxe towel is $1.50. Determine the number of towels of each type to pro-
duce to maximize profit.
maximize Z = x1 + 1.5x2 ,
x1 1 0 − 12 1
2
0 40
5 1
Z 0 0 8 8
1 130
maximize Z = x1 + 1.5x2 ,
x1 ≥ 0, x2 ≥ 0.
Of course, we can again solve this revised problem using the simplex method.
However, since the modification is not drastic, we would wonder whether
there is an easy way to utilize the final tableau for the original problem in-
stead of going through all the iteration steps for the revised problem. There
is a way, and this way is the key idea of the sensitivity analysis.
1. Since the slack variable for the cutting department is u1 , then use the
u1 -column.
basis x1 x2 u1 u2 Z constants
3
x2 0 1 4
− 14 0 60 + 1 × 34
x1 1 0 − 12 1
2
0 40 + 1 × (− 21 )
5 1 5
Z 0 0 8 8
1 130 + 1 × 8
(where in the last column, the first entry is the original entry, the second
one is one unit (minutes) increased, and the final one is the u1 -column
entry), i.e.,
720 Applied Linear Algebra and Optimization using MATLAB
basis x1 x2 u1 u2 Z constants
3
x2 0 1 4
− 14 0 60 34
x1 1 0 − 12 1
2
0 39 12
5 1
Z 0 0 8 8
1 130 58
x1 1 0 − 21 1
2
0 40 + (−8) × ( 12 ) = 36
5 1
Z 0 0 8 8
1 130 + (−8) × ( 18 ) = 129
The bottom-row entry, 85 , represents the net profit increase for a one unit
(minute) increase of the available time at the cutting department. It is
called the shadow price at the cutting department. Similarly, another bottom-
row entry, 18 , is called the shadow price at the sewing department.
Linear Programming 721
A negative entry in the bottom row represents the net profit increase
when one unit of the variable in that column is introduced. For example,
if a negative entry in the x1 column is − 41 , then introducing one unit of
x1 will result in $( 41 ) = 25 cents net profit gain. Therefore, the bottom-row
entry, 85 , in the preceding tableau represents that the net profit loss is $( 58 )
when one unit of u1 is introduced, keeping the constraint, 160, the same.
6.13 Summary
In this chapter we gave a brief introduction to linear programming. Prob-
lems were described by systems of linear inequalities. One can see that
small systems can be solved in a graphical manner, but that large sys-
tems are solved using row operations on matrices by means of the simplex
method. For finding the basic feasible solution to artificial systems, we
discussed the Big M simplex method and the Two-Phase simplex method.
6.14 Problems
1. The Oakwood Furniture Company has 12.5 units of wood on hand
from which to manufacture tables and chairs. Making a table uses
two units of wood and making a chair uses one unit. Oakwood’s
distributor will pay $20 for each table and $15 for each chair, but
they will not accept more than eight chairs, and they want at least
twice as many chairs as tables. How many tables and chairs should
the company produce to maximize its revenue? Formulate this as a
linear programming problem.
Food 1 2 3 4 5
Vitamin A content 1 0 1 1 2
Vitamin B content 0 1 2 1 1
Cost per unit (cents) 20 20 31 11 12
Formulate this as a linear programming problem.
4. Consider a problem of scheduling the weekly production of a certain
item for the next 4 weeks. The production cost of the item is $10
for the first two weeks, and $15 for the last two weeks. The weekly
demands are 300, 700, 800, and 900 units, which must be met. The
plant can produce a maximum of 700 units each week. In addition,
the company can employ overtime during the second and third weeks.
This increases weekly production by an additional 200 units, but the
cost of production increases by $5 per unit. Excess production can be
stored at a cost of $3 an item per week. How should the production
be scheduled to minimize the total cost? Formulate this as a linear
programming problem.
5. An oil refinery can blend three grades of crude oil to produce regular
and super gasoline. Two possible blending processes are available.
For each production run the older process uses 5 units of crude A,
7 units of crude B, and 2 units of crude C to produce 9 units of
regular and 7 units of super gasoline. The newer process uses 3
units of crude A, 9 units of crude B, and 4 units of crude C to
produce 5 units of regular and 9 units of super gasoline for each
production run. Because of prior contract commitments, the refinery
must produce at least 500 units of regular gasoline and at least 300
units of super for the next month. It has available 1500 units of crude
A, 1900 units of crude B, and 1000 units of crude C. For each unit of
regular gasoline produced the refinery receives $6, and for each unit
of super it receives $9. Determine how to use the resources of crude
oil and the two blending processes to meet the contract commitments
and, at the same time, maximize revenue. Formulate this as a linear
programming problem.
6. A tailor has 80 square yards of cotton material and 120 square yards
of woolen material. A suit requires 2 square yards of cotton and 1
724 Applied Linear Algebra and Optimization using MATLAB
a pair of slacks is $3 and the net profit on a skirt is $4. How many
skirts and how many slacks should be made to maximize profit?
12. The Apple Company has a contract with the government to supply
1200 microcomputers this year and 2500 next year. The company has
the production capacity to make 1400 microcomputers each year, and
it has already committed its production line for this level. Labor and
management have agreed that the production line can be used for at
most 80 overtime shifts each year, each shift costing the company an
additional $20, 000. In each overtime shift, 50 microcomputers can
be manufactured this year and used to meet next year’s demand, but
must be stored at a cost of $100 per unit. How should the production
be scheduled to minimize cost?
13. Solve each of the following linear programming problems using the
graphical method:
x1 ≥ 0, x2 ≥ 0
15. Solve each of the following linear programming problems using the
graphical method:
(d) Maximize: Z = x1 − x2
Subject to: x1 + x2 ≤ 6
x1 − x2 ≥ 0
−x1 − x2 ≥ 3
x1 ≥ 0, x2 ≥ 0
16. Solve each of the following linear programming problems using the
graphical method:
3x1 + 11x2 ≤ 33
3x1 + 4x2 ≥ 24
x1 ≥ 0, x2 ≥ 0
17. Solve each of the following linear programming problems using the
simplex method:
x1 ≥ 0, x2 ≥ 0
19. Solve each of the following linear programming problems using the
simplex method:
x1 + x2 + x3 ≥ 120
x1 ≥ 0, x2 ≥ 0, x3 ≥ 0
20. Solve each of the following linear programming problems using the
simplex method:
21. Use Big M simplex method to solve each of the following linear pro-
gramming problems:
22. Use the Two-Phase simplex method to solve each of the following
linear programming problems:
x1 ≥ 0, x2 ≥ 0, x3 ≥ 0, x4 ≥ 0
23. Write the duals of each of the following linear programming problems:
24. Write the duals of each of the following linear programming problems:
x1 ≥ 0, x3 ≥ 0, x2 ≤ 0
(d) Maximize: Z = x1 + x2
Subject to: 2x1 + x2 = 5
3x1 − x2 = 6
x1 ≥ 0, x2 ≥ 0
(f ) Maximize: Z = x1 + 2x2 + 3x − 3
Subject to: 2x1 + x2 + x3 = 5
x1 + 3x2 + 4x3 = 3
x1 ≥ 0, x2 ≥ 0
Chapter 7
Nonlinear Programming
7.1 Introduction
In the previous chapter, we studied linear programming problems in some
detail. For such cases, our goal was to maximize or minimize a linear func-
tion subject to linear constraints. But in many interesting maximization
and minimization problems, the objective function may not be a linear
function, or some of the constraints may not be linear constraints. Such
an optimization problem is called a Nonlinear Programming (NLP) prob-
lem.
735
736 Applied Linear Algebra and Optimization using MATLAB
The equation
lim f (x) = L
x→a
Nonlinear Programming 737
means that as x gets closer to a (real number), the value of f (x) gets
arbitrarily closer to L. Note that the limit of a function may or may not
exist. •
2x3 − 6x2 + x − 3
lim .
x→3 x−3
Solution. The domain of the given function
2x3 − 6x2 + x − 3
f (x) =
x−3
is all the real numbers except the number 3. To find the limit we shall
change the form of f (x) by factoring it as follows:
(2x2 + 1)(x − 3)
lim .
x→3 x−3
When we investigate the limit as x → 3, we assume that x 6= 3. Hence,
x − 3 6= 0, and we can now cancel the factor x − 3. Thus,
To find the limit of f (x), we have to find the one-side limits, i.e., the
right-hand limit
lim+ f (x),
x→3
738 Applied Linear Algebra and Optimization using MATLAB
does not exist because f (x) is not defined throughout an open interval con-
taining 3. •
16 − x2
(iii) lim = 8 = f (4). •
x→4 4 − x
f (x + h) − f (x)
f 0 (x) = lim , provided limits exist.
x→a h
The process of finding derivatives is called differentiation. •
If a limit does not exist, then the function is not differentiable. Re-
member that the derivative of f (x) at x = a, i.e., f 0 (a) is called the slope
of f (x). If f 0 (a) > 0, then f (x) is increasing at x = a, whereas if f 0 (a) < 0,
then f (x) is decreasing at x = a.
f (x) = c f 0 (x) = 0
f (x) = x f 0 (x) = 1
f (x) = ax + b f 0 (x) = a
f (x) = xn f 0 (x) = nxn−1
f (x) = ebx f 0 (x) = bebx
f (x) = ax f 0 (x) = ax ln a
f (x) = ln x f 0 (x) = x1
f (x) = [g(x)]n f 0 (x) = n[g(x)]n−1 g 0 (x)
f (x) = [f1 (x) + f2 (x)] f 0 (x) = f10 (x) + f20 (x)
f (x) = [f1 (x)f2 (x)] f 0 (x) = f10 (x)f2 (x) + f1 (x)f20 (x)
f1 (x) f2 (x)f10 (x) − f1 (x)f20 (x)
f (x) = f 0 (x) =
f2 (x) [f2 (x)]2
Higher Derivatives
Sometimes we have to find the derivatives of derivatives. For this we can
take sequential derivatives to form second derivatives, third derivatives,
and so on. As we have seen, if we differentiate a function f , we obtain
Nonlinear Programming 741
Similarly, we can find the second and the third derivatives of the function
as follows:
To plot the above function f (x) and its first three derivatives f 0 (x),
f 00 (x), f 000 (x), we use the following MATLAB commands:
Let f (x) be continuous on the open interval (a, b) and let f 0 (x) exist
and be continuous on (a, b). If f 0 (x) > 0 in (a, c) and f 0 (x) < 0 in (c, b),
then f (x) is concave downward at x = c. On the other hand, if f 0 (x) < 0
in (a, c) and f 0 (x) > 0 in (c, b), then f (x) is concave upward at x = c. The
type of concavity is related to the sign of the second derivative, and so
we have the second derivative test to determine if a critical point is local
extremum or not.
Nonlinear Programming 743
Example 7.5 Find the local extrema and inflection points of the function
f (x) = x3 − 2x2 + x + 1
f 0 (x) = 3x2 − 4x + 1,
f 00 (x) = 6x − 4.
The fact that f 00 ( 31 ) = −2 < 0 and f 00 (1) = 2 > 0 tells us that the critical
point x = 13 is the local maximum (f ( 13 ) = 27 31
) and x = 1 is the local
minimum (f (1) = 1) of f (x). The inflection point is given by f 00 (x) =
0, or at x = 23 , i.e., ( 23 , − 31 ). •
744 Applied Linear Algebra and Optimization using MATLAB
Example 7.6 Find the maximum and minimum values of the function
f (x) = x3 − 6x2 − 15x + 1
on the closed interval [−2, 6].
Note that the extrema of a function on the closed interval [a, b] is also
called the absolute extrema of a function. •
x = f minbnd(0 f unction0 , a, b)
Note that the function can be entered as a string, as the name of a function
file, or as the name of an inline function, i.e.,
The value of the function at the minimum can be added to the output by
using the following command:
Also, the fminbnd command can be used to find the maximum of a func-
tion, which can be done by multiplying the function by −1 and then finding
the minimum. For example:
Note that the maximum of the function is at x = −1, and the value of the
function at this point is 9.
f (x1 + h, x2 ) − f (x1 , x2 )
fx1 (x1 , x2 ) = lim
h→0 h
f (x1 , x2 + h) − f (x1 , x2 )
fx2 (x1 , x2 ) = lim .
h→0 h
In this definition, x1 and x2 are fixed (but arbitrary) and h is the only
variable; hence, we use the notation for limits of functions of one vari-
able instead of the (x1 , x2 ) → (a, b) notation introduced previously. We
Nonlinear Programming 747
∂f ∂f
f x1 = , fx2 =
∂x1 ∂x2
∂ ∂z
fx1 (x1 , x2 ) = f (x1 , x2 ) = = zx1
∂x1 ∂x1
∂ ∂z
fx2 (x1 , x2 ) = f (x1 , x2 ) = = zx2 .
∂x2 ∂x2
f x1 x2 = f x2 x1 ,
throughout R. •
Solution. The first partial derivatives of the given function are as follows:
∂f x1
fx1 = = 2x1 + p 2
∂x1 x1 + x22 + 4
∂f x2
fx2 = =p 2 .
∂x2 x1 + x22 + 4
∂fx1 x 1 x2
fx1 x2 (x1 , x2 ) = =− 2 ,
∂x2 (x1 + x22 + 4)3/2
(1)(2)
fx1 x2 (1, 2) = − = −0.0741.
(12 + 22 + 4)3/2
•
To plot the above function f (x1 , x2 ) and its partial derivatives fx1 , fx2 ,
fx1 x2 , we use the following MATLAB commands
Nonlinear Programming 749
>> syms x1 x2 ;
>> f = x1 ˆ 2+sqrt(x1 ˆ 2+x2 ˆ 2+4);
>> f x1 = dif f (f, x1 ); f x2 = dif f (f, x2 );
>> f x1 x2 = (f x1 , x2 );
>> simplif y(f x1 x2 );
>> subs(f x1 x2 , {x1 , x2 }, {1, 2});
>> ezmesh(f );
>> ezmesh(f x1 );
>> ezmesh(f x1 x2 );
Solution. The first partial derivatives of the given function are as follows:
∂f
f x1 = = 9x21 + 8x1 x2
∂x1
∂f
f x2 = = 4x21 + 6x22 − 2x23
∂x2
∂f
f x3 = = −4x2 x3 + 12x23 .
∂x3
750 Applied Linear Algebra and Optimization using MATLAB
D = D(x10 , x20 ) = fx1 x1 (x10 , x20 )fx2 x2 (x10 , x20 ) − fx21 x2 (x10 , x20 ).
Then:
(i) if D > 0 and fx1 x1 (x10 , x20 ) < 0, f (x10 , x20 ) is a local maximum
value;
(ii) if D > 0 and fx1 x1 (x10 , x20 ) > 0, f (x10 , x20 ) is a local minimum
value;
(iii) if D < 0, f (x10 , x20 ) is not an extreme value ((x10 , x20 ) is a saddle point) ;
Nonlinear Programming 751
Solution. Since the first derivatives of the function with respect to x1 and
x2 are
the critical points obtained by solving the simultaneous equations fx1 (x1 , x2 ) =
fx2 (x1 , x2 ) = 0, are (1, −2) and (−1, −2).
To find the critical points for the given function f (x1 , x2 ) using MAT-
LAB commands we do the following:
>> syms x1 x2 ;
>> f = 4 ∗ x1 ˆ 3 + 2 ∗ x2 ˆ 2 − 12 ∗ x1 + 8 ∗ x2 + 2;
>> f x1 = dif f (f, x1 );
>> f x2 = dif f (f, x2 );
>> [x1 c, x2 c] = solve(f x1 , f x2 );
x1 c =
[ 1]
[−1]
x2 c =
[−2]
[−2]
Similarly, the second partial derivatives of the function are
D = fx1 x1 (1, −2)fx2 x2 (1, −2) − fx21 x2 (1, −2) = 24(4) − 0 = 96 > 0.
Furthermore, fx1 x1 (1, −2) = 24 > 0 and so, by (ii), f (1, −2) = −14 is a
local minimum value of the given function f (x1 , x2 ).
752 Applied Linear Algebra and Optimization using MATLAB
Now testing the given function at the other critical point, (−1, −2), we find
that
D = fx1 x1 (−1, −2)fx2 x2 (−1, −2)−fx21 x2 (−1, −2) = (−24)(4)−0 = −96 < 0.
Thus, by (iii), (−1, −2) is a saddle point and f (1, −2) is not an extremum.
•
>> syms x1 x2
>> f = 4 ∗ x1 ˆ 3 + 2 ∗ x2 ˆ 2 − 12 ∗ x1 + 8 ∗ x2 + 2;
>> f x1 = dif f (f, x1 );
>> f x1 x1 = dif f (f x1 , x1 );
>> f x2 = dif f (f, x2 );
>> f x2 x2 = dif f (f x2 , x2 );
>> f x1 x2 = dif f (f x1 , x2 );
>> D = f x1 x1 ∗ f x2 x2 − fx1 x2 ˆ 2;
>> [ax1 ax2 ] = solve(f x1 , f x2 );
>> T = [ax1 ax2 subs(D, x1 x2 , ax1 ax2 ) subs(f x1 x1 , x1 x2 , ax1 ax2 )];
>> double(T );
ans =
1 −2 96 24
−1 −2 −96 −24
>> ezplot(f );
u = (0, 1)) is
∂f ∂f
Du f (x1 , x2 ) = u1 + u2 .
∂x1 ∂x2
•
we can easily compute the first partial derivatives of the given function as
∂f ∂f
f x1 = = 2x1 x2 and f x2 = = x21 + 6x2 ,
∂x1 ∂x2
Thus, for the given unit vector u = (3, −2), and using Theorem 7.5, we
have
Du f (1, 2) = 4(3) + 13(−2) = 12 − 26 = −14,
which is the required solution. •
Nonlinear Programming 755
>> syms x1 x2 ;
>> u1 = 3; u2 = −2;
>> f = x1 ˆ 2x2 + 3 ∗ x2 ˆ 2;
>> f x1 = (f, x1 ); f x2 = (f, x2 );
>> du f = subs(f x1 , {x1 , x2 }, {1, 2}) ∗ u1 + subs(f x2 , {x1 , x2 }, {1, 2}) ∗ u2;
One can easily compute the directional derivatives by using the following
theorem.
756 Applied Linear Algebra and Optimization using MATLAB
∇f (x1 , x2 , x3 ) = (fx1 , fx2 , fx3 ) = (cos(x2 x3 ), −x1 x3 sin(x2 x3 ), −x1 x2 sin(x2 x3 )).
>> syms x1 x2 x3
>> f = x1 ∗ cos(x2 ∗ x3 );
>> gradf = jacobian(f, [x1 , x2 , x3 ]);
ans =
∇f (x̄) = 0.
then
2x1
∇f (x1 , x2 ) = ,
2(x2 − 1)
and at the stationary point x̄ = [0, 21 ]T , the gradient vector of the function
is
1 0
∇f 0, = .
2 −1
•
This matrix is formally referred to as the Hessian of f. Note that the Hes-
sian matrix is square and symmetric.
∂f ∂f
= 2x1 + 3x2 , = 4(x2 − 1) + 3x1 ,
∂x1 ∂x2
758 Applied Linear Algebra and Optimization using MATLAB
∂ 2f ∂ 2f
= 2, = 4.
∂x21 ∂x22
∂ 2f ∂ 2f
= 3, = 3.
∂x1 ∂x2 ∂x2 ∂x1
Thus,
2 3
H(x1 , x2 ) =
3 4
is the Hessian matrix of the given function. •
>> syms x1 x2
>> f un = x1 ˆ 2+2 ∗ (x2 − 1)ˆ 2+3 ∗ x1 ∗ x2 + 4;
>> T = Hessian(f un, [x1 , x2 ])
>> double(T )
2 3
ans =
3 4
Program 7.1
MATLAB m-file for the Hessian Matrix
function H = Hessian(fun,Vars)
n = numel(Vars);
Hess = vpa(ones(n,n));
for j = 1:n;
for i = 1:n;
Hess(j,i) = diff(diff(fun,Vars(i),1),Vars(j),1);
end end
H=Hess;
Nonlinear Programming 759
For the n-dimensional case, the Hessian matrix H(x) is defined as follows:
2
∂ f (x) ∂ 2 f (x) ∂ 2 f (x)
∂x2 ···
1 ∂x1 ∂x2 ∂x1 ∂xn
∂ 2 f (x) ∂ 2 f (x) 2
∂ f (x)
···
∂x ∂x
2 1 ∂x22 ∂x2 ∂xn
H(x) = .
.. .. .. ..
. . . .
2
∂ f (x) ∂ 2 f (x) 2
∂ f (x)
···
∂xn ∂x1 ∂xn ∂x2 ∂x2n
The relationships between the Hessian matrix definiteness and the clas-
sification of stationary points are discussed in the following two theorems.
Since we know that the second derivative test for a function of one
variable gives no information when the second derivative of a function is
zero, similarly, if H(x̄) is indefinite or if H(x̄) is positive-semidefinite at
x̄ but not all points are in a neighborhood of x̄, then the function might
have a maximum, or a minimum, or neither at x̄.
which gives
T 4z1
= 4z12 + 4z22 .
z Hz = z1 z2
4z2
Nonlinear Programming 761
Note that
zT Hz = 4(z12 + z22 ) > 0,
for z =6 0, so the Hessian matrix is positive-definite and the stationary
point x̄ = [0, 0]T is a strict local minimizing point. •
which gives
x̄ = [1, 0]T .
The Hessian matrix for the given function is
2 0
H(x1 , x2 ) = .
0 −1
and it gives
T 2z1
= 2z12 − z22 .
z Hz = z1 z2
−z2
The sign of zT Hz clearly depends on the particular values taken on by
z1 and z2 , so the Hessian matrix is indefinite and the stationary point x
cannot be classified on the basis of this test. •
762 Applied Linear Algebra and Optimization using MATLAB
0
(x − x2)
f (x) ≈ T2 (x) = f (x0 ) + (x − x0 )f (x0 ) + f 00 (x0 ). (7.2)
2!
Example 7.16 Find the cubic approximation of the function f (x) = ex cos x
expanded about x0 = 0.
>> syms x
>> f = inline(0 exp(x) ∗ cos(x)0 );
>> taylor(f (x), 4, 0)
ans = 1 + x − 1/3 ∗ xˆ 3
Now consider a function f (x, y) of two variables and all of its partial deriva-
tives are continuous, then we can approximate this function about a given
point (x0 , y0 ) using Taylor’s series as follows:
∂f (x0 , y0 ) ∂f (x0 , y0 )
f (x, y) = f (x0 , y0 ) + (x − x0 ) + (y − y0 )
∂x ∂y
1 ∂ 2 f (x0 , y0 ) ∂ 2 f (x0 , y0 )
2
+ (x − x 0 ) + 2 (x − x0 )(y − y0 )
2 ∂x2 ∂x∂y
∂ 2 f (x0 , y0 )
2
+ (y − y0 ) + · · · . (7.4)
∂y 2
Writing the above expression in more compact form by using matrix nota-
tion gives
∂f (x0 , y0 ) ∂f (x0 , y0 ) x − x0
f (x, y) = f (x0 , y0 ) +
∂x ∂y y − y0
2
∂ f (x0 , y0 ) ∂ 2 f (x0 , y0 )
1
∂x2 ∂x∂y
x − x0
+ (x − x0 )(y − y0 ) y − y0 + · · · .
2 2
∂ f (x0 , y0 ) ∂ 2 f (x0 , y0 )
∂x∂y ∂y 2
764 Applied Linear Algebra and Optimization using MATLAB
and
∂ 2 f (x) ∂ 2 f (x) ∂ 2 f (x)
∂x2 ···
1 ∂x1 ∂x2 ∂x1 ∂xn
2
∂ f (x) 2 2
∂ f (x) ∂ f (x)
∂x ∂x 2
···
2 2 1 ∂x 2 ∂x 2 ∂x
n
∇ f (x) = .
.. .. .. ..
. . . .
2 2 2
∂ f (x) ∂ f (x) ∂ f (x)
“““‘ ···
∂xn ∂x1 ∂xn ∂x2 ∂x2n
For a continuously differentiable function, the mixed second partial deriva-
tives are symmetric, i.e.,
∂ 2 f (x) ∂ 2 f (x)
= , i, j = 1, 2, . . . , n,
∂xi ∂xj ∂xj ∂xi
which implies that the Hessian matrix is always symmetric. Thus, Taylor’s
series approximation for a function of several variables about given point
x∗ can be written as
1
f (x) = f (x∗ ) + ∇f (x∗ )T ∆x + ∆xT ∇2 f (x∗ )∆x + · · · .
2
Example 7.17 Find the linear and quadratic approximations of the func-
tion
f (x1 , x2 ) = x21 + 5x1 x22 + 3x22 + 4x31 x32
at the given point (a, b) = (1, 1) using Taylor’s series formulas.
∂f
= 2x1 + 5x22 + 12x21 x32 ,
∂x1
and
∂f
= 10x1 x2 + 6x2 + 12x31 x22 .
∂x2
766 Applied Linear Algebra and Optimization using MATLAB
and
x1 − a
∆x = .
x2 − b
f (a, b) = f (1, 1) = 13
Thus,
72 46
∇2 f (1, 1)T = .
46 45
So using the quadratic approximation formula
1
f (x1 , x2 ) ≈ T2 (x1 , x2 ) = T1 (x1 , x2 ) + ∆xT ∇2 f (a, b)∆x,
2
we get
72 46
1 x1 − 1
f (x1 , x2 ) ≈ T2 (x1 , x2 ) = T1 (x1 , x2 )+ [(x1 − 1) (x2 − 1)] ,
2 x2 − 1
46 45
which gives
So
x = (x1 , x2 , . . . , xn )T
and
a11 a12 ··· a1n
a21 a22 ··· a2n
A=
a31 a32 ··· a3n ,
.. .. .. ..
. . . .
an1 an2 · · · ann
then the function
a11 a12 · · · a1n
x 1
a21 a22 · · · a2n x2
q(x) = xT Ax = x1 x2 · · · xn
a31 a32 · · · a3n
.
.. .. .. .. ..
. . . .
xn
an1 an2 · · · ann
Pn Pn
= i=1 j=1 aij xi xj ,
Solution. If
x1
x = x2 ,
x3
then
2 3 −1 x 1
xT Ax =
x1 x2 x3 3 6 0 x2
−1 0 4 x3
or
2x1 + 2x2 − x3
xT Ax =
x1 x2 x3 2x1 + 6x2 .
−x1 + 4x3
Thus,
or
xT Ax = 2x21 + 4x1 x2 − 2x1 x3 + 6x22 + 4x23 .
After rearranging the terms, we have
Hence,
(x1 + 2x2 )2 + 2x22 + (x1 − x3 )2 + 3x23 > 0,
unless
x1 = x2 = x3 = 0.
•
770 Applied Linear Algebra and Optimization using MATLAB
Example 7.19 Find the matrix associated with the quadratic form
Thus,
4 1 −2 x 1
xT Ax =
x1 x2 x3 1 2 2 x2 .
−2 2 2 x3
•
It can be proved that the necessary and sufficient conditions for the real-
ization of the preceding cases are:
One can easily compute the eigenvalues of the above matrix, which are 8, 8,
and 5. Since all of these eigenvalues are positive, q(x1 , x2 , x3 ) is a positive,
definite quadratic form. •
772 Applied Linear Algebra and Optimization using MATLAB
xT AT x = (xT Ax)T
and
1 T
xT B T x = x (A + AT )x
2
1 T 1
= x Ax + xT AT x
2 2
1 T 1
= x Ax + (xT Ax)T
2 2
1 T 1
= x Ax + xT Ax
2 2
T
= x Ax.
Also,
1 1 1
B T = (A + AT )T = (AT + A) = (A + AT ) = B.
2 2 2
Note that the quadratic forms of A and B are the same but the matrices A
and B are not, unless A is a symmetric. For example, for the matrix
4 4
A= ,
2 6
we have
1 4 4 4 2
B= + ,
2 2 6 4 6
and it gives
1 8 6 4 3
B= = .
2 6 12 3 6
Nonlinear Programming 773
we have
1 4 4 4 4
B= + ,
2 4 6 4 6
which gives
1 8 8 4 4
B= = .
2 8 12 4 6
Also, the quadratic forms
T 4 4 x1
= 4x21 + 6x1 x2 + 6x22
x Ax = x1 x2
2 6 x2
and
T 4 3 x1
= 4x21 + 6x1 x2 + 6x22
x Bx = x1 x2
3 6 x2
The eigenvalues of the above matrix A are −12, −10, and −1.5, and since
all of these eigenvalues are negative, q(x1 , x2 , x3 ) is a negative-definite
quadratic form. •
774 Applied Linear Algebra and Optimization using MATLAB
x2 + 5x + 6 = 0; x3 = 2x + 1; x100 + x2 + 1 = 0.
x = cos(x); ex + x − 10 = 0; x + ln x = 10.
There may be many roots of the given nonlinear equation, but we will
seek the approximation of only one of its roots, which lies on the given
Nonlinear Programming 775
interval [a, b]. This root may be simple (not repeating) or multiple (repeat-
ing).
f (x) = 0. (7.6)
We seek the values of x called the roots of (7.6) or the zeros of the function
f (x) such that (7.6) is true. The roots of (7.6) may be real or complex.
Here, we will look for the approximation of the real root of (7.6). There
are many methods that will give us information about the real roots of
(7.6). The methods we will discuss are all iterative methods. They are the
bisection method, fixed-point method, and Newton’s method.
y = f (x).
Our object is to find an x value for which y is zero. Using this method, we
begin by supposing f (x) is a continuous function defined on the interval
[a, b] and then by evaluating the function at two x values, say, a and b,
such that
f (a).f (b) < 0.
776 Applied Linear Algebra and Optimization using MATLAB
The implication is that one of the values is negative and the other is pos-
itive. These conditions can be easily satisfied by sketching the function
(Figure 7.6).
If f (c) ≈ 0, then c ≈ α is the desired root, and, if not, then there are
two possibilities. First, if f (a).f (c) < 0, then f (x) has a zero between
point a and point c. The process can then be repeated on the new interval
[a, c]. Second, if f (a).f (c) > 0, it follows that f (b).f (c) < 0 since it is
known that f (b) and f (c) have opposite signs. Hence, f (x) has a zero
between point c and point b and the process can be repeated with [c, b].
We see that after one step of the process, we have found either a zero or
a new bracketing interval which is precisely half the length of the original
one. The process continues until the desired accuracy is achieved. We use
the bisection process in the following example.
Example 7.22 Use the bisection method to find the approximation to the
root of the equation
x3 = 4x − 2
that is located on the interval [1.0, 2.0] accurate to within 10−4 .
a1 + b 1 1.0 + 2.0
c1 = = = 1.5; f (c1 ) = f (1.5) = 0.859375.
2 2
We see that the functional values approach zero as the number of it-
erations increases. We got the desired approximation to the root α =
1.6751309 of the given equation x3 = 4x − 2, i.e., c17 = 1.675133, which
was obtained after 17 iterations, with accuracy = 10−4 . •
f unction y = f n(x)
y = x.ˆ 3 − 4 ∗ x + 2;
Program 7.2
MATLAB m-file for the Bisection Method
function sol=bisect(fn,a,b,tol)
f a = f eval(f n, a); f b = f eval(f n, b);
if f a∗f b > 0; fprintf(’Endpoints have same sign’) return
end
while abs (b − a) > tol
c = (a + b)/2; f c = f eval(f n, c);
if f a ∗ f c < 0; b = c; else a = c; end
end; sol=(a + b)/2;
780 Applied Linear Algebra and Optimization using MATLAB
b−a
|α − cn | ≤ , n ≥ 1. (7.8)
2n
Moreover, to obtain an accuracy of
|α − cn | ≤
(for = 10−k ), it suffices to take
ln 10k (b − a)
n≥ , (7.9)
ln 2
where k is a nonnegative integer. •
The above Theorem 7.13 gives us information about bounds for errors
in approximation and the number of bisections needed to obtain any given
accuracy.
x = g(x). (7.11)
Any solution of (7.11) is called a fixed point for the iteration function g(x)
and, hence, a root of (7.10).
(a) g has at least one fixed point on the given interval [a, b].
then:
(b) The sequence (7.12) will converge to the attractive (unique) fixed-
point α in [a, b].
(c) The iteration (7.12) will converge to α for any initial approximation.
Nonlinear Programming 785
kn
|α − xn | ≤ |x1 − x0 |, for all n ≥ 1. (7.14)
1−k
f unction y = f n(x)
y = sqrt(4 ∗ x − 2)/x;
>> x0 = 1.5; tol = 0.00001
>> sol = f ixpt(0 f n0 , x0, tol);
Program 7.3
MATLAB m-file for the Fixed-Point Method
function sol=fixpt(fn,x0,tol)
old= x0+1;
while abs(x0-old) > tol; old=x0;
x0 = f eval(f n, old); end; sol=x0;
Let f ∈ C 2 [a, b] and let xn be the nth approximation to the root α such
Nonlinear Programming 787
(x − xn )2 00
f (x) = f (xn ) + (x − xn )f 0 (xn ) + f (η(x)), (7.16)
2
where η(x) lies between x and xn . Since f (α) = 0, then (7.16), with x = α,
gives
(α − xn )2 00
f (α) = 0 = f (xn ) + (α − xn )f 0 (xn ) + f (η(α)).
2
Since |α − xn | is small, we neglect the term involving (α − xn )2 , and so
0 ≈ f (xn ) + (α − xn )f 0 (xn ).
f (xn )
xn+1 = xn − , f 0 (xn ) 6= 0, for all n ≥ 0. (7.18)
f 0 (xn )
Example 7.24 Use Newton’s method to find the root of the equation x3 −
4x + 2 = 0 that is located on the interval [1.0, 2.0] accurate to 10−4 , taking
an initial approximation x0 = 1.5.
Solution. Given
f (x) = x3 − 4x + 2,
since Newton’s method requires that the value of the derivative of the func-
tion be found, the derivative of the function is
f 0 (x) = 3x2 − 4.
Now evaluating f (x) and f 0 (x) at the give approximation x0 = 1.5 gives
Using the iterative formula (7.18) again, we get the other new approxima-
tion as follows:
f (x1 ) (0.244177)
x2 = x1 − 0
= 1.727273 − = 1.677948.
f (x1 ) 4.950413
Thus, the successive iterates are shown in Table 7.3. Just after the third
iteration, the root is approximated to be x4 = 1.67513087056 and the
functional value is reduced to 4.05 × 10−10 . Since the exact solution is
1.67513087057, the actual error is 1 × 10−10 . We see that the convergence
is faster than the methods considered previously. •
To get the above results using MATLAB commands, first the function
x3 − 4x + 2 and its derivative 3x2 − 4 are saved in m-files called fn.m and
dfn.m, respectively, written as follows:
Nonlinear Programming 789
f unction y = f n(x)
y = x.ˆ 3 − 4 ∗ x + 2;
and
f unction dy = df n(x)
dy = 3 ∗ x.ˆ 2 − 4;
Program 7.4
MATLAB m-file for Newton’s Method
function sol=newton(fn,dfn,x0,tol)
old = x0+1;
while abs (x0 − old) > tol; old = x0;
x0 = old − f eval(f n, old)/f eval(df n, old);
end; sol=x0;
f1 (x, y) = 0 (7.19)
and
f2 (x, y) = 0. (7.20)
The problem can be stated as follows:
Given the continuous functions f1 (x, y) and f2 (x, y), find the values x = α
and y = β such that
f1 (α, β) = 0
f2 (α, β) = 0. (7.21)
by using Taylor’s theorem for functions of two variables for f1 (x, y) and
f2 (x, y) expanding about (xn , yn ), we have
f1 (x, y) = f1 (xn + (x − xn ), yn + (y − yn ))
∂f1 (xn , yn ) ∂f1 (xn , yn )
= f1 (xn , yn ) + (x − xn ) + (y − yn ) + ···
∂x ∂y
and
f2 (x, y) = f2 (xn + (x − xn ), yn + (y − yn ))
∂f2 (xn , yn ) ∂f2 (xn , yn )
= f2 (xn , yn ) + (x − xn ) + (y − yn ) + ···.
∂x ∂y
∂f
1 ∂f1 −1
xn+1 xn ∂x ∂y f1
= −
. (7.24)
∂f ∂f2
yn+1 yn f2
2
∂x ∂y
Nonlinear Programming 793
of (7.18) and (7.24) shows that the above procedure is indeed an extension
of Newton’s method in one variable, where division by f 0 is generalized to
premultiplication by J −1 .
Example 7.25 Solve the following system of two equations using Newton’s
method with accuracy = 10−5 :
4x3 + y = 6
x2 y = 1.
∂f1 ∂f2
f2 (1.0, 0.5) = −0.5, = f2 x = 1.0, = f2 y = 1.0.
∂x ∂y
The Jacobian matrix J and its inverse J −1 at the given initial approxima-
tion can be calculated as
∂f ∂f
1 1
∂x ∂y 12.0 1.0
J = =
∂f ∂f
2 2 1.0 1.0
∂x ∂y
and
−1 1 1.0 −1.0
J = .
11.0 −1.0 12.0
Nonlinear Programming 795
>> syms x y
>> f un = [4 ∗ xˆ 3+y − 6, xˆ 2∗y − 1];
>> var = [x, y];
>> R = jacobian(f, var);
Substituting all these values into (7.25), we get the first approximation as
x1 1.0 1 1.0 −1.0 −1.5 1.090909
= − = .
y1 0.5 11.0 −1.0 12.0 −0.5 0.909091
1.088264
= .
0.844686
796 Applied Linear Algebra and Optimization using MATLAB
The first two and the further steps of the method are listed in Table 7.4.
•
Note that a typical iteration of this method for this pair of equations
can be implemented in the MATLAB Command Window using:
Using the starting value (1.0, 0.5), the possible approximations are shown
in Table 7.4. •
We see that the values of both the functionals approach zero as the
number of iterations is increased. We got the desired approximations to
the roots after 3 iterations, with accuracy = 10−5 .
Newton’s method is fairly easy to implement for the case of two equa-
tions in two unknowns. We first need the function m-files for the equations
and the partial derivatives. For the equations in Example 7.25, we do the
following:
Nonlinear Programming 797
f unction f = f n2(v)
%Here f and v are vector quantities
x = v(1); y = v(2);
f (1) = 4 ∗ x.ˆ 2 + y − 6;
f (2) = x.ˆ 2 ∗ y − 1;
f unction J = df n2(v)
%Jacobian matrix f orf n2.m
x = v(1); y = v(2);
J(1, 1) = 12 ∗ x.ˆ 2; J(1, 2) = 1;
J(2, 1) = 2 ∗ x ∗ y; J(2, 2) = x.ˆ 2;
Then the following MATLAB commands can be used to generate the so-
lution of Example 7.25:
>> s = newton2(0 f n20 ,0 df n20 , [1.0, 0.5], 1e − 5)
s=
1.088282 0.844340
The m-file Newton2.m will need both the function and its partial deriva-
tives as well as a starting vector and a tolerance. The following code can
be used:
Program 7.5
MATLAB m-file for Newton’s Method for a Nonlinear System
function sol=newton2(fn2,dfn2,x0,tol)
old=x0+1; while max(abs(x0-old))>tol; old=x0;
f = f eval(f n2, old); f 1 = f (1); f 2 = f (2);
J=feval(dfn2,old);
f 1x = J(1, 1); f 1y = J(1, 2); f 2x = J(2, 1); f 2y = J(2, 2);
D = f 1x ∗ f 2y − f 1y ∗ f 2x;
h = (f 2 ∗ f 1y − f 1 ∗ f 2y)/D; k = (f 1 ∗ f 2x − f 2 ∗ f 1x)/D;
x0 = old + [h, k]; end; sol=x0;
Similarly, for a large system of equations it is convenient to use vector
notation. Consider the system
f (x) = 0,
798 Applied Linear Algebra and Optimization using MATLAB
This represents a system of linear equations for Z[n] and can be solved
by any of the methods described in Chapter 3. Once Z[n] has been found,
the next iterate is calculated from
2. The method requires the user to provide the derivatives of each func-
tion with respect to each variable. Therefore, one must evaluate
the n functions and the n2 derivatives at each iteration. So solv-
ing systems of nonlinear equations is a difficult task. For systems
of nonlinear equations that have analytical partial derivatives, New-
ton’s method can be used; otherwise, multidimensional minimization
techniques should be used.
1. Choose the initial guess for the roots of the system so that the deter-
minant of the Jacobian matrix is not zero.
or
∂g1 ∂g2 ∂g1 ∂g2
− < 1. (7.33)
∂x n ∂x n ∂y n ∂y n
Note that the fixed-point iteration may fail to converge even though the
condition (7.32) is satisfied, unless the process is started with an initial
guess (x0 , y0 ) sufficiently close to (α, β).
Example 7.26 Solve the following system of two equations using the fixed-
point iteration, with accuracy = 10−5 :
4x3 + y = 6
x2 y = 1.
Nonlinear Programming 801
A set S is convex if x0 ∈ S and x00 ∈ S implies that all points on the line
segment joining x0 and x00 are members of S. This ensures that
Note that the intersection of convex sets is a convex set, but the union
of convex sets is not necessarily a convex set. •
which gives
Given f (x) = x2 , then the left-hand side of the above inequality can be
written as
2 2
f (cx0 + (1 − c)x00 ) = (cx0 + (1 − c)x00 )2 = c2 x0 + (1 − c)2 x00 + 2c(1 − c)x0 x00 .
806 Applied Linear Algebra and Optimization using MATLAB
Thus,
2 2
(c2 − c)[x0 + x00 − 2x0 x00 ] ≤ 0
or
(c2 − c)(x0 − x00 )2 ≤ 0.
For c = 0 and c = 1, this inequality holds with the equality, and for c ∈
(0, 1), we have
c2 − c < 0.
Nonlinear Programming 807
Also,
(x0 − x00 )2 ≥ 0,
which also holds. Hence, the given f (x) is a convex function. •
From the above definitions of convex and concave functions, we see that
f (x1 , x2 , . . . , xn ) is a convex function, if and only if −f (x1 , x2 , . . . , xn ) is a
concave function, and vice-versa.
From the following Figure 7.18, we see a function that is neither convex
nor concave because the line segment AB lies below y = f (x) and the line
808 Applied Linear Algebra and Optimization using MATLAB
A function f (x) is said to be strictly convex if, for two distinct points x0
and x00 ,
f (cx0 + (1 − c)x00 ) < cf (x0 ) + (1 − c)f (x00 ),
where 0 < c < 1. Conversely, a function f (x) is strictly concave if −f (x)
is strictly convex.
f (x) = Kx + xT Ax,
Suppose that the second derivative of a function f (x) exists for all x in a
convex set S. Then f (x) is a convex function on S, if and only if
f 00 (x) ≥ 0, for all x ∈ S.
For example, the function f (x) = x2 is a convex function on S = R1
because
f 0 (x) = 2x, f 00 (x) = 2 ≥ 0.
Theorem 7.16 (Concave Function)
Suppose that the second derivative of a function f (x) exists for all x in a
convex set S. Then f (x) is a concave function on S, if and only if
f 00 (x) ≤ 0, for all x ∈ S.
For example, the function f (x) = x1/2 is a concave function on S = R1
because
1 1
f 0 (x) = x−1/2 , f 00 (x) = − x−3/2 ≤ 0.
2 4
Also, the function f (x) = 3x + 2 is both a convex and concave function on
S = R1 because
f 0 (x) = 3, f 00 (x) = 0.
Using the definitions and the above two theorems, it is difficult to check
for convexity of a given function because it would require consideration of
infinite many points. However, using the sign of the Hessian matrix of the
function, we can determine the convexity of a function.
810 Applied Linear Algebra and Optimization using MATLAB
Theorem 7.17
is a convex function.
Solution. First, we find the first and second partial derivatives of the
given function as follows:
∂f
f x1 = = 6x1 + 2x2
∂x1
∂f
f x2 = = 2x1 + 4x2 − 2x3
∂x2
∂f
f x3 = = −2x2 + 4x3 ,
∂x3
and the second derivatives of the functions are as follows:
∂ 2f
(fx1 )x1 = =6
∂x1 2
∂ 2f
(fx2 )x2 = =4
∂x2 2
∂ 2f
(fx3 )x3 = =2
∂x3 2
Nonlinear Programming 811
∂fx1 ∂fx2
(fx1 )x2 = =2= = (fx2 )x1
∂x2 ∂x1
∂fx1 ∂fx3
(fx1 )x3 = =0= = (fx3 )x1
∂x3 ∂x1
∂fx2 ∂fx3
(fx2 )x3 = = −2 = = (fx3 )x2 .
∂x3 ∂x2
Hence, the Hessian matrix for the given function can be found as
6 2 0
H(x1 , x2 , x3 ) = 2 4 −2 .
0 −2 4
To check the definiteness of H, take
6 2 0 z1
zT Hz = (z1 , z2 , z3 ) 2 4 −2 z2 ,
0 −2 4 z3
which gives
6z1 + 2z2 + 0z3
zT Hz = (z1 , z2 , z3 ) 2z1 + 4z2 − 2z3
0z1 − 2z2 + 4z3
Example 7.30 Show that the function f (x1 , x2 ) = 3x21 + 4x1 x2 + 2x22 is a
convex function on S = R2 .
The first principal minors of the Hessian matrix are the diagonal entries,
both 6 > 0 and 4 > 0. The second principal minor of the Hessian matrix
is the determinant of the Hessian matrix, which is
So for any point, all principal minors of the Hessian matrix H(x1 , x2 )
are nonnegative, therefore, Theorem 7.18 shows that f (x1 , x2 ) is a convex
function on R2 . •
Example 7.31 Show that the function f (x1 , x2 ) = −x21 − 2x1 x2 − 3x22 is
a concave function on S = R2 .
Solution. The Hessian matrix of the given function has the form
−2 −2
H(x1 , x2 ) = .
−2 −6
The first principal minors of the Hessian matrix are the diagonal entries
(−2 and −6). These are both negative (nonpositive). The second principal
minor is the determinant of the Hessian matrix H(x1 , x2 ) and equals
Example 7.32 Show that the function f (x1 , x2 ) = 2x21 − 4x1 x2 + 3x22 is
not a convex nor a concave function on S = R2 .
Solution. The Hessian matrix of the given function has the form
2 −4
H(x1 , x2 ) = .
−4 3
The first principal minors of the Hessian matrix are 2 and 3. Because both
principal minors are positive, f (x1 , x2 ) cannot be concave. The second
principal minor is the determinant of the Hessian matrix H(x1 , x2 ) and it
is equal to
2(3) − (−4)(−4) = 6 − 16 = −10 < 0.
Thus, f (x1 , x2 ) cannot be a convex function on R2 . Together, these facts
show that f (x1 , x2 ) cannot be a convex nor a concave function. •
Nonlinear Programming 815
is a convex function on S = R3 .
Solution. The Hessian matrix of the given function has the form
4 −2 −2
H(x1 , x2 , x3 ) = −2 4 −2 .
−2 −2 4
By deleting row 2 and column 2 of the Hessian matrix, we find the second-
order principal minor
4 −2
|H2 | = = 16 − 4 = 12 > 0.
−2 4
By deleting row 3 and column 3 of the Hessian matrix, we find the second-
order principal minor
4 −2
|H2 | = = 16 − 4 = 12 > 0.
−2 4
The third-order principal minor is simply the determinant of the Hessian
matrix itself. Expanding by row 1 cofactors, we find the third-order prin-
cipal minor as follows:
|H3 | = 4[(4)(4) − (−2)(−2)] − (−2)[(−2)(4) − (−2)(−2)]
+(−2)[(−2)(−2) − (−2)(4)] = 0.
Because for all (x1 , x2 , x3 ) all principal minors of the Hessian matrix are
nonnegative, we have shown that f (x1 , x2 , x3 ) is a convex function on R3 .
•
Nonlinear Programming 817
a ≤ 0, c ≤ 0 and 4ac − b2 ≥ 0,
maximize(minimize) Z = f (x)
subject to (7.34)
gj (x) (≤, =, ≥) 0, j = 1, 2, . . . , m.
In the following we give two very important theorems that illustrate the
importance of convex and concave functions in NLP problems.
nonlinear equations.
This method is an iterative method and starts with two initial guesses,
xL and xu , that bracket one local extremum of f (x) (considered a maxi-
mum) and hence is called a unimodel. Next, we look for two interior points,
x1 and x2 , which can be chosen according to the golden ratio
√
5−1
d= (xu − xL ),
2
which gives
x1 = x L + d
x2 = xu − d.
After finding the two interior points, the given function is evaluated at
these points and two results can occur:
1. If f (x1 ) > f (x2 ), then the domain of x to the left of x2 , from xL
to x2 , can be eliminated because it does not contain the maximum.
Since the optimum lies on the interval (x1 , xu ), we set x2 = x1 for
the next iteration.
Nonlinear Programming 821
Remember that we do not have to recalculate all the function values for
the next iteration, and we need only one new function value. For example,
when the optimum is on the interval (x1 , xu ), then we set x2 = x1 , i.e., the
old x1 becomes the new x2 and f (x2 ) = f (x1 ). After this, we have to find
only the new x1 for the next iteration, and it can be obtained as
√
5−1
x1 = xL + (xu − xL ).
2
A similar approach would be used for the other possible case, when the
optimum is on the interval (xL , x2 ) by setting x2 = x1 , i.e., the old x2
becomes the new x1 and f (x1 ) = f (x2 ). Then we need to find only the
new x2 for the next iteration, which can be obtained as
√
5−1
x2 = xu − (xu − xL ).
2
822 Applied Linear Algebra and Optimization using MATLAB
As the iterations are repeated, the interval containing the optimum is re-
duced rapidly. In fact, with each iteration the interval is reduced by a
factor of the golden ratio (about 61.8%). This means that after 10 itera-
tions the interval is shrunk to about 0.008 or 0.8% of the initial interval.
Solution. To find the two interior points x1 and x2 , first we compute the
value of the golden ratio as
√ √
5−1 5−1
d= (xu − xL ) = (2.25 − 1) = 0.7725,
2 2
and with this value, we have the values of the interior points as follows:
x1 = xL + d = 1 + 0.7725 = 1.7725,
x2 = xu − d = 2.25 − 0.7725 = 1.4775.
Next, we have to compute the function values at these interior points, which
are:
Since f (x1 ) > f (x2 ), the maximum is on the interval defined by x2 , x1 , and
xu , i.e., (x2 , xu ). For this, we set the following scheme:
xL = x2 = 1.4775
x2 = x1 = 1.7725
xu = xu = 2.25.
Nonlinear Programming 823
So we have to find the new value of x1 , only for the second iteration, and
it can be computed with the help of the new value of the golden ratio as
follows:
√ √
5−1 5−1
d= (xu − xL ) = (2.25 − 1.4775) = 0.4774
2 2
and
x1 = xL + d = 1.4775 + 0.4774 = 1.9549.
xL = x2 = 1.7725
x2 = x1 = 1.9549
xu = xu = 2.25.
Repeat the process, and the numerical results for the corresponding itera-
tions, starting with the initial approximations xL = 1.0 and xu = 2.25 with
accuracy 5 × 10−4 , are given in Table 7.6. From Table 7.6, we can see that
within, 14 iterations (very slow), the result converges rapidly on the true
value of 1.8082 at x = 2.0793. •
824 Applied Linear Algebra and Optimization using MATLAB
Program 7.6
MATLAB m-file for the Golden-Section Search Method for Op-
timization
function sol=golden(fn,xL,xu,tol)
disp(’ k xL f (xL) x2 f (x2) x1 f (x1) xu f (xu) d ’)
d=((sqrt(5)-1)/2)*(xu-xL); x1=xL+d; x2=xu-d;
fL=feval(fn,xL);fu=feval(fn,xu);
f1=feval(fn,x1);f2=feval(fn,x2); k = 0;
[k xL fL x2 f2 x1 f1 xu fu d]
while abs(x1-x2)>tol; f1=feval(fn,x1);f2=feval(fn,x2);
k=k+1; if f1>f2; xL=x2; x2=x1; xu=xu;
d=((sqrt(5)-1)/2)*(xu-xL); x1=xL+d; f1=feval(fn,x1);
E=abs(x1-x2); sol=x2; f2=feval(fn,x2);
fL=feval(fn,xL); fu=feval(fn,xu);
[k xL fL x2 f2 x1 f1 xu fu d sol]
else; xu=x1; x1=x2; xL=xL; d=((sqrt(5)-1)/2)*(xu-xL);
x2=xu-d; f2=feval(fn,x2); E=abs(x1-x2); sol=x1;
f1=feval(fn,x1);fL=feval(fn,xL);fu=feval(fn,xu);
[k xL fL x2 f2 x1 f1 xu fu d sol]
end;end;
Nonlinear Programming 825
f unction y = f n(x)
y = 2 ∗ x − 1.75 ∗ x.ˆ 2 + 1.1 ∗ x.ˆ 3 − 0.25 ∗ x.ˆ 4;
Just as there is only one straight line connecting two points, there is
only one quadratic or parabola connecting three points. Suppose that
we are given three distinct points x0 , x1 , and x2 and a quadratic function
p(x) passing through the corresponding function values f (x0 ), f (x1 ), and
f (x2 ). Thus, if these three points jointly bracket an optimum, we can fit a
quadratic function to the points as follows:
(x − x1 )(x − x2 ) (x − x0 )(x − x2 )
p(x) = f (x0 ) + f (x1 )
(x0 − x1 )(x0 − x2 ) (x1 − x0 )(x1 − x2 )
(x − x0 )(x − x1 )
+ f (x2 ).
(x2 − x0 )(x2 − x1 )
The necessary condition for the minimum of this quadratic function can
be obtained by differentiating it with respect to x. Set the result equal to
zero, and solve the equation for an estimate of optimal x, i.e.,
(2x − x1 − x2 ) (2x − x0 − x2 )
p0 (x) = f (x0 ) + f (x1 )
(x0 − x1 )(x0 − x2 ) (x1 − x0 )(x1 − x2 )
(2x − x0 − x1 )
+ f (x2 )
(x2 − x0 )(x2 − x1 )
= 0.
Nonlinear Programming 827
After finding the new point (optimum point), the next job is to deter-
mine which one of the given three points is discarded before repeating the
process. To discard a point we check the following:
1. xopt ≤ x1 :
2. xopt > x1 :
(i) If f (x1 ) ≥ f (xopt ), then the minimum of the actual function is
on the interval (x1 , x2 ), therefore, we will use the new three points
x1 , xopt , and x2 for the next iteration.
(ii) If f (x1 ) < f (xopt ), then the minimum of the actual function is
on the interval (x0 , xopt ), therefore, in this case we will use the new
three points x0 , x1 , and xopt for the next iteration.
= 2.0617,
To perform the second iteration, we have to discard one point by using the
same strategy as in the previous golden-section search. Since the function
value at xopt is greater than the intermediate point x1 and the xopt value is
to the right of x1 , the first initial guess x0 is discarded. So for the second
Nonlinear Programming 829
= 2.0741,
Repeat the process and the numerical results for the corresponding iter-
ations, starting with the initial approximations x0 = 1.75, x1 = 2.0, and
x2 = 2.25, with accuracy 5 × 10−2 , which are given in Table 7.7.
From Table 7.7, we can see that within five iterations, the result con-
verges rapidly on the true value of 1.8082 at x = 2.0793. Also, note that for
this problem the quadratic interpolation method converges only on one end
of the interval, and sometimes the convergence can be slow for this reason.
•
f unction y = f n(x)
y = 2 ∗ x − 1.75 ∗ x.ˆ 2 + 1.1 ∗ x.ˆ 3 − 0.25 ∗ x.ˆ 4;
Remember that the procedure is essentially complete except for the choice
of three initial points. Choosing three arbitrary values of x may cause
problems if the denominator of the xopt equation is zero. Assume that
the three points are chosen as 0, , and 2, where a positive number is a
chosen parameter (say, = 1). In such case, the expression for xopt takes
the form
(3f (x0 ) − 4f (x1 ) + f (x2 ))
xopt = ,
(2f (x0 ) − 4f (x1 ) + 2f (x2 ))
i.e.,
f (x0 ) + f (x2 )
> f (x1 ).
2
In the case of the convergence of the method, the interval on which the
minimum lies becomes smaller, the quadratic function becomes closer to
the actual function, and the process is terminated when
(f (xopt ) − p(xopt )
≤ tol,
(f (xopt )
Program 7.7
MATLAB m-file for Quadratic Interpolation Method
function sol=Quadratic2(fn,x0,x1,x2,tol)
disp(0 k x0 f (x0) x1 f (x1) x2 f (x2) x3 f (x3)0 )
f0=feval(fn,x0);f1=feval(fn,x1);f2=feval(fn,x2); k = 0;
A = (f 0 ∗ (x1.ˆ 2 − x2.ˆ 2) + f 1 ∗ (x2.ˆ 2−x0.2 ) + f 2 ∗ (x0.ˆ 2−x1.ˆ
2));
B = (2 ∗ f 0 ∗ (x1 − x2) + 2 ∗ f 1 ∗ (x2 − x0) + 2 ∗ f 2 ∗ (x0 − x1));
x3 = A/B;f3=feval(fn,x3);
while abs(x2-x0)¿tol; k=k+1; if f3¿f1;
x0=x1;f0=f1;x1=x3;f1=f3; else; x2=x1;f2=f1;x1;x3;f1=f3;
disp(0 k x0 f (x0) x1 f (x1) x2 f (x2) x3 f (x3)0 )
end;end; sol=x3;
Since for finding the root of nonlinear equation f (x) = 0, this method
can be written as
f (xn )
xn+1 = xn − 0 , n ≥ 0,
f (xn )
a similar open approach can be used to find an optimum of f (x) by defining
a new function, F (x) = f 0 (x). Thus, because the same optimal value x∗
satisfies both the functions
F 0 (x∗ ) = f 0 (x∗ ) = 0,
Newton’s method for optimization can be written as
f 0 (xn )
xn+1 = xn − , n ≥ 0, (7.36)
f 00 (xn )
which can be used to find the minimum or maximum of f (x), if f (x) is
twice continuously differentiable.
832 Applied Linear Algebra and Optimization using MATLAB
(x − x0 )2 00
f (x) = f (x0 ) + (x − x0 )f 0 (x0 ) + f (x0 ) + higher-order terms.
2!
Taking the derivative with respect to x and ignoring the higher-order term,
we get
f 0 (x) ≈ f 0 (x0 ) + (x − x0 )f 00 (x0 ).
Setting f 0 (x) = 0 and simplifying the expression for x, we obtain
f 0 (x0 )
x ≈ x0 − .
f 00 (x0 )
It is an improved approximation and can be written as
f 0 (x0 )
x 1 = x0 − ,
f 00 (x0 )
Example 7.37 Use Newton’s method to find the local maximum of the
function
f (x) = 2x − 1.75x2 + 1.1x3 − 0.25x4 ,
with an initial guess x0 = 2.5.
Solution. To use formula (7.36), first we compute the first and second
derivative of the given function as follows:
Repeat the process; the numerical results for the corresponding iter-
ations, starting with the initial approximation x0 = 2.5 with accuracy
5 × 10−2 , are given in Table 7.8.
From Table 7.8, we can see that within four iterations, the result converges
rapidly on the true value of 1.8082 at x = 2.0793. Also, note that this
method does not require initial guesses that bracket the optimum. In ad-
dition, this method also shares the disadvantage that it may be divergent.
For confirming the convergence of the method we must check the correct
sign of the second derivative of the function. For maximizing the function,
the second derivative of the function should be less than zero, and it should
be greater than zero for the minimizing problem. In both cases, the first
derivative of the function should be close to zero as much as possible be-
cause optimum here means the same as the root of f 0 (x) = 0. Note that if
the second derivative of the function equals zero at the given initial guess,
then change the initial guess. •
834 Applied Linear Algebra and Optimization using MATLAB
To get the above results using MATLAB commands, first the func-
tion 2x − 1.75x2 + 1.1x3 − 0.25x4 and its first and second derivatives
2 − 1.5x + 3.3x2 − x3 and −1.5 + 6.6x − 3x2 were saved in three m-files
called fn.m, dfn.m, and ddfn.m, respectively, written as follows:
f unction y = f n(x)
y = 2 ∗ x − 1.75 ∗ x.ˆ 2 + 1.1 ∗ x.ˆ 3 − 0.25 ∗ x.ˆ 4;
f unction dy = df n(x)
dy = 2 − 1.5 ∗ x + 3.3 ∗ x.ˆ 2 − x.ˆ 3;
Program 7.8
MATLAB m-file for Newton’s Method for Optimization
function sol=newtonO(fn,dfn,ddfn,x0,tol)
old = x0+1; k = 0;
while abs (x0 − old) > tol; old = x0;
x0 = old − f eval(df n, old)/f eval(ddf n, old);
end; sol=x0;
Nonlinear Programming 835
∂f
(x̄) = 0, i = 1, 2, . . . , n, (7.38)
∂xi
Example 7.38 Find all local minimum, local maximum, and saddle points
for the function
1 2 1
f (x1 , x2 ) = x31 − x32 + x21 − 6x1 + 32x2 + 4.
3 3 2
Solution. The first partial derivatives of the function are
∂f (x1 , x2 )
= x21 + x1 − 6
∂x1
and
∂f (x1 , x2 )
= −2x22 + 32.
∂x2
∂f ∂f
Since ∂x 1
and ∂x 2
exist for every (x1 , x2 ), the only stationary points are
the solutions of the system
2
x1 + x1 − 6 = 0
−2x22 + 32 = 0.
∂ 2f ∂ 2f
= 2x1 + 1, = −4x2
∂x1 ∂x22
and
∂ 2f ∂ 2f
=0= .
∂x1 ∂x2 ∂x2 ∂x1
Hence, the Hessian matrix for the function f (x) is
2x1 + 1 0
H(x1 , x2 ) = .
0 −4x2
Since
H1 (−3, −4) = −5 < 0
and
−5 0
H2 (−3, −4) = = −80 6= 0,
0 16
the conditions of Theorem 7.23 and Theorem 7.24 cannot be satisfied, there-
fore, the stationary point x̄ = (−3, −4) is not a local extremum for the
given function. But Theorem 7.25 now implies that x̄ = (−3, −4) is a
saddle point, i.e.,
407
(−3, −4, f (−3, −4)) = (−3, −4, − ).
6
At (−3, 4) the Hessian matrix is
−5 0
H(−3, 4) = .
0 −16
Since
H1 (−3, 4) = −5 < 0
838 Applied Linear Algebra and Optimization using MATLAB
and
−5 0
H2 (−3, 4) = = −80 6= 0,
0 16
the conditions of Theorem 7.24 are satisfied, and it shows that the station-
ary point x̄ = (−3, 4) is a local maximum for the given function, i.e.,
617
f (−3, 4) = .
6
At (2, −4) the Hessian matrix is
5 0
H(2, −4) = .
0 16
Since
H1 (2, −4) = 5 > 0
and
5 0
H2 (2, −4) = = 80 > 0,
0 16
the conditions of Theorem 7.23 are satisfied, and it shows that the station-
ary point x̄ = (2, −4) is a local minimum for the given function, i.e.,
266
f (2, −4) = − .
3
Finally, at (2, 4) the Hessian matrix is
5 0
H(2, 4) = .
0 −16
Since
H1 (2, 4) = 5 > 0
and
−5 0
H2 (−3, −4) = = −80 6= 0,
0 16
the conditions of Theorem 7.23 and Theorem 7.24 cannot be satisfied, there-
fore, the stationary point x̄ = (2, 4) is not a local extremum for the given
function. From Theorem 7.25, we see that x̄ = (2, 4) is a saddle point, i.e.,
(2, 4, f (2, 4)) = (2, 4, 82).
•
Nonlinear Programming 839
Example 7.39 Find all local minimum, local maximum, and saddle points
for the function
f (x1 , x2 ) = −x21 + x22 .
Solution. The first partial derivatives of the function are
∂f (x1 , x2 ) ∂f (x1 , x2 )
= −2x1 , = 2x2 .
∂x1 ∂x2
∂f ∂f
Since ∂x 1
and ∂x 2
exist for every (x1 , x2 ), the only stationary points are
the solutions of the system
−2x1 = 0
2x2 = 0.
Solving this system, we obtain the only stationary point (0, 0).
The second partial derivatives of f are
∂ 2f ∂ 2f
= −2, =2
∂x21 ∂x22
and
∂ 2f ∂ 2f
=0=
∂x1 ∂x2 ∂x2 ∂x1
Since
H1 (0, 0) = −2 < 0
and
−2 0
H2 (0, 0) = = −4 6= 0,
0 2
840 Applied Linear Algebra and Optimization using MATLAB
the conditions of Theorem 7.23 and Theorem 7.24 cannot be satisfied, there-
fore, the stationary point x̄ = (0, 0) is not a local extremum for the given
function. From Theorem 7.25, we conclude that x̄ = (0, 0) is a saddle
point, i.e.,
(0, 0, f (0, 0)) = (0, 0, 0).
•
is called a unit vector and will have a length of 1 and will define the same
direction as x.
∇f (x)
Also, ∇f (x) defines the direction . •
k∇f (x)xk
For example, if
f (x1 , x2 ) = x21 + x22 ,
then
∇f (x1 , x2 ) = [2x1 , 2x2 ]T .
Thus, at (2, 3), the gradient vector of the function is
and √ √ √
k∇f (2, 3)k = 42 + 62 = 16 + 36 = 52.
So the gradient vector ∇f (2, 3) defines the direction
∇f (2, 3) 4 6
= ( √ , √ ) = (0.5547, 0.8321).
k∇f (2, 3)k 52 52
842 Applied Linear Algebra and Optimization using MATLAB
∇f (x)
Note that if any point x̄ lies on the curve f (x), then the vector
k∇f (x)xk
will be perpendicular to the curve f (x).
Also, moving from v0 in the direction of ∇f (x) to get the local maxi-
mum, we have to find the new point v1 as
v1 = v0 + α0 ∇f (v0 ),
for some α0 > 0. Since we desire v1 to be as close as possible to the
maximum, we need to find the unknown variable α0 > 0 such that
f (v1 ) = f (v0 + α0 ∇f (v0 ))
is as large as possible.
Beginning at any point v0 and moving in the direction of ∇f (v0 ) will result
in a maximum rate of increase for f . So we begin by moving away from v0
in the direction of ∇f (v0 ). For some nonnegative value of α0 , we move to
a point v1 , which can be written as
v1 = v0 + α0 ∇f (v0 ),
where α0 solves the following one-dimensional optimization problem:
z(α) = maximize f (v0 + α0 ∇f (v0 ))
subject to .
α0 ≥ 0
But if k∇f (v1 )k is not sufficiently small, then we move away from v1
a distance α1 in the direction of k∇f (v1 )k. As before, we discuss α1 by
solving
z(α) = maximize f (v1 + α1 ∇f (v1 ))
subject to .
α1 ≥ 0
If k∇f (v2 )k is sufficiently small, then we terminate the process with the
knowledge that v2 is a good approximation of the stationary point v̄ of the
given function f (x), with ∇f (v̄) = 0.
Example 7.40 Use the steepest ascent method to approximate the solution
to
maximize z = −(x1 − 3)2 − (x2 − 2)2
subject to
(x1 , x2 ) ∈ R2 ,
by starting with v0 = (1, 1).
Solution. Given
Thus,
f (α0 ) = −20α02 + 20α0 − 5.
Nonlinear Programming 845
which gives
−40α0 + 20 = 0, α0 = 0.5.
Thus our new point can be found as
Since
∇f (3, 2) = [0, 0]T ,
we terminate the process. Thus, (3, 2) is the optimal solution to the given
NLP problem because f (x1 , x2 ) is a concave function:
−2 0
H(3, 2) = .
0 −2
The first principal minors of the Hessian matrix are the diagonal entries
(–2 and –2). These are both negative (nonpositive). The second principal
minor is the determinant of the Hessian matrix H and equals
−2(−2) − 0 = 4 ≥ 0.
1. Start with initial point x(0) and initial (given) function f0 (x).
Example 7.41 Use the steepest ascent method to approximate the solution
to the problem
Solution. Given
Thus,
f (α0 ) = −20α02 + 20α0 − 5.
Nonlinear Programming 847
which gives
−40α0 + 20 = 0, α0 = 0.5.
Thus, our new point can be found as
Since
∇f (3, 2) = [0, 0]T ,
it is important to note that this method is a very slow convergent (lin-
early) method, therefore, it can mostly be used for providing the best initial
guess for the approximation of an extreme value of a function for the other
iterative methods. •
Example 7.42 Use the steepest descent method to approximate the solu-
tion to the following problem
Solution. Given
which gives
16 1
64α0 = 16, or α0 = = .
64 4
Notice that
00 1 00
f (α0 ) = f = 64 > 0,
4
so α0 = 14 is the minimizing point.
Thus, our new point can be found as
1
v1 = v0 + α0 d0 = (1, 1) + (−4, 0) = (0, 1)
4
and
∇f (v1 ) = ∇f (0, 1) = [0, 1]T ,
which gives
d1 = −∇f (v1 ) = −∇f (0, 1) = [0, −1]T .
Now we find α1 to minimize
and it gives
f (α1 ) = 0 + (1 − α1 )2 + 0 − (1 − α1 ) = α12 − α1 .
Nonlinear Programming 849
f 0 (α1 ) = 0 = 2α1 − 1,
which gives α1 = 12 .
So the new point can be found as
1 1 1
v2 = v1 + α1 d = (0, 1) + (0, −1) = 0,
2 2
and
T
1 1
∇f (v2 ) = ∇f 0, = ,0 ,
2 2
which gives
T
2 1 1
d = −∇f (v2 ) = −∇f 0, = − ,0 .
2 2
T
1 1 7 1
α3 = , v4 = − , , ∇f (v4 ) = ,0 .
2 8 16 16
Since ∇f (v4 ) ≈ 0, the process can be terminated at this point. The approx-
imate minimum point is given by v4 = (− 81 , 16
7
). Notice that the gradients
at points v3 and v4 ,
T
1 1
0, ,0 = 0,
8 16
are orthogonal. •
850 Applied Linear Algebra and Optimization using MATLAB
(x − x∗ )T H(x∗ )(x − x∗ ) ≥ 0,
0 ≈ 0 + ∇f (x0 )T + (x − x0 )H(x0 ).
or
x ≈ x0 − H −1 (x0 )∇f (x0 )T ,
Nonlinear Programming 851
and so on. Note that in both formulas, (7.45) and (7.46), the Hessian
matrix and gradient vector on the right-hand side are evaluated at (x, y) =
(xk , yk ) and (x, y, z) = (xk , yk , zk ), respectively.
Example 7.43 Use Newton’s method to find the local minimum of the
given function
f (x, y, z) = x + 2z + yz − x2 − y 2 − z 2 ,
∂f
= 1 − 2x
∂x
∂f
= z − 2y
∂y
∂f
= 2 + y − 2z,
∂z
so the gradient of the function can be written as
∂ 2f
= −2
∂x2
∂ 2f
= −2
∂y 2
∂ 2f
= −2
∂z 2
∂ 2f ∂ 2f
= =0
∂x∂y ∂y∂x
∂ 2f ∂ 2f
= =0
∂x∂z ∂z∂x
∂ 2f ∂ 2f
= = 1,
∂y∂z ∂z∂y
so the Hessian of f is
−2 0 0
H(x, y, z) = 0 −2 1 .
0 1 −2
Nonlinear Programming 853
Second iteration:
1 1
1 −
2 0 0 1
2
2
x2
2
2 1 2
y2 = − 0 − − = ,
2 −1
z2
3 3
3
2
1 2
−1
2
0 − −
3 3 3
and the norm
T r √
2 4 4 16 2 5
k∇f (x2 , y2 , z2 )k = k 0, − , k= 0+ + = .
3 3 9 9 3
Third iteration:
1 1 1
0
− 0 0
2 2 2
x3
2
2
y3 = − 0 − 2 1
− 2
3 − 3 =
,
z3 3 3
3
2 1 2 4
4
0 − − 3
3 3 3 3
and the norm
k∇f (x3 , y3 , z3 )k = (0, 0, 0)T = 0.
We noted that the convergence is very fast because we start sufficiently close
to the optimal solution, which can be easily computed analytically as
∂f
= 1 − 2x = 0
∂x
∂f
= z − 2y = 0
∂y
∂f
= 2 + y − 2z = 0,
∂z
Nonlinear Programming 855
This method can be used to solve the NLP problem in which all the
constraints are equality constraints. We consider an NLP problem of the
following type:
maximize (or minimize) z = f (x1 , x2 , . . . , xn )
subject to
g1 (x1 , x2 , . . . , xn ) = 0
g2 (x1 , x2 , . . . , xn ) = 0
..
. (7.47)
gm (x1 , x2 , . . . , xn ) = 0.
856 Applied Linear Algebra and Optimization using MATLAB
Then we attempt to find an optimal point (x¯1 , . . . , x¯n , λ¯1 , . . . , λ¯m ) that
maximizes (or minimizes) L(x1 , . . . , xn , λ1 , . . . , λm ). If (x¯1 , . . . , x¯n , λ¯1 , . . . ,
λ¯m ) maximizes the Lagrangian L, then at (x¯1 , . . . , x¯n , λ¯1 , . . . , λ¯m ) we have
∂L
= bi − gi (x1 , . . . , xn ) = 0,
∂λi
∂L
where ∂λi
is the partial derivative of L with respect to λi .
This shows that (x¯1 , x¯2 , . . . , x¯n ) will satisfy the constraints in (7.47).
To show that (x¯1 , x¯2 , . . . , x¯n ) solves (7.47), let (x01 , x02 , . . . , x0n ) be any point
that is in (7.47)’s feasible region. Since (x¯1 , x¯2 , . . . , x¯n , λ¯1 , λ¯2 , . . . , λ¯m ) max-
imizes L, for any numbers λ01 , λ02 , . . . , λ0m ), we have
L(x¯1 , x¯2 , . . . , x¯n , λ¯1 , . . . , λ¯m ) ≥ L(x01 , x02 , . . . , x0n , λ01 , . . . , λ0m ). (7.49)
Since (x¯1 , . . . , x¯n ) and (x01 , . . . , x0n ) are both feasible in (7.47), the terms in
(7.48) involving the λs are all zero, and (7.49) becomes
Thus, (x¯1 , . . . , x¯n ) does solve problem (7.47). In short, if (x¯1 , . . . , x¯n ,
λ¯1 , . . . , λ¯m ) solves the unconstraint maximization problem
then (x¯1 , . . . , x¯n ) solves (7.47). Since we know that for (x¯1 , x¯2 , . . . , x¯n , λ¯1 ,
. . . , λ¯m ) to solve (7.51), it is necessary that at (x¯1 , . . . , x¯n , λ¯1 , . . . , λ¯m )
∂L ∂L ∂L ∂L
= ··· = = = ··· = = 0. (7.52)
∂x1 ∂xn ∂λ1 ∂λm
The following theorems give conditions that any point (x¯1 , . . . , x¯n , λ¯1 , . . . , λ¯m )
that satisfies (7.52) will yield an optimal solution (x¯1 , . . . , x¯n ) to (7.47).
Nonlinear Programming 857
and we set
∂L
= 0 = 1 − 2x1 + 3λ
∂x1
858 Applied Linear Algebra and Optimization using MATLAB
∂L
= 0 = 1 − 2x2 + 2λ
∂x2
∂L
= 0 = 1 − 2x3 + λ
∂x3
∂L
= 0 = 3x1 + 2x2 + x3 − 10.
∂λ
From the first equation of the above system, we have
1 − 2x1 + 3λ = 0,
and it gives
1 + 3λ
x1 = .
2
The second equation of the above system is
1 − 2x2 + 2λ = 0,
which gives
1 + 2λ
x2 = .
2
Also, the third equation of the system is
1 − 2x3 + λ = 0,
which gives
1+λ
x3 = .
2
Finally, the last equation of the system is simply the given constraint
3x1 + 2x2 + x3 = 10,
and using the values of x1 , x2 , and x3 in this equation, we get
1 + 3λ 1 + 2λ 1+λ
3 +2 + = 10.
2 2 2
Simplifying this expression, we get
14λ = 14, which gives λ = 1.
Nonlinear Programming 859
Solution. Given
f (x1 , x2 , x3 ) = (x1 − 1)2 + (x2 − 2)2 + (x3 − 2)2 ,
with the constraints
g1 (x1 , x2 , x3 ) = x21 + x22 + x23 − 36,
the Lagrangian L(x, y, z) is defined as
L(x1 , x2 , x3 ) = (x1 − 1)2 + (x2 − 2)2 + (x3 − 2)2 + λ(x21 + x22 + x23 − 36),
which leads to the equations
∂L
= 0 = 2(x1 − 1) + 2x1 λ
∂x1
∂L
= 0 = 2(x2 − 2) + 2x2 λ
∂x2
∂L
= 0 = 2(x3 − 2) + 2x3 λ
∂x3
∂L
= 0 = x21 + x22 + x23 − 36.
∂λ
Nonlinear Programming 861
Assume that x1 = x2 = x3 6= 0
1 − x1 2 − x2 2 − x3
= λ, = λ, = λ.
x1 x2 x3
1 − x1 2 − x2
= ,
x1 x2
x3 = 2x1 .
we obtain
9x2 = 36 or x ± 2.
Since f (2, 4, 4) = 9 and f (−2, −4, −4) = 81, the function has a mini-
mum value at the point (2, 4, 4) and the maximum value at the other point
(−2, −4, −4). •
3. Find all of the solutions (x̄, λ̄) to the following system of nonlinear
algebraic equations:
m
X
∇L(x, λ) = ∇f (x) + λi ∇gi (x) = 0
i=1
∂L(x, λ)
= gi (x) = 0.
∂λi
These equations are called the Lagrange conditions and (x̄, λ̄) are the
Lagrange points.
Nonlinear Programming 863
2x1 + x2 + x3 = 2
x1 − x2 − 3x3 = 4.
Solution. Given
f (x1 , x2 , x3 ) = x21 + x22 + x23 ,
with constraints
g1 (x1 , x2 , x3 ) = 2x1 + x2 + x3 − 2
g2 (x1 , x2 , x3 ) = x1 − x2 − 3x3 − 4,
and m = 2, n = 3, n > m as required, i.e., the method can be used for the
given problem.
Compute the derivatives of L(x, λ) and then set the derivatives equal to
zero, which gives
∂L
= 2x1 + 2λ1 + λ2 = 0
∂x1
864 Applied Linear Algebra and Optimization using MATLAB
∂L
= 2x2 + λ1 − λ2 = 0
∂x2
∂L
= 2x3 + λ1 − 3λ2 = 0
∂x3
∂L
= 2x1 + x2 + x3 − 2 = 0
∂λ1
∂L
= x1 − x2 − 3x3 − 4 = 0.
∂λ2
Note that this system is linear and quite easy to solve, but in many prob-
lems the systems are nonlinear, and in such cases the Lagrange conditions
cannot be solved analytically, on account of the particular nonlinearities
they contain. While in other cases an analytical solution is possible, some
ingenuity might be needed to find it.
Thus, solving the above linear system, one can get the values of x1 , x2 ,
and x3 as follows:
(2λ1 + λ2 )
x1 = −
2
(λ1 − λ2 )
x2 = −
2
(λ1 − 3λ2 )
x3 = − .
2
One can easily examine that the solution is a minimizing point as the
gradients of the constraints
2 1
∇g1 (x) = 1 and ∇g1 (x) = −1
1 −3
are linearly independent vectors. •
Some applications may involve more than two constraints. In particular,
consider the problem of finding the extremum of f (x1 , x2 , x3 , x4 ) subject to
the constraints gi (x) = 0 (for i = 1, 2, 3). If f (x) has an extremum subject
to these constraints, then the following conditions must be satisfied for
some real numbers λ1 , λ2 , and λ3 :
∇f + λ1 ∇g1 + λ2 ∇g2 + λ3 ∇g3 = 0.
By equating components and using constraints, we obtain a system of seven
equations in the seven unknowns x1 , x2 , x3 , x4 , λ1 , λ2 , λ3 . This method can
also be extended to functions of more than 4 variables and to more than 3
constraints.
866 Applied Linear Algebra and Optimization using MATLAB
minimize z = f (x1 , x2 , . . . , xn )
subject to
g1 (x1 , x2 , . . . , xn ) = 0
g2 (x1 , x2 , . . . , xn ) = 0
..
. (7.53)
gm (x1 , x2 , . . . , xn ) = 0.
If x̄ is a local minimizing point for the NLP problem (7.53), and n > m
(there are more variables than constraints) and the constraints gi (i =
1, , 2, . . . , m) have continuous first derivatives with respect to xj (j = 1, 2, . . . , n)
and the gradient ∇gi (x̄) are linearly independent vectors, then there is a
vector λ = [λ1 , λ2 , . . . , λm ]T such that
m
X
∇f (x̄) + λi ∇gi (x̄) = 0, (7.54)
i=1
For fixed x̄, this vector equation is simply a system of linear equations in
the variables λi (i = 1, 2, . . . , m). If the assumptions of Theorem 7.30 hold,
and if there is no λ such that the preceding gradient equation holds at x̄,
then the point x̄ cannot be a minimizing point.
Nonlinear Programming 867
L(x1 , x2 , x3 ) = 20+2x1 +2x2 +x23 +λ(x21 +x22 +x23 −11)+µ(x1 +x2 +x3 −3).
which gives
∂L
= 0 = 2 + 2λx1 + µ
∂x1
∂L
= 0 = 2 + 2λx2 + µ
∂x2
∂L
= 0 = 2x3 + 2λx3 + µ.
∂x3
Writing the above system in matrix form, we have
0 2 2x1 1
0 = 2 + λ 2x2 + µ 1 .
0 2x3 2x3 1
868 Applied Linear Algebra and Optimization using MATLAB
Suppose that the feasible point x̄ = [−1, 3, 1]T is the minimizing point, then
0 2 −2 1
0 = 2 + λ 6 + µ 1
0 2 2 1
or
0 = 2 − 2λ + µ
0 = 2 + 6λ + µ
0 = 2 + 2λ + µ.
can use the gradient condition, the orthogonality condition, and the orig-
inal constraint inequalities to find the stationary points for an inequality-
constrained problem.
minimize z = f (x)
subject to
gi (x) ≤ 0, i = 1, 2, . . . , m
hi (x) = 0, i = 1, 2, . . . , p
In the above standard form, the objective function is the minimization type,
all constraint right-hand sides are zero, and the inequality constraints are
the less-than type. Before applying the KT conditions, it is necessary to
note that the given problem has been converted to this standard form.
h(x1 , x2 , . . . , xn ) ≥ 0
must be written as
−h(x1 , x2 , . . . , xn ) ≤ 0.
Also, a constraint of the form
h(x1 , x2 , . . . , xn ) = 0
870 Applied Linear Algebra and Optimization using MATLAB
can be replaced by
h(x1 , x2 , . . . , xn ) ≤ 0
and
−h(x1 , x2 , . . . , xn ) ≤ 0.
For example,
x1 + 3x2 = 4
can be replaced by
x1 + 3x2 ≤ 4
and
−x1 − 3x2 ≤ −4.
2. Find all of the solutions (x̄, λ̄) to the following system of nonlinear
algebraic equations and inequalities:
∂L
= 0, j = 1, 2, . . . , n (gradient condition)
∂xj
gi (x) ≤ 0, i = 1, 2, . . . , m (feasibility condition)
λi gi (x) = 0, i = 1, 2, . . . , m (orthogonality condition)
λi ≥ 0, i = 1, 2, . . . , m (nonnegativity condition)
3. If the functions gi (x) are all convex, the point (x̄) is the global min-
imizing point. Otherwise, examine each solution (x̄, λ̄) to see if (x̄)
is a minimizing point.
Note that for each inequality constraint, we need to consider the following
two possibilities.
Nonlinear Programming 871
Now, we discuss necessary and sufficient conditions for x̄ = (x¯1 , x¯2 , . . . , x¯n )
to be an optimal solution for the following NLP problem:
maximize (or minimize) z = f (x1 , x2 , . . . , xn )
subject to
g1 (x1 , x2 , . . . , xn ) ≤ 0 (7.55)
g2 (x1 , x2 , . . . , xn ) ≤ 0 (7.56)
..
.
gm (x1 , x2 , . . . , xn ) ≤ 0
The following theorems give conditions (KT conditions) that are necessary
for a point
x̄ = (x¯1 , x¯2 , . . . , x¯n )
to solve (7.55).
Theorem 7.31 (Necessary Conditions, Maximization Problem)
m
∂f (x̄) X ∂gi (x̄)
− λ̄i = 0, j = 1, 2, . . . , n (7.57)
∂xj i=1
∂xj
λ̄i ≥ 0, i = 1, 2, . . . , m. (7.59)
•
872 Applied Linear Algebra and Optimization using MATLAB
λ̄i ≥ 0, i = 1, 2, . . . , m. (7.62)
λ̄i ≥ 0, i = 1, 2, . . . , m (7.67)
µ¯j ≥ 0, j = 1, 2, . . . , n. (7.68)
•
Since µ¯j ≥ 0, the first equation in the above system is equivalent to
m
∂f (x̄) X ∂gi (x̄)
− λ̄i ≤ 0, j = 1, 2, . . . , n.
∂xj i=1
∂xj
874 Applied Linear Algebra and Optimization using MATLAB
Thus, the KT conditions for the above maximization problem with non-
negative constraints may be written as
m
∂f (x̄) X ∂gi (x̄)
− λ̄i ≤ 0, j = 1, 2, . . . , n (7.69)
∂xj i=1
∂xj
λ̄i ≥ 0, i = 1, 2, . . . , m. (7.72)
Theorem 7.34 (Necessary Conditions, Minimization Problem)
λ̄i ≥ 0, i = 1, 2, . . . , m (7.76)
µ¯j ≥ 0, j = 1, 2, . . . , n. (7.77)
•
Since µ¯j ≥ 0, the first equation in the above system is equivalent to
m
∂f X ∂gi
(x̄) + λ̄i (x̄) ≥ 0, j = 1, 2, . . . , n.
∂xj i=1
∂xj
Nonlinear Programming 875
Thus, the KT conditions for the above maximization problem with non-
negative constraints may be written as
m
∂f (x̄) X ∂gi (x̄)
+ λ̄i (x̄) ≥ 0, j = 1, 2, . . . , n (7.78)
∂xj i=1
∂x j
λ̄i ≥ 0, i = 1, 2, . . . , m. (7.81)
Theorems 7.31, 7.32, 7.33, and 7.34 give the necessary conditions for a
point x̄ = (x¯1 , x¯2 , . . . , x¯n ) to be an optimal solution to (7.55) and (7.69).
The following theorems give the sufficient conditions for a point x̄ =
(x¯1 , x¯2 , . . . , x¯n ) to be an optimal solution to (7.55) and (7.69).
x1 + x2 − x3 ≤ 1
2x1 − x2 − 2x3 ≤ 2.
with constraints
g1 (x1 , x2 , x3 ) = x1 + x2 − x3 − 1 ≤ 0
g2 (x1 , x2 , x3 ) = 2x1 − x2 − 2x3 − 2 ≤ 0,
L(x, λ) = (x1 −2)2 +(x2 −3)2 +(x3 −3)2 +λ1 [x1 +x2 −x3 −1]+λ2 [2x1 −x2 −2x3 −2].
∂L
= g1 (x) = 2x1 − x2 − 2x3 − 2 ≤ 0
∂λ2
λ1 [x1 + x2 − x3 − 1] = 0
λ2 [2x1 − x2 − 2x3 − 2] = 0
λ1 ≥ 0, λ2 ≥ 0.
(b) Consider the four possible cases:
λ1 = 0, λ2 =0
λ1 6= 0, λ2 6= 0
λ1 = 0, λ2 6= 0
λ1 6= 0, λ2 = 0.
First Case: When λ1 = 0, λ2 = 0, then using the set of equations we
got from the gradient condition, we get
2(x1 − 2) = 0 gives x1 = 2
2(x2 − 3) = 0 gives x2 = 3
2(x3 − 3) = 0 gives x3 = 3.
Putting these values of x1 , x2 , and x3 in the given first constraint, we have
x1 + x2 − x3 − 1 = 2 + 3 − 3 − 1 = 1 6≤ 0.
Hence, this case does not hold.
x1 + x2 − x3 − 1 = 0
2x1 − x2 − 2x3 − 2 = 0.
1
2(x2 − 3) − λ2 = 0 gives x2 = [6 + λ2 ]
2
2x1 − x2 − 2x3 − 2 = 0.
∂f ∂f 2
= 2(x1 − 2); =2
∂x1 ∂x21
∂f ∂f 2
= 2(x2 − 3); =2
∂x2 ∂x22
∂f ∂f 2
= 2(x3 − 3); =2
∂x3 ∂x23
∂f 2 ∂f 2 ∂f 2
= = = 0.
∂x1 ∂x2 ∂x1 ∂x3 ∂x2 ∂x3
880 Applied Linear Algebra and Optimization using MATLAB
Also, since the functions g1 and g2 are linear and thus convex by the
definition of a convex function, hence all the functions are convex and the
point x̄ = [ 35 , 83 , 10
3
]T is the global minimum.
we will solve the given constraints for two of the variables in terms of the
other two. Solving for x1 and x3 in terms of x2 and x4 , we multiply the
first constraint equation by 5 and the second constraint equation by 3 and
subtract the results, which gives
5 7
x1 = − 2x2 − x4 .
2 2
Next, subtracting the two given constraint equations, we get
5 1
x3 = − x4 .
2 2
Putting these two expressions for x1 and x3 into the given objective func-
tion, we obtain the new problem (called the reduced problem):
2 2
5 7 2 5 1
minimize f (x2 , x4 ) = − 2x2 − x4 + x2 + − x4 + x24 ,
2 2 2 2
or it can be written as
25 27
minimize f (x2 , x4 ) = − 10x2 + 5x22 − 20x4 + 14x2 x4 + x24 .
2 2
One can note that this is an unconstraint problem now, and it can be solved
by setting the first partial derivatives with respect to x2 and x4 equal to zero,
i.e.,
∂f
= −10 + 10x2 + 14x4 = 0
∂x2
and
∂f
= −20 + 14x2 + 27x4 = 0.
∂x4
Thus, we have a linear system of the form
10x2 + 14x4 = 10
14x2 + 27x4 = 20.
Solving this system by taking x2 = (1 − 14 x ) from the first equation and
10 4
then putting this value in the second equation we get
14
14 1 − + 27x4 = 20,
10
Nonlinear Programming 883
30 5
which gives x4 = 37
, and it implies that x2 = − 37 .
Now using these values of x2 and x4 , we obtain the other two variables
x1 and x3 as follows:
5 5 7 30 5
x1 = +2 − ( )=−
2 37 2 37 74
and
5 1 30 155
x3 = − ( )= .
2 2 37 74
Thus, the optimal solution is
5 5 155 30 T
x̄ = [− ,− , , ] .
74 37 74 37
Note that the main difference between this method and the Lagrange mul-
tipliers method is that this method is easy to solve for several constraint
equations simultaneously if they are linear. The gradient of the objective
function is called the reduced gradient, and the method is therefore called
the reduced-gradient method.
x1 x2 x3 x4 constants
1l 2 3 5 10
1 2 5 6 15
x1 x2 x3 x4 constants
1 2 3 5 10
0 0 2l 1 5
x1 x2 x3 x4 constants
7 5
1 2 0 2 2
1 5
0 0 1 2 2
884 Applied Linear Algebra and Optimization using MATLAB
which gives
5 7
x1 = − 2x2 − x4
2 2
5 1
x3 = − x4 ,
2 2
the same as above. •
Nonlinear Constraints
In the previous example, we solved the problem having linear constraints
by the gradient method. Now we will deal with a problem having nonlinear
constraints and consider the possibility of approximating such a problem
by a problem with linear constraints. To do this we expand each nonlinear
constraint function in a Taylor series and then truncate terms beyond the
linear one.
z = x1 + 2x22 + x3 + x24 ,
x1 + x22 + x3 + 2x4 = 2
2x1 + 2x2 + x3 + x24 = 4.
If we truncate terms after the second term, we obtain the linear approxi-
mation
f (x) ≈ f (x∗ ) + ∇f (x∗ )T (∆xx∗ ),
which can also be written as
With the help of this formula we can replace the inequality constraints by
equality constraints that approximate the true constraints in the vicinity of
the point x∗ at which the linearization is performed:
x1 − x∗1
x2 − x∗2
g1 (x) ≈ [x∗1 + x∗ 22 + x∗3 + 2x∗4 − 2] + [1, 2x∗ 2 , 1, 2]
x3 − x∗3
x4 − x∗4
the optimal point and the linearized constraints of the subproblems become
better approximations to the original nonlinear constraints in the neighbor-
hood of the optimal point. At the optimal point, the linearized problem has
the same solution as the original nonlinear problem.
x1 = 1 − 4x2 + 4x4
x3 = 3 + 6x2 − 6x4 .
f (0) (x) = (1−4x2 +4x4 )+2x22 +(3+6x2 −6x4 )+x24 = 4+2x2 +2x22 −2x4 +x24 .
We can also solve this problem by converting the given constraints for
two of the variables in terms of the other two. So solving for x1 and x3 in
terms of x2 and x4 , we obtain
x1 = 2 − x22 − x3 − 2x4
and
x3 = 4 − 2x1 − 2x2 − x24 .
Putting the x1 equation in the x3 equation, we get
Using these new values of x1 and x3 , the given objective function becomes
f (x) = (2 + x22 − 2x2 + 2x4 − x24 ) + 2x22 + (2x2 − 2x22 − 4x4 + x24 ) + x24
or
f (x) = 2 + x22 − 2x4 + x24 .
Setting the first partial derivatives with respect to x2 and x4 equal to zero
gives
∂f
= 2x2 = 0
∂x2
and
∂f
= −2 + 2x4 = 0.
∂x4
Nonlinear Programming 889
x2 = 0 and x4 = 1.
and
x3 = 2(0) − 2(0)2 − 4(1) + (1)2 = −4 + 1 = −3.
Thus, the optimal solution is
3. Solve the linear constraint equations for the basic variables in terms
of the other (nonbasic) variables.
5. Find the basic variables using the nonbasic variables from the linear
constraints equations.
6. Repeat all the previous steps unless you get the desired accuracy.
890 Applied Linear Algebra and Optimization using MATLAB
is separable because
where
f1 (x1 ) = x31 − 4x21 + 2x1 and f2 (x2 ) = 2x22 − 3x2 .
But the function
is not separable.
Sometimes the given nonlinear functions are not separable, but they
can be made separable by approximate substitution. For example, the given
nonlinear programming problem
maximize z = x1 ex2
w = x1 ex2 ,
Nonlinear Programming 891
then
ln w = ln x1 + x2 .
Thus,
maximize z = w,
subject to
ln w = ln x1 + x2
is called a separable programming problem. Note that the substitution as-
sumes that x1 is a positive variable. •
There are different ways to deal with the separable programming prob-
lems, but we will solve the problems by the McMillan method.
where
d
X
xn = λkn xkn ,
k=0
2. λkj ≥ 0, k = 0, 1, . . . d, j = 1, 2, . . . , n, and
3. no more than two of the λs that are associated with any one variable
j are greater than zero, and if two are greater than zero, they must
be adjacent.
892 Applied Linear Algebra and Optimization using MATLAB
subject to
Solution. Both the objective function and the constraint are separable
functions because
f (x) = f1 (x1 ) + f2 (x2 ),
where
f1 (x1 ) = (x1 − 2)2 and f2 (x2 ) = 4(x2 − 6)2 ,
and also
g(x) = g1 (x1 ) + g2 (x2 ),
where
g1 (x1 ) = 6x1 and g2 (x2 ) = 4(x2 − 6)2 .
Notice that both x1 and x2 enter the problem nonlinearly in the objective
function and the constraint. Thus we must write both x1 and x2 in terms
of the λs. But if the variables are linear throughout the entire problem, it is
not necessary to write in terms of the λs; they can be used as the variables
themselves.
0 ≤ x1 ≤ 2 and 0 ≤ x2 ≤ 1,
subject to
27
6λ11 + 12λ21 + 3λ02 + λ12 + 12λ22 ≤ 12
4
λ01 + λ11 + λ21 = 1
λ02 + λ12 + λ22 = 1
λij ≥ 0, i, j = 0, 1, 2.
0 1 0 −41 − 77
4
0 − 13 z = 104
1
0 1 − 34 7
− 16 0 1
λ21 = 0
2 2
1 3 7 1
1 2
0 4 16
0 12
λ01 = 1
0 0 0 1 1 1 0 λ22 = 1
0 0 −2 − 79
2
− 147
8
0 − 12 z = 104
0 1 2 − 32 − 78 0 1
6
λ11 = 0
6 14
1 0 −1 4 16
0 0 λ01 = 1
0 0 0 1 1 1 0 λ22 = 1
Since
λ11 = 0, λ01 = 1, λ22 = 1,
we have
x1 = λ01 x01 + λ11 x11 + λ21 x21 = 0
x2 = λ02 x02 + λ12 x12 + λ22 x22 = 1.
Hence,
f (x) = (x1 − 2)2 + 4(x2 − 6)2 = 4 + 100 = 104.
•
An NLP problem whose constraints are linear and whose objective func-
tion is the sum of the terms of the form xk11 xk22 . . . xknn (with each term
having a degree of 2, 1, or 0) is a quadratic programming problem.
There are several algorithms that can be used to solve quadratic pro-
gramming problems. For solving such programming problems we describe
Wolfe’s method. The basic approach of this method is that all the variables
must be nonnegative.
subject to
x1 + x2 ≤ 1
2x1 + 3x2 ≤ 4
x1 , x2 ≥ 0.
e 1 x1 = 0
e 2 x2 = 0
λ1 s 1 = 0
λ2 s 2 = 0.
Observe that with the exception of the last four equations, the KT con-
ditions are all linear or nonnegative constraints. The last four equations
are the complementary slackness conditions for this quadratic programming
problem.
“ei from xi constraints and xi cannot both be slack, or the excess vari-
able for the ith constraint and λi cannot both be positive.”
To ensure that the final solution (with all the artificial variables equal
to zero) satisfies the above slackness conditions, Wolfe’s method is modified
by the simplex choice of the entering variable as follows:
898 Applied Linear Algebra and Optimization using MATLAB
1. Never perform a pivotal that would make the ei from the above jth
constraint and xi both basic variables.
2. Never perform a pivot that would make the slack (or excess) variable
for the ith constraint and λi both basic variables.
To apply Wolfe’s method to the given problem, we have to the solve the LP
problem
minimize w = a1 + a2 ,
subject to
4x1 + 4x2 + λ1 + 2λ2 − e1 + a1 = 6
4x1 + 6x2 + λ1 + 3λ2 − e2 + a2 = 3
x1 + x2 + s 1 = 1
2x1 + 3x2 + s2 = 4
e1 x1 = e2 x2 = λ1 s1 = λ2 s 2 = 0
x1 , x2 ≥ 0, λi ≥ 0, i = 1, 2.
x1 x2 λ1 λ2 e1 e2 s1 s2 a1 a2 rhs
8 10 2 5 −1 −1 0 0 0 0 w=9
4
4 1 2 −1 0 0 0 1 0 a1 = 6
4
6 1 3 0 −1 0 0 0 1 a2 =3
1 1 0 0 0 0 1 0 0 0 s1 = 1
2 3 0 0 0 0 0 1 0 0 s2 = 4
x1 x2 λ1 λ2 e1 e2 s1 s2 a1 a2 rhs
4 1 2
3
0 3
0 −1 3
0 0 0 w=4
4 1 2
3
0 3
0 −1 3
0 0 1 a1 = 4
2 1 1
1 0 − 61 0 0 0 x2 = 1
3 6 2 2
1
3
0 − 16 − 12 0 1
6
1 0 0 s1 = 1
2
0 0 − 12 − 32 0 1
2
0 1 0 s2 = 5
2
Nonlinear Programming 899
x1 x2 λ1 λ2 e1 e2 s1 s2 a1 a2 rhs
0 −2 0 −1 −1 1 0 0 0 w=3
0 −2 0 −1 −1 1 0 0 1 a1 = 3
3 1 3
1 2 4 4
0 − 14 0 0 0 x1 = 3
4
0 − 12 − 14 − 34 0 1
1 0 0 s1 = 1
4 4
0 0 − 12 − 32 0 1
2
0 1 0 s2 = 5
2
x1 x2 λ1 λ2 e1 e2 s1 s2 a1 a2 rhs
0 0 2 −1 0
1 −4 0 0 w=2
0 0 2 −1 0
1
−4 0 1 a1 = 2
1 1 0 0 0 0 1 0 0 x1 = 1
0 −2 −1 −3 0 1 4 0 0 e2 = 1
0 1 0 0 0 0 −2 1 0 s2 = 2
x1 x2 λ1 λ2 e1 e2 s1 s2 rhs
0 0 0 0 0 0 0 0 w=2
1
0 0 1 − 12 0 −2 0 λ2 = 1
2
1 1 0 0 0 0 1 0 x1 = 1
1
0 −2 2
0 − 32 1 −2 0 e2 = 4
0 1 0 0 0 0 −2 1 s2 = 2
x1 x2 λ1 λ2 e1 e2 s1 s2 rhs
0 0 0 0 0 0 0 0 w=0
0 0 1 2 −1 0 −4 0 λ1 = 2
1 1 0 0 0 0 1 0 x1 = 1
0 −2 0 −1 −1 1 0 0 e2 = 3
0 1 0 0 0 0 −2 1 s2 = 2
900 Applied Linear Algebra and Optimization using MATLAB
We note from the last tableau that w = 0, so we have found a solution that
satisfies the KT conditions and is optimal for the quadratic programming
problem. Thus, the optimal solution to the quadratic programming problem
is
x1 = 1 and x2 = 0.
We also note that
λ1 = 2, λ2 = 0, e1 = 0, e2 = 3, s1 = 0, s2 = 2,
which satisfies
e1 x1 = 0, e2 x2 = 0, λ1 s1 = 0, λ2 s2 = 0.
7.12 Summary
Nonlinear programming is a very vast subject and in this chapter we gave
a brief introduction to the idea of nonlinear programming problems. We
started with a review of differential calculus. Classical optimization theory
uses differential calculus to determine points of extrema (maxima and min-
ima) for unconstrained and constrained problems. The methods may not
be suitable for efficient numerical problems, but they provide the theory
that is the basis for most nonlinear programming methods. The solution
methods for the nonlinear programming problem were discussed, including
direct and indirect methods. For the one-dimensional optimization prob-
lem solution we used three indirect numerical methods. First, we used one
of the direct search methods called golden-section search, which helped us
identify the interval of uncertainty that is known to include the optimum
solution point. This method locates the optimum by iteratively decreasing
the interval of uncertainty to any given accuracy. The other two one-
dimensional methods we discussed are the quadratic interpolation method
and Newton’s method. Both are fast convergence methods compared with
Nonlinear Programming 901
7.13 Problems
1. Find the following limits as x approaches 0:
sin x2
(a) .
x tan x
1 + cos2 x + 3x2 − 2 cos x
(b) .
x2
√ √
x+2− 2
(c) .
x
sin x − 1 + cos x
(d) .
x
(a)
3x + 1, if x ≤ −1
f (x) = ax + b, if − 1 < x < 4
x + 4, if x ≥ 4.
(b)
x + 1, if 1 < x < 3
f (x) =
x2 + ax + b, if |x − 2| ≥ 1.
(c)
ax − b, if x ≤ −2
2
x −4
f (x) = , if − 2 < x < 1
x + 2
x2 + b, if x ≥ 1.
(d)
2 sin 2x
, if x < 0
x
f (x) = a − 3x, if x > 0
a = b, if x = 0.
4. Find the local extrema using the second derivative test, and find the
point of inflection of the following functions:
9. Find the Hessian matrix for the following functions at the indicated
points:
10. Find the linear and quadratic approximations of the following func-
tions at the given point (a, b) using Taylor’s series formulas:
12. Find the matrices associated with each of the following quadratic
forms:
14. Use the bisection method to find solutions accurate to within 10−4
on the indicated interval of the following functions:
15. Use the bisection method to find solutions accurate to within 10−4
on the indicated intervals of the following functions:
17. Use Newton’s method to find a solution accurate to within 10−4 using
the given initial approximations of the following functions:
20. Solve the following system by fixed-point iterations using the indi-
cated initial approximation (x0 , y0 ) and stop when successive iterates
differ by less than 10−4 :
x2 + y = 4,
x + y 2 = 3.
3x + ey = 4,
x2 + y = 2.
21. Find the maximum value of the following functions using accuracy
= 0.005 by the golden-section search:
x2
(b) f (x) = sin x − , [0, 2].
5
1
(c) f (x) = x3 − 2x2 − 1.75x + 1.5, [0, 1].
5
(d) f (x) = x6 − 5x5 + 3x4 + 2x2 + 4, [2, 4].
22. Find the extrema of the following functions using accuracy = 0.005
by the quadratic interpolation method for optimization:
3
(a) Minimize f (x) = x3 + x2 − 6x + 5, x0 = 0.0, x1 = 0.5, x2 = 1.5.
2
(b) Maximize f (x) = x3 −3x2 −9x+2, x0 = −2.0, x1 = 0.0, x2 = 1.0.
23. Find the extrema of the following functions using accuracy = 0.005
by the quadratic interpolation method for optimization:
1
(b) Maximize f (x) = x3 + 2x2 − 12x + 1, x0 = −4.0, x1 = −5.5,
3
x2 = −7.0.
24. Find the extrema of the following functions using accuracy = 0.005
by Newton’s method for optimization:
25. Find the extrema of the following functions using accuracy = 0.005
by Newton’s method for optimization:
26. Find the extrema and saddle points of each of the following functions:
27. Find the extrema and saddle points of each of the following functions:
(c) z = f (x, y) = x3 + y 3 − 6y 2 − 3x + 9.
28. Use the method of steepest ascent to approximate (up to given iter-
ations) the optimal solution to the following problems:
29. Use the method of steepest descent to approximate (up to the given
iterations) the optimal solution to the following problems:
(a) w = f (x, y, z) = x2 + y 2 + z 2
Subject to
g1 (x, y, z) = x + y + z − 12 = 0
g2 (x, y, z) = x2 + y 2 − z = 0
912 Applied Linear Algebra and Optimization using MATLAB
(b) w = f (x, y, z) = x + y + z
Subject to
g1 (x, y, z) = x2 − y 2 − z = 0
g2 (x, y, z) = x2 + z 2 − 4 = 0
(c) w = f (x, y, z) = x + 2y
Subject to
g1 (x, y, z) = x + y + z − 1 = 0
g2 (x, y, z) = x2 + z 2 − 4 = 0
(d) w = f (x, y, z) = xy + yz
Subject to
g1 (x1 , x2 , x3 ) = xy − 1 = 0
g2 (x, y, z) = y 2 + z 2 − 1 = 0
(a) w = f (x, y, z) = x2 + y 2 + z 2
Subject to
g1 (x, y, z) = x − y − 1 = 0
g2 (x, y, z) = y 2 − z 2 − 1 = 0
(b) w = f (x, y, z) = x2 + y 2 + z 2
Subject to
g1 (x, y, z) = x2 + y 2 − z 2 = 0
g2 (x, y, z) = x − 4z + 5 = 0
(c) w = f (x, y, z) = x2 + y 2 + z 2
Subject to
g1 (x, y, z) = x2 + y 2 + 2z − 16 = 0
g2 (x, y, z) = x + y − 4 = 0
(d) w = f (x, y, z) = 4x − 2y − 5z
Subject to
g1 (x, y, z) = x + y − z − 1 = 0
g2 (x1 , x2 , x3 ) = x2 + y 2 + 2z 2 − 2 = 0
36. Use the KT conditions to find a solution to the following nonlinear
programming problems:
x ≥ 0, y≥0
x ≥ 0, y ≥ 0, z≥0
38. Use the reduced-gradient method to find the extrema of function f
subject to the stated constraints:
A.1 Introduction
Here, we study in broad outline the floating-point representation used in
computers for real numbers and the errors that result from the finite na-
ture of this representation. We give a good general overview of how the
computer represents and manipulates numbers. We see later that such con-
siderations affect the choice of design of computer algorithms for solving
higher-order problems. We introduce several definitions and concepts that
may be unfamiliar. The reader should not spend time trying to master
all these immediately but should rather try to acquire a rough idea of the
sorts of difficulties that can arise from computer solutions of mathematical
problems. We describe methods for representing numbers on computers
and the errors introduced by these representations. In addition, we exam
other sources of various types of computational errors.
917
918 Linear Algebra and Optimization Using MATLAB
This takes care of the positive whole numbers. A number between 0 and
1 is represented by a string of digits to the right of a decimal point. For
example,
the integer part is the first summation in the expansion and the fractional
part is the second. Computers, however, don’t use the decimal system in
computations and memory but use the binary system. The binary system is
natural for computers because computer memory consists of a huge number
Appendix A: Number Representations and Errors 919
1 × 24 + 1 × 23 + 1 × 22 + 0 × 21 + 1 × 2−1 + 1 × 2−2
The base of a number system is also called the radix. The base of a number
is denoted by a subscript, for example, (4.445)10 is 4.445 in base 10 (deci-
mal), (1011.11)2 is 1011.11 in base 2 (binary), and (18C7.90)16 is 18C7.90
in base 16 (hexadecimal).
The conversion of an integer from one system to another is fairly simple
and can probably best be presented in terms of an example. Let k = 275
in decimal form, i.e., k = (2 × 102 ) + (7 × 101 ) + (5 × 100 ). Now (k/162 ) > 1
but (k/163 ) < 1, so in hexadecimal form k can be written as k = (α2 ×
162 ) + (α1 × 161 ) + (α0 × 160 ). Now, 275 = 1(162 ) + 19 = 1(162 ) + 1(16) + 3,
920 Linear Algebra and Optimization Using MATLAB
x = ±M × 10e ,
1
where the normalized mantissa M satisfies 10 ≤ M < 1. Normalization
consists of finding the exponent e for which |x|/10e lies on the interval
1
[ 10 , 1), then taking M = |x|/10e . This corresponds to “floating” the dec-
imal point to the left of the leading significant digit of x’s decimal repre-
sentation, then adjusting e as needed. For example,
−12.75 has representation −0.1275 × 102 (M = 0.1275, e = 2)
0.1 has representation +0.1 × 100 (M = 0.1, e = 0)
1 1 −1
15
= 0.066 has representation ( 15 × 10) × 10 (M = 0.66, e = −1).
A machine number for a calculator is a real number that it stores exactly
in normalized floating-point form. For the calculator storage, a nonzero x
is a machine number, if and only if its normalized floating decimal point
representation is of the form
x = ±M × 10e ,
922 Linear Algebra and Optimization Using MATLAB
where
M = 0.d1 d2 · · · dk (dk = 0, 1, 2, . . . , 9), with d1 6= 0
e = −100, −99, · · · , +99.
1
The condition d1 6= 0 ensures normalization (i.e., M ≥ 10
).
x = ±M × 2e .
−12.75 = − 51
4
can be represented as −( 51 . 1 ).24 (M =
4 16
51
64
,e = 4)
0.1 = 1
10
1
can be represented as +( 10 .8).2−3 (M = 45 , e = −3)
1
15
1
= 0.06666 · · · can be representation ( 15 .8).2−3 (M = 8
15
,e = −3).
Sign 1 1
Exponent 8 11
Mantissa 24 53
From this we see that the above representation uses 11 bits for the binary
exponent, which therefore ranges from about −210 to 210 . (The actual range
is not exactly this because of special representations for small numbers
and for ±∞.) The mantissa has 53 bits including the implicit bit. If x =
M × 2e is a normalized MATLAB floating-point number, then M ∈ [1, 2)
924 Linear Algebra and Optimization Using MATLAB
is represented by
52
X
M =1+ dk 2−k .
k=1
Since 210 = 1024 ≈ 103 , these 53 significant bits are equivalent to approxi-
mately 16 significant decimal digits accuracy in MATLAB representation.
The fact that the mantissa has 52 bits after the binary point means that
the next machine number greater than 1 is 1 + 2−52 . This gap is called
machine epsilon.
In MATLAB, neither underflow nor overflow cause a program to stop.
Underflow is replaced by a zero, while overflow is replaced by ±∞. This
allows subsequent instructions to be executed and may permit meaningful
results. Frequently, however, it will result in meaningless answers such
as ±∞ or NaN, which stands for Not-a-Number. NaN is the result of
indeterminate arithmetic operations such as 0/0, ∞/∞, 0.∞, ∞ − ∞, etc.
There are two commonly used ways of translating a given real number
x into a k- digits floating-point number, rounding and chopping, which we
shall discuss in the following section.
There are two ways of rounding off number s to a given number (k) of
decimals. In chopping, one simply leaves off all the decimals to the right
of the kth. That way of abridging a number is not recommended since
the error has, systematically, the opposite sign of the number itself. Also,
the magnitude of the error can be large as 10−k . A surprising number
of computers use chopping on the results of every arithmetical operation.
This usually does not do much harm, because the number of digits used
in the operations is generally far greater than the number of significant
digits in the data. In rounding (sometimes called “correct rounding”), one
chooses among the numbers that are closest to the given number. Thus,
if the part of the number which stands to the right of the kth decimal is
less than 1/2 × 10−k in magnitude, then one should leave the kth decimal
unchanged. If it is greater than 1/2 × 10−k , then one raises the kth decimal
by 1. In the boundary case, when the number that stands to the right
of the kth decimal is exactly 21 × 10−k , one should raise the kth decimal
if it is odd or leave it unchanged if it is even. In this way, the error is
positive or negative about equally often. Most computers that perform
rounding always, in the boundary case mentioned above, raise the number
by 1/2 × 10−k (or the corresponding operation in a base other than 10),
because this is easier to realize technically. Whichever convention one
chooses in the boundary case, the error in the rounding will always lie
on the interval [−1/2 × 10−k , 1/2 × 10−k ]. For example, shorting to three
decimals:
A.3 Error
An approximate number p is a number that differs slightly from an exact
number α. We write
p ≈ α.
926 Linear Algebra and Optimization Using MATLAB
E = α − p. (A.5)
48.47263 ≈ 48.4726.
3. If the first discarded digit is exactly 5 and there are nonzero among
those discarded, add 1 to the last retained digit. For example,
3.0554 ≈ 3.06.
Appendix A: Number Representations and Errors 929
4. If the first discarded digit is exactly 5 and all other discarded dig-
its are zero, the last retained digit is left unchanged if it is even,
otherwise, 1 is added to it. For example,
3.05500 ≈ 3.06
3.04500 ≈ 3.04.
With these rules, the error is never larger in magnitude than one-half unit
of the place of the nth digit in the rounded number.
To understand the nature of round-off errors, it is necessary to learn
the ways numbers are stored and additions and subtractions are performed
in a computer. •
≤ ( 21 × 10−3 ) + ( 21 × 10−4 )
= ar br − EA br − EB ar + EA EB ,
since cr = ar br , so
EC = EA br + EB ar − EA EB
and
EC EA br + EB ar − EA EB
=
cr ar b r
EA Eb EA EB
= + − .
ar br ar b r
The last term has as its numerator the product of two very small num-
bers, both of which will also be small compared with ar and br so we neglect
the last term, then we have
EC EA Eb
= + . (A.15)
cr ar br
The number EA /ar is called the relative error in ar . Then from (A.15), we
have
EC EA Eb
c r ≤ ar + b r . (A.16)
932 Linear Algebra and Optimization Using MATLAB
This result can be extended to the product of more than two numbers
and simply increases the number of terms on the right-hand side of the
formula. For example, consider the error in 1.015 × 0.3573 where both
numbers have been rounded off. Then
1
× 10−3 1
× 10−4
EC 2
0.363 ≤
+ 2
1.015 0.3573
Hence,
EC −3 −3
0.363 ≤ (0.49 × 10 ) + (0.14 × 10 )
≤ 0.63 × 10−3 .
So, we have
|EC | ≤ 0.63 × 0.363 × 10−3
≤ 0.23 × 10−3 .
0.3626595 ± 0.00023,
i.e., between 0.3624295 and 0.3628895, so that this result may be correctly
rounded off to 0.36, i.e., to 2dp.
Appendix A: Number Representations and Errors 933
ce = ae /be .
Then
(ar − EA )
cr − E C =
(br − EB )
ar (1 − EA /ar )
=
br (1 − EB /br )
−1
ar EA EB
= 1− 1− .
br ar br
The number −1
EB
1−
br
is expanded using the binomial series and neglecting those terms involving
powers of the relative error EB /br . Thus,
ar EA EB
cr − E C = 1− 1+ + ···
br ar br
ar EA EB EA EB
= 1− + (neglecting )
br ar br ar b r
ar EA EB ar
= − + 2 ,
br br br
and
EC EA EB ar ar
= − 2 ÷
cr br br br
EA EB
= − .
ar br
Hence,
EC EA EB
≤ +
c r ar b r ,
which gives the same result as for the product of the two numbers. It
follows that it is possible to extend this result to quotients with two or
more factors in the numerator or denominator by simply increasing the
number of terms on the right-hand side. For example, consider the error
in 17.28 ÷ 2.136, where both numbers have been rounded off. Then
Therefore,
1
× 10−2 1
× 10−3
EC 2 2
8.09 ≤ +
17.28 2.136
8.08989 ± 0.000432,
i.e., between 8.08569 and 8.09409, so that this result may be correctly
rounded off to 8.09, i.e., to 2dp. The value of |EC | suggested this directly.
This could be given to 3dp as 8.090, but with a large error of up to 5 units
in the third decimal place.
Appendix A: Number Representations and Errors 935
Using ae = ar − EA , we have
br − EB = (ar − EA )p
= (ar )p (1 − EA /ar )p
p pEA
= (ar ) 1 − + ··· .
ar
Using the binomial series and neglecting those terms involving powers
of the relative error EA /ar gives
br − EB = (ar )p − pEA (ar )p−1 ,
which implies that
EB = pEA (ar )p−1 ,
and so
EB pEA (ar )p−1
=
br (ar )p
EA
= p .
ar
Hence,
EB
= |p| EA ,
br ar
i.e., the relative error modulus of a power of a number is equal to the
product of the modulus of the power√ and the relative error modulus of the
number. For example, consider 8.675, √ where 8.675 has been rounded off.
Here, p = 1/2 and by the calculator 8.675 = 2.9453, retaining 4dp. Thus,
1 −3
≤ 2 × 10
EB
2.945 8.675
≤ 0.029 × 10−3 ,
so that
|EB | ≤ 0.85 × 10−4 .
√
This means that 8.675 may be correctly rounded off to 2.945, i.e., to 3dp
or may be given to 4dp with an error of up to 1 unit in the fourth decimal
place. •
Appendix A: Number Representations and Errors 937
A.6 Summary
In this chapter, we discussed the storage and arithmetic of numbers on a
computer. Efficient storage of numbers in computer memory requires allo-
cation of a fixed number of bits to each value. The fixed bit size translates
to a limit on the number of decimal digits associated with each number,
which limits the range of numbers that can be stored in computer memory.
The three number systems most commonly used in computing are bi-
nary (base 2), decimal (base 10), and hexadecimal (base 16). Techniques
were developed for transforming back and forth between the number sys-
tems. Binary numbers are a natural choice for computers because they
correspond directly to the underlying hardware, which features transistors
that are switched on and off.
The absolute and relative errors were discussed as measures of differ-
ence between exact x and approximate x̂. They were applied to the storage
mechanisms of chopping and rounding to estimate the maximum error in-
troduced when on storing a number. Rounding is somewhat more accurate
than chopping (ignoring excess digits), but chopping is typically used be-
cause it is simpler to implement in hardware.
Round-off error is one of the principal sources of error in numerical
computations. Mathematical operations on floating-point values introduce
round-off errors because the results must be stored with a limited number
of decimal digits. In numerical calculations involving many operations,
round-off gradually corrupts the least significant digits of the results.
The other main source of error in numerical computations is called trun-
cation error. Truncation error is the error that arises when approximations
to exact mathematical expressions are used, such as the truncation of an
infinite series to a finite number of terms. Truncation error is independent
of round-off errors, although these two sources of error combine to affect
the accuracy of a computed result. Truncation error considerations are
important in many procedures and are discussed throughout the book. •
938 Linear Algebra and Optimization Using MATLAB
A.7 Problems
1. Convert the following binary numbers to decimal form:
16. Find absolute error in each of the following calculations (all numbers
are rounded):
(a)
187.2 + 93.5.
(b)
0.281 × 3.7148.
(c) √
28.315.
(d) p
(6.2342 × 0.82137)/27.268.
Appendix B
Mathematical Preliminaries
This appendix presents some of the basic mathematical concepts that are
used frequently in our discussion. We start with the concept of vector
space, which is useful for the discussion of matrices and systems of linear
equations. We also give a review of complex numbers and how they can
be used in linear algebra. This appendix is also devoted to general inner
product spaces and how the different notations and processes generalize.
>> u = [3 − 4];
>> v = norm(u);
Operations on Vectors
u+v =v+u
(u + v) + w = u + (v + w).
5. The two vectors i =< 1, 0 > and j =< 0, 1 > have magnitude 1 and
can be used to obtain another way of denoting vectors as
u =< u1 , u2 >= u1 i + u2 j.
If a 6= 0, then the unit vector u that has the same direction as a is defined
as
a
u= .
kak
For example, the unit vector u that has the same direction as 4i − 3j is
4i − 3j 4i − 3j 4 3
u= p = = i − j,
42 + (−3)2 5 5 5
The unit vector can be obtained using the MATLAB Command Window
as follows:
>> a = [4 − 3];
>> u = a/norm(a);
Now we define two useful concepts that involve vectors u and v—the dot
product, which is a scalar, and the cross product, which is a vector. First,
we define the dot product of two vectors as follows.
Definition B.3 (Dot Product of Vectors)
The multiplication for two vectors u =< u1 , u2 > and v =< v1 , v2 > is
called the dot product (or scalar product) and is symbolized by u.v. It is
defined as
u.v =< u1 , u2 > . < v1 , v2 >= u1 v1 + u2 v2 .
For example, if u =< 3, −4 > and v =< 2, 3 > are two vectors, then
>> u = [3 − 4];
>> v = [2 3];
>> dot(u, v);
5. u.u = kuk. •
The dot product of two vectors can be also defined as follows.
Definition B.4 (Dot Product of Vectors)
If u and v are nonzero vectors, and θ is the angle between them, the dot
product of u and v is defined as follows:
or
< 4, 3 > . < 2, 5 >
cos θ = ,
k < 4, 3 > kk < 2, 5 > k
which gives
−1 23
θ = cos √ ≈ 0.5468 radian = 31.3287 degree,
5 29
which is called the angle between the given vectors. •
The angle between two vectors can be obtained using the MATLAB Com-
mand Window as follows:
>> u = [4 3];
>> v = [2 5];
>> T = a cos(dot(u, v)/(norm(u) ∗ norm(v)));
>> (360 ∗ T )/(2 ∗ pi);
Example B.2 If u =< 3, 4, −2 > and v =< −5, 7, 6 >, then find u + v,
4u − 3v, and kuk.
Hence,
−1−17
θ = cos √ ≈ 99.96o ,
9660
which is the required angle between the given vectors. •
and
u.i =< u1 , u2 , u3 > . < 1, 0, 0 >= u1 ,
and it follows that
u1
cos α = .
kuk
By similar reasoning with the basis vectors j and k, we have
u2 u3
cos β = and cos γ = ,
kuk kuk
where α, β, and γ, are respectively, are the angles between u and i, u and
j, and u and k.
Consequently, any nonzero vector u in space has the normalized form
u u1 u2 u3
= i+ j+ k = cos α + cos β + cos γ,
kuk kuk kuk kuk
u
and because is a unit vector, it follows that
kuk
Note that the vector < cos α, cos β, cos γ > is a unit vector with the same
direction as the original vector u. •
Appendix B: Mathematical Preliminaries 951
Example B.4 Find the direction cosines and angles for the vector u =
3i + 6j + 2k, and show that cos2 α + cos2 β + cos2 γ = 1.
Solution. Because
√ √ √
kuk = 32 + 62 + 22 = 9 + 36 + 4 = 49 = 7,
we can write it as
u1 3
cos α = = ,
kuk 7
and it gives
3
α = cos−1 ≈ 64.62o .
7
Similarly,
u2 6 6
cos β = = , β = cos−1 ≈ 31.00o
kuk 7 7
and
u3 2 2
cos γ = = , β = cos−1 ≈ 73.40o .
kuk 7 7
Furthermore, the sum of the squares of the direction cosines is
9 36 4
cos2 α + cos2 β + cos2 γ = + + = 1.
49 49 49
•
952 Linear Algebra and Optimization using MATLAB
compi u = u.i = u1
compj u = u.j = u2
compk u = u.k = u3 .
Thus, the components of u along i, j, and k are the same as the components
u1 , u2 , and u3 of the vector u. •
Thus,
(2)(3) + (2)(2) + (1)(−6) 4
compu v = = ,
7 7
the required solution. •
To get the results of Example B.5, we use the MATLAB Command Win-
dow as follows:
>> u = [3 2 6];
>> v = [2 2 1];
>> compu = (dot(u, v))/norm(v);
>> compv = (dot(u, v))/norm(u);
If u and v are nonzero vectors, then the projection of vector u onto vector
v is denoted by projv u and is defined as
u.v
projv u = v.
kvk2
Note that the projection of u onto v can be written as a scalar multiple of
a unit vector in the direction of v, i.e.,
u.v (u.v) v v
projv u = 2
v= =K ,
kvk kvk kvk kvk
954 Linear Algebra and Optimization using MATLAB
where
u.v
K= = kuk cos θ
kvk
is called the component of u in the direction of v. •
Example B.6 If u = 4i − 5j + 3k and v = 6i − 3j + 2k, then find the
projection of u onto v.
Solution. Since
u.v = (4)(6) + (−5)(−3) + (3)(2) = 24 + 15 + 6 = 45
and p √ √
kvk = 62 + (−3)2 + 22 = 36 + 9 + 4 = 49 = 7,
using the above definition, we have
u.v 45
projv u = 2
v= (6i − 3j + 2k)
kvk 49
or
270 135 90
projv u = i− j + k,
49 49 49
which is the required projection of u onto v. •
To get the results of Example B.6, we use the MATLAB Command Win-
dow as follows:
>> u = [4 − 5 3];
>> v = [6 − 3 2];
>> K = dot(u, v)/norm(v);
>> X = v/norm(v);
>> P roj = K ∗ X;
If, for example, the distance is in feet and the magnitude of the force is in
pounds, then the work done is 4 ft-lb. If the distance is in meters and the
force is in Newtons, then the work done is 4 joules. •
The other way to multiply two vectors u =< (u1 , u2 , u3 > and v =<
v1 , v2 , v3 > is known as the cross product (or vector product) and is sym-
bolized by u × v. It is defined as
i j k
u2 u3 u1 u3 u1 u2
u × v = u1 u2 u3 = i −
v1 v3 j + v1 v2 k.
v1 v2 v3 v 2 v3
For example, if u =< 1, −1, 2 > and v =< 2, −1, −2 > are two vectors,
then their cross product is defined as
i j k
−1 2 1 2 1 −1
u × v = 1 −1 2 = i −
2 −2 j + 2 −1 k.
−1 −2
2 −1 −2
and it gives
•
Note that the length of the cross product u × v is equal to the area of the
parallelogram determined by the vectors u and v, i.e.,
Also, the area of the triangle is half of the area of the parallelogram, i.e.,
1
Area of triangle = A = ku × vk.
2
Example B.8 Find the area of the parallelogram made by P~Q and P~R,
where P (3, 1, 2), Q(2, −1, 1), and R(4, 2, −1) are the points in the plane.
Solution. Since
and
P~R = (4 − 3)i + (2 − 1)j + (−1 − 2)k = i + j − 3k,
their cross product is defined as follows:
i j k
~ ~
−2 −1 −1 −1
j + −1 −2 k
P Q × P R = −1 −2 −1 =
i −
1 −3 1 −3 1 1
1 1 −3
= (−6 + 1)i − (−3 + 1)j + (−1 + 2)k
= −5i + 2j + k.
Thus, √
A = kP~Q × P~Rk =
p
(−5)2 + 22 + 12 = 30 unit2
is the required area of the parallelogram. •
Example B.9 Find a vector perpendicular to the plane that passes through
the points P (2, 1, 4), Q(−3, 4, −2), and R(2, −2, 1).
Solution. The P~Q×P~R is perpendicular to both P~Q and P~R and therefore
perpendicular to the plane through P, Q, and R. Since
and
P~R = (2 − 2)i + (−2 − 1)j + (1 − 4)k = 0i − 3j − 3k,
Appendix B: Mathematical Preliminaries 959
Solution. Since the area of the triangle is half of the area of the paral-
lelogram, we compute first the area of the parallelogram. The area of the
parallelogram with adjacent sides P Q and P R is the length of the cross
product P~Q × P~R, therefore, we find the vectors P~Q and P~R as follows:
P~Q = (3 − 2)i + (1 − 1)j + (2 − 1)k = i + 0j + k,
and
P~R = (1 − 2)i + (−2 − 1)j + (1 − 1)k = −i − 3j + 0k.
Now we compute the cross product of these two vectors as follows:
i j k
0 1 1 1
j + 1 0
P~Q × P~R = 1
0 1 = i − k
−1 −3 0 −3 0 −1 0 −1 −3
1. u × v = −(v × u).
3. u × (v + w) = (u × v) + (u × w).
4. (u + v) × w = (u × w) + (v × w).
6. u × (v × w) = (u.w)v − (u.v)w. •
Note that the product u.(v × w) that occurs 5th in Theorem B.5 is called
the scalar triple product of the vectors u, v, and w. We can write the scalar
triple product of the vectors as a determinant:
u1 u2 u3
u.(v × w) = v1 v2 v3 .
w1 w2 w3
To get the scalar triple product of the given vectors of Example B.11, we
use the MATLAB Command Window as follows:
Appendix B: Mathematical Preliminaries 961
>> u = [2 1 3];
>> v = [3 2 4];
>> w = [4 3 5];
>> x = cross(v, w);
>> y = dot(u, x);
Note that the volume of the parallelepiped determined by the vectors u, v,
and w is the magnitude of their scalar triple product:
Example B.12 Find the volume of the parallelepiped having adjacent sides
AB, AC, and AD, where
A(0, 1, 0), B(2, −2, 3), C(1, 1, −1), and D(4, −1, −1).
Solution. Since
~ = (2 − 0)i + (−2 − 1)j + (3 − 0)k = 2i − 3j + 3k
u = AB
~ = (1 − 0)i + (1 − 1)j + (−1 − 0)k = i + 0j − k
v = AC
~ = (4 − 0)i + (−1 − 1)j + (−1 − 0)k = 4i − 2j − k,
w = AD
use the following determinant to compute the scalar triple product of the
given vectors as follows:
2 −3 3
0 −1 1 −1 1 0
u.(v × w) = 1 0 −1 = 2 + 3 + 3
4 −2 −1 −2 −1 4 −1 4 −2
V = |u.(v × w)| = | − 1| = 1
>> u = [2 − 3 3];
>> v = [1 0 − 1];
>> w = [4 − 2 − 1];
>> x = cross(v, w);
>> y = dot(u, x);
>> v = abs(y);
Note that if the volume of the parallelepiped determined by the vectors
u, v, and w is zero, then the vectors must lie in the same plane; i.e., they
are coplanar.
Example B.13 Use the scalar triple product to show that the vectors
u = 4i + 6j + 2k, v = 2i − 2j, and w = 14i + 6j + 4k are coplanar.
Solution. Given
u = 4i + 6j + 2k
v = 2i − 2j
w = 14i + 6j + 4k,
we use the following determinant to compute the scalar triple product of
the given vectors as follows:
4 6 2
−2 0 2 0 2 −2
u.(v × w) = 2 −2 0 = 4
− 6
+ 2
14 6 4 14 4 14 6
6 4
= 4(−8 − 0) − 6(8 − 0) + 2(12 + 28)
= −32 − 48 + 80 = 0.
Since
V = |u.(v × w)| = 0,
the volume of the parallelepiped determined by the given vectors u, v, and
w is zero. This means that u, v, and w are coplanar. •
Note that the product u × (v × w) that occurs 6th in Theorem B.5 is called
the triple vector product of the vectors u, v, and w. We can write the triple
vector product of the vectors in dot product form as
u × (v × w) = (u.w)v − (u.v)w,
Appendix B: Mathematical Preliminaries 963
and the result of the triple vector product of the vectors is a vector.
Example B.14 Find the triple vector product of the vectors u = 3i−j, v =
2i + j + k, and w = i − j + k.
Solution. To find the triple vector product of u =< 3, −1, 0 >, v =<
2, 1, 1 >, and w =< 1, −1, 1 >,we compute the following dot products:
and
Thus,
u × (v × w) = 4v − 5w
= 4 < 2, 1, 1 > −5 < 1, −1, 1 >
= < 8, 4, 4 > − < 5, −5, 5 >
= < 8 − 5, 4 + 5, 4 − 5 >
= < 3, 9, −1 >,
and
i j k
u × (v × w) = u × x = 3 −1 0
2 −1 −3
−1 0 3 0 3 −1
=
i−
j+
k
−1 −3 2 −3 2 −1
= (3 − 0)i − (−9 − 0)j + (−3 + 2)k
= 3i + 9j − k,
To get the triple vector product of the given vectors of Example B.14, we
use the MATLAB Command Window as follows:
>> u = [3 − 1 0];
>> v = [2 1 1];
>> w = [1 − 1 1];
>> x = cross(v, w);
>> y = cross(u, x);
Lines in Space
Let us consider a line that passes through the point P1 = (x1 , y1 , z1 ) and
is parallel to the position vector a = (a1 , a2 , a3 ). For any other point
P = (x, y, z) on the line, the vector P~1 P must be parallel to a, i.e.,
P~1 P = ta
Appendix B: Mathematical Preliminaries 965
Example B.15 Find the parametric and symmetric equations of the line
passing through the points (1, 3, −2) and (3, −2, 5).
Solution. Begin by letting P1 = ((1, 3, −2) and P2 = (3, −2, 5), then a
direction vector for the line passing through P1 and P2 is given by
a = P1~P2 = (3 − 1, −2 − 3, 5 + 2) = (2, −5, 7),
which is parallel to the given line and taking either point will give us an
equation for the line. So using direction number a1 = 2, a2 = −5, and
a3 = 7, with the point P1 = ((1, 3, −2), we can obtain the parametric
equations of the form
x − 1 = 2t, y − 3 = −5t, z + 2 = 7t.
Similarly, the symmetric equations of the line are
x−1 y−3 z+2
= = .
2 −5 7
•
966 Linear Algebra and Optimization using MATLAB
Solution. Given that lines l1 and l2 are parallel, respectively, to the vectors
x0 = 1 + 3t, y0 = 5 − 3t, z0 = −1 + 5t
and
x0 = 2 + 7t, y0 = 4 − 3t, z0 = 5 + t.
This leads to three conditions on t1 and t2 :
1 + 3t1 = 2 + 7t2
5 − 3t1 = 4 − 3t2
−1 + 5t1 = 5 + t2 .
With these values of t1 and t2 , the third equation of the above system is
not satisfied, so the lines do not intersect. Thus, the given linear lines are
skew lines. •
Planes in Space
As we have seen, an equation of a line in space can be obtained from a
point on the line and a vector parallel to it. A plane in space is determined
by specifying a vector n =< a, b, c > that is normal (perpendicular) to
968 Linear Algebra and Optimization using MATLAB
the plane (i.e., orthogonal to every vector lying in the plane), and a point
P1 = (x1 , y1 , z1 ) lying in the plane.
In order to find an equation of the plane, let P = (x, y, z) represent any
point in the plane. Then, since P and P1 are both points in the plane, the
vector
P~1 P = (x − x1 , y − y1 , z − z1 )
lies in the plane and so must be orthogonal to n, i.e.,
n.P~1 P = 0
(a, b, c).(x − x1 , y − y1 , z − z1 ) = 0
a(x − x1 ) + b(y − y1 ) + c(z − z1 ) = 0
The above third equation is called the equation of the plane in standard
form or sometimes called the point-normal form of the equation of the
plane.
Let us rewrite the equation as
ax − ax1 + by − by1 + cz − cz1 = 0
or
ax + by + cy − ax1 − by1 − cz1 = 0.
Since the last three terms are constant, combine them into one constant d
and write
ax + by + cy + d = 0.
This is called the general form of the equation of the plane.
Given the general form of the equation of the plane, it is easy to find a
normal vector to the plane. Simply use the coefficients of x, y, and z and
write n =< a, b, c > .
Example B.18 Find an equation of the plane through the point (3, −4, 3)
with normal vector n =< 3, −4, 5 > .
Solution. Using the direction number for n =< 3, −4, 5 >=< a, b, c >
and the point (x1 , y1 , z1 ) = (3, −4, 3), we can obtain
a(x − x1 ) + b(y − y1 ) + c(z − z1 ) = 0
3(x − 3) − 4(y + 4) + 5(z − 3) = 0 (standard form)
3x − 4y + 5z − 40 = 0, (general form)
Appendix B: Mathematical Preliminaries 969
the equation of the plane. Observe that the given point (3, −4, 3) satisfies
this equation. •
Example B.19 Find the general equation of the plane containing the three
points (2, −1, 3), (3, 1, 2), and (4, 5, −3).
Solution. To find the equation of the plane, we need a point in the plane
and a vector that is normal to the plane. There are three choices for the
point, but no normal vector is given. To find a normal vector, use the
vector product of vectors a and b extending from the point P1 (2, −1, 3) to
the points P2 (3, 1, 2) and P3 (4, 5, −3). The component forms of a and b are
as follows:
the vector which is normal to the given plane. Using the direction number
for n and the point (x1 , y1 , z1 ) = (2, −1, 3), we can obtain an equation of
the plane to be
Note that each of the given points (2, −1, 3), (3, 1, 2), and (4, 5, −3) satisfies
this plane equation. •
970 Linear Algebra and Optimization using MATLAB
Note that:
Since
n2 =< 8, −8, 12 >= 4 < 2, −2, 3 >= 4n1 ,
the vectors n1 and n2 are parallel, and so are the planes. •
The distance between a plane and a point R (which is not in the plane) is
defined as
|P~R.n|
D= ,
knk
where P is a point in the plane and n is normal to the plane. •
Example B.21 Find the distance between the point R = (3, 7, −3) and
the plane given by 4x − 3y + 5z = 8.
From Theorem B.6, we can determine the distance between the point R =
(x0 , y0 , z0 ), and the plane given by ax + by + cz + d = 0 is
Example B.22 Find the distance between the point P (1, −2, −3) and the
plane 6x − 2y + 3z = 2.
972 Linear Algebra and Optimization using MATLAB
6x − 2y + 3z − 2 = 0,
we obtain
a = 6, b = −2, c = 3, d = −2.
Using these values, we get
|(6)(1) + (−2)(−2) + (3)(−3) − 2|
D= p
62 + (−2)2 + 32
or
| − 1| 1
D= √ = ,
49 7
which is the distance from the given point to the given plane. •
Solution. First, we note that the planes are parallel because their normal
vectors < 9, 3, −3 > and < 3, 1, −1 > are parallel, i.e.,
To find the distance between the planes, we choose any point on one plane,
say (x0 , y0 , z0 ) = (1, 0, 0) is a point in the first plane, then, from the second
plane, we can find
a = 3, b = 1, c = −1, d = −2.
Example B.24 Show that the following system of equations has no solu-
tion:
x1 − x2 + 4x3 = 1
−2x1 + 2x2 − 8x3 = 3
x1 + x2 + 3x3 = 2.
Solution. Consider the general form of the equation of the plane
ax + by + cy + d = 0,
where the vector (a, b, c) is normal to this plane. Interpret each of the given
equations as defining a plane in R3 . On comparison with the general form,
it is seen that the following vectors are normal to these three planes:
Note that
(−2, 2, −8) = −2(1, −1, 4),
which shows that the normals to the first two planes are parallel. Thus,
these two planes are parallel and are distinct. Thus, three planes have no
points in common and, therefore, the given system has no solution. •
kP~R × uk
D= ,
kuk
where u is the direction vector for the line and P is a point on the line. •
Example B.25 Find the distance between the point R = (4, −2, 5) and
the line given by
x = −1 + 3t, y = 2 − 5t, and z = 3 + 7t.
Solution. Using the direction numbers 3, −5, 7, we have the direction vec-
tor for the line, which is
u =< 3, −5, 7 > .
So to find a point P on the line, let t = 0, and we get the point P =
(−1, 2, 3). Thus, the vector from P to R is given by
P~R =< 4 + 1, −2 − 2, 5 − 3 >=< 5, −4, 2 >,
and we can form the vector product as
i j k
~
−4 2 i − 5 2
j + 5 −4
P R × u = 5 −4 2 = k.
−5 7 3 7 3 −5
3 −5 7
Solving this, we get
P~R × u = (−28 + 10)i − (35 − 6)j + (−25 + 12)k
= −18i − 29j − 13k
= < −18, −29, −13 > .
Thus, the distance between the point R and the given line is
kP~R × uk k < −18, −29, −13 > k
D= =
kuk k < 3, −5, 7 > k
p
(−18)2 + (−29)2 + (−13)2
= p
32 + (−5)2 + 72
√
1334
= √
83
36.5240
=
9.1104
= 4.0090,
which is the required distance between the given point and the line. •
Appendix B: Mathematical Preliminaries 975
Solution. Since the two lines l1 and l2 are skew, they can be viewed as
lying on two parallel planes P1 and P2 . The distance between l1 and l2 is
the same as the distance between P1 and P2 . The common normal vector
to both planes must be orthogonal to both u1 =< 2, −2, 1 > (the direction
of l1 ) and u2 =< 3, 1, 2 > (the direction of l2 ). So a normal vector is
i j k
−2 1 2 1 2 −2
n = u1 × u2 = 2 −2 1 =
i − 3 2 j + 3
k.
3 1 2 1
1 2
Solving this, we get
n = (−4 − 1)i − (4 − 3)j + (2 + 6)k
= −5i − j + 8k
= < −5, −1, 8 > .
If we put s = 0 in the equations of l2 , we get the point (2, 2, −4) on P2 ,
and so the equation for P2 is
−5(x − 2) − (y − 2) + 8(z + 4) = 0,
which can also be written as
−5x − y + 8z + 44 = 0.
If we set t = 0 in the equations of l1 , we get the point (1, 3, 5) on P1 . So
the distance between l1 and l2 is the same as the distance from (1, 3, 5) to
−5x − y + 8z + 44 = 0. Thus, the distance is
|(−5)(1) + (−1)(3) + (8)(5) + 44|
D = p
(−5)2 + (−1)2 + 82
76
= √
90
≈ 8.0111,
which is the required distance between the skew lines. •
976 Linear Algebra and Optimization using MATLAB
ax2 + bx + c = 0
z = a + ib, (B.1)
where a and b are real numbers; a is called the real part of z and is denoted
by Re(z); and b is called the imaginary part of z and is denoted by Im(z).
We say that two complex numbers z1 = a1 + ib1 and z2 = a2 + ib2 are
equal, if their real and imaginary parts are equal, i.e., if
a1 = a2 and b1 = b2 .
Note that:
Appendix B: Mathematical Preliminaries 977
z1 − z2 = (a1 − a2 ) + i(b1 − b2 ).
This multiplication formula is obtained by expanding the left side and using
the fact that i2 = −1.
One can multiply a complex number by a real number α according to
αz = αa + iαb.
>> z1 = conj(z);
√
We call zz the modulus, absolute value, or the magnitude of z and write
√
|z| = |a + ib| = zz = a2 + b2 .
1. z = z.
2. z1 + z2 = z1 + z2 .
3. z1 z2 = z1 z2 .
z1 z1
4. If z2 6= 0, then = .
z2 z2
A complex vector space is defined in exactly the same manner as its real
counterpart, the only difference being that we replace real scalars by com-
plex scalars. The terms complex vector space and real vector space empha-
size the set from which the scalars are chosen. The most basic example is
the n-dimensional complex vector space Cn consisting of all column vectors
z = (z1 , z2 , . . . , zn )n that have n complex entries z1 , z2 , . . . , zn in Cn . Note
that
z ∈ Rn ⊂ Cn
so
z = a + ib = r cos θ + ir sin θ.
Thus, any complex number can be written in the polar form
where
√ b
r = |z| = a2 + b 2 and
tan θ = .
a
The angle θ is called an argument of z and is denoted argz. Observe
1 1
= (cos θ − i sin θ) .
z r
In the following we give some well-known theorems concerning the polar
form of a complex number.
•
982 Linear Algebra and Optimization using MATLAB
Using Euler’s formula, we see that the polar form of a complex number can
be written more compactly as
and z = re−iθ . •
z1 z2 = r1 r2 ei(θ1 +θ2 ) .
x2 + ux + v = 0,
then α1 + α2 = −u and α1 α2 = v. •
where u1 , u2 , . . . , un are the roots of f (x) (and need not all be distinct) and
u is the coefficient of xn . •
Appendix B: Mathematical Preliminaries 983
Theorem B.14 Every polynomial f (x) of positive degree with real coef-
ficients can be factored as a product of linear and irreducible quadratic
factors. •
Then
1 + i 3i 2 − 2i 4i 3 − i 7i
A+B = + =
5 + i 6i 1 + 3i 2i 6 + 4i 8i
and
1 + i 3i 2 − 2i 4i −1 + 3i −i
A−B = − = .
5 + i 6i 1 + 3i 2i 4 − 2i 4i
Also,
1+i 2+i 9 − 9i −9 + 15i
1 + i 3i
CA = 5 + i 4 + 5i = 19 + 35i −30 + 39i
5 + i 6i
2 − 4i 1 + 2i 9 + 9i 12i
984 Linear Algebra and Optimization using MATLAB
and
1 + i 3i −3 + 3i −9i
3iA = 3i = .
5 + i 6i −3 + 15i −18
There are special types of complex matrices, like Hermitian matrices, uni-
tary matrices, and normal matrices which we discussed in Chapter 3.
. ! . !
3i 4 .. 5 + 15i 3i 0 .. −3 + 3i
∼ . ∼ .
0 1 .. 2 + 3i 0 1 .. 2 + 3i
. !
1 0 .. 1 + i
∼ . .
0 1 .. 2 + 3i
Thus, the solution to the given system is x1 = 1 + i and x2 = 2 + 3i. •
can be obtained as
1 + i 3i
|A| = = (6i + 6i2 ) − (15i + 3i2 ) = −9i − 3.
5 + i 6i
•
Ax = λx. (B.3)
(A − λI)x = 0. (B.4)
det(A − λI) = 0
then
det(A − λI) = λ2 + 1 = 0
gives the eigenvalues λ1 = i and λ2 = −i of A. One can easily find the
eigenvectors
x1 = [1, i]T and x2 = [−1, i]T ,
associated with eigenvalues i and −i, respectively. •
986 Linear Algebra and Optimization using MATLAB
A vector space with an inner product is called an inner product space. The
most basic example of an inner product is the familiar dot product
n
X
< u, v > = u.v = u1 v1 + u2 v2 + · · · + un vn = uj vj ,
j=1
Let v be a vector in an inner product space V . Then the length (or norm)
of v is defined as √
kvk = < v, v >.
•
Theorem B.17 (Inner Product Norm Theorem)
If V is a real vector space with an inner product < u, v >, then the function
√
kuk = < u, u > is a norm on V. •
Definition B.14 (Distance Between Vectors)
with the inequality holding, if and only if u and v are scalar multiples of
each other. •
ku + vk ≤ kuk + kvk.
Theorem B.26 Let V be an inner product space. For any vectors u and
v of V , we have
•
990 Linear Algebra and Optimization using MATLAB
The following additional properties follow immediately from the four inner
product axioms:
An inner product can then be used to define the norm, orthogonality, and
distance for a real vector space.
Let u = (u1 , u2 , . . . , un ) and v = (v1 , v2 , . . . , vn ) be elements of Cn . The
most useful inner product for Cn is
< u, v >= u1 v1 + u2 v2 + · · · + un vn .
It can be shown that this definition satisfies the inner product axioms for
a complex vector space.
This inner product leads to the following definitions of norm, distance,
and orthogonality for Cn :
√
1. kuk = u1 u1 + u2 u2 + · · · + un un .
2. d(u, v) = ku − vk.
B.4 Problems
1. Compute u + v, u − v and their k.k for each of the following:
(a) u =< 4, −5 >, v =< 3, 4 > .
(b) u =< −3, −7 >, v =< −4, 5 > .
(c) u =< 4, 5, 6 >, v =< 1, −3, 4 > .
(d) u =< −7, 15, 26 >, v =< 11, −13, 24 >
9. Find the value of α such that the following vectors are orthogonal:
(a) u =< 4, 5, −3 >, v =< α, 4, 0 > .
(b) u =< 5, α, −4 >, v =< 5, −3, 4 > .
(c) u =< 2 sin x, 2, − cos x >, v =< − sin x, α, 2 cos x > .
(d) u =< sin x, cos x, −2 >, v =< cos x, − sin x, α > .
11. Find the direction cosines and angles for the vector u = 1i + 2j + 3k,
and show that cos2 α + cos2 β + cos2 γ = 1.
12. Find the direction cosines and angles for the vector u = 9i−13j+22k,
and show that cos2 α + cos2 β + cos2 γ = 1.
20. Use the cross product to show that each of the following vectors are
parallel:
(a) u =< 2, −1, 4 >, v =< −6, 3, −12 > .
(b) u =< −3, −2, 1 >, v =< 6, 4, −2 > .
(c) u =< 3, 4, 2 >, v =< −6, −8, −4 > .
(d) u =< −6, −10, 4 >, v =< 3, 5, −2 > .
Appendix B: Mathematical Preliminaries 995
23. Find the area of the parallelogram made by P~Q and P~R, where P,
Q, and R are the points in the plane:
(a) P (4, −5, 2), Q(2, 4, 7), R(−4, −2, 6).
(b) P (2, 1, 1), Q(5, 2, −5), R(4, −2, 5).
(c) P (2, 1, 1), Q(4, 4, 5), R(5, −3, 4).
(d) P (2, 1, 1), Q(9, 5, 3), R(4, 7, 9).
26. Find the scalar triple product of each of the following vectors:
(a) u =< 2, 0, 1 >, v =< 3, −4, 2 >, w =< 3, −2, 0 > .
(b) u =< 5, 3, 0 >, v =< 1, −2, 5 >, w =< 3, −2, 7 > .
(c) u =< 5, −3, 2 >, v =< 2, 1, 6 >, w =< 4, 0, −5 > .
(d) u =< 5, −4, 9 >, v =< 7, −4, 4 >, w =< 6, 1, 5 > .
996 Linear Algebra and Optimization using MATLAB
27. Find the scalar triple product of each of the following vectors:
(a) u = 3i − 5j + 2k, v = 6i + 3j + 4k, w = 3i − 8j + k.
(b) u = 4i − 3j + 6k, v = −4i + 3j + 5k, w = −3i + 9j + 2k.
(c) u = 17i − 25j + 10k, v = 5i + 9j + 13k, w = 4i + 5j + 8k.
(d) u = 25i+24j+15k, v = 13i−11j+17k, w = −9i+18j+27k.
31. Find the triple vector product by using each of the following vectors:
(a) u =< 3, 2, 1 >, v =< 3, −2, 1 >, w =< −2, −2, 1 > .
(b) u =< 5, 3, −4 >, v =< 3, −2, 4 >, w =< 3, −2, 2 > .
(c) u =< 5, 6, 7 >, v =< 6, 8, 9 >, w =< 14, 13, 17 > .
(d) u =< 17, 21, 18 >, v =< 15, 7, 12 >, w =< 14, −12, 15 > .
32. Find the triple vector product by using each of the following vectors:
(a) u = 2i − 2j + 2k, v = 3i + 5j + 4k, w = 4i − 3j + 2k.
(b) u = i − 3j + 2k, v = 4i + 6j + 5k, w = 2i + 4j + 3k.
(c) u = 3i + 4j + 7k, v = 4i + 9j + 11k, w = 5i + 7j + 12k.
(d) u = 8i+9j+15k, v = 10i−14j+16k, w = −16i−22j+15k.
Appendix B: Mathematical Preliminaries 997
33. Find the parametric equations for the line through point P parallel
to vector u :
(a) P (3, 2, −4), u = 2i + 2j + 2k.
(b) P (2, 0, 3), u = −2i − 3j + k.
(c) P (1, 2, 3), u = 2i + 4j + 6k.
(d) P (2, 2, −3), u = 4i + 5j − 6k.
34. Find the parametric equations for the line through points P and Q :
(a) P (4, −3, 5), Q(3, 5, 2).
(b) P (−2, 2, −3), Q(5, 8, 9).
(c) P (3, 2, 4), Q(−7, 2, 4).
(d) P (6, −5, 3), Q(3, −3, −4).
36. Determine whether the two lines l1 and l2 intersect, and if so, find
the point of intersection:
(a) l1 : x = 1 + 3t, y = 2 − 5t, z = 4 − t;
l2 : x = 1 − 6v, y = 2 + 3v, z = 1 + v.
(b) l1 : x = −3 + 3t, y = 2 − 2t, z = 4 − 4t;
l2 : x = 1 − v, y = 2 + 2v, z = 3 + 3v.
(c) l1 : x = 4 + 3t, y = 12 − 15t, z = 14 − 13t;
l2 : x = 11 − 16v, y = 12 + 13v, z = 10 + 10v.
(d) l1 : x = 9 + 5t, y = 10 − 11t, z = 9 − 21t;
l2 : x = 16 − 22v, y = 13 + 23v, z = 11 − 15v.
37. Find an equation of the plane through the point P with normal vector
u:
(a) P (4, −3, 5), u = 2i + 3j + 4k.
998 Linear Algebra and Optimization using MATLAB
40. Show that the two planes are parallel and find the distance between
the planes:
(a) 8x − 4y + 12z − 6 = 0, −6x + 3y − 9z − 4 = 0.
(b) x + 2y − 2z − 3 = 0, 2x + 4y − 4z − 7 = 0.
(c) 2x − 2y + 2z − 4 = 0, x − y + z − 1 = 0.
(d) −4x + 2y + 2z − 1 = 0, 6x − 3y − 3z − 4 = 0.
43. Convert each of the following complex numbers to its polar form:
(a) 4 √
+ 3i.
(b) 5 3 + 7i.
Appendix B: Mathematical Preliminaries 999
√
(c) −3 √
+ 5i.
(d) −7 3 − 5i.
44. Convert each of the following complex numbers to its polar form:
(a) 11 −√8i.
(b) −15 7√+ 22i.
(c) −45√ + 19i.
(d) 24 18 − 25i.
45. Compute the conjugate of each of the following:
(a) 5 − 3i.
(b) −7 + 9i.
(c) e−2πi .
(d) 11e5πi/4 .
√
√ polar forms of the complex numbers z1 = −1 −
46. Use the 3i and
z2 = 3 + i to compute z1 z2 and z1 /z2 .
47. Find A + B, A − B, and CA using the following matrices:
i 1−i
2+i i 1 − i −2i
A= ,B= , C = 4 − i 3 + 2i .
3 − i 2i 1 + 2i 3i
2 − 3i 1 − 2i
48. Find 2A + 5B, 3A − 7B, and 4CA using the following matrices:
1 + i 3i 1 + i −3i 4i 1 − 5i
A = 4 − i 2i 1 − 3i , B = −5 + 2i 8i 2 + 7i ,
i 2i i −i 1 + 4i −7i
1+i 1−i 4i
C = 29 − 12i 1 − 44i 13i .
52 − 63i 31 + 21i −42i
49. Solve each of the following systems:
(a)
2ix1 − 3x2 = 1 − i
4ix1 + (5 + i)x2 = 2i.
1000 Linear Algebra and Optimization using MATLAB
(b)
(1 − i)x1 + (2 + i)x2 = 3i
(3 + i)x1 + 4ix2 = 1 − 5i.
(c)
−6ix1 + 9x2 = 4 − i
(1 − 4i)x1 − 6ix2 = −7i.
(d)
5ix1 − 3ix2 = 1 − i
7ix1 + 8ix2 = 3i.
(b)
(c)
(d)
(d)
13i 11i 15i
A = 23i −25i 5 − 15i .
33i 33i 18i
54. Find the real and imaginary part of each of the following matrices:
(a)
3i 2
A= .
2 − 5i 1 − i
(b)
2 − 5i i
A= .
1 − 3i 2 − 3i
(c)
13 − 21i 8 − 25i
A= .
15 + 27i 10 + 13i
Appendix B: Mathematical Preliminaries 1003
(d)
10 − 33i 3 − 9i
A= .
41 − 17i 10
55. Find the real and imaginary part of each of the following matrices:
(a)
4 − 2i 3i 2 + 3i
A = 3 − 5i 2 + 7i 3 − 7i .
2i 3i 1 − 5i
(b)
2 + 5i 11 − 15i 12 − 7i
A = 5 − 7i 6 − 13i 3 − 15i .
3 − 2i 9 − 12i 4 − 2i
(c)
3i 3 − 4i 2 − 5i
A = 2 − 7i 4 − 8i 11 − 5i .
11 + 2i 6 3 + 7i
(d)
i 4 + 14i 24 − 35i
A = 11 + 13i 9 − 25i 25 − 17i .
22 + 23i 12 − 31i 23 − 38i
56. Find the eigenvalues and the corresponding eigenvectors of each of
the following matrices:
(a)
1 2 0
A = 4 5 6 .
7 0 9
(b)
0 2 0
A = 4 5 −6 .
7 0 1
(c)
1 0 1
A = 3 0 0 .
1 1 1
1004 Linear Algebra and Optimization using MATLAB
(d)
2 0 −5
A= 3 1 0 .
1 5 1
57. Find the eigenvalues and the corresponding eigenvectors of each of
the following matrices:
(a)
7 2 1
A = 3 3 0 .
1 2 1
(b)
−2 12 1
A = −23 13 1 .
6 −2 21
(c)
11 −22 11
A = 12 −24 42 .
13 −26 −63
(d)
−29 18 −11
A= 6 7 12 .
10 16 −6
(d)
1 + 5i 7 − 3i 1 + i 9 − 10i 11 + 12i 14 − 15i
A = 4 − 7i 8 − 2i 2 − 2i , B = 10 + 11i 11 − 12i 13 + 14i .
3i 9 − i 3 − 3i 11 − 12i 13 + 14i 15 − 16i
Appendix C
Introduction to MATLAB
C.1 Introduction
In this appendix, we discuss programming with the software package MAT-
LAB. The name MATLAB is an abbreviation for “Matrix Laboratory.”
MATLAB is an extremely powerful package for numerical computing and
programming. In MATLAB, we can give direct commands, as on a hand
calculator, and we can write programs. MATLAB is widely used in univer-
sities and colleges in introductory and advanced courses in mathematics,
science, and especially in engineering. In industry, software is used in re-
search, development, and design. The standard MATLAB program has
tools (functions) that can be used to solve common problem. Until re-
cently, most users of MATLAB have been people who had previous knowl-
edge of programming languages such as FORTRAN or C and switched to
MATLAB as the software became popular.
MATLAB software exists as a primary application program and a large
library of program modules called the standard toolbox. Most of the nu-
merical methods described in this textbook are implemented in one form
1007
1008 Linear Algebra and Optimization using MATLAB
from graphics commands), the Editor Window (creates and debugs script
and function files), the Help Window (gives help information), and the
Workspace Window (gives information about the variables that are used).
The Command Window in MATLAB is the main window and can be used
for executing commands, opening other windows, running programs writ-
ten by users, and managing the software.
>> a = 25;
The other format commands, called format short g and format long g, use
‘scientific notation’ for the output (the best of 5-digit fixed or floating-point
and the best of 15-digit fixed or floating-point):
>> f ormat short g
>> pi
ans =
3.1416
>> f ormat long g
>> pi
ans =
3.14159265358979
We can also use the other format command for the output, called format
bank (to have 2 decimal digits):
>> f ormat bank
>> pi
ans =
3.14
There are two other format commands which can be used for the output,
called format compact (which eliminates empty lines to allow more lines
to be displayed on the screen) and format loose (which adds empty lines
(opposite of compact)).
As part of its syntax and semantics, MATLAB provides for exceptional
values. Positive infinity is represented by Inf, negative infinity by – Inf, and
not a real number by NAN. These exceptional values are carried through
the computations in a logically consistent way. •
1012 Linear Algebra and Optimization using MATLAB
Symbol Effect
+ Addition
− Subtraction
∗ M ultiplication
\ Division
∧ P ower
0 Conjugate transpose
pi, e Constants
>> (4 − 2 + 3 ∗ pi)/2
ans =
5.7124
>> a = 2; b = sin(a);
>> 2 ∗ bˆ 2
ans =
1.6537
MATLAB’s arithmetic operations are actually much more powerful than
this. We shall see just a little of this extra power later.
There are some arithmetic operations that require great care. The order
in which multiplication and division operations are specified is especially
important. For example:
>> a = 2; b = 3; c = 4;
>> a/b ∗ c
Here, the absence of any parentheses results in MATLAB executing the
two operations from left-to-right so that:
Appendix C: Introduction to MATLAB 1013
ans = 2.6667
>> (a/b) ∗ c
a/b a
Similarly, a/b/c yields the same result as c
or b/c
, which could be achieved
with the MATLAB command:
>> a/(b ∗ c)
Second Exponentiation.
and responds by displaying ans = and the numerical result of the expres-
sion in the next line. For example:
>> 25 + 10/5
ans =
27
>> (25 + 10)/5
ans =
7
>> 35ˆ (1/2)+12*4
ans =
53.9161
>> 115ˆ (1/3)+(112*40)/12 - (0.87+3.25)/6
ans =
377.5096
>> cos(pi/3)
ans = 0.5
>> x = 0.5
>> z = sin(x) + cos(x)ˆ 2
z=
1.2496
C.2.5 Vectors
In MATLAB the word vector can really be interpreted simply as a ‘list of
numbers.’ Strictly, it could be a list of objects other than numbers but ‘list
of numbers’ will fits our need’s for now.
There are two basic kinds of MATLAB vectors: row and column vec-
tors. As the names suggest, a row vector stores its numbers in a long
‘horizontal list’ such as
which is a row vector with 6 components. A column vector stores its num-
bers in a vertical list such as:
1
2
3
1.23
−10.3
2.1,
which is a column vector with 6 components. In mathematical notation
these arrays are usually enclosed in brackets [ ].
There are various convenient forms of these vectors for allocating val-
ues to them and accessing the values that are stored in them. The most
basic method of accessing or assigning individual components of a vector is
based on using an index, or subscript, which indicates the position of the
Appendix C: Introduction to MATLAB 1017
>> x = x0
x=
1.0000
2.0000
7.3000
1.2300
−10.3000
2.1000
MATLAB has several convenient ways of allocating values to a vector where
these values fit a simple pattern.
The colon : has a very special and powerful role in MATLAB. Basically,
it allows an easy way to specify a vector of equally spaced numbers. There
are two basic forms of the MATLAB colon notation.
The first one is that two arguments are separated by a colon as in:
>> x = −2 : 4
which generates a row vector with the first component –2, the last one 4,
and others spaced at unit intervals.
1018 Linear Algebra and Optimization using MATLAB
The second form is that the three arguments separated by two colons
has the effect of specifying the starting value : spacing : final value. For
example:
>> x = −2 : 0.5 : 1
which generates
x=
−2.0 − 1.5 − 1.0 − 0.5 0.0 0.5 1.0
Also, one can use MATLAB colon notation as follows:
>> y = x(2 : 6)
which generates
y=
−1.5 − 1.0 − 0.5 0.0 0.5
MATLAB has two other commands for conveniently specifying vectors.
The first one is called the linspace function, which is used to specify a vec-
tor with a given number of equally spaced elements between specified start
and finish points. For example:
example:
>> x = lognspace(1, 4, 4)
x=
10 100 1000 10000
Note the use of the transpose to convert the row vectors to columns, and
the separation of these two columns by a comma.
Note also that the standard MATLAB functions are defined to operate on
vectors of inputs in an element-by-element manner. The following exam-
ple illustrates the use of the colon (:) notation and arithmetic within the
argument of a function as:
1020 Linear Algebra and Optimization using MATLAB
C.2.6 Matrices
A matrix is a two-dimensional array of numerical values that obeys the
rules of linear algebra as discussed in Chapter 3.
To enter a matrix, list all the entries of the matrix with the first row,
separating the entries by blank space or commas, separating two rows by a
semicolon, and enclosing the list in square brackets. For example, to enter
a 3 × 4 matrix A, we do the following:
>> A = [1 2 3 4; 3 2 1 4; 4 1 2 3]
A=
1 2 3 4
3 2 1 4
4 1 2 3
There are also other options available when directly defining an array. To
define a column vector, we can use the transpose operation. For example:
Appendix C: Introduction to MATLAB 1021
>> [1 2 5]0
ans =
1
2
5
>> A = [1 2 3; 4 5 6; 7 8 9];
>> A(2, 3)
ans =
6
or
>> A(1 : 2, 2 : 3)
ans =
2 3
5 6
>> x = [1 2 3 4 5];
>> x(3) = [ ]
x=
[1 2 4 5]
>> A = [1 2 3; 4 5 6; 7 8 9];
>> A(:, 1) = [ ]
ans =
2 3
5 6
8 9
For example, if the matrix A has three rows and we want to change rows
1 and 3, we type:
For example:
>> A = [1 2 3; 4 5 6; 7 8 9]
>> B = A([3, 2, 1], :)
B=
7 8 9
4 5 6
1 2 3
Note that the method can be used to change the order of any number of
rows.
Similarly, one can interchange the columns easily by typing:
For example, if the matrix A has three columns and we want to change
Appendix C: Introduction to MATLAB 1023
For example, to change the second row of a 3 × 3 matrix A to [2, 2, 2], type
the command:
>> A(2, :) = [2 2 2]
For example:
>> A = [1 2 3; 4 5 6; 7 8 9]
>> A(2, :) = [2 2 2]
A=
1 2 3
2 2 2
7 8 9
Similarly, one can replace the kth column of a matrix A equal to the new
entries of the column in square brackets separated by semicolons, i.e., type:
>> A(:, 2) = [2 2 2]
For example:
>> A = [1 2 3; 4 5 6; 7 8 9]
>> A(:, 2) = [2; 2; 2]
A=
1 2 3
4 2 6
7 2 9
• Create a zero matrix with m rows and n columns using zeros function
as follows:
>> A = zeros(m, n)
>> A = zeros(n)
For example:
>> A = zeros(3)
A=
0 0 0
0 0 0
0 0 0
Appendix C: Introduction to MATLAB 1025
>> A = ones(n, n)
>> A = ones(3, 3)
A=
1 1 1
1 1 1
1 1 1
>> A = ones(2, 4)
A=
1 1 1 1
1 1 1 1
Indeed, ones and zeros can be used to create row and column vectors:
>> u = ones(1, 4)
u=
1 1 1 1
and
>> v = ones(1, 4)
v=
1
1
1
1
1026 Linear Algebra and Optimization using MATLAB
>> I = eye(n)
For example:
>> I = eye(3)
I=
1 0 0
0 1 0
0 0 1
>> v = [4 5 6];
>> A = diag(v)
A=
4 0 0
0 5 0
0 0 6
>> u = diag(A)
u=
4
5
6
• Create the length function and size function which are used to deter-
mine the number of elements in vectors and matrices. These functions
are useful when one is dealing with matrices of unknown or variable
size, especially when writing loops. To define the length function,
type:
>> u = 1 : 5
u=
1 2 3 4 5
Then
>> n = length(u)
n=5
Now to define the size command, which returns two values and has
the syntax:
>> B = sqrt(A)
For example:
>> A = [1 4 5; 2 3 4; 4 7 8];
>> B = sqrt(A)
B=
1.0000 2.0000 2.2361
1.4142 1.7321 2.0000
2.0000 2.6458 2.8284
>> U = triu(A)
For example:
Appendix C: Introduction to MATLAB 1029
>> A = [1 2 3; 4 5 6; 7 8 9];
>> U = triu(A)
A=
1 2 3
0 5 6
0 0 9
Also, one can create an upper triangular matrix from a given matrix
A with a zero diagonal as:
>> W = triu(A, 1)
For example:
>> A = [1 2 3; 4 5 6; 7 8 9];
>> W = triu(A, 1)
W =
0 2 3
0 0 6
0 0 0
• Create a lower triangular matrix A for a given matrix using the tril
function as:
>> L = tril(A)
For example:
>> A = [1 2 3; 4 5 6; 7 8 9];
>> L = tril(A)
A=
1 0 0
4 5 0
7 8 9
1030 Linear Algebra and Optimization using MATLAB
Also, one can create a lower triangular matrix from a given matrix
A with a zero diagonal as follows:
>> V = tril(A, 1)
For example:
>> A = [1 2 3; 4 5 6; 7 8 9];
>> V = tril(A, 1)
V =
0 0 0
4 0 0
7 8 0
>> R = rand(n)
For example:
>> R = rand(3)
R=
0.6038 0.0153 0.9318
0.2722 0.7468 0.4660
0.1988 0.4451 0.4186
For example:
>> A = [1 2 3; 4 5 6; 7 8 9; 10 11 12]
>> B = reshape(A, 2, 6)
B=
1 7 2 8 3 9
4 10 4 11 6 12
and
>> H = hilb(n)
For example:
>> H = hilb(3)
H=
1.0000 0.5000 0.3333
0.5000 0.3333 0.2500
0.3333 0.2500 0.2000
>> U = toeplitz(C, R)
1032 Linear Algebra and Optimization using MATLAB
>> A = [3 2 − 3; 4 5 6; 7 6 7];
>> B = [1 2 3; 4 − 2 1; 7 5 − 4];
>> C = A + B
C=
4 4 0
8 3 7
14 11 3
>> D =A−B
D=
2 0 −6
0 7 5
0 1 11
Matrix multiplication has the standard meaning as well. Given any two
compatible matrix variables A and B, MATLAB expression A ∗ B evalu-
ates the product of A and B as defined by the rules of linear algebra. For
example:
>> A = [2 3; −1 4];
>> B = [5 − 2 1; 3 8 − 6];
>> C = A ∗ B
C=
19 20 −16
7 34 −25
Also,
Appendix C: Introduction to MATLAB 1033
>> A = [1 2; 3 4];
>> B = A0 ;
>> C = 3 ∗ (A ∗ B)ˆ 3
C=
13080 29568
29568 66840
Similarly, if the two vectors are the same size, they can be added or sub-
tracted from one other. They can be multiplied, or divided by a scalar, or
a scalar can be added to each of their components.
Mathematically the operation of division by a vector does not make
sense. To achieve the corresponding component-wise operation, we use the
./ operator. Similarly, for multiplication and powers we use .∗ and .∧,
respectively. For example:
>> a = [1 2 3];
>> b = [2 − 1 4];
>> c = a. ∗ b
c=
2 −2 12
Also,
>> c = a./b
c=
0.5 −2.0 0.75
and
>> c = a.ˆ 3
c=
1 8 27
Similarly,
1034 Linear Algebra and Optimization using MATLAB
>> c = 2.ˆ a
c=
2 4 8
and
>> c = b.ˆ b
c=
2 1 64
Note that these operations apply to matrices as well as vectors. For exam-
ple:
>> A = [1 2 3; 4 5 6; 7 8 9];
>> B = [9 8 7; 6 5 4; 3 2 1];
>> C = A. ∗ B
C=
9 16 21
24 25 24
21 16 9
Note that A. ∗ B is not the same as A ∗ B.
>> C = A.ˆ 2
C=
1 4 9
16 25 36
49 64 81
and
function is suitable for a simple printing task. The fprintf function provides
fine control over the displayed information as well as the capability of
directing the output to a file.
The disp function takes only one argument, which may be either a
string matrix or a numerical matrix. For example:
and
More complicated strings can be printed using the fprintf function. This is
essentially a C programming command that can be used to obtain a wide
range of printing specifications. For example:
display, as in:
>> x = A \ b
>> A = [1 1 1; 2 3 1; 1 − 1 − 2];
>> b = [2; 3; −6];
>> x = A \ b
x=
−1
1
2
There are a small number of functions that should be mentioned.
• Reduce a given matrix A to reduced row echelon form by using the
rref function as:
1038 Linear Algebra and Optimization using MATLAB
For example:
>> A = [1 1 1; 2 3 1; 1 − 1 − 2];
>> rref (A)
ans =
1 0 0
0 1 0
0 0 1
>> det(A)
For example:
>> A = [1 2 − 1; 3 0 1; 4 2 1];
>> det(A)
ans = −6
>> rank(A)
For example:
>> A = [1 4 5; 2 3 4; 4 7 8];
>> rank(A)
ans = 3
Appendix C: Introduction to MATLAB 1039
>> inv(A)
For example:
>> A = [1 1 1; 1 2 4; 1 3 9];
>> inv(A)
ans =
3.0000 −3.0000 1.0000
−2.5000 4.0000 −1.5000
0.5000 −1.0000 0.5000
>> C = [A b];
For example:
>> A = [1 1 1; 1 2 4; 1 3 9];
>> b = [2; 3; 4];
>> C = [A b]
C=
1 1 1 2
1 2 3 3
1 4 9 4
For example:
>> A = [1 4 5; 2 3 4; 4 7 8];
>> [L, U ] = lu(A)
L=
0.2500 1.0000 0.0000
0.5000 −0.2222 1.0000
1.0000 0.0000 0.0000
and
U=
4.0000 7.0000 8.0000
0.0000 2.2500 3.0000
0.0000 0.0000 0.6667
For example:
>> A = [1 4 5; 2 3 4; 4 7 8];
>> [L, U, P ] = lu(A)
L=
1 0 0
0.25 1 0
0.5 −0.2 1
and
Appendix C: Introduction to MATLAB 1041
U=
4 7 8
0 2.25 3
0 0 0.67
and
P =
0 0 1
1 0 0
0 1 0
• One can compute the various norms of the vectors and matrices by
using the norm function. The expression norm(A, 2) or norm(A)
gives the Euclidean norm or l2 -norm of A while norm(A,Inf) gives
the maximum or l∞ -norm. Here, A can be a vector or a matrix. The
l1 -norm of a vector or matrix can be obtained by norm(A,1). For
example, the different norms of the vector can be obtained as:
>> a = [6, 7, 8]
V 1 >> norm(a)
V 1 = 12.8841
V 2 >> norm(a, 1)
V 2 = 22
V 3 >> norm(a, Inf )
V3=8
>> A = [1 1 1; 1 2 4; 1 3 9]
M 1 >> norm(A)
M 1 = 10.6496
M 2 >> norm(A, 1)
M 2 = 14
M 3 >> norm(A, Inf )
M 3 = 13
>> A = [1 1 1; 1 2 4; 1 3 9]
>> B = inv(A) = [3 − 3 1; −2.5 4 − 1; 0.5 − 1 0.5]
N 1 >> norm(A, Inf )
N 1 = 13
N 2 >> norm(B, Inf )
N2 = 8
>> cond(A) = N 1 ∗ N 2
cond(A) = 104
• The root of polynomial p(x) can be obtained by using the roots func-
tion roots(p). For example, if p(x) = 3x2 + 5x − 6 is a polynomial,
enter:
>> A = [1 1 2; −1 2 1; 0 1 3];
>> [U, D] = eig(A)
U=
0.4082 −0.5774 0.7071
0.8165 0.5774 0.0000
−0.4082 −0.5774 0.7071
and
D=
1 0 0
0 2 0
0 0 3
>> x = a : d : b;
>> y = f (x);
>> plot(x, y)
>> x = −2 : 0.1 : 2;
>> y = exp(x) + 10;
>> plot(x, y)
Appendix C: Introduction to MATLAB 1045
By default, the plot function connects the data with a solid line. The
markers used for points in a plot may be any of the following:
Symbol Effect
• P oint
◦ Circle
× Cross
? Star
For example, to put a marker for points in the above function plot using
the following commands, we get:
>> x = −2 : 0.1 : 2;
>> y = exp(x) + 10;
>> plot(x, y,0 o0 )
To plot several graphs using the hold on, hold off commands, one graph
is plotted first with the plot command. Then the hold on command is
typed. It keeps the Figure Window with the first plot open, including its
1046 Linear Algebra and Optimization using MATLAB
axis properties and formatting if any was done. Additional graphs can
be added with plot commands that are typed next. Each plot command
creates a graph that is equal to that figure. To stop this process, the hold
off command can be used. For example:
Also, we can used the fplot command, which plots a function with the
form y = f (x) between specified limits. For example, to plot the function
Appendix C: Introduction to MATLAB 1047
• Define the scaling vector for X. For example, to divide the interval
[−2, 2] for x into subintervals of width 0.1, enter:
>> x = −2 : 0.1 : 2;
• Define the scaling vectors for Y . In order to use the same scaling for
1048 Linear Algebra and Optimization using MATLAB
y, enter:
>> y = x;
>> Z = −3 ∗ X + Y ;
>> mesh(X, Y, Z)
p
For example, to create a surface plot of z = x2 + y 2 + 1 on the
domain −5 ≤ x ≤ 5, −5 ≤ y ≤ 5, we type the following:
p p
Figure C.4: Surface plot of z = sin( x2 + y 2 + 1)/ x2 + y 2 + 1.
Subplots
Often, it is in our interest to place more than one plot in a single figure
window. This is possible with the graphic command called the subplot
function, which is always called with three arguments as in:
Similarly, one can use the subplots function for creating surface plots by
using the following command:
For example, in order to create the 1 × 4 row vector x with entries accord-
ing to formula x(i) = i; type:
>> f or i = 1 : 4
x(i) = i
end
The action in this loop will be performed once for each value of counter
name i beginning with the initial value 1 and increasing by each time until
the actions are executed for the last time with the final value i = 4.
>> x = 1;
while x > 0.01
x = x/2;
end
>> disp(x)
which generates:
x=
0.5000
0.2500
0.1250
0.0625
0.0313
0.0156
0.0078
i + j, type:
>> f or i = 1 : 5,
for j = 1 : 4,
A(i, j) = i + j;
end,
end
ans =
2 3 4 5
3 4 5 6
4 5 6 7
5 6 7 8
6 7 8 9
C.3.5 Structure
Finally, we introduce the basic structure of MATLAB’s logical branching
commands. Frequently, in programs, we wish for the computer to take dif-
ferent actions depending on the value of some variables. Strictly speaking
these are logical variables, or, more commonly, logical expressions similar
to those we saw when defining while loops.
Two types of decision statements are possible in MATLAB, one-way
decision statements and two-way decision statements.
The syntax for the one-way decision statement is:
in which the statements in the action block are executed only if the con-
dition is satisfied (true). If the condition is not satisfied (false) then the
action block is skipped. For example:
in which the first set of instructions in the action block is executed if the
condition is satisfied while the second set, the action block, is executed if
the condition is not satisfied. For example, if x and y are two numbers and
we want to display the value of the number, we type:
Symbol Effect
& and
| or
∼ not
However, these operators not only apply to side variables but such opera-
tors will also work on vectors and matrices when the operation is valid.
Symbol Effect
== is equal to
<= is less than or equal
>= is greater than or equal
∼= is not equal to
< is less than
> is greater than
1056 Linear Algebra and Optimization using MATLAB
2x
f (x) = ex − ,
(1 + x3 )
type:
f unctiony = f n1(x)
y = exp(x) − 2 ∗ x./(1 + x.ˆ 3);
Once this function is saved as an m-file named fn1.m, we can use the
MATLAB Command Window to compute function at any given point.
For example:
Appendix C: Introduction to MATLAB 1057
ans =
0.0000 1.0000
0.2000 0.8246
0.4000 0.7399
0.6000 0.8353
0.8000 1.1673
1.0000 1.7183
1.2000 2.4404
1.4000 3.3073
1.6000 4.3251
1.8000 5.5227
2.0000 6.9446
name = inline(’expression’)
x2
For example, the function f (x) = √ can be defined in the MATLAB
x2 + 1
Command Window as follows:
>> y(2) =
ans =
1.7889
If x is expected to be an array and the function is calculated for each
element, then the function must be modified for element-by-element calcu-
lations:
For example:
uses floating-point arithmetic for its calculations. But one can also do exact
arithmetic with symbolic expressions. Here, we will give many examples
to get the exact arithmetic. The starting point for symbolic operations
is symbolic objects. Symbolic objects are made of variables and numbers
that, when used in mathematical expressions, tell MATLAB to execute
the expression symbolically. Typically, the user first defines the symbolic
variables that are needed and then uses them to create symbolic expressions
that are subsequently used in symbolic operations. If needed, symbolic
expressions can be used in numerical operations.
Many applications in mathematics, science, and engineering require
symbolic operations, which are mathematical operations with expressions
that contain symbolic variables. Symbolic variables are variables that don’t
have specific numerical values when the operation is executed. The result
of such operations is also mathematical expression in terms of the sym-
bolic variables. Symbolic operations can be performed by MATLAB when
the Symbolic Math Toolbox is installed. The Symbolic Math Toolbox is
included in the student version of the software and can be added to the
standard program. The Symbolic Math Toolbox is a collection of MAT-
LAB functions that are used for execution of symbolic operations. The
commands and functions for the symbolic operations have the same style
and syntax as those for the numerical operations.
Symbolic computations are performed by computer programs such as
Derive
R
, Maple
R
, and Mathematica
R
. MATLAB also supports sym-
bolic computation through the Symbolic Math Toolbox, which uses the
symbolic routines of Maple. To check if the Symbolic Math Toolbox is
installed on a computer, one can type:
>> ver
In general, numerical results are obtained much more quickly with numeri-
cal computation than with numerical evaluation of a symbolic calculation.
To perform symbolic computations, we must use syms to declare the vari-
ables we plan to use to be symbolic variables. For example, the quadratic
formula can be defined in terms of a symbolic expression by the following
kind of commands:
>> syms x a b c
>> sym(sqrt(a ∗ xˆ 2+b ∗ x + c))
ans =
(a ∗ xˆ 2+b ∗ x + c) ˆ (1/2)
>> syms x y z
>> w = xˆ 2+yˆ 2
w=
xˆ 2+yˆ 2
>> u = xˆ 2∗yˆ 2
u=
xˆ 2∗yˆ 2
>> v = sqrt(w + u)
v=
(xˆ 2+yˆ 2+x ˆ 2∗yˆ 2)ˆ (1/2)
This command collects the terms in the expression that have the variable
with the same power. In the new expression, the terms will be ordered in
decreasing order of power. The form of this command is:
>> collect(f )
or
For example, if f = (2x2 +y 2 )(x+y 2 +3), then use the following commands:
>> syms x y
>> f = (2 ∗ xˆ 2+yˆ 2) ∗ (x + yˆ 2+3)
>> collect(f )
ans =
2 ∗ xˆ 3+(2 ∗ yˆ 2+6) ∗ xˆ 2+yˆ 2∗x + yˆ 2∗(yˆ 2+3)
But if we take y as a symbolic variable, then we do the following:
>> syms x y
>> f = (2 ∗ xˆ 2+yˆ 2) ∗ (x + yˆ 2+3)
>> collect(f, y)
ans =
yˆ 4+(2 ∗ xˆ 2+x + 3) ∗ yˆ 2+2 ∗ xˆ 2∗(x + 3)
The Factor Command
>> f actor(f )
>> syms x
>> f = xˆ 3−3 ∗ xˆ 2−4 ∗ x + 12
>> f actor(f )
ans =
(x − 2) ∗ (x − 3) ∗ (x + 2)
The Expand Command
This command multiplies the expressions. The form of this command is:
>> expand(f )
For example, if f = (x3 − 3x2 − 4x + 12)(x − 3)2 , then use the following
commands:
>> syms x
>> f = (xˆ 3−3 ∗ xˆ 2−4 ∗ x + 12) ∗ (x − 3)ˆ 3
>> expand(f )
ans =
xˆ 6−12 ∗ xˆ 5+50 ∗ xˆ 4−60 ∗ xˆ 3−135 ∗ xˆ 2+432 ∗ x − 324
The Simplify Command
For example, if f = (x3 − 3x2 − 4x + 12)/(x − 3)2 , then use the following
commands:
Appendix C: Introduction to MATLAB 1067
>> syms x
>> f = (xˆ 3−3 ∗ xˆ 2−4 ∗ x + 12)/(x − 3)ˆ 3
>> simplif y(f )
ans =
(xˆ 2−4)/(x − 3)ˆ 2
The Simple Command
This command finds a form of the expression with the fewest number of
characters. The form of this command is:
>> simple(f )
For example, if f = (cos x cos y + sin x sin y), then use the simplify com-
mand, and we get:
>> syms x y
>> f = (cos(x) ∗ cos(y) + sin(x) ∗ sin(y))
>> simplif y(f )
ans =
cos(x) ∗ cos(y) + sin(x) ∗ sin(y)
But if we use the simple command, we get:
>> syms x y
>> f = (cos(x) ∗ cos(y) + sin(x) ∗ sin(y))
>> simple(f )
ans =
cos(x − y)
The Pretty Command
>> pretty(f )
1068 Linear Algebra and Optimization using MATLAB
√
For example, if f = x3 − 3x2 − 4x + 12, then use the following com-
mands:
>> syms x
>> f = sqrt(xˆ 3−3 ∗ xˆ 2−4 ∗ x + 12)
>> pretty(f )
ans =
3 2 1
(x − 3x − 4x + 12) /2
>> syms a b c x y z
>> f 1 = a ∗ xˆ 2+b ∗ x + c
>> f 2 = x ∗ y ∗ z
>> f indsym(f 1)
ans =
a, b, c
>> f indsym(f 2)
ans =
x, y, z
>> syms x y
>> f = xˆ 3∗y + 12 ∗ x ∗ y + 12
>> subs(f, 2)
ans =
32 ∗ y + 12
Appendix C: Introduction to MATLAB 1069
>> syms u v
>> f = u ∗ v
>> f indsym(f, 1)
ans =
v
>> syms x
>> f = xˆ 3−2 ∗ x − 1
>> solve(f )
ans =
[−1]
[1/2 ∗ 5 ˆ (1/2) + 1/2]
[1/2 − 1/2 ∗ 5ˆ (1/2)]
>> double(ans)
ans =
−1.0000
1.6180
−0.6180
or type vpa(ans):
>> vpa(ans)
ans =
[−1.]
[1.6180339887498949025257388711907]
[−.61803398874989490252573887119070]
>> syms x y
>> [x, y] = solve(0 3 ∗ x + 3 ∗ y = 20 ,0 x + 2 ∗ yˆ 2= 10 )
x=
[5/12 − 1/12 ∗ 33ˆ (1/2)]
[5/12 + 1/12 ∗ 33ˆ (1/2)]
y=
[1/4 + 1/12 ∗ 33ˆ (1/2)]
[1/4 − 1/12 ∗ 33ˆ (1/2)]
Note that both solutions can be extracted with x(1), y(1), x(2), and y(2).
For example, type:
>> x(1)
ans =
[5/12 − 1/12 ∗ 33ˆ (1/2)]
and
Appendix C: Introduction to MATLAB 1071
>> y(1)
ans =
[1/4 + 1/12 ∗ 33ˆ (1/2)]
>> syms x y
>> solve(0 x + x ∗ yˆ 2+3 ∗ x ∗ y = 30 ,0 y 0 )
ans =
[1/2/x ∗ (−3 ∗ x + (5 ∗ x2 + 12 ∗ x)ˆ (1/2))]
[1/2/x ∗ (−3 ∗ x − (5 ∗ x2 + 12 ∗ x)ˆ (1/2))]
C.6.3 Calculus
Symbolic Differentiation
>> dif f (f )
or
>> syms x
>> f = xˆ 3+3 ∗ xˆ 2+20 ∗ x − 12
>> dif f (f )
ans =
3 ∗ xˆ 2+6 ∗ x + 20
2
Note that if f = x3 + x ln y + yex is taken, then MATLAB differentiates
f with respect to x (default symbolic variable) as:
>> syms x y
>> f = xˆ 3+x ∗ log(y) + y ∗ exp(xˆ 2)
>> dif f (f )
ans =
3 ∗ xˆ 2+log(y) + 2 ∗ y ∗ x ∗ exp(xˆ 2)
2
If we want to differentiate f = x3 + x ln y + yex with respect to y, then we
use the MATLAB diff(f, y) command as:
>> syms x y
>> f = xˆ 3+x ∗ log(y) + y ∗ exp(xˆ 2)
>> dif f (f, y)
ans =
x/y + exp(xˆ 2)
Find the numerical value of the symbolic expression by using the MAT-
LAB subs command. For example, to find the derivative of f = x3 + 3x2 +
20x − 12 at x = 2, we do the following:
>> syms x
>> f = xˆ 3+3 ∗ xˆ 2+20 ∗ x − 12
>> df = dif f (f )
>> subs(df, x, 2)
ans =
44
We can also find the second and higher derivative of expressions by using
the following command:
Appendix C: Introduction to MATLAB 1073
or
>> syms x y
>> f = xˆ 3+x ∗ log(y) + y ∗ exp(xˆ 2)
>> dif f (f, y, 2)
ans =
−x/yˆ 2
Symbolic Integration
>> int(f )
or
If in using the int(f) command the expression contains one symbolic vari-
able, then integration took place with respect to that variable. But if the
expression contains more than one variable, then the integration is per-
formed with respect to the default symbolic variable. For example, to find
2
the indefinite integral (antiderivative) of f = x3 + x ln y + yex with respect
to y, we use the MATLAB int(f, y) command as:
1074 Linear Algebra and Optimization using MATLAB
>> syms x y
>> f = xˆ 3+x ∗ log(y) + y ∗ exp(xˆ 2)
>> int(f, y)
ans =
xˆ 3∗y + x ∗ y ∗ log(y) − x ∗ y + 1/2 ∗ yˆ 2∗exp(xˆ 2)
Similarly, for the case of a definite integral, we use the following command:
>> int(f, a, b)
or
where a and b are the limits of integration. Note that the limits a and
b may beR numbers or symbolic variables. For example, to determine the
1
value of 0 (x2 + 3ex + x ln y) dx, we use the following commands:
>> syms x y
>> f = xˆ 2+3 ∗ exp(x) + x ∗ log(y)
>> int(f, 0, 1)
ans =
−8/3 + 3 ∗ exp(1) + 1/2 ∗ log(y)
We can also use symbolic integration to evaluate Rthe integral when f has
∞ 2
some parameters. For example, to evaluate the −∞ e−ax dx, we do the
following:
The Symbolic Math Toolbox provides the limit command, which allows us
to obtain the limits of functions directly. For example, to use the definition
of the derivative of the function
f (x + h) − f (x)
f 0 (x) = lim , provided limits exist,
h→0 h
and for finding the derivative of the function f (x) = x2 , we use the fol-
lowing commands:
>> syms h x
>> f = (x + h)ˆ 2−xˆ 2
>> limit(f /h, h, 0)
ans =
2∗x
We can also find one-sided limits with the Symbolic Math Toolbox. To
find the limit as x approaches a from the left, we use the commands:
|x − 3| |x − 3|
lim− and lim+ .
x→3 x−3 x→3 x−3
Since the limit from the left does not equal the limit from the right, the
limit does not exist. It can be checked by using the following commands:
Appendix C: Introduction to MATLAB 1077
The Symbolic Math Toolbox provides the taylor command, which allows
us to obtain the analytical expression of the Taylor polynomial of a given
function. In particular, having defined in the string the function f on
which we want to operate taylor(f, x, n+1) the associated Taylor polyno-
mial of degree n expanded about x0 = 0. For example, to find the Taylor
polynomial of degree three for f (x) = ex sin x expanded about x0 = 0, we
use the following commands:
>> syms x
>> f = exp(x) ∗ sin(x)
>> taylor(f, x, 4)
ans =
x + xˆ 2+1/3 ∗ xˆ 3
>> dsolve(0 eq 0 )
1078 Linear Algebra and Optimization using MATLAB
or
>> syms t y
>> f = t + 3 ∗ y/t
>> dsolve(0 Dy = f 0 )
ans =
−tˆ 2+tˆ 3∗C1
For finding a particular solution of a first-order ordinary differential equa-
tion, we use the following command:
>> syms t y
>> f = x + 3 ∗ y/t
>> dsolve(0 Dy = f 0 ,0 y(1) = 40 ,0 t0 )
ans =
−tˆ 2+5 ∗ tˆ 3
Similarly, the higher-order ordinary differential equation can be solved sym-
bolically using the following command:
Appendix C: Introduction to MATLAB 1079
>> syms x y
>> dsolve(0 D2y − 4 ∗ Dy − 5 ∗ y = 00 ,0 y(1) = 00 ,0 Dy(1) = 20 ,0 x0 )
ans =
1/3/exp(1)ˆ 5∗exp(5 ∗ x) − 1/3 ∗ exp(1) ∗ exp(−x)
>> syms x y
>> A = [3 2; x y])
>> det(A)
ans =
3∗y−2∗x
>> inv(A)
ans =
[−y/(−3 ∗ y + 2 ∗ x), 2/(−3 ∗ y + 2 ∗ x)]
[x/(−3 ∗ y + 2 ∗ x), −3/(−3 ∗ y + 2 ∗ x)]
>> b = [1; x]
>> A\b
ans =
[(2 ∗ x − y/(−3 ∗ y + 2 ∗ x)]
[−2 ∗ x/(−3 ∗ y + 2 ∗ x)]
1080 Linear Algebra and Optimization using MATLAB
3 −1 0
A = −1 2 −1 ,
0 −1 3
>> A = [3 − 1 0; −1 2 − 1; 0 − 1 3]
>> poly(sym(A))
ans =
xˆ 3−8 ∗ xˆ 2+19 ∗ x − 12
>> f actor(ans)
ans =
(x − 1) ∗ (x − 3) ∗ (x − 4)
We can also get the eigenvalues and eigenvectors of a square matrix A sym-
bolically by using the eig(sym(A)) command. The form of this command
is as follows:
>> A = [3 − 1 0; −1 2 − 1; 0 − 1 3]
>> [X, D] = eig(sym(A))
X=
[1, −1, 1]
[−1, 0, 2]
[1, 1, 1]
D=
[4, 0, 0]
[0, 3, 0]
[0, 0, 1]
>> ezplot(Z)
or
or
>> syms x
>> Z = (2 ∗ xˆ 2+2)/(xˆ 2−64)
>> ezplot(Z)
and we obtain Figure C.7.
Note that ezplot can also be used to plot a function that is given in a
parametric form. For example, when x = cos 2t and y = sin 4t, we use the
following commands:
>> syms t
>> x = cos(2 ∗ t)
>> x = sin(4 ∗ t)
>> ezplot(x, y)
and we obtain Figure C.8.
Appendix C: Introduction to MATLAB 1083
diff differentiate
int integration
limit limit of an expression
symsum summation of series
taylor Taylor’s series expansion
det determinant
diag create or extract diagonals
eig eigenvalues and eigenvectors
inv inverse of a matrix
expm exponential of a matrix
rref reduced row echelon form
svd singular value decomposition
1084 Linear Algebra and Optimization using MATLAB
C.9 Summary
MATLAB has a wide range of capabilities. In this book, we used only a
small portion of its features. We found that MATLAB’s command struc-
ture is very close to the way one writes algebraic expressions and linear
algebra operations. The names of many MATLAB commands closely par-
allel those of the operations and concepts of linear algebra. We gave de-
scriptions of commands and features of MATLAB that related directly to
this course. A more detailed discussion of MATLAB commands can be
found in the MATLAB user guide that accompanies the software and in
the following books:
Linear Algebra LABS with MATLAB, second edition by David R. Hill and
David E. Zitaret (Prentice-Hall, Inc., 1996).
For a very complete introduction to MATLAB graphics, one can use the
following book:
Graphics and GUIs with MATLAB, 2nd ed., by P. Marchand (CRC Press,
1999).
There are many websites to help you learn MATLAB and you can locate
many of those by using a web search engine. Alternatively, MATLAB soft-
ware provides immediate on-screen descriptions using the Help command
or one can contact Mathworks at: www.mathworks.com
1090 Linear Algebra and Optimization using MATLAB
C.10 Problems
1. Solve each of the following expressions in the Command window:
(a)
(165)3/2 (765)2
(15 + 17)2 + + .
4 24
(b) p
3 ( (45)3 + 23)3 (101/21)2
(2.55 + ln(4)) + + .
17 15
(c) √
3/2 2
(165 + 2e ) ( 876 + 234)4
(e2 + 245)3 + + .
134 342
(d) p
( (788)5 + 120)2 (e4 + 333)4
(e4/5 + ln(7))2 + + .
111 254
2. Solve each of the following expressions in the Command Window:
(a)
4π 2π 2π 5π
(sin( ) + 0.7757)2 + cos2 ( ) + 2 cos( ) sin( ).
3 3 3 3
(b)
7π 12π π π
(sin2 ( ) + ln(2.5))3 + e3 cos( ) + cos( ) tan( ).
4 5 3 4
(c)
π π π π
(tan( ) + e0.5 )1/2 + (sin3 ( ) + 3 cos( ))/4 + sin3 ( ).
3 6 6 6
(d)
p
3/2 ( (22) + 12 cos(4π)) (e2 + ln(4.5) sin(0.75))2
2
(e + ln(3.5) + + .
24 12
(b)
5π 5π π 7π
sec( ) tan2 ( ) + 2 cos2 ( ) sin( ).
4 5 4 4
(c)
5π 3π π 9π
cot2 ( ) sin2 ( ) + 3[csc2 ( ) sec2 ( )]2 .
4 5 4 4
(d)
π π π π π
[tan( ) sin( )]2 + cos3 ( )[csc2 ( ) sec2 ( )]3 .
6 4 6 6 6
4. Define variable x and calculate each of the following in the Command
Window:
(a)
f (x) = x4 + 23x3 + 19x2 + 2x + 32; x = 2.5.
(b)
√
4 2 x3 + 23x2 + 1.2x ((x4 − 12)/13)2
f (x) = x +ln(x +2)+ + ; x = 4.5.
x 5
(c)
p
(x2 +1) (15x 3
+ 2ex/2 4
) ( sin(x) + x)2
f (x) = (e +2x3 +5x)4 + + , x12.5.
x 2
(d)
p
(x+3) 3 (x + 18)6 + sin(x + 1))3
3 (
f (x) = 5(e +ln(x −2)) + , x = 35.5.
33
(b)
(c)
p
w= cos(xy + z)+ln(x2 +y 3 +z 4 )+tan(xy); x = 5.5, y = 6.5, z = 8.5.
(d)
p
w = ln( x + y 3 )+cos(x3 y)+15x2 yz 4 ; x = 11.0, y = 12.0, z = 13.0.
(d)
p
(x+3) 3 3(x + 18)6 + sin(x + 1))3
(
f (x) = 5(e +ln(x −2)) + , x = 35.5
33
6. Create a vector that has the following elements using the Command
Window:
(a)
π
19, 4, 31, e25 , 63, cos( ), ln(3).
6
(b)
11π
π, 44, 101, e2 , 116, sin( ), ln(2).
4
(c)
5π
35, 40, 321, e7 , 406, cos3 ( ), 2 ln(7).
4
(d)
5π
60, 4 ln(4), 17, 1 + e3 , 83 sin( ).
6
x3 − 4x + 1
7. Plot the function f (x) = for the domain −5 ≤ x ≤ 5.
x3 − x + 2
8. Plot the function f (x) = 4x cos x − 3x and its derivative, both on the
same plot, for the domain −2π ≤ x ≤ 2π.
9. Plot the function f (x) = 4x4 − 3x3 + 2x2 − x + 1, and its first and
second derivatives, for the domain −3 ≤ x ≤ 5, all in the same plot.
Appendix C: Introduction to MATLAB 1093
11. Use the fPlot command to plot the function f (x) = 0.25x4 − 0.15x3 +
0.5x2 − 1.5x + 3.5, for the domain −3 ≤ x ≤ 3.
x = (1 − 2 sin t) cos t
y = (1 − 2 sin t) sin t
z = 3t2 .
x = 1 + sin t
y = 1 + cos t
z = 3t4 .
14. Make a 3-D surface plot and contour plot (both in the same figure)
of the function z = (x + 2)2 + 3y 2 − xy in the domain −5 ≤ x ≤ 5
and −5 ≤ x ≤ 5.
15. Make a 3-D mesh plot and contour plot (both in the same figure) of
the function z = (x − 2)2 + (y − 2)2 + xy in the domain −5 ≤ x ≤ 5
and −5 ≤ x ≤ 5.
16. Define x as a symbolic variable and create the two symbolic expres-
sions:
P1 = x4 − 6x3 + 12x2 − 9x + 3 and
P2 = (x + 2)4 + 5x3 + 17(x + 3)2 + 12x − 20.
Use symbolic operations to determine the simplest form of the fol-
lowing expressions:
1094 Linear Algebra and Optimization using MATLAB
(i) P1 .P2 .
(ii) P1 + P2 .
P1
(iii) .
P2
(iv) Use the subs command to evaluate the numerical value of the
results for x = 15.
17. Define x as a symbolic variable and create the two symbolic expres-
sions: √
P1 = ln(sin(x + 2) + x2 ) − 12 x3 + 12 − ex+2 − 15x + 10 and
P2 = (x − 2)3 + 11x2 + 9x − 15.
Use symbolic operations to determine the simplest form of the fol-
lowing expressions:
(i) P1 .P2 .
(ii) P1 + P2 .
P1
(iii) .
P2
(iv) Use the subs command to evaluate the numerical value of the
results for x = 9.
18. Define x as a symbolic variable.
(i) Show that the roots of the polynomial
f (x) = x5 − 12x4 + 15x3 + 200x2 − 276x − 1008
are −3, −2, 4, 6, and 7 by using the factor command.
(ii) Derive the equation of the polynomial that has the roots
x = −5, x = 4, x = 2, and x = 1.
19. Define x as a symbolic variable.
(i) Show that the roots of the polynomial
f (x) = x5 − 30x4 + 355x3 − 2070x2 + 5944x − 6720
are 4, 5, 6, 7, and 8 by using the factor command.
(ii) Derive the equation of the polynomial that has the roots
x = 3, x = 2, x = 1, and x = 0.
20. Find the fourth-degree Taylor polynomial for the function f (x) =
(x3 + 1)−1 , expanded about x0 = 0.
Appendix C: Introduction to MATLAB 1095
21. Find the fourth-degree Taylor polynomial for the function f (x) =
x + 2 ln(x + 2), expanded about x0 = 0.
22. Find the fourth-degree Taylor polynomial for the function f (x) =
(x + 1)ex + cos x, expanded about x0 = 0.
23. Find the general solution of the ordinary differential equation
y 0 = 2(y + 1).
Then find its particular solution by taking the initial condition y(0) =
1 and plot the solution for −2 ≤ x ≤ 2.
24. Find the general solution of the second-order ordinary differential
equation
y 00 + xy 0 − 3y = x2 .
Then find its particular solution by taking the initial conditions
y(0) = 3, y 0 (0) = −6 and plot the solution for −4 ≤ x ≤ 4.
25. Find the inverse and determinant of the matrix
3 −1 5
A= p 4 2q .
q 3p 5
Then find its roots by using the factor command. Also, find the
eigenvalues and the eigenvectors of A.
Then find its roots by using the factor command. Also, find the
eigenvalues and the eigenvectors of A.
D.0.1 Chapter 1
1 −5 −1
1. C = −2 1 −3 .
−2 −1 −8
1 −5 −1
3. |AB| = −2 1 −3 = 0.
−2 −1 −8
5. x = −3, y = 2.
7. (a) B, (c) E.
1097
1098 Linear Algebra and Optimization using MATLAB
2 3 4
9. (a) Row E.F. = 0 1 2 , x = [4/3, 1/3, −2/3]T .
0 0 −3
3 0 1
(c) Row E.F. = 0 −1 0 , x = [−1/3, −3, 2]T .
0 0 1
13. det(A) = a21 c21 + a22 c22 + a23 c23 = 0 + 3(−33) + 5(37) = 86.
det(B) = a13 c13 + a23 c23 + a33 c33 = 4(−152) + 6(−107) + 12(−8) =
−1346.
det(A) = a12 c12 +a22 c22 +a32 c32 = −8(−52)+1(−45)+10(94) = 1311.
3 1 −1
15. det(A) = 0 −2/3 14/3 = (3)(−2/3)(−36) = 72.
0 0 −36
4 1 6
det(B) = 0 27/4 17/2 = (4)(27/4)(83/27) = 83.
0 0 83/27
17 46 7
det(C) = 0 −87/17 −4/17 = (17)(−87/17)(10) = −870.
0 0 10
23.
12 16 −27
Adj(A) = −5 9 13 ,
−10 −11 19
−314/707 1/101 −451/707
(Adj(A))−1 = 5/101 6/101 3/101 ,
−145/707 4/101 −188/707
det(Adj(A)) = −707.
32 −10 26
Adj(B) = 20 26 −16 ,
−41 37 7
1/86 2/129 −1/129
(Adj(B))−1 = 1/129 5/258 2/129 ,
7/258 −1/86 2/129
det(Adj(B)) = 66564.
4 2 −16
Adj(C) = −1 −11 4 ,
−11 5 2
−1/42 −1/21 −2/21
(Adj(C))−1 = −1/42 −2/21 0 ,
−1/14 −1/42 −1/42
det(Adj(C)) = 1764.
25.
−2 11 −3
Adj(A) = −8 −10 15 , det(A) = 27
7 2 −3
−2/27 11/27 −1/9
A−1 = −8/27 −10/27 5/9 .
7/27 2/27 −1/9
1100 Linear Algebra and Optimization using MATLAB
48 −8 34
Adj(B) = 41 18 −2 , det(B) = 298
−19 28 30
24/149 −4/149 17/149
B −1 = 41/298 9/149 −1/149 .
−19/298 14/149 15/149
74 108 44 −80
−190 112 −128 −20
Adj(C) = , det(C) = −1112
4 −96 −208 176
90 66 −232 68
−37/556 27/278 −11/278 10/139
95/556 −14/139 16/139 5/278
C −1 =
−1/278
.
12/139 26/139 −22/139
−45/556 −8/139 29/139 −17/278
27. (a) α = 1.
(c) α = 11.
0 1/7 1/7
29. (a) A−1 = 1/5 −3/35 4/35 ,
−2/5 −4/35 17/35
x = [1, 2, 3]T .
3/16 1/16 1/16
(c) A−1 = −1/64 5/64 −11/64 ,
−17/192 7/64 5/192
x = [−1/2, 3/8, −5/24]T .
−157/5 203/5 2/5 36/5
17 −22 0 −4
31. (a) A−1 = ,
13 −17 0 −3
81/5 −104/5 −1/5 −18/5
Appendix D: Answers to Selected Exercises 1101
33. (a) det(A) = −50, det(A1) = −58, det(A2) = −44, det(A3) = −66,
x = [−7/2, 2, 3/2]T .
1 2 0
(c) U = 0 2 −2 , x = [−1, 2, 3]T .
0 0 2
41. (a) α 6= 3, α 6= 5.
(c) α 6= 0, α 6= ±1.
43.
−1/7 3/7 −1/7 46/11 3/11 −2
A−1 = −4/7 3/14 3/7 , B −1 = −3/11 1/11 0 ,
6/7 −4/7 −1/7 −2 0 1
10/31 −2/31 −5/31
C −1 = 7/31 11/31 −19/31 .
−11/31 −4/31 21/31
55. (a)
1 0 0 0 3 −2 1 1
−1 1 0 0 0 5 5 −2
L=
2/3 −11/15
, U = ,
1 0 0 0 6 28/15
7/3 1/3 −1/3 1 0 0 0 133/45
(c)
1 0 0 0 2 2 3 −2
5 1 0 0 0 −8 −2 21
L=
1 −3/8
, U = ,
1 0 0 0 1/4 127/8
1/2 5/8 −9 1 0 0 0 551/4
(e)
1 0 0 0
12 1 0 0
L= 22 −53/5
,
1 0
8 −32/5 628/1119 1
1 −1 10 8
0 −5 −109 −74
U = ,
0 0 −6812/5 −4807/5
0 0 0 2541/232
y = [−2, 31, 1893/5, 727/105]T .
x = [758/1851, 391/1723, −501/692, 342/541]T .
1104 Linear Algebra and Optimization using MATLAB
2 3 −1
57. (a) U = 0 1/2 3/2 , det(A) = (2)(1/2)(1) = 1.
0 0 1
2 4 1
(c) U = 0 −3 1/2 , det(A) = (2)(−3)(5/6) = −5.
070 5/6
1 −1 0
(e) U = 0 1 1 , det(A) = (2)(−3)(5/6) = −5.
0 0 1
59. (a)
1 0 0 3 0 0 1 4 3
A = LDV = 2/3 1 0 0 1/3 0 0 1 1 .
1/3 5 1 0 0 1 0 0 1
1 0 0 4 0 0 1 −2 3
B = LDV = 5/4 1 0 0 9/2 0 0 1 −27/4 .
1 10/9 1 0 0 21/2 0 0 1
(c)
1 0 0 1 −5 4 1 −5 4
A = LDV = 2 1 0 0 13 −12 0 13 −12 .
3 17/13 1 0 0 126/13 0 0 126/13
1 0 0 4 7 −6 4 7 −6
B = LDV = 5/4 1 0 0 −15/4 5/2 0 −15/4 5/2 .
3/2 58/15 1 0 0 25/3 0 0 25/3
61. (a)
2 0 0 1 −1/2 1/2
L = −3 5/2 0 , U = 0 1 1/5 ,
1 −1/2 3/5 0 0 1
(c)
2 0 0 1 1 1
L = 1 1 0 , U = 0 1 0 ,
3 0 1 0 0 1
y = [0, −4, 1]T .
x = [3, −4, 1]T .
(e)
1 0 0 1 −1 0
L= 2 1 0 , U = 0 1 1 ,
270 −1 0 0 1
y = [2, 0, 1]T .
x = [1, −1, 1]T .
63. (a)
1 0 0 1 −1 1
L = −1 2 0 , LT = 0 2 0 ,
1 0 3 0 0 3
y = [2, 2, 0]T .
x = [3, 1, 0]T .
(c)
2 0 0 2 1 3/2
L= 1 4 0 , LT = 0 4 −1/8 ,
3/2 −1/8 253/153 0 0 253/153
65. (a)
1 0 0 3 −1 0
L = −1/3 1 0 , U = 0 8/3 −1 ,
0 −3/8 1 0 0 21/8
1106 Linear Algebra and Optimization using MATLAB
(c)
1 0 0 0 4 −1 0 0
−1/4 1 0 0 0 15/4 −1 0
L= , U = ,
0 −4/15 1 0 0 0 56/15 −1
0 0 −15/56 1 0 0 0 209/56
83. (a) y = 4 − 3x + x2 .
(c) y = 3 + 2x.
(e) y = 5 + x + 2x2 .
85. (a) l1 = 3, l2 = 1, l3 = 2.
87. x2 = 0.
10 12 10
89. x1 = , x2 = and x3 = .
7 7 7
91. (a) 4F eS2 + 11O2 −→ 2F e2 O3 + 8SO2 .
(c) 2C4 H10 + 13O2 −→ 8CO2 + 10H2 O.
55 25 5
93. x = , y= , z= .
8 4 8
95. 80, 000 yen, 900 francs, 1200 marks.
D.0.2 Chapter 2
D.0.3 Chapter 3
1 1 2 −5 1 0 0 1 1
7. (a) 1 2 3 , (c) −2 −8 1/2 , (e) 1 −1 0
2 4 1 1 −10 1 1 1 1
−5 −1/2 5/2
17. (a) λ3 − 2λ2 − λ + 2 = 0, A−1 = −4 −1/2 3/2
−12 −1 6
−3/4 −1/4 −1
(c) λ3 + 2λ2 − 11λ − 12 = 0, A−1 = −1/2 −1/2 1 .
1/12 −1/12 1/3
21.
2 5 −2t
f1 (t) = − e4t + e
3 3
2 1 −2t
f1 (t) = − e4t − e .
3 3
23. (a)
1 3 4t
f1 (t) = − e2t + e
2 2
f2 (t) = e2t
1 3 4t
f3 (t) = − e2 + e .
2 2
Appendix D: Answers to Selected Exercises 1111
(c)
f1 (t) = −7e−3t + 8et
f2 (t) = −3e−3t + 4et
f3 (t) = 5e−3t − 4et .
1
25. (a) an = (2n+1 + 2n ), a20 = 1048576.
3
(c) an = (−1)n , a15 = −1.
D.0.4 Chapter 4
0.7000 −0.6414 −0.3140
X = 0.4357 0.7319 −0.5240 .
0.5659 0.2300 0.79118
0.7933 −0.1710 −0.5843
X = 0.4934 0.7429 0.4525 .
0.3567 −0.6472 0.6737
0.6068 0.4539 0.4156 −0.5031
−0.4738 0.5814 0.5323 0.3928
X=
−0.4513 −0.4775 0.5215 −0.5444 .
4.0000 4.5826 0.0000 0.0000
4.5826 5.0476 −3.3873 0.0000
(c) T =
0.0000 −3.3873
,
7.1666 −1.1701
0.0000 0.0000 −1.1701 5.7858
5.4713 −0.7373 2.1419
17. (a) A(15) = −0.0097 −3.4713 1.9705 ,
0.0000 0.0000 1.0000
3.8284 −0.0001 0.0000
(c) A(15) = −0.0001 −1.8284 −0.0002 ,
0.0000 −0.0002 1.0000
λ = 3.8284, −1.8284, 1.0000.
19. (a) λ = 4.3028, 0.70, 0, Iterations = 8.
The 42 QR iteration is
26.3046 −30.0959 76.9405 19.6782
0.0000 −13.4367 53.4580 9.0694
,
0.0000 −3.0615 5.7097 −2.2534
0.0000 −0.0000 −0.0000 6.4224
27. (a)
T 0 1 3 0 −1 0
A = U DV = .
1 0 0 2 0 −1
(c)
0.5774 0 0.8165 1.7321 0
0 1
A = U DV T = 0.5774 0.7071 −0.4082 0 1.4142
1 0
−0.5774 0.7071 0.4082 0 0
D.0.5 Chapter 5
5. α = 1/2.
2.0000 0 0 0 0
2.2361 0.2361 0 0 0
2.4495 0.2134 −0.0113 0 0
2.6458 0.1963 −0.0086 0.0009 0
2.8284 0.1826 −0.0069 0.0006 −0.0001
43. (a)
+ −1/75 −2/15 43/75
A = .
3/25 1/5 −4/25
(c)
+ −13/159 −25/318 89/318
A = .
29/159 34/159 −32/159
+ 11/17 −5/17
45. (a) A = , x̂ = [19/17, 6/17]T .
−1/17 2/17
−38/29 −9/29 17/29
+
(c) A = −5/29 −5/29 3/29 , x̂ = [−30/29, −7/29, 40/29]T .
41/29 12/29 −13/29
D.0.6 Chapter 6
x1 ≥ 0, x2 ≥ 0.
x1 ≥ 0, x2 ≥ 0, x3 ≥ 0, x4 ≥ 0, x5 ≥ 0.
x1 ≥ 0, x2 ≥ 0.
y1 + y2 ≥ −5
−y1 + 3y2 ≥ 2
y1 ≥ 0, y2 ≥ 0.
(c) maximize V = 2y1 + 5y2
subject to the constraints
6y1 + 3y2 ≤ 6
−3y1 + 4y2 ≤ 3
y1 + y2 ≤ 0
y1 ≥ 0, y2 ≥ 0.
D.0.7 Chapter 7
1. (a) 1.
√
2
(c) .
4
15
3. (a) .
16x9/4
1
(c) 6 − 3 cos x + x sin x − 2 .
x
Appendix D: Answers to Selected Exercises 1121
∂ 2f
5. (a) = y 2 e−xy cos y.
∂ 2x
∂ 2f
(c) 2 = (x2 − 1)e−xy cos y + 2xe−xy sin y.
∂ x
7. (a) [5, 1, 2]T .
T
3 12 27
(c) , , .
ln 36 ln 36 ln 36
6 0 0
12 5
9. (a) H(1, −1) = , (c) H(1, 2, 2) = 0 48 5 .
5 12
0 5 24
D.0.8 Appendix A
99
1. 10, 37, .
128
3. 8(10)3 + 3(10)2 + 8(10) + 3
2(10)2 + 8(10) + 5 + 6(10)−1 + 2(10)−2 + 5(10)−3
4(10)2 + (10) + 3 + (10)−1 + 4(10)−2 + (10)−3 + · · · .
13. (a) Absolute error = 7.346 × 10−6 , Relative error = 2.3383 × 10−6 .
(b) Absolute error = 7.346×10−4 , Relative error = 2.3383×10−4 .
D.0.9 Appendix B
9. (a) α = −5.
(c) α = 1.
13. 27 joules.
λ2 = 1.3014 + 0.6707i
x2 = [0.2163 + 0.1536i, −0.2378 − 0.3653i, −0.8600]T .
λ3 = 1.3014 − 0.6707i
x3 = [0.2163 − 0.1536i, −0.2378 + 0.3653i, −0.8600]T .
(c)
λ1 = 0.0000
x1 = [−0.8944, −0.4472, 0.0000]T .
λ2 = −38.0000 + 18.0000i
x2 = [0.4020 + 0.0783i, 0.7732, −0.3726 + 0.3090i]T .
λ3 = −38.0000 − 18.0000i
x3 = [0.4020 − 0.0783i, 0.7732, −0.3726 − 0.3090i]T .
1126 Linear Algebra and Optimization using MATLAB
D.0.10 Appendix C
3. (a) 2.4755.
(c) 24.9045.
7. 9.
11. 13.
15.
19. (i) (x − 5) ∗ (x − 6) ∗ (x − 7) ∗ (x − 8) ∗ (x − 4)
(ii) xˆ 4 − 6xˆ 3 + 11xˆ 2 − 6x
25. 60 − 18 ∗ q ∗ p + 5 ∗ p + 15 ∗ pˆ 2 − 2 ∗ qˆ 2 − 20 ∗ q,
[−2 ∗ (−10 + 3 ∗ q ∗ p)/(60 − 18 ∗ q ∗ p + 5 ∗ p + 15 ∗ pˆ 2 − 2 ∗ qˆ
2 − 20 ∗ q), 5 ∗ (1 + 3 ∗ p)/(60 − 18 ∗ q ∗ p + 5 ∗ p + 15 ∗ pˆ 2 − 2 ∗ qˆ
2 − 20 ∗ q), −2 ∗ (q + 10)/(60 − 18 ∗ q ∗ p + 5 ∗ p + 15 ∗ pˆ 2 − 2 ∗ qˆ
2 − 20 ∗ q)]
[−(5 ∗ p − 2 ∗ qˆ 2)/(60 − 18 ∗ q ∗ p + 5 ∗ p + 15 ∗ pˆ 2 − 2 ∗ qˆ
2 − 20 ∗ q), −5 ∗ (−3 + q)/(60 − 18 ∗ q ∗ p + 5 ∗ p + 15 ∗ pˆ 2 − 2 ∗ qˆ
1128 Linear Algebra and Optimization using MATLAB
29. [2]
[−2]
[−2]
Bibliography
[3] Ahlberg, J., E. Nilson, and J. Walsh: The Theory of Splines and Their
Application, Academic Press, New York, 1967.
[4] Akai, T. J.: Applied Numerical Methods for Engineers, John Wiley &
Sons, New York, 1993.
[51] Fletcher, R.: Practical Methods of Optimization, 2nd ed., John Wiley
& Sons, New York, 1987.
[52] Forsythe, G. E. and C. B. Moler: Computer Solution of Linear Alge-
braic Systems, Prentice–Hall, Englewood Cliffs, NJ, 1967.
[53] Fox, L.: Numerical Solution of Two-Point Boundary-Value Problems
in Ordinary Differential Equations, Dover, New York, 1990.
[54] Fox, L.: An Introduction to Numerical Linear Algebra, Oxford Uni-
versity Press, New York, 1965.
[55] Fröberg, C. E.: Introduction to Numerical Analysis, 2nd ed., Addison–
Wesley, Reading, MA, 1969.
[56] Fröberg, C. E.: Numerical Mathematics: Theory and Computer Ap-
plication, Benjamin/Cummnings, Menlo Park, CA, 1985.
[57] Gass, S. I.: An Illustrated Guide to Linear Programming, McGraw–
Hill, New York, 1970.
[58] Gass, S. I.: Linear Programming, 4th ed., McGraw–Hill, New York,
1975.
[59] Gerald, C. F. and P. O. Wheatley: Applied Numerical Analysis, 7th
ed., Addison–Wesley, Reading, MA, 2004.
[60] Gilat A.: MATLAB—An Introduction with Applications, John Wiley
& Sons, New York, 2005.
[61] Gill, P. E., W. Murray, and M. H. Wright: Numerical Linear Algebra
and Optimization, Addison–Wesley, Reading, MA, 1991.
[62] Gill, P. E., W. Murray, and M. H. Wright: Practical Optimization,
Academic Press, New York, 1981.
[63] Goldstine, H. H.: A History of Numerical Analysis From the 16th
Through the 19th Century, Springer–Verlag, New York, 1977.
[64] Golub, G. H.: Studies in Numerical Analysis, MAA, Washington, DC,
1984.
1134 Bibliography
[66] Golub, G. H. and C. F. van Loan: Matrix Computation, 3rd ed., Johns
Hopkins University Press, Baltimore, MD, 1996.
[68] Greenbaum, A.: Iterative Methods for Solving Linear Systems, SIAM,
Philadelphia, 1997.
[80] Henrici, P. K.: Elements of Numerical Analysis, John Wiley & Sons,
New York, 1964.
[85] Hoffman, Joe. D.: Numerical Methods for Engineers and Scientists,
McGraw–Hill, New York, 1993.
[86] Hohn, F. E.: Elementary Matrix Algebra, 3rd ed., Macmillan, New
York, 1973.
[100] Kahanger, D., C. Moler, and S. Nash: Numerical Methods and Soft-
ware, Prentice–Hall, Englewood Cliffs, NJ, 1989.
[102] Kelley, C. T.: iterative Methods for Linear and Nonlinear Equations,
SIAM, Philadelphia, 1995.
[105] Knuth, D. E.: Seminumerical Algorithms, 2nd ed., Vol. 2 of The Art
of Computer Programming, Addison–Wesley, Reading, MA, 1981.
[113] Linear, P.: Theoretical Numerical Analysis, John Wiley & Sons, New
York, 1979.
[122] Meyer, C. D.: Matrix Analysis and Applied Linear Algebra, SIAM,
Philadelphia, 2000.
[128] Murty, K. G.: Linear Programming, John Wiley & Sons, New York,
1983.
[148] Rao, S. S.: Engineering Optimization: Theory and Practice, 3rd ed.,
John Wiley & Sons, New York, 1996.
[155] Saad, Y.: Numerical Methods for Large Eigenvalue Problems: The-
ory and Algorithms, John Wiley & Sons, New York, 1992.
[156] Saad, Y.: Iterative Methods for Sparse Linear Systems, PWS Pub-
lishing Co., Boston, 1996.
[157] Scales, L. E.: Introduction to Nonlinear Optimization, Macmillan,
London, 1985.
[158] Scarborough, J. B.: Numerical Mathematical Analysis, 6th ed., The
Johns Hopkins University Press, Baltimore, MA, 1966.
[159] Schatzman, M.: Numerical Analysis—A Mathematical Introduction,
Oxford University Press, New York, 2002.
[160] Scheid, F.: Numerical Analysis, McGraw–Hill, New York, 1988.
[161] Schilling, R. J. and S. L. Harris: Applied Numerical Methods for
Engineers using MATLAB and C, Brooks/Cole Publishing Company,
Boston, 2000.
[162] Shampine, L. F. and R. C. Allen: Numerical Computing—An Intro-
duction, Saunders, Philadelphia, 1973.
[163] Shapiro, J. F.: Mathematical Programmin Structures and Algo-
rithms, John Wiley & Sons, New York, 1979.
[164] Steward, B. W.:Introduction to Matrix Computations, Academic
Press, New York, 1973.
[165] Stewart, G. W.: Afternotes on Numerical Analysis, SIAM, Philadel-
phia, 1996.
[166] Store, J. and R. Bulirsch: Introduction to Numerical Analysis,
Springer–Verlag, New York, 1980.
[167] Strang, G.: Linear Algebra and Its Applications 3rd ed., Brooks/Cole
Publishing Company, Boston, 1988.
[168] Stroud, A. H. and D. Secrest: Gaussian Quadrature Formulas,
Prentice–Hall, Englewood Cliffs, NJ, 1966.
1142 Bibliography
[186] Williams, G.: Linear Algebra With Applications, 4th ed., Jones and
Bartlett Publisher, UK, 2001.
zero matrix, 16
zero solution, 62
zero subspace, 350
zero vector, 348, 943
zeroth divided difference, 531