Module IV - Adv - 2023
Module IV - Adv - 2023
x + y = (x1 + y1 , x2 + y2 , ...xn + yn ),
cx = (cx1 , cx2 , ...cxn )
n
X
x·y= xi yi
i=1
n
! 21
1
X
| x |= (x · x) =
2 x2i
i=1
The structure now defined (the vector space Rn with the above inner product
and norm ) is called euclidean n− space.
(a) |x| ≥ 0
c1 x1 + c2 x2 + ... + ck xk
Example 1.2.4. if x1 = (1, 2, 4), x2 = (−1, 3, 1) then x = (1, 12, 14) is a linear
combination of the vectors x1 and x2 . Since x = (1, 12, 14) = 3x1 + 2x2
i+1
X r−i
X
aj yj + bk xk = 0
j=1 k=1
If all bk ’s were 0, the independence of Q would force all aj ’s to be 0, a con-
tradiction. It follows that some xk ∈ Si , is a linear combination of the other
members of Ti = Si ∪ {yi+1 }. Remove this xk from Ti and call the remaining
set Si+1 . Then Si+1 spans the same set as Ti , namely X. Starting with S0 , we
thus construct sets S1 , ..., Sr .The set Sr consists of y1 , ...yr and our constric-
tion shows that it spans X. But Q is independent ; hence yr+1 is not in the
span of Sr ,which is a contradiction.
Proof. Since {e1 , ..., en } spans Rn , the theorem 1.2.11 shows that dim Rn ≤ n.
Since {e1 , ..., en } is independent, by definition 1.2.8 dim Rn ≥ n.
Proof.
(a) Suppose E = {x1 , ..., xn }. Since dim X = n, the set {x1 , ..., xn , y} is
dependent, for every y ∈ X.
Let y ∈ X. If E is independent and the set {x1 , ..., xn , y} is dependent
then it follows that a1 y + b1 x1 + ... + bn xn = 0 for some a1 6= 0. So y is
in the span of E, hence E spans X.
Conversely, if E is dependent, one of its members can be removed without
changing the span of E (by similar construction of Si+1 from Ti in the
proof of the Theorem 1.2.11 ). Hence, span of E is same as span of E 0 for
some set E 0 ⊂ E containing n − 1 elements. Hence, by Theorem 1.2.11,
dim (span of E 0 ) ≤ n − 1.Hence E cannot span X.
(b) Since dim X = n, X contains an independent set of n vectors, and (a)
shows that every such set is a basis of X. Suppose that E 0 is a basis of
X. Then, by the definition of basis we know that E 0 is an independent
set and span of E 0 is X. Since dim X = n, every independent set has
at- most n elements by the Definition 1.2.8. So the number of elements
in E 0 is at-most n . If number of elements in E 0 is less than n, then by
Theorem 1.2.11, dim X ≤ n − 1, which is not possible. So number of
elements in E 0 is n.
(c) Let{x1 , ..., xn } be a basis of X. The set
S = {y1 , ..., yr , x1 , ..., xn }
spans X and is dependent, since it contains more than n vectors. The
argument used in the proof of Theorem 1.2.11 shows that one of the xi ’s
is a linear combination of the other members of S. If we remove this xi ,
from S, the remaining set still spans X. This process can be repeated r
times and leads to a basis of X which contains {y1 , .., yr } by (a).
for all x, x1 , x2 ∈ X and all scalars c. Note that one often writes Ax instead
of A(x) if A is linear.
Exercise 1.3.2. Prove that if the mapping A of a vector space X into a vector
space Y is linear then A(0) = 0.
Observe that a linear transformation A of X into Y is completely deter-
mined by its action on any basis. If {x1 , ..., xn } is a basis of X, then every
x ∈ X has a unique representation of the form
n
X
x= c i xi
i=1
and the linearity of A allows us to compute Ax from the vectors Ax1 , ..., Axn ,
and the coordinates c1 , ..cn by the formula
n
X
Ax = ci Axi (1.1)
i=1
x
(Hint: For any x ∈ Rn , let y = |x|
then |Ay| ≤ kAk)
Exercise 1.3.8. If λ is such that |Ax| ≤ λ|x| for all x ∈ Rn ,then kAk ≤ λ.
|A(x)| = |y · x| ≤ |y||x|
Theorem 1.3.10.
kBAk ≤ kBkkAk
Proof.
So that
n
X
kAk ≤ |Aei | < ∞
i=1
(i) kA − Bk ≥ 0
n
kA−Bk = 0 imply that A(x) = B(x) for every x ∈ R with |x|
≤ 1.
x x
So for 0 6= x ∈ Rn , if kA − Bk = 0 imply that A |x| = B |x| .
So A = B
(ii) kA − Bk = kB − Ak
(iii) we have the triangle inequality
kA − Ck = k(A − B) + (B − C)k ≤ kA − Bk + kB − Ck
So L(Rn , Rm ) is a metric space with norm metric
kB − Ak kA−1 k < 1
Then B ∈ Ω.
B −1 − A−1 = B −1 (A − B)A−1
β
kB −1 − A−1 k ≤ kB −1 kkA − BkkA−1 k ≤
α(α − β)
If B → A then β → 0 imply kB −1 − A−1 k → 0.That is f (B) → f (A).
Matrices:
Suppose {x1 , ..., xn } and {y1 , ..., ym } are bases of vector spaces X and Y re-
spectively. Then every A ∈ L(X, Y ) determines a set of numbers aij such that
m
X
Axj = aij yi (1 ≤ j ≤ n) (1.3)
i=1
Observe that the coordinates aij of the vector Axj (with respect to the basis
{y1 , . . . , yn } appear in the j th column of [A]. The vectors Axj are therefore
sometimes called the column vectors of [A]. With this terminology, the range
of A is spanned by the column vectors of [A].
If x = Σcj xj , the linearity of A, combined with 1.3, shows that
m n
!
X X
Ax = aij cj yi (1.4)
i=1 i=1
Thus the coordinates of Ax are Σj aij cj . Note that in 1.3 the summation ranges
over the first subscript of aij , but that we sum over the second subscript when
computing coordinates.
Suppose next that an m by n matrix is given, with real entries aij . If A
is then defined by 1.4, it is clear that A ∈ L(X, Y ) and that [A] is the given
matrix. Thus there is a natural 1-1 correspondence between L(X, Y ) and the
set of all real m by n matrices.
Example 1.3.12. Let A : R2 → R2 by A(x1 , x2 ) = (2x1 +3x2 , 4x1 −x2 ). Then
2 3
[A] =
4 −1
1.4 Differentiation
Definition 1.4.1. Suppose E is an open set in Rn , f maps E into Rm , and
x ∈ E. If there exist a linear transformation A of Rn into Rm such that
|f(x+h) − f(x) − Ah|
lim =0 (1.7)
h→0 |h|
then we say that f is differentiable at x, and we write
f 0 (x) = A (1.8)
(c) Equation 1.9 shows that f is continuous at any point at which f is differ-
entiable.
(Hint:Every Linear transformation is continuous by Theorem 1.3.10 and
lim r(h) = 0)
h→0
(d) The derivative defined by 1.7 or 1.9 is often called the differential of f at
x, or the total derivative of f at x.
Theorem 1.4.2. Suppose E and f are as in Definition 1.4.1 and x ∈ E and
equation 1.7 holds with A = A1 and A = A2 Then A1 = A2 .
Proof. If B = A1 − A2 . the inequality
F(x) = g(f(x))
is differentiable at x0 . and
for all h ∈ Rn and k ∈ Rm for which f(x0 + h) and g(y0 + k) are defined. Then
and
fi (x + tej ) − fi (x)
(Dj fi )(x) = lim (1.15)
t→0 t
provided the limit exists,is called partial derivative of fi with respect to xj .
where {e1 , , ..., en } and {u1 , ..., um } be the standard bases of Rn and Rm .
|r(tej )|
where → 0 as t → 0.The linearity of f 0 (x) shows that
t
f(x + tej ) − f(x)
lim = f 0 (x)ej (1.17)
t→0 t
If we now represent f in terms of its components, as in equation 1.14, then
equation 1.17 becomes
m
X fi (x + tej ) − fi (x)
lim ui = f 0 (x)ej (1.18)
t→0
i=1
t
It follows that each quotient in this sum has a limit, as t → 0 so that each
(Dj fi )(x) exists, and then equation 1.16 follows from equation 1.18.
Let [f 0 (x)] be the matrix of the linear transformation f 0 (x) with respect
to our standard bases.
Then f 0 (x)ej is the j th column vector of [f 0 (x)] and 1.16 shows therefore
that the number (Dj fi )(x) occupies the spot in the ith row and j th column of
[f 0 (x)] Thus
(D1 f1 )(x) · · · (Dn f1 )(x)
[f 0 (x)] = ··· ··· ···
(D1 fm )(x) · · · (Dn fm )(x)
If h = hj ej is any vector in Rn then equation 1.16 implies that
P
m
( n )
X X
f 0 (x)h = (Dj fi )(x)hj ui
i=1 j=1
f (0, t) − f (0, 0)
(D2 f ) (0, 0) = lim =0
t→0 t
However, f (x, y) is not continuous at (0, 0), since if (x, y) → (0, 0) along the
line y = x, then f (x, y) = 12 and if (x, y) → (0, 0) along the x− axis , then
f (x, y) = 0 . So lim f (x, y) does not exist.
(x,y)→(0,0)
So f is not differentiable at (0, 0).
Definition 1.4.7. Let f be a real valued differentiable function with domain
E, Let x ∈ E then the gradient of f at x, defined by
n
X
∇f (x) = (Di f )(x)ei
i=1
If f and x are fixed, but u varies, then 1.19 shows that (Du f ) (x) attains
its maximum when u is a positive scalar multiple of (∇f )(x).
If u = Σui ei , then 1.19 shows that (Du f ) (x) can be expressed in terms of
the partial derivatives of f at x by the formula
n
X
(Du f ) (x) = (Di f ) (x)ui .
i=1
kf 0 (x)k ≤ M
|f(b) − f(a)| ≤ M |b − a|
for all a ∈ E, b ∈ E.
Proof. Fix a ∈ E, b ∈ E. Define
γ(t) = (1 − t)a + tb
|g(1) − g(0)| ≤ M |b − a|
But g(0) = f(a) and g(1) = f(b). This completes the proof.
Corollary 1.4.11. If, f 0 (x) = 0 for all x ∈ E, then f is constant.
Proof. To prove this, note that the hypotheses of the above theorem hold with
M = 0 . So
|f(b) − f(a)| ≤ 0|b − a|
for all a ∈ E, b ∈ E.
This imply that |f(b)−f(a)| = 0. That is f(b) = f(a) for all a ∈ E, b ∈ E.
Definition 1.4.12. A differentiable mapping f of an open set E ⊂ Rn into Rm
is said to be continuously differentiable in E if f 0 is a continuous mapping of E
into L (Rn , Rm ). More explicitly, Every x ∈ E and to every > 0 corresponds
a δ > 0 such that
kf 0 (y) − f 0 (x)k <
if y ∈ E and |x − y| < δ. If this is so, we also say that f is a C 0 -mapping, or
that f ∈ C 0 (E).
Result 1.4.13. Mean Value theorem: If f is a real continuous function on
[a, b] which is differentiable in (a, b), then there is a point x ∈ (a, b) at which
and since |ui | = |ei | = 1, it follows from result 1.1.3 (d) and exercise 1.3.7 that
Since |vk | < |h| < r for 1 ≤ k ≤ n and since S is convex, the segments
with end points x + vj−1 and x + vj lie in S. Since vj = vj−1 + hj ej , the mean
value theorem (Result 1.4.13) shows that the j th summand in 1.21 is equal to
hj (Dj f ) (x + vj−1 + θj hj ej )
for some θj ∈ (0, 1), and this differs from hj (Dj f ) (x) by less than |hj |
n
using 1.20.
[ To understand this , let’s consider the case j = 2.
Let x = (x1 , ...xn ), h = (h1 , ..., hn ). Then f (x + v2 ) − f (x + v1 )
= f ((x1 , x2 , ..., xn ) + (h1 , h2 , 0, ..., 0)) − f ((x1 , ..., xn ) + (h1 , 0, 0, ..., 0))
= f (x1 + h1 , x2 + h2 , ..., xn ) − f (x1 + h1 , x2 , ..., xn ) . So consider the function
g2 : [x2 , x2 + h2 ] → R by g(t) = f (x1 + h1 , t, x3 , ..., xn ) then by mean value
theorem (Result 1.4.13) there exist c2 ∈ (x2 , x2 + h2 ) such that
xn+1 = ϕ (xn ) (n = 0, 1, 2, . . .)
d (xn+1 , xn ) ≤ cn d (x1 , x0 ) (n = 0, 1, 2, . . .)
≤ (1 − c)−1 d (x1 , x0 ) cn
g(f (x)) = x (x ∈ U )
then g ∈ C 0 (V ).
Proof. (a) Put f 0 (a) = A, and choose λ so that
2λkA−1 k = 1 (1.23)
f (a + h, b + k) = A(h, k) + r(h, k)
where r is the remainder that occurs in the definition of f 0 (a, b), Since
it follows that F 0 (a, b) is the linear operator on Rn+m that maps (h, k) to
(A(h, k), k). If this image vector is 0, then A(h, k) = 0 and k = 0, hence
A(h, 0) = 0, and Theorem 1.7.2 implies that h = 0 . It follows that F 0 (a, b)
is 1-1; hence it is invertible (Theorem 1.3.5 ).
The inverse function theorem can therefore be applied to F. It shows that
there exist open sets U and V in Rn+m , with (a, b) ∈ U, (0, b) ∈ V such that
F is a 1-1 mapping of U onto V .
We let W be the set of all y ∈ Rm such that (0, y) ∈ V . Note that b ∈ W .
Consider the function L : Rm → Rn+m by L(y) = (0, y). Then L is continuous
and L−1 (V ) = W . So W is open .
If y ∈ W , then (0, y) = F(x, y) for some (x, y) ∈ U. By 1.33, f (x, y) = 0
for this x.
Suppose, with the same y, that (x 0 , y) ∈ U and f (x 0 , y) = 0. Then
Since F is 1-1 in U , it follows that x 0 = x. This proves the first part of the
theorem.
For the second part, define g(y), for y ∈ W , so that (g(y), y) ∈ U and 1.31
holds. Then
F(g(y), y) = (0, y) (y ∈ W ) (1.34)
If G is the mapping of V onto U that inverts F, then G ∈ C 0 , by the inverse
function theorem, and 1.34 gives
f 0 (Φ(y))Φ 0 (y) = 0
AΦ 0 (b) = 0 (1.37)
Ax g 0 (b) + Ay = 0
This is equivalent to 1.32, and completes the proof.
If a = (0, 1) and b = (3, 2, 7), then f(a, b) = 0. With respect to the standard
bases, the matrix of the transformation A = f 0 (a, b) is
2 3 1 −4 0
[A] =
−6 1 2 0 −1
Hence
2 3 1 −4 0
[Ax ] = , [Ay ] = .
−6 1 2 0 −1
We see that the column vectors of [Ax ] are independent. Hence Ax is invert-
ible and the implicit function theorem asserts the existence of a C 0 -mapping
g, defined in a neighborhood of (3, 2, 7), such that g(3, 2, 7) = (0, 1) and
f(g(y), y) = 0. We can use 1.32 to compute g 0 (3, 2, 7) : Since
−1 −1 1 1 −3
(Ax ) = [Ax ] =
20 6 2
1.32 gives
1 1 3
0 1 1 −3 1 −4 0 − 20
[g (3, 2, 7)] = − = 4 5
20 6 2 2 0 −1 − 21 6
5
1
10
1.8 Determinants
Definition 1.8.1. If (j1 , . . . , jn ) is an ordered n-tuple of integers, define
Y
s (j1 , . . . , jn ) = sgn (jq − jp ) (1.38)
p<q
s(1, 1)a(1, 1)a(2, 1) + s(1, 2)a(1, 1)a(2, 2) + s(2, 1)a(1, 2)a(2, 1) + s(2, 2)a(1, 2)a(2, 2)
= a(1, 1)a(2, 2) − a(1, 2)a(2, 1)
Theorem 1.8.4.
(b) det is a linear function of each of the colunn vectors xj , if the others are
held fixed.
That is det(x1 , ..., cxj , ..., xn ) = c det(x1 , ..., xj , ..., xn ) and
det[A]1 = − det[A]
Proof.
(b) By 1.38, s (j1 , . . . , jn ) = 0 if any two of the j’s are equal. Each of the
remaining n! products in 1.39 contains exactly one factor from each col-
umn. This proves (b).
By 1.42 and Theorem 1.8.4, ∆B also has properties 1.8.4(b) to (d). By (b) and
1.40,
!
X X
∆B [A] = ∆B a(i, 1)ei , x2 , . . . , xn = a(i, 1)∆B (ei , x2 , . . . , xn )
i i
the sum being extended over all ordered n-tuples (i1 , . . . , in ) with 1 ≤ ir , ≤ n.
by (c) and (d)
for all n by n matrices [A] and [B]. Taking B = I, we see that the above sum
in braces is det[A]. This proves the theorem.
so that det[A] 6= 0.
If A is not invertible, the columns x1 , . . . , xn of [A] are dependent (Theorem
1.3.5); hence there is one, say, xk , such that
X
xk + cj xj = 0 (1.46)
j6=k
and also to !
X X X
ABej = A bkj ek = aik bkj ei .
k i k
det[A]U = det[A].