LinearAlgebra-Ver1 4 PDF
LinearAlgebra-Ver1 4 PDF
Preface
These notes are intended as a rough guide to the course Further Linear Algebra which
is a part of the Oxford 2nd year undergraduate course in mathematics. Please do not
expect a polished account. They are lecture notes, not a carefully checked text-book.
Nevertheless, I hope they may be of some help.
The course is designed to build on 1st year Linear Algebra. The syllabus for that
course includes matrices, row reduction of matrices to echelon form, rank of a matrix,
solution of simultaneous linear equations, vector spaces and their subspaces, linear depen-
dence and independence, bases and dimension of a vector space, linear transformations
and their matrices with respect to given bases, and eigenvalues and eigenvectors of a
square matrix or of a linear transformation of a vector space to itself. In that first-
year work the coefficients of our matrices or linear equations are almost exclusively real
numbers; accordingly, the field over which our vector spaces are defined is R.
A significant change of attitude occurs now. Our first task is to examine what happens
when R is replaced by an arbitrary field F of coefficients. We find that the basic theory of
matrices, linear equations, vector spaces and their subspaces, linear transformations and
their matrices is unchanged. The theory of eigenvalues and eigenvectors does depend
on the field F , however. And when we come to the “metric” or “geometric” theory
associated with inner products, orthogonality, and the like, we find that it is natural to
return to vector spaces over R or C.
As a consequence the lecture course, and therefore also this set of notes, naturally
divides into four parts: the first is a study of vector spaces over arbitrary fields; then
we study linear transformations of a vector space to itself; third, a treatment of real or
complex inner product spaces; and finally the theory of adjoints of linear transformations
on inner product spaces.
It is a pleasure to acknowledge with warm thanks that these notes have benefitted
from comments and suggestions by Dr Jan Grabowski. Any remaining errors, infelicities
and obscurities are of course my own responsibility—I would welcome feedback.
i
CONTENTS
Fields 1
Vector spaces 1
Subspaces 3
Quotient spaces 4
Dimension (Revision) 4
Dual transformations 13
Further exercises I 17
Triangular form 26
Further exercises II 32
Bessel’s Inequality 40
ii
Part IV: Adjoints of linear transformations on
finite-dimensional inner product spaces 47
Further exercises IV 58
iii
iv
Part I: Fields and Vector Spaces
As has been indicated in the preface, our first aim is to re-work the linear algebra
presented in the Mods course, generalising from R to an arbitrary field as domain from
which coefficients are to be taken. Fields are treated in the companion course, Rings
and Arithmetic, but to make this course and these notes self-contained we begin with a
definition of what we mean by a field.
(1) a + (b + c) = (a + b) + c [+ is associative]
(2) a+b=b+a [+ is commutative]
(3) a+0=a
(4) a + (−a) = 0
(5) a (b c) = (a b) c [× is associative]
(6) ab = ba [× is commutative]
(7) a1 = a
(8) a 6= 0 ⇒ ∃ x ∈ F : a x = 1
(10) 0 6= 1
Vector Spaces
Let F be a field. A vector space over F is a set V with distinguished element 0, with
a unary operation −, with a binary operation +, and with a function × : F × V → V
satisfying the axioms below. Conventionally: for the image of (a, b) under the function
+ : V × V → V we write a + b; for a + (−b) we write a − b; and for the image of (α, v)
under the function × : F × V → V we write α v .
1
Vector space axioms
(1) u + (v + w) = (u + v) + w [+ is associative]
(2) u+v =v+u [+ is commutative]
(3) u+0=u
(4) u + (−u) = 0
(5) α (β v) = (α β) v
(6) α (u + v) = α u + α v
(7) (α + β) v = α v + β v
(8) 1v = v
Examples:
• F n is a vector space over F ;
• the polynomial ring F [x] (see Notes on Rings and Arithmetic) is a vector space
over F ;
• Mm×n (F ) is a vector space over F ;
• if K is a field and F a subfield then K is a vector space over F ;
• etc., etc., etc.
Exercise 1. Let X be any set, F any field. Define F X to be the set of all functions
X → F with the usual point-wise addition and multiplication by scalars (elements of F ).
Show that F X is a vector space over F .
Exactly as for rings, or for vector spaces over R, one can prove important “triviali-
ties”. Of course, if we could not prove them then we would add further axioms until we
had captured all properties that we require, or at least expect, of the algebra of vectors.
But the fact is that this set of axioms, feeble though it seems, is enough. For example:
Proposition: Let V be a vector space over the field F . For any v ∈ V and any
α ∈ F we have
(i) 0 v = 0;
(ii) α 0 = 0;
(iii) if α v = 0 then α = 0 or v = 0;
(iv) α (−v) = −(α v) = (−α) v ;
Proof. For v ∈ V , from Field Axiom (3) and Vector Space Axiom (7) we have
0 v = (0 + 0) v = 0 v + 0 v.
Then adding −(0 v) to both sides of this equation and using Vector Space Axioms (4)
on the left and (1), (4), (3) on the right, we get that 0 = 0 v , as required for (i). The
reader is invited to give a proof of (ii).
2
For (iii), suppose that α v = 0 and α 6= 0: our task is to show that v = 0. By Field
Axioms (8) and (6) there exists β ∈ F such that β α = 1. Then
v = 1 v = (β α) v = β (α v) = β 0
by Vector Space Axioms (8) and (5). But β 0 = 0 by (ii), and so v = 0, as required.
Clause (iv), like Clause (ii), is offered as an exercise:
Subspaces
Let V be a vector space over a field F . A subspace of V is a subset U such that
(1) 0∈U and u + v ∈ U whenever u, v ∈ U and −u ∈ U whenever u ∈ U ;
(2) if u ∈ U , α ∈ F then α u ∈ U .
P
Examples: (1) Let L1 , . . . , Lm be homogeneous linear expressions cij xj with
coefficients cij ∈ F , and let
U := {(x1 , . . . , xn ) ∈ F n | L1 = 0, . . . , Lm = 0}.
Then U 6 F n .
(2) Let F [n] [x] := {f ∈ F [x] | f = 0 or deg f 6 n}. Then F [n] [x] 6 F [x].
(3) Upper triangular matrices form a subspace of Mn×n (F ).
3
Exercise 5. Let U1 , U2 , . . . be proper subspaces of a vector space V over a field F
(recall that the subspace U is said to be proper if U 6= V ).
Quotient spaces
Suppose that U 6 V where V is a vector space over a field F . Define the quotient
space V /U as follows:
Exercise 6 (worth doing carefully once in one’s life, but not more than once—unless
an examiner offers marks for it): Check that −, + and multiplication by scalars are
well defined, and that the vector space axioms hold in V /U .
Note: The notion of quotient space is closely analogous with the notion of quotient
of a group by a normal subgroup or of a ring by an ideal. It is not in the Part A syllabus,
nor will it play a large part in this course. Nevertheless, it is an important and useful
construct which is well worth becoming familiar with.
Dimension (Revision)
Although technically new, the following ideas and results translate so simply from
the case of vector spaces over R to vector spaces over an arbitrary field F that I propose
simply to list headers for revision:
(2) dimension;
(3) dim V = d ⇒ V ∼
= Fd;
(4) any linearly independent set may be extended (usually in many ways) to a basis;
4
(6) dim (U + W ) = dim U + dim W − dim (U ∩ W );
T 0 = 0, T (−x) = −T x, T (x + y) = T x + T y, and T (λ x) = λ (T x)
Note 1: The definition is couched in terms which are intended to emphasise that
what should be required of a linear transformation (homomorphism of vector spaces) is
that it preserves all the ingredients, 0, +, − and multiplication by scalars, that go into
the making of a vector space. What we use in practice is the fact that T : V1 → V2 is
linear if and only if T (α x + β y) = α T x + β T y for all x, y ∈ V and all α, β ∈ F . The
proof is an easy exercise and is left to the reader.
Exercise 9. Let V be a finite dimensional vector space over a field F and let
T : V → V be a linear transformation. For λ ∈ F define Eλ := {v ∈ V | T v = λv}.
5
Direct sums and projection operators
The vector space V is said to be the direct sum of its subspaces U and W , and we
write V = U ⊕ W , if V = U + W and U ∩ W = {0}.
Proof. Suppose first that for every v ∈ V there exist unique vectors u ∈ U and
w ∈ W such that v = u + w . Certainly then V = U + W and what we need to prove
is that U ∩ W = {0}. So let x ∈ U ∩ W . Then x = x + 0 with x ∈ U and 0 ∈ W .
Equally, x = 0 + x with 0 ∈ U and x ∈ W . But the expression x = u + w with u ∈ U
and w ∈ W is, by assumption, unique, and it follows that x = 0. Thus U ∩ W = {0},
as required.
Now suppose that V = U ⊕ W . If v ∈ V then since V = U + W there certainly
exist vectors u ∈ U and w ∈ W such that v = u + w . The point at issue therefore is:
are u, w uniquely determined by v ? Suppose that u + w = u′ + w′ , where u, u′ ∈ U
and w, w′ ∈ W . Then u − u′ = w′ − w . This vector lies both in U and in W . By
assumption, U ∩ W = {0} and so u − u′ = w′ − w = 0. Thus u = u′ and w = w′ , so the
decomposition of a vector v as u + w with u ∈ U and w ∈ W is unique, as required.
Note: What we are discussing is sometimes (but rarely) called the “internal” direct
sum to distinguish it from the natural construction which starts from two vector spaces
V1 , V2 over the same field F and constructs a new vector space whose set is the product
set V1 × V2 and in which the vector space structure is defined componentwise—compare
the direct product of groups or of rings. This is (equally rarely) called the “external”
direct sum. These are two sides of the same coin: the external direct sum of V1 and V2
is the internal direct sum of its subspaces V1 × {0} and {0} × V2 ; while if V = U ⊕ W
then V is naturally isomorphic with the external direct sum of U and W .
Observations:
(1) P is well-defined;
(2) P is linear;
(3) Im P = U , Ker P = W ;
(4) P2 =P.
6
Proofs. That P is well-defined is an immediate consequence of the existence and
uniqueness of the decomposition v = u + w with u ∈ U , v ∈ V .
To see that P is linear, let v1 , v2 ∈ V and α1 , α2 ∈ F (where, as always, F is the
field of scalars). Let v1 = u1 + w1 and v2 = u2 + w2 be the decompositions of v1 and v2 .
Then P v1 = u1 and P v2 = u2 . What about P (α1 v1 + α2 v2 )? Well,
α1 v1 + α2 v2 = α1 (u1 + w1 ) + α2 (u2 + w2 ) = (α1 u1 + α2 u2 ) + (α1 w1 + α2 w2 ).
Since α1 u1 + α2 u2 ∈ U and α1 w1 + α2 w2 ∈ W it follows that P (α1 v1 + α2 v2 ) =
α1 u1 + α2 u2 . Therefore
P (α1 v1 + α2 v2 ) = α1 P (v1 ) + α2 P (v2 ),
that is, P is linear.
For (3) it is clear from the definition that Im P ⊆ U ; but if u ∈ U then u = P u,
and therefore Im P = U . Similarly, it is clear that W ⊆ Ker P ; but if v ∈ Ker P then
v = 0 + w for some w ∈ W , and therefore Ker P = W .
Finally, if v ∈ V and we write v = u + w with u ∈ U and w ∈ W then
P 2 v = P (P v) = P u = u = P v ,
and this shows that P 2 = P , as required.
Our next aim is to characterise projection operators algebraically. It turns out that
Observation (4) above is the key:
Proof. We have seen already that every projection is idempotent, so the problem
is to prove that an idempotent linear transformation is a projection operator. Suppose
that P : V → V is linear and idempotent. Define
U := Im P, W := Ker P.
7
Our first task is to prove that V = U ⊕ W . Let v ∈ V . Then v = P v + (v − P v).
Now P v ∈ U (obviously), and P (v − P v) = P v − P 2 v = 0, so v − P v ∈ W . Thus
V = U + W . Now let x ∈ U ∩ W . Since x ∈ U there exists y ∈ V such that x = P y ,
and since x ∈ W also P x = 0. Then x = P y = P 2 y = P x = 0. Thus U ∩ W = {0},
and so V = U ⊕ W .
To finish the proof we need to convince ourselves (and others—such as examiners,
tutors and other friends) that P is the projection onto U along W . For v ∈ V write
v = u + w where u ∈ U and w ∈ W . Since u ∈ U there must exist x ∈ V such that
u = P x. Then
P v = P (u + w) = P u + P w = P 2 x + 0 = P x = u ,
Exercise 11. Let V be a vector space (over some field F ), and let E1 and E2 be
projections on V .
The next result is a theorem which turns out to be very useful in many situations.
It is particularly important in applications of linear algebra in Quantum Mechanics.
P T v = P T (u + w) = P (T u + T w) = T u ,
T P v = T P (u + w) = T u .
We turn now to direct sums of more than two subspaces. The vector space V is
said to be the direct sum of subspaces U1 , . . . , Uk if for every v ∈ V there exist unique
vectors ui ∈ Ui for 1 6 i 6 k such that v = u1 + · · · + uk . We write V = U1 ⊕ · · · ⊕ Uk .
8
Note 1: If k = 2 then this reduces to exactly the same concept as we have just been
studying. Moreover, if k > 2 then U1 ⊕ U2 ⊕ · · · ⊕ Uk = (· · · ((U1 ⊕ U2 ) ⊕ U3 ) ⊕ · · · ⊕ Uk ).
satisfy
U1 ∩ U2 = U1 ∩ U3 = U2 ∩ U3 = {0}
and yet it is clearly not true that F 2 is their direct sum.
Note 3: If V = U1 ⊕ U2 ⊕ · · · ⊕ Uk and
Pk Bi is a basis of Ui then B1 ∪ B2 ∪ · · · ∪ Bk
is a basis of V . In particular, dim V = i=1 dim Ui . The proof, which is not deep, is
offered as an exercise:
I v = v = P1 v + · · · + Pk v = (P1 + · · · + Pk ) v .
v = I v = (P1 + · · · + Pk ) v = P1 v + · · · + Pk v = u1 + · · · + uk .
9
Suppose that v = w1 + · · · + wk where wi ∈ Ui for 1 6 i 6 k . Then Pi wi = wi since Pi
is a projection onto Ui . And if j 6= i then Pj wi = Pj (Pi wi ) = (Pj Pi ) wi , so Pj wi = 0
since Pj Pi = 0. Therefore
Pi v = Pi (w1 + · · · + wk ) = Pi w1 + · · · + Pi wk = wi ,
Note: It is a matter of important routine to check that the vector space axioms are
satisfied (see the exercise below). It is also important that, when invited (for example
by an examiner) to define the dual space of a vector space, you specify not only the set,
but also the operations which make that set into a vector space.
Exercise 13 (worth doing carefully once in one’s life, but not more than once—unless
an examiner offers marks for it). Check that the vector space axioms are satisfied, so
that V ′ defined as above really is a vector space over F .
Note: Some authors use V ∗ or Hom(V, F ) or HomF (V, F ) for the dual space V ′ .
10
In particular, dim V ′ = dim V .
Proof. Define fi as follows. For v ∈ V we set fi (v) := αi where α1 , . . . , αn ∈ F are
such that v = α1 v1 + · · · + αn vn . This definition is acceptable because v1 , . . . , vn span
V and so such scalars α1 , . . . , αn certainly exist; moreover, since v1 , . . . , vn are linearly
independent the coefficients α1 , . . . , αn are uniquely determined by v . If w ∈ V , say
w = β1 v1 + · · · + βn vn , and λ, µ ∈ F then
³ X X ´
fi (λ v + µ w) = fi λ αj vj + µ βj vj
j j
³X ´
= fi (λ αj + µ βj ) vj
j
= λ αi + µ βi
= λ fi (v) + µ fi (w) ,
Since f and g are linear and agree on a basis of V we have f = g , that is, f =
′
P
j f (v j ) fj . Thus f1 , . . . , fn is indeed a basis of V , as the theorem states.
Exercise 14. Let F be a field with at least 4 elements and let V be the vector
space of polynomials c0 + c1 x + c2 x2 + c3 x3 of degree 6 3 with coefficients from F .
(i) Show that for a ∈ F the map fa : V → F given by evaluation of polynomial p
at a (that is, fa (p) = p(a) ) is a linear functional.
(ii) Show that if a1 , a2 , a3 , a4 are distinct members of F then {fa1 , fa2 , fa3 , fa4 }
is a basis of V ′ , and find the basis {p1 , p2 , p3 , p4 } of V of which this is the dual
basis.
11
(ii) Generalise to the vector space of polynomials of degree 6 n over F .
Let V be a vector space over the field F . For a subset X of V the annihilator is
defined by
X ◦ := {f ∈ V ′ | f (x) = 0 for all x ∈ X } .
Note: X ◦ = {f ∈ V ′ | X ⊆ Ker f }.
12
Therefore U1◦ + U2◦ = (U1 ∩ U2 )◦ , as required.
To finish this study of dual spaces we examine the second dual, that is, the dual of
the dual. It turns out that if V is a finite-dimensional vector space then the second dual
V ′′ can be naturally identified with V itself.
Dual transformations
Let V and W be vector spaces over a field F , and let T : V → W be a linear
transformation. The dual transformation T ′ : W ′ → V ′ is defined by
T ′ (f ) := f ◦ T for all f ∈ W ′ .
13
Proof. We need first to show that if f ∈ W ′ then T ′ f ∈ V ′ . But if f ∈ W ′ then
f : W → F and f is linear, so, since T : V → W is linear and composition of linear
maps produces a linear map, also f ◦ T : V → F is linear, that is, T ′ f ∈ V ′ . Now let
f1 , f2 ∈ W ′ and α1 , α2 ∈ F . Then for any v ∈ V we have
Proofs. Clause (1) is immediate from the definition since I ′ f = f ◦ I = f for all
f ∈ W ′ . Clauses (2), (3), (4) are straightforward routine and should be done as an
exercise (see below). For (5) we need to know what is meant by f (T ) when f is a
polynomial with coefficients from F and T is a linear transformation V → V . It means
precisely what you would expect it to mean: non-negative integral powers of T are
defined by T 0 := I , T n+1 := T ◦ T n for n > 0, and then if f (x) = a0 + a1 x + · · · + an xn
then f (T ) is the corresponding linear combination a0 I + a1 T + · · · + an T n of the powers
of T . The fact that f (T )′ = f (T ′ ) therefore follows from (1), (2) and (4) by induction
on the degree of f .
Exercise 16. Let U , V and W be vector spaces over a field F , and let S : U → V ,
T : V → W be linear transformations. Show that (T S)′ = S ′ T ′ . Deduce that if
T : V → V is an invertible linear transformation then (T −1 )′ = (T ′ )−1 .
Exercise 17. Let V and W be vector spaces over a field F and let T1 , T2 : V → W
be linear transformations. Show that if α1 , α2 ∈ F then (α1 T1 + α2 T2 )′ = α1 T1′ + α2 T2′
We ask now what can be said about the matrix of a dual transformation with respect
to suitable bases in W ′ and V ′ .
14
Proof. Let f1 , . . . , fm and h1 , . . . , hn be the relevant dual bases in V ′ and W ′
respectively, so that
( (
1 if p = i , 1 if q = j ,
fp (vi ) = hq (wj ) =
0 if p 6= i , 0 if q 6= j .
Let ai,j , bp,q be the (i, j)- and (p, q)-coefficients of A and B respectively. By definition
then
Xn m
X
T vi = aj,i wj and T ′ hq = bp,q fp .
j=1 p=1
Now
n
³X ´
′
(T hq )(vi ) = hq (T vi ) = hq aj,i wj = aq,i .
j=1
T ′ hq ′
P
It follows that = p aq,p fp
(compare the proof on p. 11 that the dual basis spans).
Therefore bp,q = aq,p for 1 6 p 6 m and 1 6 q 6 n, that is, B = Atr as the theorem
states.
For we know that a matrix and its transpose have the same rank. By studying the
kernel and image of a dual transformation we get a more “geometric” understanding of
the corollary, however:
15
and therefore Im T ′ = (Ker T )◦ , as required.
Response. Note that in fact it is irrelevant that the field of coefficients is R, so we’ll
do this for vector spaces over an arbitrary field F .
For the first part let v ∈ V and write v = v1 + v2 where v1 ∈ V1 and v2 ∈ V2 . Note
that P v2 = v2 since (IV − P ) v2 = 0. Then P v = P v1 + P v2 = 0 + v2 = v2 , and
P 2 v = P v2 = v2 . Thus P 2 v = P v for all v ∈ V and so P 2 = P .
Defining dual spaces and dual transformations is bookwork treated earlier in these
notes. Then for f ∈ V ′ we have that
(P ′ )2 f = P ′ (P ′ (f )) = P ′ (f ◦ P ) = (f ◦ P ) ◦ P = f ◦ P 2 = f ◦ P = P ′ f
and so (P ′ )2 = P ′ .
The proof of the theorem on p. 7 above that idempotent operators are projections
includes a proof that V ′ = U1 ⊕ U2 where U1 = Ker (P ′ ) and U2 = Ker (IV ′ − P ′ ) [and
this bookwork is what the examiner would have expected candidates to expound here].
Definition of dual basis and proof that it really is a basis is bookwork treated earlier in
these notes (see p. 10 above). Now suppose that E ⊆ V1 ∪V2 . Thus E = {v1 , v2 , . . . , vn },
where we may suppose that v1 , . . . , vk ∈ V1 and vk+1 , . . . , vn ∈ V2 . Let f1 , . . . , fn be
the dual basis E ′ of V ′ , so that fi (vj ) = 0 if i 6= j and fi (vi ) = 1. Consider an
index i such that 1 6 i 6 k . Now (P ′ fi )(vj ) = fi (P vj ), and so if 1 6 j 6 k then
(P ′ fi )(vj ) = 0 since P vj = 0, while if k +1 6 j 6 n then (P ′ fi )(vj ) = 0 since P vj = vj
and fi (vj ) = 0. Thus (P ′ fi )(vj ) = 0 for all relevant j and therefore P ′ fi = 0, that is,
fi ∈ U1 . Similarly, if k + 1 6 i 6 n then fi ∈ U2 . That is, E ′ ⊆ U1 ∪ U2 as required.
′ ′
µ of¶P with respect to E is the same as the matrix
And now it is clear that the matrix
0 0
of P with respect to E , namely , where r := n − k , Ir is the r × r identity
0 Ir
matrix and the three entries 0 represent k ×k , k ×r and r ×k zero matrices respectively.
16
Further exercises I
Exercise 18. Let V be an n-dimensional vector space over a field F .
Show that B(U ) is a subspace of the space Hom(V, V ) of all linear transforma-
tions V → V and that dim B(U ) = n2 − m n + m2 .
(ii) A flag in V is an increasing sequence {0} = U0 < U1 < U2 < · · · < Uk = V of
subspaces beginning with {0} and ending with V itself. For such a flag F we
define
Exercise 19. Let V be a vector space over a field F such that charF 6= 2, and let
E1 , E2 , E3 be idempotent linear transformations V → V such that E1 + E2 + E3 = I ,
where I : V → V is the identity transformation. Show that Ei Ej = 0 when i 6= j (that
is, {E1 , E2 , E3 } is a partition of the identity on V ). [Hint: recall Exercise 11 on p. 8.]
Give an example of four idempotent operators E1 , E2 , E3 , E4 operators on V such
that E1 + E2 + E3 + E4 = I but {E1 , E2 , E3 , E4 } is not a partition of the identity on V .
Exercise 20. Let F := Z2 , the field with just two elements 0 and 1, let
V := {f ∈ F N | supp(f ) is finite},
the subspace of F N defined in Exercise 4 (see p. 3). Thus V may be thought of as the
vector space of sequences (x0 , x1 , x2 , . . .) where each coordinate xi is 0 or 1 and all
Pmany of the coordinates are 0. For each subset S ⊆ N define ϕS : V → F
except finitely
by ϕS (f ) = n∈S f (n).
(i) Show that ϕS ∈ V ′ for all S ⊆ N.
(ii) Show that in fact V ′ = {ϕS | S ⊆ N}. [Hint: for each n ∈ N let en ∈ V be the
function such that en (n) = 1 and en (m) = 0 when m 6= n; then, given ϕ ∈ V ′
define S by S := {n ∈ N | ϕ(en ) = 1} and seek to show that ϕ = ϕS .]
(iii) Show that V is countable but V ′ is uncountable.
17
Part II: Some theory of a single linear transformation on
a finite-dimensional vector space
then A = (aij ). Note that A is the transpose of the array of coefficients that would be
written out on the page if we did not use summation notation. That might seem a little
artificial, but this convention ensures that if S : V → V and S has matrix B then S ◦ T
has matrix A B . The point to remember is that with respect to a given basis of V there
is a one-to-one correspondence between linear transformations T : V → V and n × n
square matrices over F , where n := dim V .
For most of this section we will use linear transformations or n × n matrices over F
interchangeably, whichever is the more convenient for the job in hand. Therefore you are
advised to recall your facility for calculating with matrices. Here is an exercise to help.
X n
X
parity(ρ)
detA = (−1) a
1 1ρ a
2 2ρ ··· a n nρ and traceA = aii ,
ρ∈Sym (n) i=1
18
field F (they could even come from a commutative ring). The basic properties are the
same, for example that det(AB) = (detA)(detB), that trace(AB) = trace(BA), and
that A is invertible (in the sense that there exists B such that A B = B A = I ) if and
only if detA 6= 0.
Now for our linear transformation T : V → V we define the determinant of T by
detT := detA, and we define the trace by traceT := traceA where A is the matrix of
T with respect to some basis of V . On the face of it these definitions might depend on
the particular basis that is used—in which case they would be of doubtful value. But in
fact they do not:
Proof. We know from Mods that if B is the matrix representing T with respect to
another basis then B = U −1 A U where U is the matrix whose entries are the coefficients
needed to express the elements of one basis as linear combinations of the elements of the
other. Therefore
detB = (detU )−1 (detA) (detU ) = detA
and
traceB = trace((U −1 A) U ) = trace(U (U −1 A)) = traceA,
as required.
cA (x) := det(xI − A) .
where
c1 = traceT , cn = detT, etc.
Here ‘etc.’ hides a great deal of detailed information. The other coefficients cr are im-
portant but more complicated functions of A—in fact cr is the sum of the determinants
of all the so-called r × r principal submatrices of A, that is, square submatrices of A of
which the diagonals coincide with part of the diagonalµof A. For¶example, c2 is the sum
aii aij
of the determinants of all the 21 n(n − 1) submatrices for 1 6 i < j 6 n.
aji ajj
19
Just as in the case of linear algebra over R, we define eigenvalues and eigenvectors of
T as follows: a scalar λ ∈ F is said to be an eigenvalue of T if there exists a non-zero
vector v ∈ V such that T v = λ v ; and a vector v ∈ V is said to be an eigenvector of T
if v 6= 0 and there exists a scalar λ ∈ F such that T v = λ v .
Proof. That the set of all linear transformations V → V forms a vector space should
be clear since we can add linear transformations and multiply them by scalars, and the
vector space axioms can easily be checked. The correspondence of linear transformations
with matrices is obviously a vector-space isomorphism, and the space of n × n matrices
has dimension n2 since the matrices Ep q , where Ep q has 1 as its (p, q) entry and 0
elsewhere, form a basis.
20
Proof. Since the set of all linear transformations V → V forms a vector space of
2
dimension n2 , the transformations I , T , T 2 , . . ., T n , of which there are n2 + 1, must
be linearly dependent. Therefore for 0 6 i 6 n2 there exist ci ∈ F , not all 0, such that
Pn2 i
Pn2 i
i=0 ci T = 0. So if f (x) := i=0 ci x then f (x) ∈ F [x] \ {0} and f (T ) = 0.
For, if f1 , f2 ∈ F [x] are minimal polynomials of T (or of A), then f1 , f2 are monic
and of the same degree, say m. Therefore if g := f1 − f2 then either g = 0 or deg g < m.
But g(T ) = f1 (T )−f2 (T ) = 0, and so since m is the least degree of non-zero polynomials
which annihilate T , it must be that g = 0: that is, f1 = f2 .
Notation. We’ll write mT (x) (or mA (x) when A is an n × n matrix over F ) for
the minimal polynomial of T (or of A). Note that if A ∈ Mn×n (F ) and A represents T
with respect to some basis of V then mT (x) = mA (x).
For, if f ∈ F [x], then f (S) = U −1 f (T ) U (see Exercise 21), and so f (S) = 0 if and
only if f (T ) = 0.
Examples: • mI (x) = x − 1;
• m0 (x) = x;
1 0 0
• if A = 0 2 0 then mA (x) = (x2 − 3x + 2).
0 0 2
Proof. Let f ∈ F [x]. Since F [x] has a Division Algorithm we can find q, r ∈ F [x]
such that f (x) = q(x) mT (x)+r(x) and either r = 0 or deg r < deg mT . Now mT (T ) = 0
and so f (T ) = 0 if and only if r(T ) = 0. It follows from the minimality of deg mT that
f (T ) = 0 if and only if r = 0. That is f (T ) = 0 if and only if mT (x) divides f (x) in
F [x], as required.
21
Exercise 24. Let Ann(T ) := {f ∈ F [x] | f (T ) = 0}, the so-called annihilator of
T in F [x]. Show that Ann(T ) is an ideal in F [x], and that mT is a generator—that is,
Ann(T ) is the principal ideal (mT ) in F [x].
Next we examine the roots of the minimal polynomial: it turns out that they are
precisely the eigenvalues of T in F :
Note. In fact mT (x) and cT (x) always have the same irreducible factors in F [x].
0 1 0 0 1 1 0 1
0 0 1 0 −2 −1 −1 0
Exercise 25. For each of the matrices 1 0 0 0,
0 0 2 −5
0 0 0 1 0 0 1 −2
find the characteristic polynomial and the minimal polynomial.
Exercise 26. [Part of a former FHS question.] (i) Let A be a 3×3 matrix whose
characteristic polynomial is x3 . Show that there are exactly three possibilities for the
minimal polynomial of A and give an example of matrices of each type.
(ii) Let V be a finite-dimensional vector space over some field F , and let T : V → V
be a linear transformation whose minimal polynomial is xk . Prove that
{0} < Ker T < Ker T 2 < · · · < Ker T k = V
22
The Primary Decomposition Theorem
The Primary Decomposition Theorem is a very useful result for understanding linear
transformations and square matrices. Like many of the best results in mathematics it is
more a theory than a single cut and dried theorem and I propose to present three forms
of it. The first is the basic, all-purpose model which contains the main idea; the second
gives some detail about the minimal polynomial; and the third is designed to extract a
considerable amount of detail from the prime factorisation of the minimal polynomial.
Throughout this section notation is as before: F is a field, V is a finite-dimensional
vector space over F , and T : V → V is linear. Recall that a subspace U is said to be
T -invariant if T (U ) 6 U , that is, T u ∈ U for all u ∈ U . When this is the case we write
T |U for the restriction of T to U . Thus T |U : U → U and TU u = T u for all u ∈ U .
Reminder: although two linear operators do not usually commute, if they are poly-
nomials in one and the same operator T then they certainly do commute. For, powers
of T obviously commute and therefore so do linear combinations of powers of T .
Proof. Our problem is to find subspaces U and W of V that have the specified
properties. Those properties include that g(T ) u = 0 for all u ∈ U and h(T ) w = 0 for
all w ∈ W . Therefore we know that we must seek U inside Ker g(T ) and W inside
Ker h(T ). In fact we define
and prove that these subspaces do what is wanted. Certainly, if u ∈ U then g(T )(T u) =
T g(T ) u = T 0 = 0. Thus if u ∈ U then T u ∈ U , so U is T -invariant. Similarly, W
is T -invariant. And the facts that g(T |U ) = 0 and h(T |W ) = 0 are immediate from the
definitions of U and of W . It remains to prove therefore that V = U ⊕ W .
From the theory of polynomials rings over a field we know that, since g , h are
coprime, there exist a, b ∈ F [x] such that
Then
a(T ) g(T ) + b(T ) h(T ) = I,
where I : V → V is the identity as usual. For v ∈ V define
23
Exercise 27. With the notation and assumptions of the Primary Decomposition
Theorem, let P be the projection of V onto U along W . Find p(x) ∈ F [x] such that
P = p(T ).
f (T ) v = mT |U (T ) mT |W (T ) (u + w)
= mT |W (T ) mT |U (T ) u + mT |U (T ) mT |W (T ) w
= 0 + 0 = 0.
V = V1 ⊕ V 2 ⊕ · · · ⊕ V k ,
Proof. We use induction on k . The result is trivially true if k = 1, that is, if mT (x)
is simply a power of some irreducible polynomial. Our inductive hypothesis is that if U
is a finite-dimensional vector space over F and if the minimal polynomial of S : U → U
factorises as a product of k − 1 powers of irreducible polynomials, then U decomposes
as a direct sum as described in the statement of the theorem (with S replacing T ).
So now suppose that mT (x) = f1 (x)a1 f2 (x)a2 · · · fk (x)ak , where f1 , f2 , . . . , fk are
distinct monic irreducible polynomials over F . Let
k−1
Y
g(x) := fi (x)ai and h(x) := fk (x)ak .
i=1
24
By the Primary Decomposition Theorem, Mark 2, V = U ⊕ W , where U , W are T -
invariant and mT |U = g , mT |W = h. Applying the induction hypothesis to U and T |U
we see that U = U1 ⊕ · · · ⊕ Uk−1 , where the subspaces Ui are T |U -invariant and the
minimal polynomial of the restriction of T |U to Ui is fiai for 1 6 i 6 k −1. But of course
this simply means that Ui is T -invariant and the minimal polynomial of the restriction
of T to Ui is fiai for 1 6 i 6 k − 1. Define Vi := Ui for 1 6 i 6 k − 1 and Vk := W to
complete the proof.
Proof. Suppose first that mT (x) may be factorised as a product of distinct lin-
ear factors in F [x]. This means that there exist distinct scalars λ1 , . . . , λk such that
mT (x) = (x − λ1 ) · · · (x − λk ). By the Primary Decomposition Theorem (Mark 3),
V = V1 ⊕ · · · ⊕ Vk where Vi is T -invariant for 1 6 i 6 k and T |Vi − λi IVi = 0. Thus all
vectors v in Vi satisfy the equation T v = λi v . If Bi is a basis of Vi then Bi consists
of eigenvectors of T with eigenvalue λi , and so if B := B1 ∪ · · · ∪ Bk then B is a basis
of V consisting of eigenvectors. Thus T is diagonalisable.
Now suppose conversely that T is diagonalisable and let B be a basis of V consist-
ing of eigenvectors of T . Let λ1 , . . . , λk be the distinct members of F that occur as
eigenvalues for the vectors in B . Define f (x) := (x − λ1 ) · · · (x − λk ). We propose to
show that f (T ) = 0. Let v ∈ B . Then there exists i ∈ {1, . . . , k} such that T v = λi v .
Therefore (T − λi I) v = 0 and so f (T ) v = 0. Since f (T ) annihilates all members of a
basis of V its null-space (kernel) is V , that is, f (T ) = 0. It follows that mT divides
f in F [x]. That would be enough to show that mT (x) is a product of some of the
factors (x − λi ) and therefore factorises as a product of distinct linear factors in F [x].
But in fact we can go a little further—we know that every eigenvalue is a root of mT
and therefore in fact mT = f , that is, mT (x) = (x − λ1 ) · · · (x − λk ).
Response. The first clause is the ‘bookwork’ we have just treated. For the second
part, let λ1 , . . . , λk be the distinct eigenvalues of T and for λ ∈ F define
Ui := {v ∈ V | T v = λi v}
25
consisting of eigenvectors and this means that V = U1 ⊕ · · · ⊕ Uk . This is part of what
we have just proved; note that the fact that eigenvectors for distinct eigenvalues are
linearly independent was what you proved in Exercise 4 of the ‘Mods Revision’ Sheet.
Now we prove (see that same Exercise 4) that each subspace Ui is S -invariant. For,
if v ∈ Ui then T (S v) = S(T v) = S(λi v) = λi S v , and so S v ∈ Ui . Now the minimal
polynomial mS|U divides mS , and mS may be factorised as a product of distinct linear
i
factors in F [x], so the same is true of mS|U . It follows that there is a basis ui 1 , . . . , ui di
i
of Ui consisting of eigenvalues of S . Note, however, that since these are non-zero vectors
in Ui they are automatically eigenvectors also for T . And now if
B := {ui j | 1 6 i 6 k, 1 6 j 6 di }
Exercise 28. For each of the matrices in Exercise 23, thought of as matrices with
coefficients in C, find an invertible matrix Y over C such that Y −1 XY is diagonal, or
prove that no such Y exists. Can such matrices be found, whose coefficients are real?
0 1 0
Exercise 29. Let A := 1 1 0 . Is A diagonalisable when considered as a
1 1 1
matrix over the following fields: (i) C ; (ii) Q ; (iii) Z2 ; (iv) Z5 ?
Triangular form
There are some linear transformations T and some square matrices A that cannot
be diagonalised—for example the
µ transformation
¶ T : F 2 → F 2 given by T : (x, y) 7→
1 1
(x + y, y) or the matrix A := . Nevertheless, one can choose bases with respect
0 1
to which the matrices are particularly simple or particularly well-adapted to show the
26
behaviour of T or of A. The most sophisticated of these give the so-called Rational
Canonical Form and the Jordan Normal Form of matrices, but these are quite a long
way beyond the scope of this course. Triangular form is a step in the direction of the
Jordan Normal Form, and although it is quite a small step, it is extremely useful.
As previously, throughout this section F is a field, V is a finite-dimensional vector
space over F , n := dim V , and T : V → V is a linear transformation.
Note the comparison with the diagonalisability theorem on p. 25. There it was the
minimal polynomial that mattered, and it had to factorise completely with distinct roots
in F . Here it is the characteristic polynomial that matters, and it must factorise comp-
letely over F but can have multiple roots.
Proof. This is clear one way round: we have already seen that if the matrix of
T is the upperQtriangular matrix aij with respect to a suitable basis then cT (x) =
det(x I − A) = (x − aii ).
For the converse we use induction on the dimension n. There is nothing to prove
if n = 0 or if n = 1. So suppose that n > 2 and that the theorem is known to hold
for
Qn linear transformations of smaller-dimensional vector spaces. Suppose that cT (x) =
i=1 (x − λi ), where λ1 , . . . , λn ∈ F . Let λ be any one of the λi . For the sake of
definiteness let’s define λ := λn . Let W := Im (T − λ I) and let m := dim W . Since λ is
an eigenvalue, dim (Ker (T −λ I)) > 1 and so (by the Rank-Nullity Theorem) m 6 n−1.
Let w1 , . . . , wm be a basis of W and extend this to a basis w1 , . . . , wm , vm+1 , . . . , vn
of V . Now W is T -invariant because if w ∈ W then w = T v − λ v for some v ∈ V ,
and so T w = T 2 v − λ T v = (T − λ I) T v ∈ W . Therefore there is an m × m matrix
(aij ) such that
Xm
T wj = aij wi for 1 6 j 6 m.
i=1
Also, since for m + 1 6 j 6 n we know that T vj − λ vj ∈ W there is an m × (n − m)
matrix (bij ) such that
m
X
T vj = λ vj + bij wi for m + 1 6 j 6 n.
i=1
27
µ ¶
A1 B1
The matrix A of T with respect to this basis has the partitioned form ,
0 λ In−m
where A1 = (aij ) (the matrix of T |W with respect to the basis w1 , . . . , wm of W ),
B1 = (bij ), and the matrix in the south-west corner is an (n − m) × m zero matrix. Now
and so cT |W (x) divides cA (x) in F [x]. Therefore cT |W (x) may be written as a product
of linear factors in F [x] and so by inductive hypothesis there is a basis v1 , . . . , vm of
W with respect to which the matrix A′ of T |W is upper triangular. µ ′ Then ¶the matrix
A B′
of T with respect to the basis v1 , . . . , vm , vm+1 , . . . , vn of V is , for some
0 λ In−m
m × (n − m) matrix B ′ , and this is upper triangular, as required.
The proof of the theorem gives a practical method for finding a basis with respect to
which the matrix of T is triangular. We find an eigenvalue λ of T ; then (T − λI)V is a
proper T -invariant subspace of V . Find a triangularising basis there, and extend to V .
Choose the vector (1, −1, −1)tr to span this one-dimensional space; then by inspec-
tion we find that W is spanned by this vector together with (0, 1, 0)tr , and so as
triangularising basis for V we can take
1 0 0
−1 , 1 , 0 .
−1 0 1
28
And in fact, with respect to this basis the matrix becomes
1 2 3
0 1 −2 .
0 0 1
µ ¶
4 9
Exercise 32. Let A := , construed as a matrix over an arbitrary
−1 −2
field F . Find an invertible 2 × 2 matrix P over F for which P −1 A P is triangular. Are
there any fields F over which P can be found such that P −1 A P is diagonal?
Exercise 33. For each of the matrices in Exercise 24, thought of as matrices with
coefficients in C, find an invertible matrix Y over C such that Y −1 XY is upper trian-
gular. Can such matrices be found, whose coefficients are real?
Note. Historically this was a theorem about square matrices rather than linear
transformations. Indeed, it was the second of the four assertions above. For reasons
both historical and practical, in these notes I propose to work with n × n matrices over
F . The translation to linear transformations T : V → V (for a finite-dimensional vector
space V over F ) is, however, absolutely routine.
There are many proofs of the theorem. It lies deeper than other parts of this course,
and some of those proofs give little insight into just why the theorem holds. They merely
prove it. In order to give you some insight into why it is true I propose to begin with three
simple but suggestive observations which lead directly to an easy proof of the theorem
over any subfield of C.
f (B) = f (P −1 A P ) = P −1 f (A) P
29
and so f (B) = 0 if and only if f (A) = 0. That is, cA (A) = 0 if and only if cB (B) = 0,
as required.
This can be thought of in another way. Choose a basis for an n-dimensional vector
space over V and let T : V → V be the linear transformation whose matrix is A with
respect to this basis. The correspondence between matrices and linear transformations
says that cA (A) = 0 if and only if cT (T ) = 0. But now if B = P −1 A P then B is
simply the matrix of the same linear transformation T with respect to a different basis,
so cB (B) = 0 if and only if cT (T ) = 0. Therefore cA (A) = 0 if and only if cB (B) = 0.
For, by the previous observation this is true if and only if it is true for diagonal
matrices. Suppose then that, in an obvious notation, A = Diag (λ1 , λ2 , . . . , λn ). Then
cA (x) = (x−λ1 ) (x−λ2 ) · · · (x−λn ), and so cA (A) = (A−λ1 I) (A−λ2 I) · · · (A−λn I).
Each factor (A − λi I) is a diagonal matrix, and its ith diagonal entry is 0. Therefore
cA (A) is a diagonal matrix and its ith diagonal entry is 0 for every i; that is, cA (A) = 0.
As it happens, there is a strong sense in which ‘almost all’ matrices are diagonalisable,
and therefore these simple arguments have already proved the Cayley–Hamilton Theorem
for ‘most’ matrices (and therefore for ‘most’ linear transformations). We can however
take one further step and prove the theorem for triangularisable matrices.
Proof. Again, by the first observation above, we may assume that in fact A is
upper triangular. Then this observation was Qn 6 on the preliminary
µ ¶ exercise sheet
A1 B
(Mods Revision). We think of A as partitioned in the form where A1 is an
0 λn
(n − 1) × (n − 1) upper triangular matrix, B is a (n − 1) × 1 column vector, and 0
denotes the 1 × (n − 1) zero row vector. Then
det(x I − A) = det(x I − A1 ) × det(x − λn ),
that is, cA (x) = cA1 (x) (x − λn ). For any polynomial f ∈ F [x] we find that
µ ¶
f (A1 ) C
f (A) = ,
0 f (λn )
where C is some (n − 1) × 1 column vector, and 0 again denotes the 1 × (n − 1) zero row
vector. As inductive assumption we may assume that cA1 (A1 ) = 0. Then, for suitable
column vectors C , C ′ ,
cA (A) = cA1 (A) (A − λn I)
A1 − λn I C ′
µ ¶µ ¶
cA1 (A1 ) C
=
0 cA1 (λn ) 0 0
A1 − λn I C ′
µ ¶µ ¶
0 C
=
0 cA1 (λn ) 0 0
µ ¶
0 0
= .
0 0
30
Corollary. The Cayley–Hamilton Theorem holds for matrices over C and for
matrices over any subfield F of C, such as Q or R.
Proof. We know (see the note on p. 28) that every matrix over C is trangularisable,
and so the result follows immediately from Observation 3.
If A ∈ Mn×n (F ), where F is a subfield of C then we can think of A as a matrix
over C. As such it is known to be annihilated by its characteristic polynomial, which is
what we wanted to show.
It is worth a digression to take a brief look at the original source of the theorem. This
is A. Cayley, ‘A memoir on the theory of matrices’, Phil. Trans Roy. Soc. London, 148
(1858), 17–37, which is reprinted in The Collected Mathematical Papers of Arthur Cayley,
Vol. II, pp. 475–495. Hamilton’s connection with the theorem seems to have been more
tenuous and to have come from one of his theorems about his quaternions. Here is a
quotation from Cayley’s introduction to his paper:
I obtain the remarkable theorem that any matrix whatever satisfies an algeb-
raical equation of its own order, [ . . . ] viz. the determinant, formed out of the matrix
diminished by the matrix considered as a single quantity involving the matrix unity,
will be equal to zero.
And here is an extract from §§21–23 (I have tried to reproduce Cayley’s notation for
matrices faithfully, with the first row enclosed in round brackets, the rest enclosed in
vertical bars):
21. The general theorem before referred to will be best understood by a com-
plete development of a particular case. Imagine a matrix
( a, b ),
c, d
M 2 − (a + d) M 1 + (ad − bc) M 0 ;
( a2 + bc , b(a + d) ), ( a, b ), ( 1, 0 ),
2
c(a + d), d + bc c, d 0, 1
and substituting these values the determinant becomes equal to the matrix zero,
[ . . . ].
[. . .]
23. I have verified the theorem, in the next simplest case of a matrix of the
order 3 , viz. if M be such a matrix, suppose
( a, b, c ),
d, e, f
g, h, i
31
then the derived determinant vanishes, or we have
˛ ˛
˛ a − M, b , c ˛ = 0,
˛ ˛
˛ d , e − M, f ˛
˛ ˛
˛ g , h , i−M ˛
or expanding
but I have not thought it necessary to undertake the labour of a formal proof of the
theorem in the general case of a matrix of any degree.
What a charming piece of 19th Century chutzpah: “I have not thought it necessary
to undertake the labour of a formal proof of the theorem in the general case of a matrix of
any degree.” It is very unlikely that the method which Cayley uses for 2×2 matrices, and
sketches for the 3 × 3 case could be practical for the n × n case: even if one could write
down explicitly the characteristic polynomial of an n × n matrix A, it seems unrealistic
to expect to write down the (i, j) coefficient of a general power Ak for all k up to n, and
then evaluate cA (A) in the way that it is possible in the 2×2 case. So the fact is, he can’t
really have had a proof. But there’s another fact: he may not have appreciated rigorous
thinking in the same way as Oxford students now do (the poor chap went to Cambridge
and missed out on the Oxford experience) but he did have a wonderful insight. He knew
that the theorem was right, and for him proof, though it would have been nice to have,
was less important than having an insight which he could use in all sorts of ways to solve
other mathematical problems.
Further exercises II
Exercise 35. Let V be a finite-dimensional vector space over a field F and let
T : V → V be a linear transformation.
(i) Suppose that F = C and that T 4 = T . Show that T is diagonalisable.
(ii) Now suppose that F = Z3 and that mT (x) = x4 − x. Is T diagonalisable?
32
Exercise 37 [FHS 1995, A1, 3]. Let V be a finite-dimensional vector space over a
field F , and let T : V → V be a linear transformation. Let v ∈ V , v 6= 0, and suppose
that V is spanned by {v, T v, T 2 v, . . . , T j v, . . .}.
(i) Show that there is an integer k > 1 such that v, T v, T 2 v, . . . , T k−1 v are linearly
independent but T k v = α0 v + α1 T v + · · · + αk−1 T k−1 v for some α0 , α1 , . . . ,
αk−1 ∈ F .
(ii) Prove that {v, T v, T 2 v, . . . , T k−1 v} is a basis for V .
0 0 0 1
vector v ∈ R4 such that R4 is spanned by {v, Av, A2 v, . . . , Aj v, . . .}.
33
Part III: Inner Product Spaces
In the first two parts of this course the focus has been on linear algebra—vector spaces,
linear transformations and matrices—over an arbitrary field F . We come now to that
part of linear algebra which is rather more geometric, the theory of inner products and
inner product spaces. Inner products are generalisations P of the familiar dot product u · v
of n-vectors with real coordinates, defined by u · v = xi yi (where u has coordinates
x1 , . . . , xn and v has coordinates y1 , . . . , yn ). Although this theory can be extended
to arbitrary fields, or at least parts of it can, its most important and most natural
manifestations are over R and C. Therefore from now on we restrict to these fields.
A function V × V → R satisfying (1) and (1′ ) is said to be bilinear and called a bilinear
form. Thus an inner product on a real vector space is a positive definite symmetric
bilinear form. A real inner product space is a vector space over R equipped with an
inner product.
Notation. Often we find hu, vi or hu | vi used to denote what here is B(u, v). I
propose to use the former. In an inner product space we define
1
||u|| := hu, ui 2 .
Thus ||u|| > 0 and ||u|| = 0 if and only if u = 0; this is known as the length or as the
norm of the vector u.
Note. From condition (1′ ) for an inner product it follows that hu, 0i = 0 for any
u ∈ V . Conversely, Suppose that u ∈ V and hu, vi = 0 for all v ∈ V . Then in particular
hu, ui = 0 and so u = 0. This property of an inner product is expressed by saying that
it is a non-degenerate or non-singular bilinear form.
34
Let V be a real inner product space. If u, v ∈ V and hu, vi = 0 then we say that
u, v are orthogonal. For X ⊆ V we define
X ⊥ := {v ∈ V | hu, vi = 0 for all u ∈ X }.
For u ∈ V we write u⊥ for {u}⊥ .
Lemma. An orthogonal set of non-zero vectors in a real inner product space is lin-
early independent.
35
Theorem. Let V be a finite-dimensional real inner product space, let n := dim V ,
and let u ∈ V \{0}. There is an orthonormal basis v1 , v2 , . . . , vn in which v1 = ||u||−1 u.
Exercise 38. Let V be a real inner product space and let U be a subspace of V .
Show that (U ⊥ )⊥ = U .
u = x1 v1 + · · · + xn vn , w = y1 v1 + · · · + y n vn ,
36
The significance of this theorem is twofold. First, it shows the value of an orthonormal
basis for computing norms and inner products. Secondly, it shows that if we use an
orthonormal basis to identify our inner product space with Rn then our abstract inner
product is identified with the familiar dot product:
Exercise 39. Let V be a vector space over R, and let hu, vi1 and hu, vi2 be
inner products defined on V . Prove that if hx, xi1 = hx, xi2 for all x ∈ V , then
hu, vi1 = hu, vi2 for all u, v ∈ V .
We express this by saying that B is semilinear or conjugate linear in its second variable,
or that B is sesquilinear (one-and-a-half linear). A complex inner product is often called
a Hermitian form in honour of Charles Hermite (1822–1901). A complex inner product
space is a complex vector space equipped with an inner product in this sense.
Notation. As in the real case hu, vi or hu | vi are often used for inner products.
1
And as in the real case we define ||u|| := hu, ui 2 .
37
Let V be a complex inner product space. Orthogonality is defined just as in real inner
product spaces. A finite-dimensional complex inner product space has an orthonormal
basis. If U 6 V and U is finite-dimensional then V = U ⊕ U ⊥ . The proofs of these
simple facts are almost exactly the same as in the real case and are offered as exercises
for the reader:
u = x1 v1 + · · · + xn vn , w = y1 v1 + · · · + y n vn ,
Isometry of complex inner product spaces is defined exactly as in the real case and
we have the following important consequence of the theorem:
Exercise 42. Let V be a vector space over C, and let hu, vi1 and hu, vi2 be
inner products defined on V . Is it true that if hx, xi1 = hx, xi2 for all x ∈ V , then
hu, vi1 = hu, vi2 for all u, v ∈ V ? [Compare Exercise 39.]
38
distinct entities, and although they are of course intimately related to each other it seems
best to keep them separate. Nevertheless, from now on I propose to treat their theories
together.
Proof. Since u1 6= 0 we can define v1 := ||u1 ||−1 u1 . Then ||v1 || = 1 and Span(v1 ) =
Span(u1 ). Suppose as inductive hypothesis that 1 6 k < n and we have found an
orthonormal set v1 , . . . , vk in V such that Span(v
P 1 , . . . , vk ) = Span(u1 , . . . , uk ). For
1 6 i 6 k define αi := huk+1 , vi i and w := i αi vi . (We should think geometrically
of w as the orthogonal projection of uk+1 into the subspace spanned by v1 , . . . , vk ,
which is the same as the subspace spanned by u1 , . . . , uk .) Now w ∈ Span(v1 , . . . , vk ) =
Span(u1 , . . . , uk ) whereas uk+1 ∈
/ Span(u1 , . . . , uk ). Therefore if v := uk+1 − w then
v 6= 0 and we can define vk+1 := ||v||−1 v . Then ||vk+1 || = 1. Also,
where the last equation comes from the fact that w ∈ Span(u1 , . . . , uk ).
39
{e1 , e2 , . . . , ek } is a basis for the subspace spanned by {b1 , b2 , . . . , bk }. Deduce
that there are tij ∈ R with tii 6= 0 such that
Response. The first three instructions ask for bookwork that has just been treated
PjFor the fourth note that since {e1 , e2 , . . . , en } is an orthonormal basis of V and
above.
bj = r=1 trj er we have hbj , ei i = tij . So we calculate as follows:
DX X E X n
X
hbi , bj i = tri er , tsj es = tri tsj her , es i = tki tkj ,
r s r,s k=1
Pn
since her , es i is 0 if r 6= s and is 1 if r = s = k . That is, hbi , bj i = k=1 hbi , ek ihbj , ek i,
as required.
Now let G be the matrix (hbi , bj i) [which is known as the Gram matrix of the inner
product with respect to the basis {b1 , b2 , . . . , bn }], and let T be the matrix (tij ). The
formula that has just been proved tells us that G = Q T tr T . Therefore detG = Qn(detT )2 .
But T is upper triangular and therefore detT = tii . Hence detG = i=1 tii as 2
required. Since tii 6= 0 for all relevant i we see that detG 6= 0 and so G is non-singular.
Exercise 43. Let V beR the vector space of polynomials of degree 6 3 with real
1
coefficients. Define hf, gi := −1 f (t) g(t) dt. Show that this is an inner product on V .
Use the Gram–Schmidt process to find an orthonormal basis for V .
R1
Exercise 44. How does the answer to Exercise 43 change if hf, gi := 0 f (t) g(t) dt?
Bessel’s Inequality
Bessel’s Inequality. Let V be a real or complex inner product space and let
v1 , . . . , vm be an orthonormal set in V . If u ∈ V then
m
X
|hu, vi i|2 6 ||u||2 .
1
40
Proof. In the following calculation
Pm we shall use notation appropriate to the complex
case. For u ∈ V define w := u − i=1 hu, vi i vi . Then hw, wi > 0, but also
D m
X m
X E
hw, wi = u − hu, vi i vi , u − hu, vi i vi
i=1 i=1
D m
X E m
DX E m
DX m
X E
= hu, ui − u, hu, vi i vi − hu, vi i vi , u + hu, vi i vi , hu, vj i vj
i=1 i=1 i=1 j=1
m
X m
X m
X
= hu, ui − hu, vi i hu, vi i − hu, vi i hvi , ui + hu, vi i hu, vj ihvi , vj i.
i=1 i=1 i,j=1
Note that in the real P case absolute values are not needed on the left side of the
m 2 2
inequality. It states that 1 hu, vi i 6 ||u|| . Also, complex conjugation is not needed
in the proof. Nor is it harmful, though if a friend (such as a tutor or an examiner)
specifically asks for a proof of Bessel’s Inequality for real inner product spaces then one
should expound the proof without it.
which is of course obvious, but gives us some insight into what the theorem is saying in
its general and abstract setting.
If r, s ∈ Z then Z 1 ½
2πi(r−s)t 1 if r = s,
hvr , vs i = e dt =
0 0 if r 6= s,
so {vk (t) | k ∈ Z} is an orthonormal set in V . For f ∈ V define
Z 1
ck := hf, ek i = f (t) e−2πikt dt .
0
41
Then Bessel’s Inequality tells us that for m, n ∈ N
n
X Z 1
|ck |2 6 |f (t)|2 dt,
−m 0
Classic alternative proof for real inner product spaces. Suppose that V is a real
inner product space and let u, v ∈ V . If v = 0 the result holds for trivial reasons, so we
suppose that v 6= 0. For x ∈ R define
Thus f (x) is a quadratic function of x and it is positive semi-definite in the sense that
f (x) > 0 for all x ∈ R. Therefore its discriminant is 6 0, and so hu, vi2 6 ||u||2 ||v||2 ,
that is, |hu, vi| 6 ||u||.||v||, as required.
Let V be a vector space over the complex numbers. Define what is meant by
the statement that V has an inner product over C , and explain how this can be
used to define the norm ||v|| of a vector v in V .
Prove Bessel’s Inequality, that if {u1 , . . . , un } is an orthonormal set in V and
v ∈ V then
Xn
|hui , vi|2 6 ||v||2 ,
i=1
where h , i denotes the inner product. Deduce, or prove otherwise, the Cauchy–
Schwarz Inequality,
42
(ii) Show that if a1 , . . . , an are strictly positive real numbers then
³X ´³X 1 ´
ai > n2 .
ai
Response. The first parts are ‘bookwork’ done above (but perhaps now is a good
moment for you to close this book and try to reconstruct the proofs for yourself).
For (i) take V := Cn , the space of 1 × n column vectors with its usual hermitian
)tr and v := (b1 ,P
inner product. Take u := (a1 , . . . , anP . . . , bn )tr in the Cauchy–Schwarz
tr 2 |ai |2 and ||v||2 =
P 2
Inequality. Since hu, viP= u v̄ =P ai bi , P ||u|| = |bi | , this
2 2 2
theorem tells us that | ai bi | 6 ( |ai | ) ( |bi | ), as required.
1/2 −1/2
Now for (ii) replace ai and bi in (i) with ai and ai respectively to see that
n
³X n
³X n
´³X ´
1)2 6 ai a−1
i ,
i=1 i=1 i=1
1/ai ) > n2 .
P P
that is, ( ai )
For (iii) take V to be the vector space of complex valued continuous functions on
Rb
[a, b] with inner product hf, gi := a f (t) g(t) dt. It needs to be checked that this does
define an inner product on V [bookwork]. Then the Cauchy–Schwarz inequality says
Rb Rb Rb
that for any f, g ∈ V , ( a |f (x)|2 dx) ( a |g(x)|2 dx) > ( a f (x) g(x) dx)2 . Specialising
to the case where f , g are real-valued, we get the required inequality
³Z b ´³ Z b ´ ³Z b ´2
2 2
f (x) dx g(x) dx > f (x) g(x) dx .
a a a
[Comment: note that in the statement of part (iii) of the question the modulus bars
are unnecessary. Also, it seems a bit odd to define V in this last part to consist of
complex-valued functions on [a, b] when the question asks about real-valued functions,
but since the first part asks for a proof of Bessel’s Inequality and the Cauchy–Schwarz
Inequality for complex inner product spaces, that is what we have available.]
Note 2. It follows that the isometries of our inner product space V form a group:
certainly I is an isometry; if P is an isometry then P −1 is an isometry; if P1 , P2 are
isometries then so is P2 ◦ P1 .
43
Note 3. In the real case an isometry is known as an orthogonal transformation;
the group is the orthogonal group denoted O(V ). The group O(Rn ) is often denoted
O(n), sometimes O(n, R) or On (R).
Note 5. Thus an n × n matrix A with real entries is orthogonal if and only if the
columns of A form an orthonormal basis for Rn .
Note 7. U(Cn ) = {A ∈ Mn×n (C) | A−1 = Ātr }. The proof is similar to the real
case and is offered as an exercise:
Note 8. Thus an n × n matrix A with complex entries is unitary if and only if the
columns of A form an orthonormal basis for Cn .
Observation. the group O(n) is a closed bounded subset of Mn×n (R). Similarly,
U(n) is a closed bounded subset of Mn×n (C).
The group O(n) is closed in Mn×n (R) because if Am are orthogonal matrices and
A = limm→∞ Am then Atr A = lim Atr tr
m lim Am = lim(Am Am ) = lim I = I , which shows
that
P A is orthogonal. It is bounded because if A = (aij ) ∈ O(n) then for each i we have
a2 = 1 and so |a | 6 1 for all relevant i, j . The proof for U(n) is almost the same.
j ij ij
44
infinite dimensional spaces—so-called Hilbert spaces). This will be the foundation for
our treatment of adjoint transformations in the fourth and final part of these notes.
Moreover, vf is unique.
Note. The map f 7→ vf from V ′ to V is linear in the real case, semilinear in the
complex case:
Exercise 47. Prove this. That is, show that if F = R then in the theorem above
vαf +βg = α vf + β vg for all f, g ∈ V ′ and all α, β ∈ R, and that if F = C then
vαf +βg = ᾱ vf + β̄ vg for all f, g ∈ V ′ and all α, β ∈ C.
45
if and only if u1 , . . . , ul is an orthonormal basis for V .
¯Xn ¯2 n
X
Exercise 49. Prove that if z1 , . . . , zn ∈ C then ¯ zi ¯ 6 n |zi |2 . When
¯ ¯
i=1 i=1
does equality hold?
Exercise 50 [Part of FHS 2005, AC2, 2]. Prove that for any positive real number a
k
X k
X
ai a−i > k 2 .
i=1 i=1
46
Part IV: Adjoints of linear transformations
on finite-dimensional inner product spaces
In this fourth and final part of these notes we study linear transformations of finite-
dimensional real or complex inner product spaces. What we are aiming for is the so-called
Spectral Theorem for self-adjoint transformations.
Theorem. Let V be a finite-dimensional inner product space. For each linear trans-
formation T : V → V there is a unique linear transformation T ∗ : V → V such that
hu, T ∗ (α1 v1 + α2 v2 )i = hT u, α1 v1 + α2 v2 i
= α1 hT u, v1 i + α2 hT u, v2 i
= α1 hu, T ∗ v1 i + α2 hu, T ∗ v2 i
= h u, α1 T ∗ v1 + α2 T ∗ v2 i.
We have seen before that if hu, w1 i = hu, w2 i for all u ∈ V then w1 = w2 . It follows
that
T ∗ (α1 v1 + α2 v2 ) = α1 T ∗ v1 + α2 T ∗ v2 ,
and therefore T ∗ is linear, as required.
Example. Let V := Rn with its usual inner product hu, vi = utr v . Suppose
T : V → V is given by T : v 7→ Av where A ∈ Mn×n (R). Then T ∗ : v 7→ Atr v . For,
47
Example. Let V := Cn with its usual inner product hu, vi = utr v̄ . Suppose
T : V → V , T : v 7→ Av where A ∈ Mn×n (C) . Then T ∗ : v 7→ Ā tr v . The proof
is very similar to that worked through in the real case.
Proof. We deal with (ii) and (iii), leaving (i) and (iv) as exercises. For (ii), for all
u, v ∈ V ,
P The matrix A is (aij ),∗say, where the coefficients are defined by the equations
Proof.
T ej = i aij ei . PSimilarly, if A = (bij ) then the coefficients bij are defined by the
equations T ej = i bij ei . Now hT ep , eq i = hep , T ∗ eq i for all relevant p, q by definition
∗
48
Next we investigate the kernel and the image of an adjoint transformation.
Theorem. Let V be a finite-dimensional real or complex inner product space and let
T : V → V be a linear transformation. Then Ker T ∗ = (Im T )⊥ and Im T ∗ = (Ker T )⊥ .
Exercise 53 [Part of an old FHS question]. Consider the vector space of real-
valued polynomials
R1 of degree 6 1 in a real variable t, equipped with the inner product
hf, gi = 0 f (t) g(t) dt. Let D be the operation of differentiation with respect to t. Find
the adjoint D∗ .
Exercise 54 [Compare FHS 2005, AC1, Qn R 1 2]. In the previous exercise, how does
D∗ change if the inner product is changed to −1 f (t) g(t) dt?
49
Proof. Let U be a T -invariant subspace of V and let v ∈ U ⊥ . Then for any u ∈ U
Exercise 56 [Part of FHS 1990, A1, Qn 4]. Let V be the vector space of all n × n
real matrices with the usual addition and scalar multiplication. For A, B in V , let
hA, Bi = Trace(AB tr ), where B tr denotes the transpose of B .
(i) Show that this defines an inner product on V .
(ii) Let P be an invertible n×n matrix and let θ : V → V be the linear transformation
given by θ(A) = P −1 AP . Find the adjoint θ∗ of θ .
(iii) Prove that θ is self-adjoint if and only if P is either symmetric or skew-symmetric
(P is skew-symmetric if P tr = −P ).
Therefore λ ||v||2 = λ̄ ||v||2 and since ||v||2 > 0 we must have λ = λ̄, that is, λ is real.
50
Proof. We have T u = λ u, T v = µv and λ 6= µ. Therefore
We have just seen that µ̄ = µ and so we see that λhu, vi = µhu, vi. Since λ 6= µ we
must have hu, vi = 0 as the theorem states.
Vi := {v ∈ V | T v = λi v},
Theorem.
(1) If A ∈ Mn×n (R) and Atr = A then there exists U ∈ O(n) and there exists
a diagonal matrix D ∈ Mn×n (R) such that U −1 A U = D . (Recall: U ∈ O(n)
means that U −1 = U tr .)
(2) If A ∈ Mn×n (C) and Ā tr = A then there exists U ∈ U(n) and there exists a
diagonal matrix D ∈ Mn×n (R) such that U −1 A U = D . (Recall: U ∈ U(n) means
that U −1 = Ū tr .)
The force of this theorem is that real symmetric matrices can be diagonalised by an
orthogonal change of basis—that is, by a rotation or a reflection (though in fact one can
always do it with a rotation). Similarly, a conjugate-symmetric complex matrix can be
diagonalised by a unitary transformation. This is simply the special case of the previous
theorem where V is Rn or Cn and T : v 7→ A v .
51
As preparation for our third version of the diagonalisablity theorem, the so-called
Spectral Theorem, we need to examine self-adjoint projection operators
52
Response. Definition of adjoint and proof that it is unique is bookwork done above.
To see that Ker (A) = Ker (A∗ A) note first that if u ∈ Ker (A) then certainly (A∗ A) u =
A∗ (A u) = 0, so Ker (A) 6 Ker (A∗ A). On the other hand, if u ∈ Ker (A∗ A) then
hA u, A ui = hu, A∗ A ui = hu, 0i = 0
and so A u = 0. This shows that Ker (A∗ A) 6 Ker (A) and so Ker (A∗ A) = Ker (A).
The definition of self-adjoint is bookwork given above. So now suppose that S is
self-adjoint. We know (and the examiner lets us quote) thatµthere¶is a basis of V
D 0
with respect to which the the matrix of S is partitioned as where D is a
0 0
diagonal matrix which in a self-explanatory notation may be written Diag (λ1 , . . . , λr )
where λ1 , . . . , λr are non-zero real numbers (not necessarily distinct).µ Thus¶ in this
D2 0
notation r(S) = r . With respect to that same basis the matrix of S 2 is . Since
0 0
D2 = Diag (λ12 , . . . , λr2 ) we see that tr(S 2 ) = λ2i . [In fact, this is true for any linear
P
transformation as one sees from consideration of its triangular form.] Taking vectors
u := (λ1 , . . . , λr )tr ∈ Rr and v := (1, . . . , 1)tr ∈ Rr and applying Pthe Cauchy–Schwarz
Inequality in the form hu, vi2 6 ||u||2 ||v||2 to them we see that ( λi )2 6 r
P 2
λi , that
2 2
is, (tr(S)) 6 r(S) tr(S ), as required.
For the last part let A : V → V be any linear transformation and define S := A∗ A.
Then S is self-adjoint (since S ∗ = A∗ A∗∗ = A∗ A = S ) and from the first part of
the question Ker S = Ker A. By the Rank-Nullity Theorem therefore r(S) = r(A).
Applying what has just been proved we see that (tr(A∗ A))2 6 r(A) tr((A∗ A)2 ), as
required.
Exercise 57 [FHS 1988, A1, Qn 3]. Let V be a finite-dimensional real inner prod-
uct space, T : V → V a linear transformation and T ∗ its adjoint. Prove the following:
1
Note. If charF 6= 2 then we may replace each of aij , aji by 2 (aij + aji ). Thus we
may (and we always do) assume that aij = aji .
53
aij xi xj = xtr A x where x is the column vector (x1 , x2 , . . . , xn )tr
P
Note. Then
and A is the matrix (aij ), which, it is worth emphasising, now is symmetric.
There is a close connection between quadratic and bilinear forms (which, recall, were
defined on p. 34).
Real quadratic forms, that is to say, quadratic forms over R are of particular impor-
tance in most branches of mathematics, both pure and applied.
λi > 0 if 1 6 i 6 p, λi < 0 if p + 1 6 i 6 r , λi = 0 if r + 1 6 i 6 n.
54
Define −1/2
λi
if 1 6 i 6 p,
µi := (−λi )−1/2 if p + 1 6 i 6 r,
1 if r + 1 6 i 6 n.
and E := Diag (µ1 , . . . , µn ). Clearly, E is invertible and
Ip 0 0
tr
E DE = 0 −Iq 0 ,
0 0 0
Note 1. The parameter r in the theorem is called the rank of Q. It is the matrix
rank of A.
then
Q(x1 , . . . , xn ) = Y12 + · · · + Yp2′ − Yp2′ +1 − · · · − Yr2′
for all (x1 , . . . , xn )tr ∈ Rn . Define q := r − p (as above) and q ′ := r′ − p′ . Call a
subspace U of Rn positive if Q(u) > 0 for all u ∈ U \ {0}, and call a subspace W
non-positive if Q(u) 6 0 for all u ∈ W \ {0}. Clearly, if U is a positive subspace and W
is a non-positive subspace then U ∩ W = {0}, so dim U + dim W 6 n. Define subspaces
U1 , U2 , W1 , W2 by
U1 := {u ∈ Rn | Xp+1 = . . . = Xn = 0},
W1 := {u ∈ Rn | X1 = . . . = Xp = 0},
U2 := {u ∈ Rn | Yp′ +1 = . . . = Yn = 0},
W2 := {u ∈ Rn | Y1 = . . . = Yp′ = 0}.
55
(Note that here Xi = 0 and Yj = 0 are to be construed as linear equations in the
coordinates x1 , . . . , xn of u.) Then U1 and U2 are positive subspaces of dimensions
p, p′ respectively and W1 , W2 are non-positive subspaces of dimensions n − p, n − p′
respectively. It follows that p + (n − p′ ) 6 n so that p 6 p′ , and p′ + (n − p) 6 n so that
p′ 6 p. Therefore p = p′ .
Note that a similar argument with negative and non-negative subspaces will prove
that q = q ′ , and therefore r = p + q = p′ + q ′ = r′ . But of course the fact that r = r′
also comes from the fact that this is the rank of the matrix A of the quadratic form.
Note 3. The invariance of rank and signature is known as Sylvester’s Law of Inertia.
In geometry and in mechanics we often need to study two real quadratic forms si-
multaneously. The following theorem has a number of applications, in particular to the
study of small vibrations of a mechanical system about a state of stable equilibrium.
det(xA − B) = 0
56
Proof. Let A and B be the real symmetric matrices associated with Q and R
respectively, so that
C tr = U tr B tr (U tr )tr = U tr B U = C,
and
R(x1 , . . . , xn ) = (P v)tr B (P v) = v tr (P tr B P ) v = a1 X12 + · · · + an Xn2 ,
as required.
We have seenQthat U tr A U = I and U tr B U = D = DiagQ (a1 , . . . , an ). Clearly,
det(xI − D) = Q (x − ai ). Thus det(x U tr A U − U tr B U ) = (x − ai ). Therefore
det(x A − B) = c (x − ai ) where c := (detU )−2 . So a1 , . . . , an are the roots of the
equation det(x A − B) = 0 and this competes the proof of the theorem.
Exercise 59 [An old FHS question]. (i) Let A, B be real symmetric n×n matrices.
Suppose that all the eigenvalues of A are positive and let T be an invertible n×n matrix
such that T tr AT = In . [It is a theorem of the course that such a matrix exists]. Show
that T tr BT is symmetric and deduce that there exists an invertible n × n matrix S
such that both S tr AS = In and S tr BS is diagonal. Show also that the diagonal entries
of S tr BS are the roots of the equation det(xA − B) = 0.
57
Further exercises IV
Exercise 61. Express the following quadratic forms as sums or differences of
squares of linearly independent linear forms in x, y , z : x2 + 2xy + 2y 2 − 2yz − 3z 2 ;
xy + yz + xz . [Note: it is probably quicker to use the method of ‘completing the
square’ than to use the method given by the proof of the theorem on diagonalisation of
real quadratic forms.]
Exercise 64 [Cambridge Tripos Part IA, 1995]. Let V be a real inner product
1
space, and let ||v|| := hv, vi 2 . Prove that
Hence show that ||y|| ||z|| 6 ||z − x + y|| ||x|| + ||z − x|| ||x − y||.
58
Exercise 66. [FHS 1996, A1, Qn 3.] Let V be a finite-dimensional real inner
product space.
(a) If v ∈ V show that there is a unique element θv in the dual space V ′ of V such
that
θv (u) = hu, vi for all u ∈ V.
(b) Show that the map θ : V → V ′ given by θ(v) = θv (for v ∈ V ) is an isomorphism.
[You may assume that dim V = dim V ′ .]
(c) Let W be a subspace of V , and let W ⊥ be the orthogonal complement of W in
V . Show that θ(W ⊥ ) = W ◦ , where W ◦ is the annihilator of W in V ′ .
Now let V be the space of polynomials in x of degree at most 2, with real coefficients.
Define an inner product on V by setting
Z 1
hf, gi = f (x) g(x) dx
0
√ √
for f, g ∈ V . You may assume that {1, 3(1 − 2x), 5(6x2 − 6x + 1)} is an orthonormal
basis for V . Show that the map φ : V → R given by φ(f ) = f ′′ (0) defines a linear
functional on V (i.e. show that φ ∈ V ′ ), and find v ∈ V such that θv = φ.
0 0 0 1
vector v ∈ R such that R is spanned by {v, A v, A2 v, . . . , Aj v, . . .}.
4 4
59