0% found this document useful (0 votes)
18 views

Math Note

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views

Math Note

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 106

Math 354

Notes

Charlie Cruz
Contents

Chapter 1 Vector Spaces Page 6


1.1 Fields 6
1.2 Vectors Spaces 10
1.3 Subspaces 12

Chapter 2 Finite-Dimensional Vector Spaces Page 18


2.1 Span and linear independence 18
2.2 Basis 23
2.3 Dimension 26

Chapter 3 Linear Transformations Page 29


3.1 Linear Maps 29
3.2 Null spaces and Ranges 32
3.3 Matrix of a linear map 38
3.4 Invertible Linear Maps 43
3.5 Products and quotients of Vector Spaces 57

Chapter 4 Polynomials Page 63

Chapter 5 Eigenvalues, Eigenvectors, and Invariant Subspaces Page 69


5.1 Invariant subspaces (5.A + 5.B) 69
5.2 Eigenspaces 75

Chapter 6 Inner Product Spaces Page 79


6.1 Inner product spaces 79
6.2 Orthonormal bases 86

Chapter 7 Operators on Inner Product Spaces Page 92

1
Chapter 8 Operators on Complex Vector Spaces Page 93

Chapter 9 Operators on Real Vector Spaces Page 94

Chapter 10 Determinants and Traces Page 95


10.1 Determinants and Permutations 95

2
𝜙
𝐴 𝐵
𝜓
𝜂
𝐶 𝐷
Information about the class: Math 354 Honors Linear Algebra.
Professor Varilly-Alvarado (Dr. V.) with his email being [email protected].
Office: HBH 222
Office Hours: Monday 3:30 – 5:00 pm, F 3-4PM (to be confirmed)
We will be using the book Linear Algebra Done Right by Sheldon Axler.

Notation Definition
ℕ The set of natural numbers = {1, 2, 3, 4, . . .}
ℤ The set of integers = {. . . , −2, −1, 0, 1, 2 . . .}
ℚ The set of rational numbers = { 𝑏𝑎 | 𝑎, 𝑏 ∈ ℤ, 𝑏 ≠ 0}
ℝ The set of real numbers
ℂ The set of complex numbers = {𝑎 + 𝑏𝑖 | 𝑎, 𝑏 ∈ ℝ, 𝑖 2 = −1}
∈ The symbol ∈ means “is an element of” or “belongs to”
∉ The symbol ∉ means “is not an element of” or “does not belong to”
⊂ The symbol ⊂ means “is a subset of”
⊆ The symbol ⊆ means “is a subset of or equal to”
∩ The symbol ∩ means “intersection of”
∪ The symbol ∪ means “union of”
\ The symbol \ means “set difference of”
∅ The symbol ∅ means “the empty set”
∀ The symbol ∀ means “for all”
∃ The symbol ∃ means “there exists”
| The symbol | in {𝑎 | 𝑎 ∈ ℝ} means “such that”
=⇒ The symbol =⇒ means “implies”
⇐⇒ The symbol ⇐⇒ means “if and only if”
𝑎® The symbol 𝑎® means “the vector 𝑎”

Mathematical Induction: Set of Natural Numbers


(a) ℕ = {1, 2, 3, 4, . . .} Natural Numbers
(b) ℤ = {. . . , −2, −1, 0, 1, 2, . . .} Integers
Mathematical Induction is a technique of proof that allows you to verify statements
indexed by ℕ or a subset of ℤ.

Example 0.0.1
For all 𝑛 ∈ ℕ, it is true that:

𝑛(𝑛 + 1)
1+2+3+...+𝑛 =
2
e.g.: 𝑛 = 𝑞 1= 2 ,
1·2
and so on for 𝑛 = 2, 3, . . ..

Induction: 𝑃(𝑛) is a statement depending on 𝑛 ∈ ℕ.


e.g. ”1 + 2 + 3 + . . . + 𝑛 = 𝑛(𝑛+1)
2 ”
Now suppose that
(i) 𝑃(1) is true (base case)
(ii) If 𝑃(𝑘) is true for some 𝑘 ∈ ℕ, then 𝑃(𝑘 + 1) is true (inductive step)
Then 𝑃(𝑛) is true for all 𝑛 ∈ ℕ.

3
Note:-
Think about this as a domino effect.
𝑛(𝑛+1)
Example: Let 𝑃(𝑛) : 1 + 2 + . . . + 𝑛 = 2 .

(i) 𝑃(1) is true because 1 = 2 .


1(1+1)

(ii) Suppose 𝑃(𝑘) is true for some 𝑘 ∈ ℕ. Then

𝑘(𝑘 + 1)
1 + 2 + . . . + 𝑘 + (𝑘 + 1) = + (𝑘 + 1)
2
𝑘(𝑘 + 1) + 2(𝑘 + 1)
=
2
(𝑘 + 1)(𝑘 + 2)
=
2
(𝑘 + 1)((𝑘 + 1) + 1)
=
2
𝑛(𝑛 + 1)
= where 𝑛 = 𝑘 + 1
2

Therefore, 𝑃(𝑘 + 1) is true.

Note:-
Baby Version Let 𝐴 ⊆ ℕ be a subset. Suppose that
(i) 1 ∈ 𝐴
(ii) If 𝑘 ∈ 𝐴, then 𝑘 + 1 ∈ 𝐴
Then 𝐴 = ℕ
Baby version =⇒ PMI (let 𝐴 {𝑛 ∈ ℕ : 𝑝(𝑛) is true})

Example 0.0.2 (Same proof in two different ways)


Let 𝑝(𝑛) : 1
1∗2 + 1
2∗3 + 1
𝑛(𝑛+1)
= 𝑛
𝑛+1 for all 𝑛 ∈ ℕ.

Proof 1.0: For 𝑛 ∈ ℕ, let 𝑝(𝑛) be the statement:


1 1 1 𝑛
+ + = .
1 ∗ 2 2 ∗ 3 𝑛(𝑛 + 1) 𝑛 + 1
We show that 𝑝(𝑛) is true for all 𝑛 ∈ ℕ by induction.
Base Case: 𝑝(1) is true because 1
1∗2 = 1
2 = 1+1 .
1

Inductive Step: Assume that 𝑝(𝑘) is true for some 𝑘 ∈ ℕ. Then

1 1 1 𝑘
+ +...+ = (1)
1∗2 2∗3 𝑘(𝑘 + 1) 𝑘+1
We want to deduce that 𝑝(𝑘 + 1) is true i.e.

1 1 1 1 𝑘+1
+ +...+ + = (2)
1∗2 2∗3 𝑘(𝑘 + 1) (𝑘 + 1)(𝑘 + 2) 𝑘+2
To do this, add 1
(𝑘+1)(𝑘+2)
to both sides of (1).

4
1 1 1 1 𝑘 1
+ +...+ + = + .
1∗2 2∗3 𝑘(𝑘 + 1) (𝑘 + 1)(𝑘 + 2) 𝑘 + 1 (𝑘 + 1)(𝑘 + 2)
The RHS of this equation is:

𝑘 1 𝑘(𝑘 + 2) + 1 𝑘 2 + 2𝑘 + 1 (𝑘 + 1)2 𝑘+1


+ = = = =
𝑘 + 1 (𝑘 + 1)(𝑘 + 2) (𝑘 + 1)(𝑘 + 2) (𝑘 + 1)(𝑘 + 2) (𝑘 + 1)(𝑘 + 2) 𝑘+2
This shows that 𝑝(𝑘 + 1) true, completing the induction step.
Therefore, 𝑝(𝑛) is true for all 𝑛 ∈ ℕ.

Proof 2.0: We prove the statement by induction on 𝑛.


The base case, when 𝑛 = 1, is true because 1∗2
1
= 12 = 1+1
1
.
For the inductive step, assume the claim is the true for some 𝑘 ∈ ℕ.

1 1 1 𝑘
+ +...+ =
1∗2 2∗3 𝑘(𝑘 + 1) 𝑘+1
Add 1
(𝑘+1)(𝑘+2)
to both sides of the equation to obtain

1 1 1 1 𝑘 1
+ +...+ + = +
1∗2 2∗3 𝑘(𝑘 + 1) (𝑘 + 1)(𝑘 + 2) 𝑘 + 1 (𝑘 + 1)(𝑘 + 2)
𝑘(𝑘 + 2) + 1
=
(𝑘 + 1)(𝑘 + 2)
𝑘 2 + 2𝑘 + 1
=
(𝑘 + 1)(𝑘 + 2)
(𝑘 + 1)2
=
(𝑘 + 1)(𝑘 + 2)
𝑘+1
=
𝑘+2
Then, the claim is true for 𝑘 + 1, completing the inductive step.
We deduce that the claim is true for all 𝑛 ∈ ℕ by induction.

5
Chapter 1

Vector Spaces

1.1 Fields
Definition 1.1.1: Fields or ”sets of scalars”

We have a lot of experience with this, in fact:

(i) ℚ = : 𝑎, 𝑏 ∈ ℤ, 𝑏 ≠ 0 Rational Numbers


𝑎
𝑏

(ii) ℝ, the real numbers: ℚ ⊆ ℝ e.g. 2 ∉ ℚ

(iii) ℂ, the complex numbers: 𝑖 2 + 1 = 0

A field is a set 𝐹 with two operations + and · such that:

(i) Associativity: for all 𝑎, 𝑏, 𝑐 ∈ 𝐹, 𝑎 + (𝑏 + 𝑐) = (𝑎 + 𝑏) + 𝑐 and 𝑎(𝑏𝑐) = (𝑎𝑏)𝑐

(ii) Commutativity: for all 𝑎, 𝑏 ∈ 𝐹, 𝑎 + 𝑏 = 𝑏 + 𝑎 and 𝑎𝑏 = 𝑏𝑎


(iii) Identity: there are elements 0 and 1 in 𝔽 such that for all 𝑎 ∈ 𝐹, we have: 𝑎 + 0 = 𝑎 and 𝑎 · 1 = 𝑎
(iv) Inverses: for all 𝑎 ∈ 𝐹, there is an element 𝑏 such that 𝑎+𝑏 = 0. We write 𝑏 = −𝑎 and 𝑥−𝑦 : = 𝑥+(−𝑦).
Additionally, for any 𝑎 ≠ 0 in 𝐹, there is an element 𝑏 ≠ 0 such that 𝑎𝑏 = 1. We write 𝑏 = 𝑎 −1 .

(v) Distributivity: for all 𝑎, 𝑏, 𝑐 ∈ 𝐹, we have 𝑎(𝑏 + 𝑐) = 𝑎𝑏 + 𝑎𝑐


(vi) 0 ≠ 1

Example 1.1.1
𝐹2 = {0, 1} is a field.
If we write the addition table:
+ 0 1
0 0 1
1 1 0

Now for the multiplication table:

· 0 1
0 0 0
1 0 1

6
Example 1.1.2
Let’s define 𝔽 = {4, ƒ , ◦}
With the addition and multiplication tables as followed:
+ 4 ƒ ◦ · 4 ƒ ◦
4 4 ƒ ◦ 4 4 4 4
ƒ ƒ ◦ 4 ƒ 4 ƒ ◦
◦ ◦ 4 ƒ ◦ 4 ◦ ƒ
𝔽 is a field.
With 𝑜 𝔽 = 4 and 1𝔽 = ƒ.
Now, ”2𝐹 ” = 1𝔽 + 1𝔽 = ƒ + ƒ = ◦.
Let’s also define, −ƒ = ◦.

Theorem 1.1.1 The additive identity of a field 𝔽 is unique.

Proof: Suppose 0𝐹 , 00𝐹 are additive identities of 𝔽.


Then:
Because 𝑂 𝐹 ‘ is an additive identify
z}|{
0𝐹 = 0𝐹 + 00𝐹 = 00𝐹
|{z}
Beacuse 𝑂 𝑓 is an additive identity

Thus, the additive identity is unique.

Theorem 1.1.2
Let 𝔽 be a field and let 𝑎 ∈ 𝔽. Then 𝑎 · 𝑂 𝐹 = 0𝐹 .
Note:-
Don’t think of zero is nothing, think of its meaning and how important it is to a field

Proof: Let −𝑎 · 0𝐹 be the additive inverse of 𝑎 · 0𝐹 .


Then:

0𝐹 + 0𝐹 = 0𝐹 as 0𝐹 is an additive identity
𝑎 · (0𝐹 + 0𝐹 ) = 𝑎 · 0𝐹
𝑎 · 0𝐹 + 𝑎 · 0𝐹 = 𝑎 · 0𝐹 by distributivity
(𝑎 · 0𝐹 + 𝑎 · 0𝐹 ) + (−𝑎 · 0𝐹 ) = 𝑎 · 0𝐹 + (−𝑎 · 0𝐹 )
(𝑎 · 0𝐹 + 𝑎 · 0𝐹 ) + −𝑎 · 0𝐹 = 0𝐹 by additive inverse
𝑎 · 0𝐹 + (𝑎 · 0𝐹 + −𝑎 · 0𝐹 ) = 0𝐹 by associativity
𝑎 · 0𝐹 + 0𝐹 = 0𝐹 by additive inverse
𝑎 · 0𝐹 = 0𝐹 by additive identity

Theorem 1.1.3
Let 𝔽 be a field and let 𝑎 ∈ 𝔽. Then −(−𝑎) = 𝑎.

Known: Additive inverse are unique.

7
(−𝑎) + 𝑎 = 0𝐹
This says that 𝑎 is an additive inverse of −𝑎.
Additive inverses are unique, so 𝑎 must be the additive inverse of −(−𝑎).

Note:-
you can try this at home:
(−1𝐹 ) · (𝑎) = −𝑎
Where −1 is the additive inverse of 1, and −𝑎 is the additive inverse of 𝑎.
Hint: (−1) + 1 = 0 𝑓

Building ℂ out of ℝ : A complex number is an ordered pair (𝑎, 𝑏) of real numbers.

ℂ = {(𝑎, 𝑏) : 𝑎, 𝑏 ∈ ℝ}

(i) Addition: (𝑎, 𝑏) + (𝑐, 𝑑) : = (𝑎 + 𝑐, 𝑏 + 𝑑), where (𝑎, 𝑏) = 𝑧1 and (𝑐, 𝑑) = 𝑧2 . As such, we use ℝ addition to
define ℂ addition.

(ii) Multiplication: (𝑎, 𝑏) · (𝑐, 𝑑) : = (𝑎𝑐 − 𝑏𝑑, 𝑎𝑑 + 𝑏𝑐), where (𝑎, 𝑏) = 𝑧1 and (𝑐, 𝑑) = 𝑧 2
Note:-
You might want to think of (𝑎, 𝑏) as 𝑎 + 𝑏𝑖, where 𝑖 2 = −1. As such:

(𝑎 + 𝑏𝑖) · (𝑐 + 𝑑𝑖) = (𝑎𝑐 − 𝑏𝑑) + (𝑎𝑑 + 𝑏𝑐)𝑖

(iii) Additive identity: (0ℝ , 0ℝ ) = 0ℂ .


(iv) Multiplicative identity: (1ℝ , 0ℝ ) = 1ℂ .
Note:-
We can check that 𝑖 : = (0ℝ , 1ℝ ).
Now, (0ℝ , 1ℝ ) · (0ℝ , 1ℝ ) = (−1ℝ , 0ℝ ) = −(1ℝ , 0ℝ ) = −1ℂ .
|{z}

8
Definition 1.1.2: Lists 𝑇 tuples

Let 𝐹 be a field (think: 𝐹2 , 𝐹3 , ℚ, ℝ, ℂ ).


Then we will have a list of length 𝑛 that they are ordered 𝑥1 , . . . , 𝑥 𝑛 , 𝑥 𝑖 ∈ 𝔽.
Remember order matters! Note (2, 3) ≠ (3, 2)
Let’s define: 𝔽𝑛 = {(𝑥1 , . . . , 𝑥 𝑛 ) : 𝑥 𝑖 ∈ 𝔽, 𝑖 = 1, . . . , 𝑛}. For instance ℝ2 = {(𝑥, 𝑦) : 𝑥, 𝑦 ∈ ℝ}
Sometimes we write 𝑥 or 𝑥® ∈ 𝔽𝑛 for (𝑥 1 , . . . , 𝑥 𝑛 ).
𝐹 𝑛 is the archetype of a ”finite-dimensional vector space”.
This means that the following properties hold:

(i) 𝑥® + 𝑦® = (𝑥1 , . . . , 𝑥 𝑛 ) + (𝑦1 , . . . , 𝑦𝑛 ) ≔ (𝑥 1 + 𝑦1 , 𝑥 2 + 𝑦2 , . . . , 𝑥 𝑛 + 𝑦𝑛 ). Note that we are using the


addition of 𝔽 to define the addition of 𝔽𝑛 .
(ii) Addition has a neutral element: 0® = (0𝐹𝐹 , . . . , 0𝐹𝐹 ), 𝑛 times. Thus:

𝑥® + 0® = (𝑥 1 , . . . , 𝑥 𝑛 ) + (0𝐹𝐹 , . . . , 0𝐹𝐹 )
= (𝑥 1 + 0𝐹𝐹 , . . . , 𝑥 𝑛 + 0𝐹𝐹 )
= (𝑥 1 , . . . , 𝑥 𝑛 )
= 𝑥®

There are additive inverses: if 𝑥® = (𝑥1 , . . . , 𝑥 𝑛 ) then setting − 𝑥® = (−𝑥1 , . . . , −𝑥 𝑛 ) we get

𝑥® + (− 𝑥®) = 0®

(iii) Elements of 𝔽𝑛 can be scaled by elements of 𝔽. If 𝑥® = (𝑥1 , . . . , 𝑥 𝑛 ) and 𝜆 ∈ 𝔽, then 𝜆 · 𝑥® =


(𝜆 · 𝑥1 , . . . , 𝜆 · 𝑥 𝑛 ).

Note:-
Warning! In general, elements of 𝔽𝑛 cannot be multiplied with each other unless we define a multipli-
cation operation on 𝔽𝑛 .

9
1.2 Vectors Spaces
Definition 1.2.1: Vector Space in general

Let 𝐹 be a field, where 𝐹 = (𝐹, +𝐹 , ·𝐹 ).


A vector space over 𝔽 is a set 𝑉 together with two operations:
Define Addition of Vectors as

+ : 𝑉 × 𝑉 ↦→ 𝑉 , (𝑢, 𝑣) ↦→ 𝑢 + 𝑣
And scalar multiplication as

· : 𝔽 × 𝑉 ↦→ 𝑉 , (𝜆, 𝑣) ↦→ 𝜆 · 𝑣
These operations satisfy the following properties:

(i) Commutativity: 𝑢 + 𝑣 = 𝑣 + 𝑢 for all 𝑢, 𝑣 ∈ 𝑉


(ii) Associativity of addition: (𝑢 + 𝑣) + 𝑤 = 𝑢 + (𝑣 + 𝑤) for all 𝑢, 𝑣, 𝑤 ∈ 𝑉. Also:

1 · 𝜆2 ) ·𝑉 𝑣 = 𝜆1 ·𝑉 (𝜆2 ·𝑉 𝑣)
(𝜆𝔽 𝔽 𝔽 𝔽
for all 𝜆1 , 𝜆2 ∈ 𝔽 and 𝑣 ∈ 𝑉

(iii) Additive identity: there is a vector 0𝑉 ∈ 𝑉 such that 0𝑉 + 𝑣 = 𝑣 for all 𝑣 ∈ 𝑉.


(iv) Additive inverse: for every 𝑣 ∈ 𝑉, there is a vector −𝑣 ∈ 𝑉 such that 𝑣 + (−𝑣) = 0𝑉 .
(v) Scalar Multiplicative identity: 1𝔽 · 𝑣 = 𝑣 for all 𝑣 ∈ 𝑉.

(vi) Distributivity: 𝜆 · (𝑢 + 𝑣) = 𝜆 · 𝑢 + 𝜆 · 𝑣 for all 𝜆 ∈ 𝔽 and 𝑢, 𝑣 ∈ 𝑉. Also, (𝜆𝔽


1 + 𝜆2 ) · 𝑣 = 𝜆1 · 𝑣 + 𝜆2 · 𝑣
𝔽 𝔽 𝔽

for all 𝜆1 , 𝜆2 ∈ 𝔽 and 𝑣 ∈ 𝑉.

Note:-
If 𝔽 = ℝ, call 𝑉 a real vector space.
If 𝔽 = ℂ, call 𝑉 a complex vector space.

Summary: To specify a vector space, we need 4 pieces of data:

(a) 𝑉 - the set of vectors

(b) 𝔽 - the ”numbers that we can scale by”


(c) +𝑉 - addition of vectors
(d) ·𝑉 - scalar multiplication

For a while, we will write (𝑉 , 𝔽, +, ·) for all this data.


In fact, (𝑉 , 𝔽, +, ·) = (𝑉 , (𝔽, +𝔽 , ·𝔽 ), +𝑉 , ·𝑉 ).
For instance. Take a field 𝔽, (𝔽𝑛 , 𝔽, +, ·).
Now given 𝑥®, 𝑦® ∈ 𝔽𝑛 , 𝜆 ∈ 𝔽. We can define:

𝑥® + 𝑦® = (𝑥 1 , . . . , 𝑥 𝑛 ) + (𝑦1 , . . . , 𝑦𝑛 )
= (𝑥 1 + 𝑦1 , . . . , 𝑥 𝑛 + 𝑦𝑛 )
𝜆 · 𝑥® = 𝜆 · (𝑥1 , . . . , 𝑥 𝑛 )
= (𝜆 · 𝑥1 , . . . , 𝜆 · 𝑥 𝑛 )

10
Example 1.2.1

(i) (a) (𝑉 , 𝔽, +, ·) = (ℝ2 , ℝ, +, ·) is a real vector space.


(b) (𝑥1 , 𝑦1 ) +ℝ2 (𝑥2 , 𝑦2 ) = (𝑥1 +ℝ 𝑥2 , 𝑦1 +ℝ 𝑦2 ).
(c) 𝜆 ∈ ℝ, 𝜆 ·ℝ (𝑥, 𝑦) = (𝜆 ·ℝ 𝑥, 𝜆 ·ℝ 𝑦).
2

(ii) (a) (𝑉 , 𝔽, +, ·) = (ℂ2 , ℂ, +ℂ2 , ·ℂ2 ) is a complex vector space.


(b) 𝑧1 , 𝑧 2 +ℂ2 (𝑍10 , 𝑍20 ) = (𝑧 1 +ℂ 𝑧10 , 𝑧 2 +ℂ 𝑧 20 ).
(c) 𝜆 ∈ ℂ, 𝜆 ·ℂ (𝑧1 , 𝑧 2 ) = (𝜆 ·ℂ 𝑧1 , 𝜆 ·ℂ 𝑧2 ).
2

(iii) (a) (𝑉 , 𝔽, +, ·) = (ℂ2 , ℝ, +ℝ2 , ·ℝ2 ) is a complex vector space, but we are using real numbers to scale.
(b) Addition is the same as (ii), but 𝑧1 , 𝑧 2 ∈ ℂ2 i.e. 𝑎 + 𝑏𝑖
(c) Scalar multiplication: 𝜆 ∈ ℝ, 𝜆 ·ℝ (𝑧1 , 𝑧 2 ) = (𝜆 ·ℝ 𝑧 1 , 𝜆 ·ℝ 𝑧2 ).
2

(iv) (𝑉 , 𝔽, +, ·) = (𝔽𝑛 , 𝔽, +, ·) is a vector space.


(v) Let 𝐹 be a field, and 𝑆 be a set. Let 𝑉 = 𝔽𝑠 ≔ {functions 𝑓 : 𝑆 ↦→ 𝔽}.

(i) Addition: 𝑓 , 𝑔 ∈ 𝑉, 𝑓 : 𝑆 ↦→ 𝐹, and 𝑔 : 𝑆 ↦→ 𝐹. Then ( 𝑓 + 𝑔) : 𝑆 ↦→ 𝐹 is defined by ( 𝑓 + 𝑔)(𝑠) =


𝑓 (𝑠) + 𝑔(𝑠) for all 𝑠 ∈ 𝑆. Or 𝑠 ↦→ 𝑓 (𝑠) + 𝑔(𝑠)
(ii) Scalar Multiplication: 𝜆 ∈ 𝐹, 𝑓 ∈ 𝑉, 𝑓 : 𝑆 ↦→ 𝐹. We need to show that 𝜆𝔽 ∈ 𝑉 i.e. 𝜆𝔽 : 𝑆 ↦→ 𝐹,
where 𝑠 → 𝜆 𝑓 (𝑠). Also (𝜆 𝑓 )(𝑠) = 𝜆 𝑓 (𝑠) for all 𝑠 ∈ 𝑆.
(iii) Additive identity: 0®𝑉 : 𝑆 → 𝔽, 𝑠 → 0𝐹 .
Check: ( 𝑓 + 0®𝑉 )(𝑠) = 𝑓 (𝑠) + 0®𝑉 (𝑠) = 𝑓 (𝑠) + 0𝐹 = 𝑓 (𝑠) for all 𝑠 ∈ 𝑆.

Now, we want to talk about its relationship to 𝔽𝑛 . Take 𝑆 = {1, 2, . . . , 𝑛}


Now, let 𝑉 = 𝔽{1,...,𝑛} = {functions 𝑓 : {1, . . . , 𝑛} ↦→ 𝔽}.
We can create a function 𝐹 {1,...,𝑛} ↦→ 𝔽𝑛 by:

𝑓 : {1, . . . , 𝑛} ↦→ 𝔽 → ( 𝑓 (1), . . . , 𝑓 (𝑛))

Note:-
This is a bijection!

Example 1.2.2
Let 𝑆 = [0, 1] and 𝐹 = ℝ.
Now set 𝑉 = 𝔽𝑠 = ℝ[0,1] = {functions 𝑓 : [0, 1] → ℝ}.
Another Example.
Let 𝔽 = ℝ.
Now 𝑉 = {polynomials of degree ≤ 19 with coefficients in ℝ}.
Now, +𝑉 = usual addition of polynomials and ·𝑉 = usual scalar multiplication of polynomials.
For instance,

𝑥 19 + 𝑥 + 1 ∈ 𝑉
−𝑥 19 + 𝑥 17 − 𝑥 2 ∈ 𝑉
9 ∗ (𝑥 2 + 2) = 9𝑥 2 + 18 ∈ 𝑉

Note:-
Sometimes we denote that the degree of 0 (the zero polynomial) is −∞.

11
First properties of vector spaces: Let 𝑉 be a vector space over a field 𝔽.

(i) Additive identities are unique. Suppose 0® , 0®0 ∈ 𝑉 are additive identities. Then 0® = 0® + 0®0 = 0®0.
(ii) Additive inverses are unique:
Say 𝑤, 𝑤 0 are additive inverse of 𝑣 ∈ 𝑉.
𝑤 = 𝑤 + 0® = 𝑤 + (𝑣 + 𝑤 0) = (𝑤 + 𝑣) + 𝑤 0 = 0® + 𝑤 0 = 𝑤 0.
(iii) 0 · 𝑣 = 0® , ∀𝑣 ∈ 𝑉.

0𝐹 = 0𝐹 + 0𝐹
=⇒ 0𝐹 · 𝑣 = (0𝐹 + 0𝐹 ) · 𝑣
=⇒ 0𝐹 · 𝑣 = 0𝐹 · 𝑣 + 0𝐹 · 𝑣
0𝐹 · 𝑣 + (−0𝐹 · 𝑣) = (0𝐹 · 𝑣 + 0𝐹 · 𝑣) + (−0𝐹 · 𝑣)
0® = 0𝐹 · 𝑣 + (−0𝐹 · 𝑣 + 0𝐹 · 𝑣)
0® = 0𝐹 · 𝑣 + 0®
0® = 0𝐹 · 𝑣

1.3 Subspaces
Definition 1.3.1: Subspaces

Let (𝑉 , 𝔽, +, ·) be a vector space. A subset 𝑈 ⊆ 𝑉 is a subspace if (𝑈 , 𝔽, +𝑢∈𝑈 , ∗𝑢∈𝑈 ) is a vector space in


its own right.

Example 1.3.1
Let 𝑉 = ℝ3 and 𝔽 = ℝ.
𝑈 = {(𝑥1 , 𝑥 2 , 0) : 𝑥 1 , 𝑥 2 ∈ ℝ} * ℝ3 = 𝑉

1.34 in book / conditions for a subspace: To check that 𝑈 ⊆ 𝑉 is a subspace, it is enough to check:

(i) 0® ∈ 𝑈

(ii) 𝑈 is closed under addition: if 𝑢, 𝑣 ∈ 𝑈 then 𝑢 + 𝑣 ∈ 𝑈


(iii) 𝑈 is closed under scalar multiplication: if 𝑢 ∈ 𝑈 and 𝜆 ∈ 𝔽 then 𝜆 · 𝑢 ∈ 𝑈

Reason: These three conditions ensure that 𝑈 has an additive identity vector, and that addition and scalar
multipliation makes sense in 𝑈.
The remaining axioms for 𝑈 to be a vector space are inherited from 𝑉.

Example 1.3.2
Let’s check associativity of addition: Let 𝑢, 𝑣, 𝑤 ∈ 𝑈.
But we know that 𝑢, 𝑣, 𝑤 ∈ 𝑉 as 𝑈 ⊆ 𝑉, so 𝑢 + (𝑣 + 𝑤) = (𝑢 + 𝑣) + 𝑤(★) in 𝑉.
Since 𝑈 is closed under addition, 𝑢 + 𝑣 ∈ 𝑈.
Again, since 𝑢 + 𝑣 ∈ 𝑈, and 𝑤 ∈ 𝑈, we know that (𝑢 + 𝑣) + 𝑤 ∈ 𝑈.
Likewise, 𝑢 + (𝑣 + 𝑤) ∈ 𝑈. This means that (★) is also true in 𝑈.
Ditto for the other axioms. Thus, we would be proving the same thing twice.

12
Example 1.3.3 (Charlie add the graphs)
𝑉 = ℝ2

(i) 𝑈 = {(𝑎, 𝑎, ) : 𝑎 ¾ 0} is not closed under scalar multiplication.

(ii) 𝑈 = {(𝑎, 𝑎) : 𝑎 ∈ ℝ} ∪ {(−𝑎, 𝑎) : 𝑎 ∈ ℝ} is not closed under addition.


(iii) 𝑈 = {(𝑎, 𝑎 + 𝑎) : 𝑎 ∈ ℝ} does not contain the additive identity of ℝ2

Example 1.3.4
Let 𝔽 = ℝ and 𝑉 = ℝ(0,3) = {functions 𝑓 : (0, 3) → ℝ}.
Let 𝑈 ⊆ 𝑉 be the subset {functions 𝑓 : (0, 3) → ℝ | 𝑓 is differentiable and 𝑓 0(2) = 0}.
Proof: Let’s check that 𝑈 ⊆ 𝑉 is a subspace:

(i) Show that 0®𝑉 ∈ 𝑈: 0®𝑉 is 0®𝑉 : (0, 3) → ℝ, 𝑥 ↦→ 0ℝ .


0®𝑉 is differentiable and 0®𝑉 0(2) = 0.
(ii) Show that 𝑈 is closed under addition: Let 𝑓 , 𝑔 ∈ 𝑈. We need to show that 𝑓 + 𝑔 ∈ 𝑈.
This means that both 𝑓 : (0, 3) → ℝ and 𝑔 : (0, 3) → ℝ are differentiable, and that ( 𝑓 + 𝑔)0(2) = 0.
Then 𝑓 + 𝑔 : (0, 3) → ℝ is differentiable as both 𝑓 and 𝑔 are differentiable.
Moreover, ( 𝑓 + 𝑔)0(2) = 𝑓 0(2) + 𝑔0(2) = 0 + 0 = 0.
Thus, 𝑓 + 𝑔 ∈ 𝑈.
(iii) Show that 𝑈 is closed under scalar multiplication: Let 𝑓 ∈ 𝑈 and 𝜆 ∈ ℝ. We need to show that
𝜆 · 𝑓 ∈ 𝑈.
This means that 𝑓 : (0, 3) → ℝ is differentiable and that (𝜆 · 𝑓 )0(2) = 0.
Then 𝜆 · 𝑓 : (0, 3) → ℝ is differentiable as 𝑓 is differentiable.
Moreover, (𝜆 · 𝑓 )0(2) = 𝜆 · 𝑓 0(2) = 𝜆 · 0 = 0.
Thus, 𝜆 · 𝑓 ∈ 𝑈.

All three conditions are satisfied, so 𝑈 ⊆ 𝑉 is a subspace.

Definition 1.3.2: Sums of Subsets

Let (𝑉 , 𝔽, +, ·) be a vector space over a field 𝔽. Let 𝑈 , 𝑊 ⊆ 𝑉 be subsets.


Let 𝑈1 , . . . , 𝑈𝑚 be subsets of 𝑉.
Where

𝑈1 , . . . , 𝑈𝑚 = {𝑉1 + 𝑉2 + . . . + 𝑉𝑚 : 𝑉𝑖 ∈ 𝑈 𝑖 for all 𝑖 = 1, . . . , 𝑚}

Example 1.3.5
Our field will be 𝔽 = ℝ, vectors space will be 𝑉 = ℝ3 .
Let 𝑈1 = {(𝑥, 0, 0) : 𝑥 ∈ ℝ} , 𝑈2 = {(0, 𝑦, 0) : 𝑦 ∈ ℝ}.
Let 𝑈1 + 𝑈2 = {𝑉1 + 𝑉2 : 𝑉1 ∈ 𝑈1 , 𝑉2 ∈ 𝑈2 }.
This means that this is equal to {(𝑥, 0, 0) + (0, 𝑦, 0) : 𝑥, 𝑦 ∈ ℝ} = {(𝑥, 𝑦, 0) : 𝑥, 𝑦 ∈ ℝ}

Theorem 1.3.1
If 𝑈1 , . . . , 𝑈𝑚 are subspaces of 𝑉, then 𝑈1 + . . . + 𝑈𝑚 is the smallest subspace of 𝑉 containing 𝑈1 , . . . , 𝑈𝑚 .
Have to prove that:

13
(i) 𝑈1 + . . . + 𝑈𝑚 is a subspace of 𝑉 (not just a subset).

(ii) 𝑈1 ⊆ 𝑈1 + . . . , 𝑈𝑚 , 𝑈2 ⊆ 𝑈1 + . . . + 𝑈𝑚 , . . . , 𝑈𝑚 ⊆ 𝑈1 + . . . + 𝑈𝑚 .
(iii) 𝑈1 + . . . + 𝑈𝑚 is the smallest subspace of 𝑉 containing 𝑈1 , . . . , 𝑈𝑚 .

Proof: We are given that each 𝑈 𝑖 is a subspace, meaning that 0® ∈ 𝑈 𝑖 , so

0® = 0®∈𝑢1 + . . . + 0®∈𝑈𝑚 ∈ 𝑈1 + . . . + 𝑈𝑚
Thus, we have shown that the additive identity is in 𝑈1 + . . . + 𝑈𝑚 .
Now, we want to show that this sum is closed under addition.
Let 𝑣® , 𝑤
® ∈ 𝑈1 + . . . + 𝑈𝑚 . We need to show that 𝑣® + 𝑤
® ∈ 𝑈1 + . . . + 𝑈 𝑚 .
Then, 𝑉 = 𝑉1 + . . . + 𝑉𝑚 , 𝑊 = 𝑤®1 + . . . + 𝑊𝑚
® ® ® ® ®
As such

𝑤
® + 𝑣® = 𝑤®1 + . . . + 𝑤®𝑚 + 𝑣®1 + . . . + 𝑣®𝑚
 

= 𝑤®1 + 𝑣®1 + . . . + 𝑤®𝑚 + 𝑣®𝑚


 

∈ 𝑈1 + . . . + 𝑈 𝑚

Since each 𝑈 𝑖 is closed under addition.


Now, we want to show that 𝑈1 + . . . + 𝑈𝑚 is closed under scalar multiplication.
Let 𝜆 ∈ 𝔽 and 𝑣® ∈ 𝑈1 + . . . + 𝑈𝑚 . We need to show that 𝜆 · 𝑣® ∈ 𝑈1 + . . . + 𝑈𝑚 .
Then,

® = 𝜆 ∗ 𝑣®1 + . . . + 𝑣®𝑚
𝜆∗𝑉


= 𝜆 ∗ 𝑣®1 + . . . + 𝜆 ∗ 𝑣®𝑚
 

∈ 𝑈1 + . . . + 𝑈 𝑚

Since each 𝑈 𝑖 is closed under scalar multiplication.


Thus, 𝑈1 + . . . + 𝑈𝑚 is a subspace of 𝑉.
Now, now we need to show that each 𝑈 𝑖 is contained in 𝑈1 + . . . + 𝑈𝑚 .
Let 𝑢 ∈ 𝑈 𝑖 , we want to show that 𝑢 ∈ 𝑈1 + . . . + 𝑈𝑚 .
Then we can set 0®𝑢1 + . . . + 0®𝑢𝑖−1 + 𝑢 + 0®𝑢𝑖+1 + . . . + 0®𝑢𝑚 ∈ 𝑈1 + . . . + 𝑈𝑚 .
Obviously, if we set the rest of the vectors to be 0®, then we get 𝑢 ∈ 𝑈1 + . . . + 𝑈𝑚 .
Finally, we want to prove that 𝑈1 + . . . + 𝑈𝑚 is the smallest subspace of 𝑉 containing 𝑈1 , . . . , 𝑈𝑚 .
Let 𝑋 be a subspace of 𝑉 such that 𝑈 𝑖 ∈ 𝑋 for all 𝑖 = 1, . . . , 𝑚.
We want to show that 𝑈1 + . . . + 𝑈𝑚 ⊆ 𝑋.
Let 𝑣® ∈ 𝑈1 + . . . + 𝑈𝑚 , so 𝑣® = 𝑣®1 + . . . + 𝑣®𝑚 where 𝑣®𝑖 ∈ 𝑈 𝑖 for all 𝑖 = 1, . . . , 𝑚.
Since 𝑈 𝑖 ⊆ 𝑋 for all 𝑖 = 1, . . . , 𝑚, we know that 𝑣®𝑖 ∈ 𝑋 for all 𝑖 = 1, . . . , 𝑚.
Thus, 𝑣® = 𝑣®1 + . . . + 𝑣®𝑚 ∈ 𝑋 since 𝑋 is closed under vector addition.

14
Definition 1.3.3: Direct sum

Let 𝑈1 + . . . + 𝑈 𝑀 is a direct sum if


for each 𝑣® ∈ 𝑈1 + . . . 𝑈𝑚 , there is exactly one way to write 𝑣® = 𝑢®1 + . . . + 𝑢𝑚® with 𝑢®𝑖 ∈ 𝑈 𝑖 .

Example 1.3.6
Let 𝑈1 = {(𝑥, 𝑦, 0) : 𝑥, 𝑦 ∈ ℝ}, 𝑈2 = {(0, 0, 𝑧) : 𝑧 ∈ ℝ}.

Claim: Then 𝑈1 + 𝑈2 is a direct sum.


Let’s prove our claim
Proof: Let 𝑣® ∈ 𝑈1 + 𝑈2 . We know 𝑣® = 𝑢®1 + 𝑢®2 for some 𝑢®1 ∈ 𝑈1 , 𝑢®2 ∈ 𝑈2 .
We want to show: if 𝑣® = 𝑢®1 + 𝑢®2 for any 𝑢®1 ∈ 𝑈1 , 𝑢2 ∈ 𝑈2
then 𝑢®1 = 𝑢®1 ‘ and 𝑢®2 = 𝑢®2 ‘.
Now we know that 𝑢®1 = (𝑥, 𝑦, 0) and 𝑢®2 = (0, 0, 𝑧), and
the same for our primes (i.e. 𝑢®1 ‘ = (𝑥‘, 𝑦‘, 0) and 𝑢®2 ‘ = (0, 0, 𝑧‘)) for some 𝑥, 𝑦, 𝑧, 𝑥‘, 𝑦‘, 𝑧‘ ∈ ℝ.

𝑣® = 𝑢®1 + 𝑢®2 = 𝑢®1 ‘ + 𝑢®2 ‘


(𝑥, 𝑦, 0) + (0, 0, 𝑧) = (𝑥‘, 𝑦‘, 0) + (0, 0, 𝑧‘)
=⇒ (𝑥, 𝑦, 𝑧) = (𝑥‘, 𝑦‘, 𝑧‘)
=⇒ 𝑢®1 = (𝑥, 𝑦, 0) = (𝑥‘, 𝑦‘, 0) = 𝑢®1 ‘
=⇒ 𝑢®2 = (0, 0, 𝑧) = (0, 0, 𝑧‘) = 𝑢®2 ‘

Non-example: Let 𝑈3 = {(0, 𝑦, 𝑦) : 𝑦 ∈ ℝ}.


Claim: Then 𝑈1 + 𝑈2 + 𝑈3 is not a direct sum.

Proof: Thus, one way to write the zero vector is as follows:

0® = 0®∈𝑢1 + 0®∈𝑢2 + 0®∈𝑢3 = (0, −1, 0)∈𝑢1 + (0, 0, −1)∈𝑢2 + (0, 1, 1)∈𝑢3

Theorem 1.3.2
𝑈1 + . . . + 𝑈𝑚 is a direct sum if and only if

0®can be written uniquely as 0® = 0®∈𝑢1 + . . . + 0®∈𝑢𝑚


We need to prove it both ways.
Proof of =⇒ : If 𝑈1 + . . . +𝑈𝑚 is direct, then every vector in 𝑣® ∈ 𝑈1 + . . . +𝑈𝑚 can be written uniquely
as a sum of vectors from 𝑈1 , . . . , 𝑈𝑚 .
In particular, if 𝑣® = 0®, ten we can only write 0® in one way as 0® = 0®∈𝑢1 + . . . + 0®∈𝑢𝑚 .
And we done.

Proof of ⇐= : Suppose 0® can only be written in one way as

0® = 0®∈𝑢1 + . . . + 0®∈𝑢𝑚 , 𝑢𝑖 ∈ 𝑈 𝑖
Let 𝑣® ∈ 𝑈1 + . . . + 𝑈𝑚 be arbitrary, and suppose

𝑣® = 𝑢®1 + . . . + 𝑢®𝑚 = 𝑢®1 ‘ + . . . + 𝑢®𝑚 ‘


We want to show that 𝑢®𝑖 = 𝑢®𝑖 ‘ for all 𝑖 = 1, . . . , 𝑚.

15
Then,

0® = 𝑣® − 𝑣® = 𝑢®1 + . . . + 𝑢®𝑚 − 𝑢®1 ‘ + . . . + 𝑢®𝑚 ‘


 

=⇒ 0® = 𝑢®1 − 𝑢®1 ‘ + . . . + 𝑢®𝑚 − 𝑢®𝑚 ‘


 
∈𝑢1 ∈𝑢𝑚

=⇒ 𝑢®1 − 𝑢®1 ‘ = 0®∈𝑢1 , . . . , 𝑢®𝑚 − 𝑢®𝑚 ‘ = 0®∈𝑢𝑚

And we are done.

Thus, we have shown that 𝑈1 + . . . + 𝑈𝑚 is a direct sum if and only if 0® = 0®∈𝑢1 + . . . + 0®∈𝑢𝑚 is the only way
to write 0® as a sum of vectors from 𝑈1 , . . . , 𝑈𝑚 .

Alternative proof that 𝑈1 + 𝑈2 (from our example) is direct using criterion from our prvious theorem.

Alternative Proof: Let 𝑈1 = {(𝑥, 𝑦, 0) : 𝑥, 𝑦 ∈ ℝ} and 𝑈2 = {(0, 0, 𝑧) : 𝑧 ∈ ℝ}.


If 0® = 𝑢®1 + 𝑢®2 for some 𝑢®1 ∈ 𝑈1 , 𝑢®2 ∈ 𝑈2 , then

=⇒ 0® = 𝑢®1 + 𝑢®2 = (𝑥, 𝑦, 0) + (0, 0, 𝑧) = (𝑥, 𝑦, 𝑧)


=⇒ 𝑥 = 𝑦 = 𝑧 = 0
=⇒ 𝑢®1 = (𝑥, 𝑦, 0) = (0, 0, 0), 𝑢®2 = (0, 0, 𝑧) = (0, 0, 0)

Theorem 1.3.3
If 𝑈1 , 𝑈2 are subspaces of of a vector space 𝑉, then
n o
(𝑈1 + 𝑈2 is direct) ⇐⇒ (𝑈1 ∩ 𝑈2 = 0® )
n o
Proof of =⇒ : Suppose 𝑈1 + 𝑈2 is direct, then we want to show that 𝑈1 ∩ 𝑈2 = 0® .
In other words we need to prove subset inclusion in both directions.
⊆ : We have {0} ⊆ 𝑈1 ∩ 𝑈2 since 0® ∈ 𝑈1 and 0® ∈ 𝑈2 .

⊇ : Let 𝑣® ∈ 𝑈1 ∩ 𝑈2 . We want to show that 𝑣® = 0®.

=⇒ 𝑣® ∈ 𝑈1 and 𝑣® ∈ 𝑈2
=⇒ −®𝑣 ∈ 𝑈1 and − 𝑣® ∈ 𝑈2 as they are closed under scalar multiplication
𝑣 ) ∈ 𝑈1 + 𝑈2 by our previous theorem
=⇒ 0® = 𝑣® + (−®
=⇒ 𝑣® = 0® , −®
𝑣 = 0®

n o
Thus, 𝑈1 ∩ 𝑈2 = 0® .

n o
Proof of ⇐= : Suppose 𝑈1 ∩ 𝑈2 = 0® , then we want to show that 𝑈1 + 𝑈2 is direct.
Suppose 0® = 𝑢®1 + 𝑢®2 for some 𝑢®1 ∈ 𝑈1 , 𝑢®2 ∈ 𝑈2 .
We want to show that 𝑢®1 = 𝑢®2 = 0®.

16
0 = 𝑢®1 + 𝑢®2 =⇒ 𝑢®1 = −𝑢®2 =⇒ 𝑢®1 ∈ 𝑈1 and 𝑢®1 ∈ 𝑈2
n o
so 𝑢®1 ∈ 𝑈1 ∩ 𝑈2 = 0®

=⇒ 𝑢®1 = 0® =⇒ 𝑢®2 = −𝑢®1 = −0® = 0®

By our previous theorem, 𝑈1 + 𝑈2 is direct because we can only write 0® in one way as a sum of vectors
from 𝑈1 and 𝑈2 .
n o
Thus, we have shown that 𝑈1 + 𝑈2 is direct if and only if 𝑈1 ∩ 𝑈2 = 0® .

Third proof: Let 𝑣® = 𝑈1 ∩ 𝑈2 , 𝑣® = (𝑥, 𝑦, 0) = (0, 0, 𝑧).


Then 𝑥 = 𝑦 = 𝑧 = 0, so 𝑣® = 0®n. o
This means that 𝑈1 ∩ 𝑈2 = 0® .

17
Chapter 2

Finite-Dimensional Vector Spaces

2.1 Span and linear independence


Definition 2.1.1

Let (𝑉 , 𝐹, +, ·) be a vector space.


A linear combination of 𝑣®1 , 𝑣®2 , . . . , 𝑣®𝑚 ∈ 𝑉 is a vector of the form:

𝛼 · 𝑣®1 + 𝛼 · 𝑣®2 + · · · + 𝛼 · 𝑣®𝑚 for some 𝛼1 , 𝛼 2 , . . . , 𝛼 𝑚 ∈ 𝐹

Example 2.1.1
Let 𝑉 = ℝ, and our field being 𝔽 = ℝ.

6 · (2, 1, −3) + 5 · (1, −2, 4) = (17, −4, −2)

So 17, −4, 2 is a linear combination of (2, 1, −3) and (1, −2, 4).

Definition 2.1.2

The span of 𝑣®1 , 𝑣®2 , . . . , 𝑣®𝑚 ∈ 𝑉 is the set of all linear combinations of 𝑣®1 , 𝑣®2 , . . . , 𝑣®𝑚 .

span(𝑣®1 , 𝑣®2 , . . . , 𝑣®𝑚 ) = 𝛼 1 · 𝑣®1 + 𝛼2 · 𝑣®2 + · · · + 𝛼 𝑚 · 𝑣®𝑚 | 𝛼1 , 𝛼 2 , . . . , 𝛼 𝑚 ∈ 𝐹




Note:-
n o
We have a few convention: span () ≔ 0®𝑣 .

Proposition 2.1.1
The span (𝑣 1 , . . . , 𝑣 𝑚 ) is the smallest subspace of 𝑉 that contains 𝑣1 , . . . , 𝑣 𝑚 .
Proof: We have to show three things in 1.34.

(a) We know that 0®𝑣 = 0𝔽 · 𝑣®1 + 0𝔽 · 𝑣®2 + · · · + 0𝔽 · 𝑣®𝑚 ∈ span 𝑣®1 , . . . , 𝑣®𝑚 . Thus, we are done


(b) Closed under addition +𝑣 :

(𝑎 1 · 𝑣®1 + . . . + 𝑎 𝑚 · 𝑣®𝑚 ) + (𝑏1 · 𝑣®1 + . . . + 𝑏 𝑚 · 𝑣®𝑚 ) = (𝑎1 + 𝑏 1 ) · 𝑣®1 + . . . + (𝑎 𝑚 + 𝑏 𝑚 ) · 𝑣®𝑚


| {z } | {z } | {z }
∈span 𝑣®1 ,..., 𝑣®𝑚 ∈span 𝑣®1 ,..., 𝑣®𝑚 ∈span 𝑣®1 ,...,𝑣®𝑚
  

18
(c) Now closed under scalar multiplication:

𝜆 · (𝑎 · 𝑣®1 + . . . + 𝑎 𝑚 · 𝑣®𝑚 ) = 𝜆 · 𝑎1 · 𝑣®1 + . . . + 𝜆 · 𝑎 𝑚 · 𝑣®𝑚


|{z} |{z} |1 {z } | {z }
∈𝔽 in 𝑉 ∈span 𝑣®1 ,..., 𝑣®𝑚

∈span 𝑣®1 ,..., 𝑣®𝑚


Now we have to show that this span contains 𝑣®1 , . . . , 𝑣®𝑚 :


In other words,

𝑣®2 = 0𝔽 · 𝑣®1 + 1𝔽 · 𝑣®2 + 0𝔽 · 𝑣®3 + . . . + 0𝔽 · 𝑣®𝑚 ∈ span 𝑣®1 , . . . , 𝑣®𝑚




Now, we must show it is the smallest.


Note:-
Draw some pics charlie

Suppose that 𝑈 ⊆ 𝑉 is a subspace  that contains 𝑣®1 , . . . , 𝑣®𝑚 .


Must show that span 𝑣®1, . . . , 𝑣®𝑚 ⊆ 𝑈.
Let 𝑣 ∈ span 𝑣®1 , . . . , 𝑣®𝑚 , and is arbitrary.
We want to show that 𝑣 ∈ 𝑈
We know some some things:

1. 𝑣 = 𝑎 1 · 𝑣®1 + . . . + 𝑎 𝑚 · 𝑣®𝑚 for some 𝑎1 , . . . , 𝑎 𝑚 ∈ 𝔽


2. 𝑣®1 , . . . , 𝑣®𝑚 ∈ 𝑈. Since 𝑣 𝑖 ∈ 𝑈 , then 𝑎 𝑖 · 𝑣®𝑖 ∈ 𝑈 for all 𝑖 = 1, . . . , 𝑚.
This is because 𝑈 is a subspace, and is closed under scalar multiplication.
But then 𝑎 1 · 𝑣®1 + . . . + 𝑎 𝑚 · 𝑣®𝑚 ∈ 𝑈 since 𝑈 is closed under addition.

Therefore 𝑣 ∈ 𝑈, and we are done.

Special Situation: If span (𝑣1 , . . . , 𝑣 𝑚 ) = 𝑉, we say that 𝑣1 , . . . , 𝑣 𝑚 spans 𝑉.

Example 2.1.2
Let 𝑉 = ℝ3 , and the field 𝔽 = ℝ.
Then span ((1, 0, 0), (0, 1, 0), (0, 0, 1)) = ℝ3 .

Proof: Let (𝑎, 𝑏, 𝑐) ∈ ℝ3 be arbitrary.


Then, (𝑎, 𝑏, 𝑐) = 𝑎 · (1, 0, 0) + 𝑏 · (0, 1, 0) + 𝑐 · (0, 0, 1).

Definition 2.1.3

We say that 𝑉 is finite-dimensional if V can be spanned by a finite list 𝑣1 , 𝑣 2 , . . . , 𝑣 𝑚 .

Example 2.1.3
𝑃𝑚 (𝐹) = {Polys of degree ≤ 𝑚 with coefficients in 𝐹}
And we claim that this is spanned by 1, 𝑥, 𝑥 2 , . . . , 𝑥 𝑚 .
Because any 𝑝(𝑥) ∈ 𝑃𝑚 (𝐹) has the form 𝑎 𝑚 · 𝑥 𝑚 + . . . + 𝑎 1 · 𝑥 + 𝑎 0 for some 𝑎 0 , . . . , 𝑎 𝑚 ∈ 𝐹.

Proposition 2.1.2
𝑃(𝐹) = {Polys with coefficients in 𝐹} is not finite-dimensional.

Proof: We procced by contradiction.


Suppose, for a contradiction, that 𝑃(𝐹) is finite-dimensional.

19
Then, there exists a finite list 𝑝1 (𝑥), . . . , 𝑝 𝑚 (𝑥) that spans 𝑃(𝐹).
In other words, span (𝑝1 (𝑥), . . . , 𝑝 𝑚 (𝑥)) = 𝑃(𝐹).
Let 𝑛 = 𝑚𝑎𝑥 (deg(𝑝1 (𝑥)), . . . , deg(𝑝 𝑚 (𝑥))).
Then, 𝑑𝑒 𝑔(𝑎1 · 𝑝1 (𝑥) + . . . + 𝑎 𝑚 · 𝑝 𝑚 (𝑥)) ≤ 𝑛 for all 𝑎 1 , . . . , 𝑎 𝑚 ∈ 𝐹.
So the degree of every element of span (𝑝1 (𝑥), . . . , 𝑝 𝑚 (𝑥)) is at most 𝑛.
Hence, 1𝔽 · 𝑋 𝑛+1 ∉ span (𝑝1 (𝑥), . . . , 𝑝 𝑚 (𝑥)).
This means that span (𝑝1 (𝑥), . . . , 𝑝 𝑚 (𝑥)) ( 𝑃(𝐹).
This is absurd!
So our assumption that 𝑃(𝐹) is finite-dimensional is false.

Definition 2.1.4

Linear (In)depence.
Let (𝑉 , 𝔽, +, ·) be a vector space.
A list 𝑣®1 , . . . , 𝑣®𝑚 ∈ 𝑉 is linearly independent if the only way to write

0®𝑣 = 𝛼 1 · 𝑣®1 + . . . + 𝛼 𝑚 · 𝑣®𝑚 , 𝛼 1 , . . . , 𝛼 𝑚 ∈ 𝔽


Is to take 𝑎 1 = . . . = 𝑎 𝑚 = 0𝔽 , otherwise it is linearly dependent.

Example 2.1.4
We want to show that (1, 0, 0), (0, 1, 0)0, (0, 0, 1) are linearly independent in ℝ3 = 𝑉.
because if

0®ℝ3 = (0, 0, 0) = 𝑎1 · (1, 0, 0) + 𝑎 2 · (0, 1, 0) + 𝑎 3 · (0, 0, 1)


Then, (0, 0, 0) = (𝑎1 , 𝑎 2 , 𝑎 3 ), so 𝑎1 = 𝑎2 = 𝑎3 = 0.
Now suppose that 𝑣®1 , . . . , 𝑣®𝑚 ∈ 𝑉 is linearly independent and 𝑣 ∈ span 𝑣®1 , . . . , 𝑣®𝑛 .

This means: 𝑣® = 𝑎1 𝑣 1 + . . . + 𝑎 𝑚 𝑣 𝑚 for some 𝑎1 , . . . , 𝑎 𝑚 ∈ 𝔽.
Now, suppose that 𝑉 = 𝑏 1 𝑣1 + . . . + 𝑏 𝑚 𝑣 𝑚 for some 𝑏 1 , . . . , 𝑏 𝑚 ∈ 𝔽 as well
Now, let’s subtract:

0®𝑉 = 𝑣 − 𝑣 = (𝑎1 − 𝑏 1 )𝑣1 + . . . + (𝑎 𝑚 − 𝑏 𝑚 )𝑣 𝑚


Since 𝑣®1 , . . . , 𝑣®𝑚 is linearly independent (L.I, we must have 𝑎 𝑖 − 𝑏 𝑖 = 0 for all 𝑖 = 1, . . . , 𝑚.
This implies that 𝑎 𝑖 = 𝑏 𝑖 for all 𝑖 = 1, . . . , 𝑚.
Thus, there is exactly one way to write 𝑉 as a linear combination of 𝑣®1 , . . . , 𝑣®𝑚

Key result: Let (𝑉 , 𝔽, +, ·) be a finite-dimensional vector space.


Then the length of any-list of Linear Independence vectors is at most the length of any list of spanning
vectors.

Example 2.1.5
We want to show that (1, 0, 0), (0, 1, 0),
√ (0, 0, 1) spans
√ ℝ3 . √ √
This implies that the list (2, −1, 𝜋), ( 3, −7, 𝑒), ( 19, −1, 7), (0, −5, 2 + 3) is not linearly independent.
Since the length of the first list is 3, and the length of the second list is 4.
Thus, this list cannot be linearly independent.

Lenma 2.1.1 Linear Dependence Lemma (LDL)


We want to prove this, but let’s do some prep work first.

Prep work: Say 𝑣®1 , . . . , 𝑣®𝑛 ∈ 𝑉 is linearly dependent. Then there is a 𝑗 ∈ {1, . . . , 𝑚}
such that

20
 
(i) 𝑣 𝑗 ∈ span 𝑣®1 , . . . , 𝑣 𝑗 ®− 1

(ii) span 𝑣®1 , . . . , 𝑣®𝑚 = span 𝑣1 , . . . , 𝑣ˆ𝑗 , . . . , 𝑣 𝑚 , where 𝑣ˆ𝑗 means that we remove 𝑣 𝑗 from the list.
 

Now, let’s prove this.


Proof: Since 𝑣®1 , . . . , 𝑣®𝑚 are linearly dependent, there are 𝑎1 , . . . , 𝑎 𝑚 not all zero such that

0®𝑣 = 𝑎 1 · 𝑣®1 + . . . + 𝑎 𝑚 · 𝑣®𝑚 , 𝑎 1 , . . . , 𝑎 𝑚 ∈ 𝔽

(i) Let 𝑗 = max {𝑖 | 𝑎 𝑖 ≠ 0}, so that 𝑎1 𝑣 1 + . . . + 𝑎 𝑗 𝑣 𝑗 = 0®𝑣 and 𝑎 𝑗 ≠ 0.

𝑎1 𝑎 𝑗−1
   
1
=⇒ 𝑣 𝑗 = − 𝑎1 𝑣1 + . . . + 𝑎 𝑗−1 𝑣 𝑗−1 = − 𝑣1 + . . . + − 𝑣 𝑗−1

𝑎𝑗 𝑎𝑗 𝑎𝑗
=⇒ 𝑣 𝑗 ∈ span 𝑣1 , . . . , 𝑣 𝑗−1


(ii) span 𝑣1 , . . . , 𝑣ˆ𝑗 , . . . , 𝑣 𝑚 ⊆ span (𝑣1 , . . . , 𝑣 𝑚 ).




Note:-
We have to do the one above as well.
Now, we want to show the other direction as well.

span 𝑣®1 , . . . , 𝑣®𝑚 ⊆ span 𝑣1 , . . . , 𝑣ˆ𝑗 , . . . , 𝑣 𝑚


 

Let 𝑣 ∈ span 𝑣®1 , . . . , 𝑣®𝑚 .




Then, 𝑣 = 𝑏1 𝑣1 + . . . + 𝑏 𝑚 𝑣 𝑚 for some 𝑏1 , . . . , 𝑏 𝑚 ∈ 𝔽.

𝑎1 𝑎 𝑗−1
    
=⇒ 𝑣 = 𝑏 1 𝑣1 + . . . + 𝑏 𝑗 − 𝑣1 + . . . + − 𝑣 𝑗−1 + 𝑏 𝑗+1 𝑣 𝑗+1 + . . . + 𝑏 𝑚 𝑣 𝑚 where 𝑣 𝑗 (from (i))
𝑎𝑗 𝑎𝑗
=⇒ 𝑣 ∈ span 𝑣1 , . . . , 𝑣ˆ𝑗 , . . . , 𝑣 𝑚


Thus, span 𝑣®1 , . . . , 𝑣®𝑚 = span 𝑣1 , . . . , 𝑣ˆ𝑗 , . . . , 𝑣 𝑚 .


 

Proof key result: Let 𝑣®1 , . . . , 𝑣®𝑚 ∈ 𝑉 be a linearly independence  list.


Let 𝑢®1 , . . . , 𝑢®𝑛 ∈ 𝑣 be a spanning list, 𝑉 = span 𝑢®1 , . . . , 𝑢®𝑛 .
We need to show that 𝑚 ≤ 𝑛 .
Step 1:
PSET4

=⇒ 𝑣®1 , 𝑢®1 , . . . , 𝑢®𝑛 is linearly dependent


 z}|{
𝑣 1 ∈ span 𝑢®1 , . . . , 𝑢®𝑛
With the linear independence lemma, we know there exits 𝑢®𝑗1 such that
 
𝑢®𝑗1 ∈ span 𝑣®1 , 𝑢®1 , . . . , 𝑢 𝑗1 ®− 1

And
 
span 𝑣®1 , 𝑢®1 , . . . , 𝑢®𝑛 = span 𝑣®1 , 𝑢®1 , . . . , 𝑢®ˆ𝑗1 , . . . , 𝑢®𝑛


21
Note:-
NB means nota bene, which means note well.
Notice that 𝑣 1 is not plucked out from our listn when
o we apply LDL.
If it were, then LDL would say 𝑣1 ∈ span () = 0®𝑣 .
This implies that 𝑣1 = 0®𝑣 ,
But 𝑣®1 , . . . , 𝑣®𝑚 is linearly independent.
As 0®𝑣 = 1𝔽 · 𝑣®1 + 0𝔽 · 𝑣®2 + . . . + 0𝔽 · 𝑣®𝑚 is the only way to write 0®𝑣 as a linear combination of 𝑣®1 , . . . , 𝑣®𝑚 .
Thus, 𝑣1 ≠ 0®𝑣 .
 
Step 2: 𝑣2 ∈ span 𝑢®1 , . . . , 𝑢®𝑛 = span 𝑣®1 , 𝑢®1 , . . . , 𝑢®𝑛 = span 𝑣®1 , 𝑢®1 , . . . , 𝑢®ˆ𝑗1 , . . . , 𝑢®𝑛
 

Again, with the result in PSET4, we know that

𝑣®1 , 𝑣®2 , 𝑢®1 , . . . , 𝑢®ˆ𝑗1 , . . . , 𝑢®𝑛 is linearly dependent


With the linear independence lemma, we know there exits 𝑢®𝑗2 such that
   
span 𝑣®1 , 𝑢®1 , . . . , 𝑢®ˆ𝑗1 , . . . , 𝑢®𝑛 = span 𝑣®1 , 𝑣®2 , 𝑢®1 , . . . , 𝑢®ˆ𝑗1 , . . . , 𝑢®ˆ𝑗2 , . . . , 𝑢®𝑛

After 𝑚 steps: Our list is 𝑣®1 , . . . , 𝑣®𝑚 , some 𝑢’s implies that 𝑚 ≤ 𝑛
Thus, we have shown that 𝑚 ≤ 𝑛.

22
2.2 Basis
Definition 2.2.1

Let (𝑉 , 𝔽, +, ·) be a vector space.


A basis for 𝑉 is a list 𝑣®1 , . . . , 𝑣®𝑛 that spans 𝑉.
(i.e., 𝑉 = span 𝑣®1 , . . . , 𝑣®𝑛 ) and is linearly independent.

Example 2.2.1

(i) Let 𝑉 = 𝔽𝑛 (think 𝑉 = ℝ𝑛 or ℂ𝑛 )


We can define the standard basis for 𝔽𝑛 as:

𝑣1 = (1, 0, . . . , 0)
𝑣2 = (0, 1, . . . , 0)
..
.
𝑣 𝑛 = (0, 0, . . . , 1)

e.g., 𝑉 = ℝ3 = span ((1, 0, 0), (0, 1, 0), (0, 0, 1)). This list is linearly independent.
(ii) 𝑉 = ℝ2 The list (1, 2), (2, 3) is a basis.

Linearly Independence: If

𝑎1 (1, 2) + 𝑎2 (2, 3) = (0, 0)0®


ℝ2

=⇒ (𝑎 1 + 2𝑎2 , 2𝑎 1 + 3𝑎2 ) = (0, 0)


=⇒ 𝑎 1 = 𝑎 2 = 0

(iii) 𝑉 = 𝑃𝑚 (ℝ)
Thus, the list 1, 𝑥, 𝑥 2 , . . . , 𝑥 𝑚 is a basis for 𝑉

Proposition 2.2.1
𝑣®1 , . . . , 𝑣®𝑛 ∈ 𝑉 is a bais for 𝑉 if and only if every 𝑣® ∈ 𝑉 can be written uniquely as a linear combination
of 𝑣®1 , . . . , 𝑣®𝑛 .
Proof of =⇒ : Say 𝑣®1 , . . . , 𝑣®𝑛 ∈ 𝑉 is a basis.
Let 𝑣® ∈ 𝑉. Since 𝑉 = span 𝑣®1 , . . . , 𝑣®𝑛 ,
we know that 𝑣® = 𝑎 1 𝑣®1 + . . . + 𝑎 𝑛 𝑣®𝑛 for some 𝑎 1 , . . . , 𝑎 𝑛 ∈ 𝔽.
Since 𝑣®1 , . . . , 𝑣®𝑛 are linearly independent, we know this representation is unique.
Proof of ⇐= : Suppose that every 𝑣® ∈ 𝑉 can be written uniquely as 𝑣® = 𝑎1 𝑣®1 + . . . + 𝑎 𝑛 𝑣®𝑛 for some
𝑎1 , . . . , 𝑎 𝑛 ∈ 𝔽.
Then 𝑣® ∈ span 𝑣®1 , . . . , 𝑣®𝑛 , so 𝑉 ⊆ span 𝑣®1 , . . . , 𝑣®𝑛 . 
 
By the definition of span, we  know that span 𝑣®1 , . . . , 𝑣®𝑛 ⊆ 𝑉.
Thus, 𝑉 = span 𝑣®1 , . . . , 𝑣®𝑛 .
Next, let 𝑣® = 0®𝑣 .
We know that 0®𝑣 = 𝑎1 𝑣®1 + . . . + 𝑎 𝑛 𝑣®𝑛 for unique 𝑎 1 , . . . , 𝑎 𝑛 ∈ 𝔽.
On the other hand (OTOH): taking 𝑎1 = . . . = 𝑎 𝑛 = 0 works!
Therefore, the only way to write 0®𝑣 is a linearly combination of 𝑣®1 , . . . , 𝑣®𝑛

23
is to take 𝑎1 = . . . = 𝑎 𝑛 = 0𝔽 .
The definition implies that 𝑣®1 , . . . , 𝑣®𝑛 is linearly independent.
Thus, we have shown that 𝑣®1 , . . . , 𝑣®𝑛 ∈ 𝑉 is a basis for 𝑉 if and only if every 𝑣® ∈ 𝑉 can be written uniquely
as a linear combination of 𝑣®1 , . . . , 𝑣®𝑛 .

Theorem 2.2.1
Let (𝑉 , 𝔽, +, ·) be a finite-dimensional vector space (fdvs).
Then every spanning list for 𝑉 can be trimmed to a basis.

Proof: Say that 𝑣®1 , . . . , 𝑣®𝑛 is a strong list for 𝑉.


Algorithm 1: Trimming

𝐵 = 𝑣®1 , . . . , 𝑣®𝑛 ;

1 /* Note that 𝐵 has no order. */
2 for 𝑗 = 1, . . . , 𝑛 do
if 𝑣 𝑗 ∈ span 𝑣®1 , . . . , 𝑣 𝑗−1
® ∩ 𝐵 then
 
3
4 Delete 𝑣 𝑗 from 𝐵;
5 end
When the loop is finished, the set 𝐵 gives rise to a basis (any order).

Example 2.2.2
𝑉 = ℝ3 .
Let 𝑣1 =(1, 0, 0), 𝑣2 = (1, 1, 1), 𝑣3 = (0, 1, 1), and 𝑣4 = (0, 0, 1).
Let 𝐵 = 𝑣®1 , 𝑣®2 , 𝑣®3 , 𝑣®4
n o
Step 1: Is 𝑣1 ∈ span (∅ ∩ 𝐵) = span () = 0®𝑣 ?
NO. Leave 𝐵 alone.

Step 2: Is 𝑣2 ∈ span ({𝑣 1 } ∩ 𝐵) = span (𝑣1 )?


Does 𝑣 2 = 𝑎 1 · 𝑣1 .
No!
Leave 𝐵 alone.
Step 3: Is 𝑣3 ∈ span ({𝑣 1 , 𝑣 2 } ∩ 𝐵) = span (𝑣 1 , 𝑣 2 )?
Does 𝑣 3 = 𝑎 1 · 𝑣1 + 𝑎2 · 𝑣2 ?
Yes!

𝑣3 = −𝑣1 + 𝑣2
New 𝐵 = {𝑣1 , 𝑣 2 , 𝑣 4 }
Step 4: Is 𝑣4 ∈ span ({𝑣 1 , 𝑣 2 , 𝑣 3 } ∩ 𝐵) = span (𝑣1 , 𝑣 2 )?
Does 𝑣 4 = 𝑎 1 · 𝑣1 + 𝑎2 · 𝑣2 ?
No!
Leave 𝐵 alone.
Thus, 𝐵 = {𝑣1 , 𝑣 2 , 𝑣 4 } is a basis for 𝑉 through trimming.

Corollary 2.2.1
Any linearly independence list 𝑣®1 , . . . , 𝑣®𝑚 on 𝑉 can be extended to a basis.
Proof: Let 𝑢®1 , . . . , 𝑢®𝑛 be any basis for 𝑉 .
Trim the enlarged list 𝑣®1 , . . . , 𝑣®𝑚 , 𝑢®1 , . . . , 𝑢®𝑛 .

24
No 𝑣®𝑖 is deleted during trimming (LDL).

Semi-simplicity: Let (𝑉 , 𝔽, +, ·) be a finite-dimensional vector space.


Let 𝑈 ⊆ 𝑉 be a subspace.
Then, there is a subspace 𝑊 ⊆ 𝑉 (not necessarily unique) such that 𝑉 = 𝑈 ⊕ 𝑊.
Idea: Let 𝑢®1 , . . . , 𝑢®𝑛 be a basis for 𝑈.
Complete to a spanning list of 𝑉.
𝑢®1 , . . . , 𝑢®𝑛 , 𝑤®1 , . . . , 𝑤®𝑚 .
The space 𝑊 = span 𝑤®1 , . . . , 𝑤®𝑛 works!


Claim: 𝑈 itself is finite-dimensional.


Assume claim: Let 𝑢®1 , . . . , 𝑢®𝑛 be a basis for 𝑈 .
This implies that 𝑢®1 , . . . , 𝑢®𝑛 is linearly independent in 𝑈 , but also in 𝑉 .
Now, extend to a basis of 𝑉: 𝑢®1 , . . . , 𝑢®𝑛 , 𝑤®1 , . . . , 𝑤®𝑚 .
Take 𝑊 = span 𝑤®1 , . . . , 𝑤®𝑚 .
We want

(i) 𝑈 + 𝑊 ⊇ 𝑉, the other direciton is trivial.


n o
(ii) 𝑈 ∩ 𝑊 = 0®𝑣

Ok, let’s start.

(i) Let 𝑣 ∈ 𝑉. Since 𝑉 = span 𝑢®1 , . . . , 𝑢®𝑛 , 𝑤®1 , . . . , 𝑤®𝑚 , we know:




𝑣 = 𝑎1 𝑢®1 + . . . + 𝑎 𝑛 𝑢®𝑛 + 𝑏 1 𝑤®1 + . . . + 𝑏 𝑚 𝑤®𝑚 = 𝑈 + 𝑊


| {z } | {z }
∈𝑈 ,𝑎 𝑖 ∈𝔽 ∈𝑊 ,𝑏 𝑖 ∈𝔽

As such, 𝑉 = 𝑈 + 𝑊
(ii) Let 𝑣 ∈ 𝑈 ∩ 𝑊.

𝑣 = 𝑎1 𝑢®1 + . . . + 𝑎 𝑛 𝑢®𝑛 (𝑣𝑖𝑛𝑈 = span 𝑢®1 , . . . , 𝑢®𝑛 )




𝑣 = 𝑏 1 𝑤®1 + . . . + 𝑏 𝑚 𝑤®𝑚 (𝑣𝑖𝑛𝑊 = span 𝑤®1 , . . . , 𝑤®𝑚 )




Now, let’s substract

0®𝑣 = 𝑣 − 𝑣 = 𝑎 1 𝑢®1 + . . . + 𝑎 𝑛 𝑢®𝑛 − 𝑏1 𝑤®1 − . . . − 𝑏 𝑚 𝑤®𝑚

Since 𝑢’s and 𝑤’s are linearly independent in 𝑉, this forces 𝑎’s and 𝑏’s to be all 0𝔽
This implies that 𝑣 = 0®𝑣
n o
Thus, 𝑈 ∩ 𝑊 ⊆ 0®𝑣 .
n o
Thus, 𝑈 ∩ 𝑊 = 0®𝑣 and 𝑈 + 𝑊 = 𝑉.
Therefore, 𝑉 = 𝑈 ⊕ 𝑊.
n o
proof of claim: If 𝑈 = 0®𝑣 then we are done!
This is because 𝑈 = span ()
Otherwise, there
 is a 𝑣®1 ≠ 0𝑣 in 𝑈 .
®
If 𝑈 = span 𝑣®1 , then we are done.
This is because 𝑈 is finite-dimensional.
Otherwise, there is a 𝑣®2 ∈ 𝑈 such that 𝑣®2 ∉ span 𝑣®1 .


25
This implies that (𝑣1 , 𝑣 2 ) is a linearly independent list in 𝑈 .
Which means that the list is also linearly independent in 𝑉 .
If 𝑈 = span 𝑣®1 , 𝑣®2 , then we are done.
Otherwise there is a 𝑣®3 ∈ 𝑈 such that 𝑣®3 ∉ span 𝑣®1 , 𝑣®2 .

This implies that (𝑣1 , 𝑣 2 , 𝑣 3 ) is a linearly independent list in 𝑈 .
Which means that the list is also linearly independent in 𝑉 .
This process terminates:
𝑉 is finite dimensional, which implies 𝑉 = span 𝑥®1 , . . . , 𝑥®𝑝

At step 𝑚 we produce a linearly independent list 𝑣®1 , . . . , 𝑣®𝑚 of 𝑉 .
The key result we have proved in class: 𝑚 ≤ 𝑝.

2.3 Dimension
Theorem 2.3.1
Any two bases of a finite-dimensional vector space 𝑉 have the same length.
Proof: Say 𝑣®1 , . . . , 𝑣®𝑚 and 𝑢®1 , . . . , 𝑢®𝑛 are bases for 𝑉 .
Let 𝑣®1 , . . . , 𝑣®𝑚 are linearly independent in 𝑉 .
Let 𝑢®1 , . . . , 𝑢®𝑛 span 𝑉 .
By the key result 𝑚 ≤ 𝑛. Reverse roles to get 𝑛 ≤ 𝑚.
Thus, 𝑚 = 𝑛.
The length of any basis for 𝑉 is called the dimension of 𝑉.

Example 2.3.1

(i) 𝑉 = ℝ𝑛 standard basis 𝑒®1 , . . . , 𝑒®𝑛 .


These vectors look like (0, . . . , 0, 1, 0, . . . , 0) where the 1 is in the 𝑖th position for each 𝑖 = 1, . . . , 𝑛.
This implies that the dimension of ℝ𝑛 is 𝑛.
(ii) 𝑃𝑚 (ℝ) has basis 1, 𝑥, 𝑥 2 , . . . , 𝑥 𝑚 .
This implies that the dimension of 𝑃𝑚 (ℝ) is 𝑚 + 1.

Properties: (i) If 𝑈 ⊆ 𝑉 is a subspace, then dim 𝑈 ≤ dim 𝑉 .


Say 𝑉 is finite-dimensional, which implies that 𝑈 is finite-dimensional.
A basis 𝑢®1 , . . . , 𝑢®𝑛 for 𝑈 is a linearly independent in 𝑉 .
This means we can extend a basis 𝑢®1 , . . . , 𝑢®𝑛 , 𝑤®1 , . . . , 𝑤®𝑚 of 𝑉.
Thus, dim 𝑈 = 𝑛 ≤ 𝑛 + 𝑚 = dim 𝑉
(ii) Say that dim 𝑉 = 𝑛, and 𝑣®1 , . . . , 𝑣®𝑛 is a linearly independent list in 𝑉,
Then 𝑣®1 , . . . , 𝑣®𝑛 spans 𝑉 .

Proof: Extend 𝑣®1 , . . . , 𝑣®𝑛 to a basis of 𝑉.


Result is a basis for 𝑉. This basis has length dim 𝑉 = 𝑛.
This means the extension process didn’t add new vectors.
Which means that 𝑣®1 , . . . , 𝑣®𝑛 is already a basis.
Thus, 𝑣®1 , . . . , 𝑣®𝑛 spans 𝑉 .

(iii) Say that dim 𝑉 = 𝑛 and that 𝑣®1 . . . 𝑣®𝑛 spans 𝑉.


Then 𝑣®1 , . . . , 𝑣®𝑛 is a linearly independent list.

26
Note:-
Do this as an exercise
.

Example 2.3.2
Take 𝑉 = {𝑝(𝑥) ∈ 𝑃3 (ℝ) : 𝑝‘(5) = 0} ⊆ 𝑃3 (ℝ)
We know that 𝑃3 (ℝ) is 4-dimensional, with a basis 1, 𝑥, 𝑥 2 , 𝑥 3 .

Claim: dim 𝑉 < 4 and that 𝑉 is 3-dimensional.


Proof: Since 𝑉 ⊆ 𝑃3 (ℝ), we know that 𝑉 is finite-dimensional i.e, dim 𝑉 < 4.
We just need to rule out that dim 𝑉 = 4.
Suppose that dim 𝑉 = 4.
Then 1, 𝑥, 𝑥 2 , 𝑥 3 is a basis for 𝑉.
Then 𝑉 ⊂ 𝑃3 (ℝ) both have dimension 4.
Let 𝑝1 , 𝑝 2 , 𝑝 3 , 𝑝 4 be a basis for 𝑉
Then 𝑝1 , 𝑝 2 , 𝑝 3 , 𝑝 4 are linearly independent in 𝑃3 (ℝ).
But, the dim 𝑃3 (ℝ) = 4, so 𝑝 1 , 𝑝 2 , 𝑝 3 , 𝑝 4 also spans 𝑃3 (ℝ).
This means that 𝑉 = 𝑃3 (ℝ). This is as they both span (𝑝1 , . . . , 𝑝 4 ).
Let 𝑝(𝑥) = 𝑥.
Then 𝑝‘(5) = 1
Thus, 𝑝(𝑥) ∉ 𝑉.
Therefore, 𝑉 ≠ 𝑃3 (ℝ).

27
Definition 2.3.1: Dimension of a sum

Let 𝑈1 , 𝑈2 ⊆ 𝑉 be finite-dimensional subspaces.


Then dim(𝑈1 + 𝑈2 ) = dim 𝑈1 + dim 𝑈2 − dim(𝑈1 ∩ 𝑈2 ).
Proof: Let 𝑢®1 , . . . , 𝑢®𝑛 be a basis for 𝑈1 ∩ 𝑈2 .
Then, we can extend the basis in two ways:

(i) a basis 𝑢®1 , . . . , 𝑢®𝑛 , 𝑣®1 , . . . , 𝑣®𝑚 for 𝑈1


(ii) a basis 𝑢®1 , . . . , 𝑢®𝑛 , 𝑤®1 , . . . , 𝑤®𝑝 for 𝑈2

Claim: Let 𝑢®1 , . . . , 𝑢®𝑛 , 𝑣®1 , . . . , 𝑣®𝑚 , 𝑤®1 , . . . , 𝑤®𝑝 is a basis for 𝑈1 + 𝑈2 .
Assume claim true for now.

dim(𝑈1 + 𝑈2 ) = 𝑛 + 𝑚 + 𝑝 our claim and defintion of dim


= (𝑛 + 𝑚) + (𝑛 + 𝑝) − 𝑛 algebra
= dim 𝑈1 + dim 𝑈2 − dim(𝑈1 ∩ 𝑈2 ) defintion of dim

For the claim we need to prove:


Proof of span: This is left for us.

Proof of linear independence: Suppose there are scalars 𝑎1 , . . . , 𝑎 𝑛 , 𝑏 1 , . . . , 𝑏 𝑚 , 𝑐 1 , . . . , 𝑐 𝑝 ∈ 𝔽 such


that:

𝑎1 𝑢®1 + . . . + 𝑎 𝑛 𝑢®𝑛 + 𝑏1 𝑣®1 + . . . + 𝑏 𝑚 𝑣®𝑚 + 𝑐 1 𝑤®1 + . . . + 𝑐 𝑝 𝑤®𝑝 = 0®𝑣


We want to show that 𝑎1 = . . . = 𝑎 𝑛 = 𝑏1 = . . . = 𝑏 𝑚 = 𝑐 1 = . . . = 𝑐 𝑝 = 0𝔽 .
Let’s introduce sum notation:
𝑛
Õ 𝑚
Õ 𝑝
Õ
𝑎 𝑖 𝑢®𝑖 + 𝑏 𝑗 𝑣®𝑗 = − 𝑐 𝑘 𝑤®𝑘
𝑖=1 𝑗=1 𝑘=1
| {z } | {z }
∈𝑈1 ∈𝑈1
Í𝑝
This shows that 𝑘=1 𝑐 𝑘 𝑤®𝑘 ∈ 𝑈1 ∩ 𝑈2 = span 𝑢®1 , . . . , 𝑢®𝑛 .

The 𝑢’s are basis for 𝑈1 ∩ 𝑈2 , so there are scalars 𝑑1 , . . . , 𝑑𝑛 ∈ 𝔽 such that:
𝑝
Õ 𝑛
Õ
𝑐 𝑘 𝑤®𝑘 = 𝑑 𝑖 𝑢®𝑖
𝑘=1 𝑖=1

This impplies that 𝑐 1 𝑤®1 + . . . + 𝑐 𝑝 𝑤®𝑝 − 𝑑1 𝑢®1 − . . . − 𝑑𝑛 𝑢®𝑛 = 0®𝑣 .


𝑢’s and 𝑤’s are a basis for 𝑈2 .
This implies that they are linearly independent and 𝑐 1 = . . . = 𝑐 𝑝 = 𝑑1 = . . . = 𝑑𝑛 = 0𝔽 .
This shows that 𝑛𝑖=1 𝑎1 𝑢®𝑖 + 𝑚 𝑗=1 𝑏 𝑗 𝑣®𝑗 = 0𝑣 .
®
Í Í
Next:
𝑢’s and 𝑣’s are a absis for 𝑈1 .
This implies that they are linear independence.
Which implies that 𝑎 1 = . . . = 𝑎 𝑛 = 𝑏 1 = . . . = 𝑏 𝑚 = 0𝔽 .
Thus, we have proven this basis is linearly independent.

Thus, we have proven the claim.


Thus, we proven the theorem.

28
Chapter 3

Linear Transformations

3.1 Linear Maps


Definition 3.1.1: Linear Maps

Let 𝑉 , 𝑊 be vector spaces over the same field 𝐹(= ℝ𝑉ℂ).


Meaning that 𝑉 = (𝑉 , 𝔽, +𝑉 , ·𝑉 ) and 𝑊 = (𝑊 , 𝔽, +𝑊 , ·𝑊 ).
A linear map: 𝑇 : 𝑉 → 𝑊 is a function such that:

(i) 𝑇(𝑢 +𝑣 𝑣) = 𝑇(𝑢) +𝑤 𝑇(𝑣) for all 𝑢, 𝑣 ∈ 𝑉

(ii) 𝑇(𝜆 ·𝑣 𝑣) = 𝜆 ·𝑤 𝑇(𝑣) for all 𝑣 ∈ 𝑉 and 𝜆 ∈ 𝔽

In other words, they preserve the vector space structure.


Note:-
Observation: 𝑇(0®𝑣 ) = 0®𝑤 .
Reason:

𝑇(0®𝑣 ) = 𝑇(0®𝑣 +𝑣 0®𝑣 ) = 𝑇(0®𝑣 ) +𝑤 𝑇(0®𝑣 )


Adding −𝑇(0®𝑣 ) to both sides, we get:

0®𝑤 = 𝑇(0®𝑣 )

Example 3.1.1
We will be showing a lot of examples today!
(i) Zero map:

0: 𝑉 → 𝑊
𝑣 ↦→ 0𝑤

(ii) Identity map:

id𝑉 : 𝑉 → 𝑉
𝑣 ↦→ 𝑣

Note:-
Notation ℒ(𝑉 , 𝑊) = {𝑇 : 𝑉 → 𝑊 | 𝑇 is linear}.

29
(iii) Differentiation map: 𝐷 ∈ ℒ(𝑃(ℝ), 𝑃(ℝ))

𝐷 : 𝑃(ℝ) → 𝑃(ℝ)
𝑝(𝑥) ↦→ 𝑝0(𝑥)

Let’s check!
Linear:

(a) 𝐷(𝑝(𝑥) + 𝑞(𝑥)) = (𝑝(𝑥) + 𝑞(𝑥))0 = 𝑝0(𝑥) + 𝑞0(𝑥) = 𝐷(𝑝(𝑥)) + 𝐷(𝑞(𝑥))


(b) 𝐷(𝜆𝑝(𝑥)) = (𝜆𝑝(𝑥))0 = 𝜆𝑝0(𝑥) = 𝜆𝐷(𝑝(𝑥))

(iv) Integration: 𝐼 ∈ ℒ(𝑃(ℝ), 𝑃(ℝ))

𝐼 : 𝑃(ℝ) → 𝑃(ℝ)
∫ 1
𝑝(𝑥) ↦→ 𝑝(𝑥)𝑑𝑥
0

Let’s check!
Linear:
∫1 ∫1 ∫1
(a) 𝐼(𝑝(𝑥) +𝑃(ℝ) 𝑞(𝑥)) = 0
(𝑝(𝑥) + 𝑞(𝑥))𝑑𝑥 = 0
𝑝(𝑥)𝑑𝑥 + 0
𝑞(𝑥)𝑑𝑥 = 𝐼(𝑝(𝑥)) +ℝ 𝐼(𝑞(𝑥))
∫1 ∫1
(b) 𝐼(𝜆 ·𝑃(ℝ) 𝑝(𝑥)) = 0
(𝜆𝑝(𝑥))𝑑𝑥 = 𝜆 0
𝑝(𝑥)𝑑𝑥 = 𝜆𝐼(𝑝(𝑥))

(v) Shift: 𝑆 ∈ ℒ(𝔽∞ , 𝔽∞ )

𝑆 : 𝔽∞ → 𝔽∞
(𝑥1 , 𝑥 2 , . . .) ↦→ (0, 𝑥 1 , 𝑥 2 , . . .)

(vi) 𝑇 : ℝ3 → ℝ2

𝑇 : ℝ3 → ℝ2
(𝑥1 , 𝑥 2 , 𝑥 3 ) ↦→ (5𝑥 + 7𝑦 − 𝑧, 2𝑥 − 𝑦)

Properties: Remember our notation:

ℒ(𝑉 , 𝑊) = {𝑇 : 𝑉 → 𝑊 | 𝑇 is linear}
The set ℒ(𝑉 , 𝑊) can be given the structure of a vector space over 𝔽.

(i) Addition: Let 𝑇, 𝑆 ∈ ℒ(𝑉 , 𝑊). Where 𝑇 : 𝑉 → 𝑊 and 𝑆 : 𝑉 → 𝑊.

(𝑇 + 𝑆) : 𝑉 → 𝑊
𝑣 ↦→ 𝑇(𝑣) +𝑊 𝑆(𝑣)

if and only if (𝑆 + 𝑇)(𝑣) = 𝑆(𝑣) +𝑊 𝑇(𝑣) for all 𝑣 ∈ 𝑉.


(ii) Multiplication: Let 𝑇 ∈ ℒ(𝑉 , 𝑊) and 𝜆 ∈ 𝔽. With 𝑇 : 𝑉 → 𝑊.

(𝜆𝑇) : 𝑉 → 𝑊
𝑣 ↦→ 𝜆 ·𝑊 𝑇(𝑣)

If and only if (𝜆𝑇)(𝑣) = 𝜆 ·𝑊 𝑇(𝑣) for all 𝑣 ∈ 𝑉.


30
(iii) Bonus structure!
We can also multiply linearly maps using function composition:

𝑇 𝑆
z}|{ z}|{
𝑈 → 𝑉 → 𝑊

Thus, we can define (𝑆 · 𝑇)(𝑢) = 𝑆(𝑇(𝑢)).


Propositions of composition:
𝑇1 𝑇2 𝑇3

(a) Associativity: 𝑈 → 𝑉 → 𝑊 → 𝑋
z}|{ z}|{ z}|{

𝑇3 · (𝑇2 · 𝑇1 ) = (𝑇3 · 𝑇2 ) · 𝑇1
(b) Identities: 𝑇 : 𝑉 → 𝑊

id𝑉 : 𝑉 → 𝑉
𝑣 ↦→ 𝑣
id𝑊 : 𝑊 → 𝑊
𝑤 ↦→ 𝑤

Thus, id𝑊 · 𝑇 = 𝑇 = 𝑇 · id𝑉 .


(c) Distributivity: 𝑆1 , 𝑆2 : 𝑉 → 𝑊 and 𝑇 : 𝑊 → 𝑋

𝑇 · (𝑆1 + 𝑆2 ) = 𝑇 · 𝑆1 + 𝑇 · 𝑆2

Important: Say 𝑉 is a finite dimensional vector space over 𝔽1 , and 𝑣®1 , . . . , 𝑣®𝑛 is a basis for 𝑉 .
Then a linear map 𝑇 : 𝑉 → 𝑊 is determined by the values 𝑇(𝑣®1 ), . . . , 𝑇(𝑣®𝑛 ).
Reason: Let 𝑣® ∈ 𝑉 = span 𝑣®1 , . . . , 𝑣®𝑛 .

This implies that 𝑣® = 𝜆1 𝑣®1 + · · · + 𝜆𝑛 𝑣®𝑛 for some and unique 𝜆1 , . . . , 𝜆𝑛 ∈ 𝔽.
Then:

𝑇(®
𝑣 ) = 𝑇(𝜆1 𝑣®1 + · · · + 𝜆𝑛 𝑣®𝑛 )
= 𝑇(𝜆1 𝑣®1 ) + · · · + 𝑇(𝜆𝑛 𝑣®𝑛 )
= 𝜆1 𝑇(𝑣®1 ) + · · · + 𝜆𝑛 𝑇(𝑣®𝑛 )

Theorem 3.1.1 Axler 3.5


Now suppose that 𝑤®1 , . . . , 𝑤®𝑛 ∈ 𝑊, not necessarily a basis.
Then there is exactly one linear map 𝑇 : 𝑉 → 𝑊 mapping
the basis 𝑣®1 , . . . , 𝑣®𝑛 to the vectors 𝑤®1 , . . . , 𝑤®𝑛 respectively.
Meaning that 𝑇(𝑣®𝑖 ) = 𝑤®𝑖 for all 𝑖 = 1, . . . , 𝑛.
Again: 𝑣® ∈ 𝑉 , 𝑣® = 𝜆1 𝑣®1 + · · · + 𝜆𝑛 𝑣®𝑛 .
Then:

𝑇(®
𝑣 ) = 𝑇(𝜆1 𝑣®1 + · · · + 𝜆𝑛 𝑣®𝑛 ) = 𝜆1 𝑇(𝑣®1 ) + · · · + 𝜆𝑛 𝑇(𝑣®𝑛 ) = 𝜆1 𝑤®1 + · · · + 𝜆𝑛 𝑤®𝑛

31
3.2 Null spaces and Ranges
Definition 3.2.1: Kernels or null spaces

Let 𝑇 : 𝑉 → 𝑊 be a linear map. n o


The kernel (null spaces) of 𝑇 is ker 𝑇 ≔ 𝑣® ∈ 𝑉 : 𝑇(®
𝑣 ) = 0®𝑊

Note:-
The image on our canvas page is this definition.

Note:-
We know that 𝑇(0®𝑉 ) = 0®𝑊 , so 0®𝑉 ∈ ker 𝑇.

Example 3.2.1

(a) ker(0) = 𝑉

0: 𝑉 → 𝑊
𝑣 ↦→ 0𝑊
n o
(b) ker(id𝑉 ) = 0®𝑉

id𝑉 : 𝑉 → 𝑉
𝑣 ↦→ 𝑣

(c)

𝐷 : 𝑃(ℝ) → 𝑃(ℝ)
𝑝(𝑥) ↦→ 𝑝0(𝑥)

Then:

ker 𝐷 = {𝑝(𝑥) ∈ 𝑃(ℝ) : 𝑝0(𝑥) = 0}


= {𝑝(𝑥) ∈ 𝑃(ℝ) : 𝑝(𝑥) = 𝑎0 }
= {𝑎 0 : 𝑎0 ∈ ℝ}
=ℝ

(d) Shift

𝑆 : 𝔽∞ → 𝔽∞
(𝑥 1 , 𝑥 2 , . . .) ↦→ (𝑥 2 , 𝑥 3 , . . .)

Then:

n o
ker 𝑆 = (𝑥 1 , 𝑥 2 , . . .) ∈ 𝔽∞ | (𝑥2 , 𝑥 3 , . . .) = 0®𝔽∞
= {(𝑥 1 , 0, 0, . . .) ∈ 𝔽∞ | 𝑥 1 ∈ 𝔽}

32
Proposition 3.2.1
In general, ker 𝑇 is a subspace of 𝑉.
Proof: Let 𝑇 : 𝑉 → 𝑊 be a linear map.
Now we want to check 1.34:

(i) 𝑇(0®𝑉 ) ∈ ker 𝑇 as 𝑇(0®𝑉 ) = 0®𝑊 .


(ii) Closed under addition: Let 𝑢® , 𝑣® ∈ ker 𝑇 ⊆ 𝑉.
We want to show that 𝑢® +𝑉 𝑣® ∈ ker 𝑇.

𝑇(®
𝑢 +𝑉 𝑣® ) = 𝑇(®
𝑢 ) +𝑊 𝑇(®
𝑣)
= 0®𝑊 +𝑊 0®𝑊
= 0®𝑊

Thus, 𝑢® +𝑉 𝑣® ∈ ker 𝑇.

(iii) Closed under scalar multiplication: Let 𝑢® ∈ ker 𝑇 and 𝜆 ∈ 𝔽.


We want to show that 𝜆 ·𝑉 𝑢® ∈ ker 𝑇.

𝑇(𝜆 ·𝑉 𝑢® ) = 𝜆 ·𝑊 𝑇(®
𝑢)
= 𝜆 ·𝑊 0®𝑊
= 0®𝑊

Thus, 𝜆 ·𝑉 𝑢® ∈ ker 𝑇.

Therefore, ker 𝑇 is a subspace of 𝑉.

Definition 3.2.2: Injective

A linear map is injective if:

𝑇(®
𝑢) = 𝑇(®
𝑣) =⇒ 𝑢=𝑣
|{z} |{z} |{z}
equal outputs must come from equal inputs

The cont appositive:

𝑢≠𝑣 =⇒ 𝑇(®
𝑢 ) ≠ 𝑇(®
𝑣)
|{z} | {z }
unequal inputs unequal outputs

Proposition 3.2.2
Let 𝑇 : 𝑉 → 𝑊 be a linear map. n o
Then T is injective if and only if ker 𝑇 = 0®𝑉 .

Proof of =⇒ : Assume 𝑇 : 𝑉 → 𝑊 is injective.


We know that 0®𝑉 ∈ ker 𝑇. n o
We want to show that ker 𝑇 ⊆ 0®𝑉 .

33
Let 𝑣® ∈ ker 𝑇.
Then 𝑇(® 𝑣 ) = 𝑇(0®𝑉 + 0®𝑉 ) = 𝑇(0®𝑉 ) + 𝑇(0®𝑉 ) = 0®𝑊 + 0®𝑊 = 0®𝑊 .
n o
Proof of ⇐= : We are given that ker 𝑇 = 0®𝑉 .
We want to show that 𝑇 is injective.
Suppose 𝑇(® 𝑢 ) = 𝑇(®𝑣 ).
Then 𝑇(® 𝑢 ) − 𝑇(®𝑣 ) = 0®𝑊 .
By linearity, 𝑇(® 𝑢 − 𝑣® ) = 0®𝑊 .
Thus, 𝑢® − 𝑣® ∈ ker 𝑇.
This means that 𝑢® − 𝑣® = 0®𝑉 .
Therefore, 𝑢® − 𝑣® = 0®𝑉 =⇒ 𝑢® = 𝑣® .

As we have proven both directions, we have proven the proposition.

Definition 3.2.3: Images

Let 𝑇 ∈ ℒ(𝑉 , 𝑊). Then the image of 𝑇 is 𝐼𝑚(𝑇) = {𝑤 ∈ 𝑊 | 𝑤 = 𝑇(𝑣) for some 𝑣 ∈ 𝑉 }.
Also denoted as Range (𝑇).
It is a subspace of 𝑊 (Axle 3.19)

Example 3.2.2
n o
(i) 𝐼𝑚(0) = 0®𝑊

0: 𝑉 → 𝑊
𝑣 ↦→ 0®𝑊

(ii) 𝐼𝑚(id𝑉 ) = 𝑉

id𝑉 : 𝑉 → 𝑉
𝑣 ↦→ 𝑣

(iii) 𝐼𝑚(𝐷) = 𝑃(ℝ)

𝐷 : 𝑃(ℝ) → 𝑃(ℝ)
𝑝(𝑥) ↦→ 𝑝0(𝑥)

(iv) An example of polynomials with 𝑚 = 5

𝐷 : 𝑃5 (ℝ) → 𝑃5 (ℝ)
Note: 𝑥 5 ∉ 𝐼𝑚(𝐷5 )

34
Definition 3.2.4: Surjective

A map 𝑇 : 𝑉 → 𝑊 is surjective if
for any 𝑤 ∈ 𝑊 there is a 𝑣 ∈ 𝑉 such that 𝑇(𝑣) = 𝑤.
i.e., 𝑇 is surjective if (and only if) 𝐼𝑚(𝑇) = 𝑊.

Theorem 3.2.1 Rank-nullity Theorem (Fundamental Theorem of linear Maps)


Let 𝑉 be a finite dimensional vector space over 𝔽 and 𝑇 : 𝑉 → 𝑊 be a linear map.
Then 𝐼𝑚(𝑇) is a finite dimensional vector space, and

dim 𝑉 = dim ker 𝑇 + dim 𝐼𝑚(𝑇)

Proof: Let 𝑉 be a finite dimensional vector space over 𝔽 and ker 𝑇 ⊆ 𝑉 be a subspace.
This means that ker 𝑇 is finite dimensional.
Let 𝑢®1 , . . . , 𝑢®𝑚 be a basis for ker 𝑇.
Which means that 𝑢®1 , . . . , 𝑢®𝑛 is linearly independent in ker 𝑇.
Therefore, it also linearly independent in 𝑉.
We can extend this list to a full basis 𝑢®1 , . . . , 𝑢®𝑛 , 𝑣®1 , . . . , 𝑣®𝑚 for 𝑉.
Then dim 𝑉 = 𝑛 + 𝑚, and dim ker 𝑇 = 𝑛
Claim: 𝑇(𝑣®1 ), . . . , 𝑇(𝑣®𝑚 ) is a basis for 𝐼𝑚(𝑇).
Thus, if the claim is true, then 𝐼𝑚(𝑇) is finite dimensional and dim 𝐼𝑚(𝑇) = 𝑚.
Thus, dim 𝑉 = dim ker 𝑇 + dim 𝐼𝑚(𝑇).

Proof of claim: We need to show that 𝑇(𝑣®1 ), . . . , 𝑇(𝑣®𝑚 ) is linearly independent


in 𝐼𝑚(𝑇) and spans 𝐼𝑚(𝑇).

(i) 𝐼𝑚(𝑇) = span 𝑇(𝑣®1 ) . . . , 𝑇(𝑣®𝑚 ) : ⊇ definition of span




(ii) We want to prove ⊆.


Let 𝑤 ∈ 𝐼𝑚(𝑇).
Then there is a 𝑣 ∈ 𝑉 such that 𝑇(𝑣) = 𝑤.
We know that 𝑣 = 𝑎1 𝑢®1 + · · · + 𝑎 𝑛 𝑢®𝑛 + 𝑏 1 𝑣®1 + · · · + 𝑏 𝑚 𝑣®𝑚 for some 𝑎1 , . . . , 𝑎 𝑛 , 𝑏 1 , . . . , 𝑏 𝑚 ∈ 𝔽.
Then:

𝑇(𝑣) = 𝑇(𝑎1 𝑢®1 + · · · + 𝑎 𝑛 𝑢®𝑛 + 𝑏 1 𝑣®1 + · · · + 𝑏 𝑚 𝑣®𝑚 ) = 𝑇(𝑎1 𝑣®1 ) + · · · + 𝑇(𝑎 𝑛 𝑣®𝑛 ) + 𝑇(𝑏1 𝑣®1 ) + · · · + 𝑇(𝑏 𝑚 𝑣®𝑚 )
= 𝑎 1 𝑇(𝑣®1 ) + · · · + 𝑎 𝑛 𝑇(𝑣®𝑛 ) + 𝑏 1 𝑇(𝑣®1 ) + · · · + 𝑏 𝑚 𝑇(𝑣®𝑚 )
we know that 𝑇(𝑢®1 ) = · · · = 𝑇(𝑢®𝑛 ) = 0®𝑊
= 𝑏 1 𝑇(𝑣®1 ) + · · · + 𝑏 𝑚 𝑇(𝑣®𝑚 )
∈ span 𝑇(𝑣®1 ), . . . , 𝑇(𝑣®𝑚 )


Thus, this shows that 𝐼𝑚(𝑇) ⊆ span 𝑇(𝑣®1 ), . . . , 𝑇(𝑣®𝑚 ) .




(iii) 𝑇(𝑣®1 ), . . . , 𝑇(𝑣®𝑚 ) are linearly independent in 𝐼𝑚(𝑇) :


Suppose that 𝑐1 𝑇(𝑣®1 ) + · · · + 𝑐 𝑚 𝑇(𝑣®𝑚 ) = 0®𝑊 for some 𝑐 1 , . . . , 𝑐 𝑚 ∈ 𝔽.
Thus, 𝑇(𝑐1 𝑣®1 ) + · · · + 𝑇(𝑐 𝑚 𝑣®𝑚 ) = 0®𝑊 .
Then 𝑇(𝑐 1 𝑣®1 + · · · + 𝑐 𝑚 𝑣®𝑚 ) = 0®𝑊 .
Hence, 𝑐1 𝑣®1 + · · · + 𝑐 𝑚 𝑣®𝑚 ∈ ker 𝑇 = span 𝑢®1 , . . . , 𝑢®𝑛 .


Thus, 𝑐 1 𝑣®1 + · · · + 𝑐 𝑚 𝑣®𝑚 = 𝑑1 𝑢®1 + · · · + 𝑑𝑛 𝑢®𝑛 for some 𝑑1 , . . . , 𝑑𝑛 ∈ 𝔽.


Then 𝑑1 𝑢®1 + · · · + 𝑑𝑛 𝑢®𝑛 − 𝑐 1 𝑣®1 − · · · − 𝑐 𝑚 𝑣®𝑚 = 0®𝑉 .

35
Since 𝑢®1 , . . . , 𝑢®𝑛 , 𝑣®1 , . . . , 𝑣®𝑚 are linearly independent in 𝑉,
it follows that 𝑑1 = · · · = 𝑑𝑛 = 𝑐 1 = · · · = 𝑐 𝑚 = 0.
Thus, 𝑇(𝑣®1 ), . . . , 𝑇(𝑣®𝑚 ) are linearly independent in 𝐼𝑚(𝑇).

As we have shown that 𝑇(𝑣®1 ), . . . , 𝑇(𝑣®𝑚 ) are linearly independent in 𝐼𝑚(𝑇) and span 𝐼𝑚(𝑇), we have proven
the claim.

Thus, we have proven the theorem.

Application: Suppose we have a system of linear equations:


Variables 𝑥1 , . . . , 𝑥 𝑛 , 𝑎 𝑖,𝑗 ∈ ℝ
Then we can write this as a matrix equation:
 𝑎1,1 𝑥1 + · · · + 𝑎1,𝑛 𝑥 𝑛 = 0ℝ 
..
 
 
 . 
 𝑎 𝑚,1 𝑥1 + · · · + 𝑎 𝑚,𝑛 𝑥 𝑛 = 0ℝ 
 
 
Thus, there are 𝑚 equations.

One solution: Let 𝑥1 = · · · = 𝑥 𝑛 = 0ℝ .


Then the system is satisfied.
But are there others?
Rephrase: Let’s rephrase this in terms of linear maps:

𝑇 : ℝ𝑛 → ℝ𝑚

 𝑥1   𝑎1,1 𝑥1 + · · · + 𝑎1,𝑛 𝑥 𝑛 
 ..  ..
   
 .  ↦→
 
 . 
𝑥 𝑛   𝑎 𝑚,1 𝑥1 + · · · + 𝑎 𝑚,𝑛 𝑥 𝑛 
   
   
We can check 𝑇 is linear!
Thus, 𝑥1 = · · · = 𝑥 𝑛 = 0 is 0®ℝ ∈ ker 𝑇.
Rank-nullity: By the theorem, we know that dim ℝ𝑛 = dim ker 𝑇 + dim 𝐼𝑚(𝑇).
| {z } | {z }
𝑛 ≤𝑚
Thus, 𝑛 ≤ dim ker 𝑇 + 𝑚.
As such, dim ker 𝑇 ≥ 𝑛 − 𝑚
Suppose that 𝑛 − 𝑚 > 0 (more variables than equations).
Therefore, dim ker 𝑇 > 0.
Meaning that there are non-zero solutions to the system of equations.
Note:-

 0  
n o   .  

Is ker 𝑇 = 0®ℝ𝑛 =  ..  ?
 

  
 0 
 
Or is there something else?
 

Theorem 3.2.2
Let 𝑉 , 𝑊 be a finite dimensional vector space over 𝔽 and dimn𝑉 >o dim 𝑊.
Then any linear map 𝑇 : 𝑉 → 𝑊 is not injective, i.e., ker 𝑇 ≠ 0®𝑉 .

36
Proof:

dim ker 𝑇 = dim 𝑉 − dim 𝐼𝑚(𝑇) By Rank-nullity


≥ dim 𝑉 − dim 𝑊 Since 𝐼𝑚(𝑇) ⊆ 𝑊 =⇒ dim 𝐼𝑚(𝑇) ≤ dim 𝑊
>0 by hypothesis
n o
Thus, ker 𝑇 ≠ 0®𝑉 .

Note:-
Going back to systems of linear equations:

Theorem =⇒ if 𝑛 > 𝑚 then 𝑇 : ℝ𝑛 → ℝ𝑚 is not injective


n o
=⇒ ker 𝑇 ≠ 0®ℝ𝑛
=⇒ there are non-zero solutions to the system of equations

Look at Axler 3.24 and 3.27 for more information.

37
3.3 Matrix of a linear map
Definition 3.3.1: Matrix of a linear map

Let 𝑉 , 𝑊 be finite dimensional vector spaces over 𝔽, and 𝑇 ∈ ℒ(𝑉 , 𝑊).


Choose basis:

𝑣®1 , . . . , 𝑣®𝑛 for 𝑉


𝑤®1 , . . . , 𝑤®𝑚 for 𝑊

Now, we can write:

𝑇(𝑣®1 ) ∈ 𝑊 = span 𝑤®1 , . . . , 𝑤®𝑚 =⇒ 𝑇(𝑣®1 ) = 𝑎1,1 𝑤®1 + · · · + 𝑎 𝑚,1 𝑤®𝑚 , 𝑎 𝑖,1 ∈ 𝔽


..
.
𝑇(𝑣®𝑛 ) ∈ 𝑊 = span 𝑤®1 , . . . , 𝑤®𝑚 =⇒ 𝑇(𝑣®𝑛 ) = 𝑎1,𝑛 𝑤®1 + · · · + 𝑎 𝑚,𝑛 𝑤®𝑚 , 𝑎 𝑖,𝑛 ∈ 𝔽


Recall: A linear map is determined by what it does to a basis.


This implies that the array of coefficients in 𝔽 determines 𝑇:
 𝑎1,1 ··· 𝑎1,𝑛 
 .. .. .. 

 . . . 
 𝑎 𝑚,1 𝑎 𝑚,𝑛 
 
 ···
This is called the matrix of 𝑇, with respect to the bases 𝑣®1 , . . . , 𝑣®𝑛 and 𝑤®1 , . . . , 𝑤®𝑚 .
Where, the above is an 𝑚 × 𝑛 matrix, where 𝑚 is the number of rows and 𝑛 is the number of columns.
Note:-
Notation:

ℳ(𝑇, (𝑣®1 , . . . , 𝑣®𝑛 ),(𝑤®1 , . . . , 𝑤®𝑚 )) or


ℳ(𝑇)

Example 3.3.1
Let 𝑇 : ℝ2 → ℝ3 be a linear map.
With (𝑥, 𝑦) ↦→ (𝑥 + 3𝑦, 2𝑥 + 5𝑦, 7𝑥 + 9𝑦).
Choose standard bases:

(1, 0) and (0, 1) for ℝ2


|{z} |{z}
𝑉1 𝑉2

(1, 0, 0) , (0, 1, 0) , (0, 0, 1) for ℝ3


| {z } | {z } | {z }
𝑊1 𝑊2 𝑊3

Then, we can write:

𝑇(𝑣1 ) = 𝑇((1, 0)) = (1, 2, 7) = 1 ·ℝ 𝑤 1 + 2 ·ℝ 𝑤 2 + 7 ·ℝ 𝑤 3


𝑇(𝑣2 ) = 𝑇((0, 1)) = (3, 5, 9) = 3 ·ℝ 𝑤 1 + 5 ·ℝ 𝑤 2 + 9 ·ℝ 𝑤 3

38
1 3
Thus, ℳ(𝑇) = 2 5 .
 
7 9
 

Example 3.3.2
Differentiation:

𝐷 ∈ ℒ(𝑃3 (ℝ), 𝑃2 (ℝ))


𝐷(𝑝(𝑥)) = 𝑝‘(𝑥)

Check bases:

1, 𝑥, 𝑥 2 , 𝑥 3 for 𝑃3 (ℝ)
| {z }
𝑉1 ,𝑉2 ,𝑉3 ,𝑉4

1, 𝑥, 𝑥 2
for 𝑃2 (ℝ)
| {z }
𝑊1 ,𝑊2 ,𝑊3

Then:

𝐷(𝑣1 ) = 𝐷(1) = 0 = 0 ·ℝ 1 + 0 ·ℝ 𝑥 + 0 ·ℝ 𝑥 2
𝐷(𝑣2 ) = 𝐷(𝑥) = 1 = 1 ·ℝ 1 + 0 ·ℝ 𝑥 + 0 ·ℝ 𝑥 2
𝐷(𝑣3 ) = 𝐷(𝑥 2 ) = 2𝑥 = 0 ·ℝ 1 + 2 ·ℝ 𝑥 + 0 ·ℝ 𝑥 2
𝐷(𝑣4 ) = 𝐷(𝑥 3 ) = 3𝑥 2 = 0 ·ℝ 1 + 0 ·ℝ 𝑥 + 3 ·ℝ 𝑥 2

Thus,
0 1 0 0

ℳ(𝐷) = 0 0 2 0
0 0 0 3

Addition of Matrices: Let 𝑉 , 𝑊 be finite dimensional vector spaces over 𝔽.


If 𝑆, 𝑇 ∈ ℒ(𝑉 , 𝑊), then define 𝑆 + 𝑇 ∈ ℒ(𝑉 , 𝑊) by:

(𝑆 + 𝑇)(𝑣) ≔ 𝑆(𝑣) + 𝑇(𝑣)


What is the matrix of 𝑆 + 𝑇?
Choose bases 𝑣®1 , . . . , 𝑣®𝑛 for 𝑉 and 𝑤®1 , . . . , 𝑤®𝑚 for 𝑊.
Then:

𝑇(𝑣 𝑘 ) = 𝑎 1,𝑘 𝑤®1 + · · · + 𝑎 𝑚,𝑘 𝑤®𝑚 1≤𝑘≤𝑛


𝑆(𝑣 𝑘 ) = 𝑏1,𝑘 𝑤®1 + · · · + 𝑏 𝑚,𝑘 𝑤®𝑚 1≤𝑘≤𝑛

Thus,
 𝑎1,1 ··· 𝑎 1,𝑛 
ℳ(𝑇, {𝑣’s} , {𝑤’s}) =  ... .. .. 

.

. 
 𝑎 𝑚,1 𝑎 𝑚,𝑛 
 
 ···
And,
39
 𝑏1,1 ··· 𝑏1,𝑛 
ℳ(𝑆, {𝑣’s} , {𝑤’s}) =  ... .. .. 

.

. 
𝑏 𝑚,1 𝑏 𝑚,𝑛 
 
 ···
Therefore:

(𝑆 + 𝑇)(𝑣 𝑘 ) = 𝑆(𝑣 𝑘 ) + 𝑇(𝑣 𝑘 )


= (𝑏1,𝑘 𝑤®1 + · · · + 𝑏 𝑚,𝑘 𝑤®𝑚 ) + (𝑎1,𝑘 𝑤®1 + · · · + 𝑎 𝑚,𝑘 𝑤®𝑚 )
= (𝑎1,𝑘 + 𝑏 1,𝑘 )𝑤®1 + · · · + (𝑎 𝑚,𝑘 + 𝑏 𝑚,𝑘 )𝑤®𝑚

Thus,
 𝑎1,1 + 𝑏1,1 ··· 𝑎1,𝑛 + 𝑏 1,𝑛 
.. .. ..

ℳ(𝑆 + 𝑇, {𝑣’s} , {𝑤’s}) =  .
 
. . 
 𝑎 𝑚,1 + 𝑏 𝑚,1 𝑎 𝑚,𝑛 + 𝑏 𝑚,𝑛 
 
 ···
So we define addition of matrices so that:

ℳ(𝑆) + ℳ(𝑇) ≔ ℳ(𝑆 +ℒ(𝑉 ,𝑊) 𝑇)


|{z}
we defined this!

Scalar Multiplication: Let 𝑇 ∈ ℒ(𝑉 , 𝑊) with bases 𝑣®1 , . . . , 𝑣®𝑛 for 𝑉 and 𝑤®1 , . . . , 𝑤®𝑚 for 𝑊.
Remember that 𝑀(𝑇) = 𝑀(𝑇, (𝑣®1 , . . . , 𝑣®𝑛 ), (𝑤®1 , . . . , 𝑤®𝑚 )).
This looks like:
 𝑎1,1 ··· 𝑎1,𝑛 
 .. .. .. 

 . . . 
 𝑎 𝑚,1 𝑎 𝑚,𝑛 
 
 ···
i.e,. 𝑇(𝑣®𝑘 ) = 𝑎1,𝑘 𝑤®1 + · · · + 𝑎 𝑚,𝑘 𝑤®𝑚 .
Then, for 𝜆 ∈ 𝔽, we define 𝜆𝑇 ∈ ℒ(𝑉 , 𝑊) by:

𝜆 · 𝑀(𝑇) ≔ 𝑀(𝜆𝑇)
We compute:

(𝜆 · 𝑇)(𝑣®𝑘 ) ≔ 𝜆 · 𝑇(𝑣®𝑘 )
= 𝜆 · (𝑎1,𝑘 𝑤®1 + · · · + 𝑎 𝑚,𝑘 𝑤®𝑚 )
 𝜆𝑎1,1 ··· 𝜆𝑎 1,𝑛 
=⇒ 𝑀(𝜆 · 𝑇) =  ... .. .. 

.

. 
𝜆𝑎 𝑚,1 𝜆𝑎 𝑚,𝑛 
 
 ···

In other words:
 𝑎1,1 ··· 𝑎1,𝑛  𝜆𝜆𝑎1,1 ··· 𝜆𝑎1,𝑛 
𝜆 ·  ... ..  =
..  .. .. .. 
 
. .

.   . . 
 𝑎 𝑚,1 𝑎 𝑚,𝑛   𝜆𝑎 𝑚,1 𝜆𝑎 𝑚,𝑛 
   
 ···  ···

Notational shift: Let 𝐹 𝑚,𝑛 ≔ {𝑚 × 𝑛 matrices with entries in 𝔽}


Having addition + scalar multiplication implies that 𝐹 𝑚,𝑛 is a vector space over 𝔽.
Soon: 𝐹 𝑚,𝑛  ℒ(𝔽𝑛 , 𝔽𝑚 ).

40
Composition of maps:
𝑆 𝑇
𝑈→
− 𝑉→
− 𝑍
Now pick bases: 𝑢®1 , . . . , 𝑢®𝑝 for 𝑈, 𝑣®1 , . . . , 𝑣®𝑛 for 𝑉, and 𝑤®1 , . . . , 𝑤®𝑚 for 𝑊.
Let 𝑗 = 1, . . . , 𝑝 and 𝑘 = 1, . . . , 𝑛.
Then:

𝑆(𝑢®𝑗 ) = 𝑏 1,𝑗 𝑣®1 + · · · + 𝑏 𝑛,𝑗 𝑣®𝑛


𝑇(𝑣®𝑘 ) = 𝑎1,𝑘 𝑤®1 + · · · + 𝑎 𝑚,𝑘 𝑤®𝑚

Now remember, 𝑀(𝑆, (𝑢®1 , . . . , 𝑢®𝑝 ), (𝑣®1 , . . . , 𝑣®𝑛 )) = 𝑀(𝑆)


𝑏1,1 ··· 𝑏 1,𝑝 
𝑀(𝑆) =  ... .. .. 

.

. 
𝑏 𝑛,1 𝑏 𝑛,𝑝 
 
 ···
And 𝑀(𝑇, (𝑣®1 , . . . , 𝑣®𝑛 ), (𝑤®1 , . . . , 𝑤®𝑚 )) = 𝑀(𝑇)
 𝑎1,1 ··· 𝑎1,𝑛 
𝑀(𝑇) =  ... .. .. 

.

. 
 𝑎 𝑚,1 𝑎 𝑚,𝑛 
 
 ···
Now, let’s define 𝑀(𝑇) · 𝑀(𝑆) ≔ 𝑀(𝑇 ◦ 𝑆)
What is 𝑀(𝑇 ◦ 𝑆)?
We know that 𝑇 ◦ 𝑆 ∈ ℒ(𝑈 , 𝑊) with bases 𝑢®1 , . . . , 𝑢®𝑝 for 𝑈 and 𝑤®1 , . . . , 𝑤®𝑚 for 𝑊.
Let’s look how the 𝑗 th column of 𝑀(𝑇 ◦ 𝑆) is determined by 𝑀(𝑆) and 𝑀(𝑇).

(𝑇 ◦ 𝑆)(𝑢®𝑗 ) = 𝑇(𝑆(𝑢®𝑗 ))
= 𝑇(𝑏 1,𝑗 𝑣®1 + · · · + 𝑏 𝑛,𝑗 𝑣®𝑛 )
= 𝑏 1,𝑗 𝑇(𝑣®1 ) + · · · + 𝑏 𝑛,𝑗 𝑇(𝑣®𝑛 ) by linearity
= 𝑏 1,𝑗 (𝑎 1,1 𝑤®1 + · · · + 𝑎 𝑚,1 𝑤®𝑚 ) + · · · + 𝑏 𝑛,𝑗 (𝑎1,𝑛 𝑤®1 + · · · + 𝑎 𝑚,𝑛 𝑤®𝑚 )
= (𝑎 1,1 · 𝑏 1,𝑗 + · · · + 𝑎 1,𝑛 · 𝑏 𝑛,𝑗 )𝑤®1 + · · · + (𝑎 𝑚,1 · 𝑏 1,𝑗 + · · · + 𝑎 𝑚,𝑛 · 𝑏 𝑛,𝑗 )𝑤®𝑚

All told:
𝑛 𝑛 𝑛
! ! !
Õ Õ Õ
(𝑇 ◦ 𝑆)(𝑢®𝑗 ) = 𝑎 1,𝑘 · 𝑏 𝑘,𝑗 𝑤®1 + 𝑎 2,𝑘 · 𝑏 𝑘,𝑗 𝑤®2 + · · · + 𝑎 𝑚,𝑘 · 𝑏 𝑘,𝑗 𝑤®𝑚
𝑘=1 𝑘=1 𝑘=1

So for the 𝑗 th column of 𝑀(𝑇 ◦ 𝑆), we have:

 Í𝑛𝑘=1 𝑎1,𝑘 · 𝑏 𝑘,𝑗 


Í
 𝑛
 𝑘=1 𝑎2,𝑘 · 𝑏 𝑘,𝑗 

𝑀(𝑇 ◦ 𝑆, (𝑢®1 , . . . , 𝑢®𝑝 ), (𝑤®1 , . . . , 𝑤®𝑚 )) =  .. 
.

Í 
 𝑛 𝑎 ·𝑏 
 𝑘=1 𝑚,𝑘 𝑘,𝑗 
Í𝑛
Thus, the 𝑖𝑗-th entry of 𝑀(𝑇 ◦ 𝑆) is 𝑘=1 𝑎 𝑖,𝑘 · 𝑏 𝑘,𝑗 .
So the matrix multiplication looks like:
𝑛
" #
Õ
𝑎 𝑖,𝑗 𝑏 𝑖,𝑗 𝑎 𝑖,𝑘 · 𝑏 𝑘,𝑗
   
· =
|{z} |{z} 𝑘=1
𝑚×𝑛 matrix 𝑛×𝑝 matrix | {z }
𝑚×𝑝 matrix

41
Theorem 3.3.1 Matrix multiplication is associative
𝑎 · · · 𝑎1,𝑛
© 1,1
Let 𝐴 = ­­ ... ..
.
.. ª®, where 𝑎 ∈ ℝ.
. ® 𝑖,𝑗

« 𝑎 𝑚,1 ··· 𝑎 𝑚,𝑛 ¬

𝑇𝐴 : ℝ𝑛 → ℝ𝑚
𝑒 𝑘 = (0, . . . , 1, . . . , 0) ↦→ (𝑎1,𝑘 , 𝑎 2,𝑘 , . . . , 𝑎 𝑚,𝑘 )
| {z }
𝑘 th place

𝐴 = 𝑀(𝑇𝐴 , standard basis of ℝ2 , standard basis of ℝ𝑚 )

Let 𝐴, 𝐵, 𝐶 be matrices with 𝑚 × 𝑛, 𝑛 × 𝑝, 𝑝 × 𝑟 dimensions respectively with entries in ℝ.


Then:

𝐴 · (𝐵 · 𝐶) = (𝐴 · 𝐵) · 𝐶
Proof: Let

𝑇𝐴 : ℝ𝑚 → ℝ𝑛
𝐴 = 𝑀(𝑇𝐴 )
𝑇𝐵 : ℝ𝑛 → ℝ𝑝
𝐵 = 𝑀(𝑇𝐵 )
𝑇𝐶 : ℝ𝑝 → ℝ𝑟
𝐶 = 𝑀(𝑇𝐶 )

Then:

𝐴 · (𝐵 · 𝐶) = 𝑀(𝑇𝐴 ) · (𝑀(𝑇𝐵 ) · 𝑀(𝑇𝐶 ))


= 𝑀(𝑇𝐴 ) · 𝑀(𝑇𝐵 ◦ 𝑇𝐶 )
= 𝑀(𝑇𝐴 ◦ (𝑇𝐵 ◦ 𝑇𝐶 ))
= 𝑀((𝑇𝐴 ◦ 𝑇𝐵 ) ◦ 𝑇𝐶 )
= 𝑀(𝑇𝐴 ◦ 𝑇𝐵 ) · 𝑀(𝑇𝐶 )
= (𝑀(𝑇𝐴 ) · 𝑀(𝑇𝐵 )) · 𝑀(𝑇𝐶 )
= (𝐴 · 𝐵) · 𝐶

42
3.4 Invertible Linear Maps
Definition 3.4.1: Invertible

𝑇 is invertible if there is a 𝑆 ∈ ℒ(𝑊 , 𝑉) such that 𝑇 ◦ 𝑆 = 𝐼𝑑𝑊 and 𝑆 ◦ 𝑇 = 𝐼𝑑𝑉 .


Then we declare 𝑆 ≔ the inverse of 𝑇, and write 𝑆 = 𝑇 −1 .
Inverses if they exist are unique.
Reason: Say 𝑆1 , 𝑆2 ∈ ℒ(𝑊 , 𝑉) are inverses for 𝑇 ∈ ℒ(𝑉 , 𝑊).
Then:

𝑆1 = 𝑆1 ◦ 𝐼𝑑𝑊 = 𝑆1 ◦ (𝑇 ◦ 𝑆2 )
|{z}
since 𝑆2 is an inverse
= (𝑆1 ◦ 𝑇) ◦ 𝑆2 = 𝐼𝑑𝑉 ◦ 𝑆2
|{z}
since 𝑆1 is an inverse
= 𝑆2

Theorem 3.4.1
Let 𝑇 ∈ ℒ(𝑉 , 𝑊) is invertible if and only if 𝑇 is bijective (injective and surjective).

Proof of =⇒ : Say that 𝑇 ∈ ℒ(𝑉 , 𝑊) is invertible. Let 𝑇 −1 : 𝑊 → 𝑉 be the inverse.

(i) 𝑇 is injective: Suppose that for some 𝑢, 𝑣 ∈ 𝑉, we have 𝑇(𝑢) = 𝑇(𝑣).


Then:

𝑢 = 𝑇 −1 (𝑇(𝑢)) = 𝑇 −1 (𝑇(𝑣)) = 𝑣

(ii) 𝑇 is surjective: Let 𝑤 ∈ 𝑊 be arbitrary.

𝑤 = 𝑇(𝑇 −1 (𝑤)) =⇒ 𝑤 ∈ 𝐼𝑚(𝑇)

So 𝑊 ⊆ 𝐼𝑚(𝑇).

Thus, 𝑇 is bijective.

Proof of ⇐= : Say that 𝑇 ∈ ℒ(𝑉 , 𝑊) is bijective i.e., 𝑇 is injective and surjective.


Let’s construct an inverse:

𝑆: 𝑊 → 𝑉
𝑤 the unique 𝑣 ∈ 𝑉 such that 𝑇(𝑣) = 𝑤

Thus, the existence of 𝑣 is guaranteed by surjectivity.


And the uniqueness of 𝑣 is guaranteed by injectivity.

Check: We have three things to check:

(i) 𝑇 ◦ 𝑆 = 𝐼𝑑𝑊 i.e., 𝑇(𝑆(𝑤)) = 𝑤 for all 𝑤 ∈ 𝑊.


Then 𝑇(𝑆(𝑤)) = 𝑇(𝑣) where 𝑣 ∈ 𝑉 is the unique vector such that 𝑇(𝑣) = 𝑤.
Thus, 𝑇(𝑆(𝑤)) = 𝑤.

43
(ii) 𝑆 ◦ 𝑇 = 𝐼𝑑𝑣 . We want 𝑆(𝑇(𝑣)) = 𝑣 for all 𝑣 ∈ 𝑉.

𝑇(𝑆(𝑇(𝑣))) = (𝑇 ◦ (𝑆 ◦ 𝑇))(𝑣) 𝑇 injective


= ((𝑇 ◦ 𝑆) ◦ 𝑇)(𝑣) =⇒ 𝑆(𝑇(𝑣)) = 𝑣
= (𝑇 ◦ 𝑆)(𝑇(𝑣))
= 𝐼𝑑𝑊 (𝑇(𝑣))
= 𝑇(𝑣)

(iii) We need to check that 𝑆 is linear.

(a) Additivity:
One on hand we have:

𝑇(𝑆(𝑤 1 ) + 𝑆(𝑤 2 )) = 𝑇(𝑆(𝑤 1 )) + 𝑇(𝑆(𝑤 2 )) 𝑇 is linear


= 𝐼𝑑𝑊 (𝑤 1 ) + 𝐼𝑑𝑊 (𝑤2 )
= 𝑤1 + 𝑤2

On the other hand:

𝑇(𝑆(𝑤 1 ) + 𝑆(𝑤 2 )) = (𝑇 ◦ 𝑆)(𝑤 1 + 𝑤 2 )


= 𝐼𝑑𝑊 (𝑤 1 + 𝑤 2 )
= 𝑤1 + 𝑤2

As 𝑇 is injective, we know that 𝑆(𝑤1 ) + 𝑆(𝑤 2 ) = 𝑆(𝑤1 + 𝑤 2 ).


(b) Homogeneity: So on one hand we have:

𝑇(𝜆 · 𝑆(𝑤)) = 𝜆 · 𝑇(𝑆(𝑤))


= 𝜆 · (𝑇 ◦ 𝑆)(𝑤)
= 𝜆 · 𝐼𝑑𝑊 (𝑤)
=𝜆·𝑤

On the other hand:

𝑇(𝑆(𝜆 · 𝑤)) = (𝑇 ◦ 𝑆)(𝜆 · 𝑤)


= 𝐼𝑑𝑊 (𝜆 · 𝑤)
=𝜆·𝑤

Since 𝑇 is injective, we know that 𝑆(𝜆 · 𝑤) = 𝜆 · 𝑆(𝑤).

As we have proven both directions, we have proven the theorem.

44
Definition 3.4.2

An invertible linear map 𝑇 ∈ ℒ(𝑉 , 𝑊) is called an isomorphism between 𝑉 and 𝑊.


Notation: 𝑉  𝑊.

Proposition 3.4.1
Say 𝑉 , 𝑊 are finite dimensional vector spaces over 𝔽, and 𝑉  𝑊.
Then dim 𝑉 = dim 𝑊.
Proof: If 𝑉  𝑊, then there is an invertible linear map 𝑇 : 𝑉 → 𝑊.
By the rank-nullity theorem, we know that:

dim 𝑉 = dim ker 𝑇 + dim 𝐼𝑚(𝑇)


Since 𝑇 is invertible, we know that dim ker 𝑇 = 0
= 0 + dim 𝐼𝑚(𝑇)
Since 𝑇 is surjective, we know that dim 𝐼𝑚(𝑇) = dim 𝑊
= 0 + dim 𝑊
= dim 𝑊

Converse is also true (Axler 3.5): If 𝑉 , 𝑊 are finite dimensional vector spaces over 𝔽 and dim 𝑉 = dim 𝑊,
then 𝑉  𝑊.

Proof: Let 𝑣®1 , . . . , 𝑣®𝑛 be a basis for 𝑉.


And let 𝑤®1 , . . . , 𝑤®𝑛 be a basis for 𝑊.
Define a linear map 𝑇 : 𝑉 → 𝑊 by setting 𝑇(𝑣®𝑖 ) = 𝑤®𝑖 , 1 ≤ 𝑖 ≤ 𝑛.
𝑇 is surjective: Let 𝑤
® ∈ 𝑊 be arbitrary.
Then:

𝑤
® = 𝑎1 𝑤®1 + · · · + 𝑎 𝑛 𝑤®𝑛 , 𝑎 𝑖 ∈ 𝔽
= 𝑎1 𝑇(𝑣®1 ) + · · · + 𝑎 𝑛 𝑇(𝑣®𝑛 )
= 𝑇(𝑎1 𝑣®1 ) + · · · + 𝑇(𝑎 𝑛 𝑣®𝑛 )
= 𝑇(𝑎1 𝑣®1 + · · · + 𝑎 𝑛 𝑣®𝑛 )

This implies that 𝑤 ∈ 𝐼𝑚𝑇, so 𝑊 ⊆ 𝐼𝑚𝑇.


Thus, 𝑇 is surjective.
𝑇 is injective: By rank-nullity, we know that:

dim 𝑉 = dim ker 𝑇 + dim 𝐼𝑚(𝑇)


= since 𝑇 is surjective, we know that dim 𝐼𝑚(𝑇) = dim 𝑊
=⇒ dim ker 𝑇 = 0 since dim 𝑉 = dim 𝑊
n o
=⇒ ker 𝑇 = 0®𝑉

Thus, 𝑇 is injective.
Thus, we have shown that 𝑇 is bijective, and thus 𝑇 is an isomorphism.

45
Example 3.4.1
We know that 𝑃3 (ℂ) and ℂ4 are isomorphic.
Proof gives us:

𝑇 : 1 ↦→ (1, 0, 0, 0)
𝑥 ↦→ (0, 1, 0, 0)
𝑥 2 ↦→ (0, 0, 1, 0)
𝑥 3 ↦→ (0, 0, 0, 1)

Under this map, for some 𝑎0 , 𝑎 1 , 𝑎 2 , 𝑎 3 ∈ ℂ, we have:

𝑇(𝑎 0 + 𝑎 1 𝑥 + 𝑎 2 𝑥 2 + 𝑎3 𝑥 3 ) = 𝑎0 𝑇(1) + 𝑎1 𝑇(𝑥) + 𝑎2 𝑇(𝑥 2 ) + 𝑎3 𝑇(𝑥 3 )


= 𝑎0 · (1, 0, 0, 0) + 𝑎 1 · (0, 1, 0, 0) + 𝑎2 · (0, 0, 1, 0) + 𝑎 3 · (0, 0, 0, 1)
= (𝑎0 , 𝑎 1 , 𝑎 2 , 𝑎 3 )

Example 3.4.2
Let 𝑉 , 𝑊 be finite dimensional vector spaces over 𝔽.
Choose bases 𝑣®1 , . . . , 𝑣®𝑛 for 𝑉 and 𝑤®1 , . . . , 𝑤®𝑚 for 𝑊.
Let’s define:

𝑀 : ℒ(𝑉 , 𝑊) → 𝐹 𝑚,𝑛
𝑇 ↦→ 𝑀(𝑇, (𝑣®1 , . . . , 𝑣®𝑛 ), (𝑤®1 , . . . , 𝑤®𝑚 ))

Now, recall that 𝑀 is linear since:

𝑀(𝑇 +ℒ(𝑉 ,𝑊) 𝑆) = 𝑀(𝑇) +𝔽𝑚,𝑛 𝑀(𝑆)


𝑀(𝜆 ·ℒ(𝑉 ,𝑊) 𝑇) = 𝜆 ·𝔽𝑚,𝑛 𝑀(𝑇)

Now, by Axler 3.60, 𝑀 is an isomorphism.


By PSET 6, dim 𝐹 𝑚,𝑛 = 𝑚𝑛.
This implies that dim ℒ(𝑉 , 𝑊) = dim 𝑉 · dim 𝑊.

Definition 3.4.3: Endomorphisms (Linear opeartions)

A linear map: 𝑇 : 𝑉 → 𝑉 is called an endomorphism or a linear operation of 𝑉.


Notation: ℒ(𝑉) ≔ ℒ(𝑉 , 𝑉)

Example 3.4.3
Here are some examples:

(i)

𝑇 : 𝑃(ℝ) → 𝑃(ℝ)
𝑝(𝑥) ↦→ 𝑥 2 𝑝(𝑥)

Note, that this map is injective but not surjective.

46
(ii)

𝑆 : ℂ∞ → ℂ∞
(𝑥1 , 𝑥 2 , 𝑥 3 , . . .) ↦→ (𝑥2 , 𝑥 3 , . . .)

Note, that this map is surjective but not injective.

Theorem 3.4.2
Let 𝑉 be a finite dimensional vector space over 𝔽.
Let 𝑇 ∈ ℒ(𝑉).
Then the following are equivalent:

(i) 𝑇 is injective
(ii) 𝑇 is surjective
(iii) 𝑇 is invertible

We are going to prove (𝑖) =⇒ (𝑖𝑖) =⇒ (𝑖𝑖𝑖) =⇒ (𝑖).


Proof of (𝑖𝑖𝑖) =⇒ (𝑖) : We have already proven this in class.

Proof of (𝑖) =⇒ (𝑖𝑖) : Assume


n o that 𝑇 is injective.
Then, we know that ker 𝑇 = 0®𝑉
By rank-nullity, we know that dim 𝑉 = dim ker 𝑇 + dim 𝐼𝑚(𝑇).
Thus, dim 𝑉 = 0 + dim 𝐼𝑚(𝑇), so dim 𝑉 = dim 𝐼𝑚(𝑇).
Since 𝑇 ∈ ℒ(𝑉), we know that 𝐼𝑚(𝑇) ⊆ 𝑉.
By Axler 2.C.1, we know that 𝐼𝑚(𝑇) = 𝑉.
Thus, 𝑇 is surjective.
Proof of (𝑖𝑖) =⇒ (𝑖𝑖𝑖) : Now assume that 𝑇 is surjective.
Then 𝐼𝑚(𝑇) = 𝑉
By rank-nullity, we know:

dim 𝑉 = dim ker 𝑇 + dim 𝐼𝑚(𝑇)


= dim ker 𝑇 + dim 𝑉
=⇒ dim ker 𝑇 = 0
n o
=⇒ ker 𝑇 = 0®𝑉
=⇒ 𝑇 is injective =⇒ 𝑇 is bijective =⇒ 𝑇 is invertible

Thus, 𝑇 is invertible as desired.

As we have proven all three directions, we have proven the theorem.

Corollary 3.4.1
If 𝑇 : ℝ𝑛 → ℝ𝑛 is linear, then:

𝑇 is invertible ⇐⇒ 𝑇 is injective ⇐⇒ 𝑇 is surjective

47
Question 1

Show that, given 𝑞(𝑥) ∈ 𝑃(ℝ), there exists another polynomial 𝑝(𝑥) such that:
 00
𝑞(𝑥) = 𝑥 2 + 2𝑥 + 3 · 𝑝(𝑥)
 

Solution: First, make everything finite-dimensional. Say 𝑞(𝑥) has degree 𝑚.


Now let’s define:

𝑇 : 𝑃𝑚 (ℝ) → 𝑃𝑚 (ℝ)
 00
𝑝(𝑥) ↦→ 𝑥 2 + 2𝑥 + 3 · 𝑝(𝑥)
 

Exercise: Show that 𝑇 is linear.


We want to show that 𝑇 is surjective.
Claim: 𝑇 is injective
 00
Proof of claim: The kernel consists of 𝑝(𝑥) such that 𝑥 2 + 2𝑥 + 3 · 𝑝(𝑥)
 
= 0.
Thus, it must have the form [𝑎𝑥 + 𝑏]0.
Thus, we need 𝑥 2 + 2𝑥 + 3 · 𝑝(𝑥) to have the form 𝑎𝑥 + 𝑏.

deg((𝑥 2 + 2𝑥 + 3) · 𝑝(𝑥)) ¾ 2 as long as 𝑝(𝑥) ≠ 0


deg(𝑎𝑥 + 𝑏) ≤ 1

Thus, the only way for this to be true is if ker 𝑇 = 0𝑃𝑚 (ℝ) .

This implies that 𝑇 is injective.
Then, by the previous theorem, we know that if 𝑇 is injective, then 𝑇 is surjective.
Thus, given 𝑞(𝑥) ∈ 𝑃(ℝ), there exists another polynomial 𝑝(𝑥) such that 𝑇(𝑝(𝑥)) = 𝑞(𝑥).
 00
Therefore, 𝑥 2 + 2𝑥 + 3 · 𝑝(𝑥) = 𝑞(𝑥).
 

Linear Maps as Matrix multiplication: Let 𝑉 be a finite dimensional vector space over 𝔽.
Let 𝑣®1 , . . . , 𝑣®𝑛 be a basis for 𝑉.
Now for any 𝑣 ∈ 𝑉, we can write for some scalars 𝑐 1 , . . . , 𝑐 𝑛 ∈ 𝔽:

𝑣 = 𝑐1 𝑣®1 + · · · + 𝑐 𝑛 𝑣®𝑛
Let’s define:
 𝑐1 
𝑀(𝑣) ≔  ... 
 
 
𝑐 𝑛 
 
 

Example 3.4.4
Let 𝑉 = 𝑃3 (ℝ) with basis 1, 𝑥, 𝑥 2 , 𝑥 3 .
Then,

𝑣 = 2 − 7𝑥 + 5𝑥 3 = 2 · 1 − 7 · 𝑥 + 0 · 𝑥 2 + 5 · 𝑥 3
Or in other words:

2
−7
 
𝑀(𝑣) =  
0
5
 

48
Note: 𝑀(𝑣0 + 𝑤 0 ) = 𝑀(𝑣 0 ) + 𝑀(𝑤0 ) and 𝑀(𝜆𝑣) = 𝜆𝑀(𝑣).
Say that 𝑇 ∈ ℒ(𝑉 , 𝑊).
Let 𝑤®1 , . . . , 𝑤®𝑚 be a basis for 𝑊.
Then, for any 𝑣 ∈ 𝑉, we can write:

𝑀(𝑇(𝑢)) = 𝑀(𝑇) · 𝑀(𝑢)


In other words, linear maps act like matrix multiplication.
We can say:
 𝑎1,1 ··· 𝑎1,𝑛 
𝑀(𝑇, (𝑣®1 , . . . , 𝑣®𝑛 ), (𝑤®1 , . . . , 𝑤®𝑚 )) =  ... .. .. 

.

. 
 𝑎 𝑚,1 𝑎 𝑚,𝑛 
 
 ···
Then:

𝑇(𝑣) = 𝑇(𝑐 1 𝑣®1 + · · · + 𝑐 𝑛 𝑣®𝑛 )


= 𝑐1 𝑇(𝑣®1 ) + · · · + 𝑐 𝑛 𝑇(𝑣®𝑛 )

Which implies 𝑀(𝑇(𝑣)) = 𝑐 1 𝑀(𝑇(𝑣®1 )) + · · · + 𝑐 𝑛 𝑀(𝑇(𝑣®𝑛 )).


On the other hand, we have:

𝑇(𝑣 𝑘 ) = 𝑎1,𝑘 𝑤®1 + · · · + 𝑎 𝑚,𝑘 𝑤®𝑚


Now, 𝑀(𝑇(𝑣 𝑘 ) is the 𝑘 th column of 𝑀(𝑇).
 𝑎1,𝑘 
 .. 
 
 . 
 𝑎 𝑚,𝑘 
 
 
Thus, we have:
 𝑐1 𝑎1,1   𝑐 𝑛 𝑎1,𝑛   𝑐1 𝑎1,1 + · · · + 𝑐 𝑛 𝑎1,𝑛 
𝑀(𝑇(𝑣)) =  ...  + · · · +
 ..   ..
     
=  = 𝑀(𝑇) · 𝑀(𝑣)
  
 .   .
𝑐1 𝑎 𝑚,1  𝑐 𝑛 𝑎 𝑚,𝑛  𝑐1 𝑎 𝑚,1 + · · · + 𝑐 𝑛 𝑎 𝑚,𝑛 
     
     
Row Reduction 𝐼 over 𝔽 : System of 𝑚 linear equations with 𝑛 unknowns: 𝑥1 , . . . , 𝑥 𝑛
 𝑎1,1 𝑥1 + . . . + 𝑎1,𝑛 𝑥 𝑛 = 𝑏1 
..
 
, 𝑎 𝑖,𝑗 , 𝑏 𝑘 ∈ 𝔽
 
 .
 𝑎 𝑚,1 𝑥1 + . . . + 𝑎 𝑚,𝑛 𝑥 𝑛 = 𝑏 𝑚 
 
 
Which can be written as a Matrix:
 𝑎1,1 . . . 𝑎1,𝑛   𝑥1   𝑏1 
 .. . . .  .. 
     
 . . . .   . 
.   .  =  . 
 𝑎 𝑚,1 . . . 𝑎 𝑚,𝑛  𝑥 𝑛  𝑏 𝑚 
  
     
| {z } |{z} |{z}
𝐴∈𝔽𝑚,𝑛 𝑥®∈𝔽𝑛,1 ® 𝑚,1
𝑏∈𝔽

Then we can define:

𝑇𝐴 : 𝔽𝑛 ↦→ 𝔽𝑚 linear map
𝑥® ↦→ 𝐴 𝑥® = 𝑏®
49
𝑏
© .1 ª
Question is: 𝐼s ­­ .. ®® ∈ image(𝑇𝐴 )?
«𝑏 𝑚 ¬
Row operations are used on the augmented matrix:
 𝑎1,1 𝑎 1,2 ... 𝑎1,𝑛 𝑏1 
[𝐴 | 𝐵] =  ... .. .. .. ..
 
.  ∈ 𝔽𝑚,𝑛+1
 
. . .
 𝑎 𝑚,1 𝑎 𝑚,2 ... 𝑎 𝑚,𝑛 𝑏𝑚
 

 
to simplify the original systems of equations.
Need elementary matrices to express row operations: 𝐸 ∈ 𝔽𝑚,𝑚
Thus, we get three types:

(i) Where 𝑎 ∈ 𝔽 is in position 𝑖, 𝑗


1  1 
.. ..
   
𝐸= . or .
   
𝑎
  
𝑎
  

 1
 
 1

Which means 𝐸 · 𝐴 : modify 𝐴 by adding 𝑎 · (row 𝑗) to row 𝑖.


(ii) Given:

𝑎 𝑖,𝑖 ↦→ 0 𝑎 𝑖,𝑗 ↦→ 1
𝑎 𝑗,𝑗 ↦→ 0 𝑎 𝑗,𝑖 ↦→ 1

Then:
1 
..
 
.
 
 
0 1
 
 
𝐸 =  ..
 
. 

1 0
 
 
 .. 

 . 


 1

Thus, 𝐸 · 𝐴: modify 𝐴 by exchanging rows 𝑖 and 𝑗.


(iii) Given: 𝑎 𝑖,𝑖 ↦→ 𝑐 ∈ 𝔽, 𝑐 ≠ 0
Thus,

1 
..
 
.
 
 
𝐸= 𝑐
 

 .. 

 . 


 1

Meaning, 𝐸 · 𝐴: modify 𝐴 by multiplying row 𝑖 by 𝑐.

Example 3.4.5

50
(i)
1 7 0 1 2 3 29 37 45
    
0 1 0 4 5 6 =  4 5 6 
    
0 0 1 7 8 9  7 8 9 
    
| {z }
𝐸(𝑖)

(ii)
0 0 1 1 2 3 7 8 9
    
0 1 0 4 5 6 = 4 5 6
    
1 0 0 7 8 9 1 2 3
    
| {z }
𝐸(𝑖𝑖)

(iii)
1 0 0 1 2 3  1 2 3 
𝐸 = 0
    
1 0 4 5 6 =  4 5 6 
   
0 0 3 7 8 9 21 24 27
    
| {z }
𝐸(𝑖𝑖𝑖)

Lenma 3.4.1
Elementary matrices are invertible:
if 𝐸 is an elementary matrix, then there exists a matrix 𝐸−1 such that 𝐸 · 𝐸−1 = 𝐸−1 · 𝐸 = 𝐼.

Proof: By EXAMPLES LOl:


−1
1
 7 0 1
 −7 0
0 1 0
= 0 1 0

0 0 1 0 0 1
 
0 −1
 0 1 0
 0 1
0 1 0 = 0 1 0

1 0 0 1 0 0
 
1 0 0 −1 1 0 0 
  
0 1 0 = 0 1 0 
1
  
0 0 3 0 0
   3

1-1
Upshot: Elementary row operations ⇐=⇒ Elementary matrices.

Example 3.4.6

51
1 1 2 1 5  1 1 2 1 5
 −𝑅1 +𝑅2 ↦→𝑅2
𝐴 = 1 1 2 6 10 −−−−−−−−−−→
 
0 0 0 5 5

1 2 5 2 7  1 2 5 2 7
  
1 1 2 1 5
−𝑅1 +𝑅3 ↦→𝑅3 
−−−−−−−−−−→ 0 0 0 5 5
0 1 3 1 2

1 1 2 1 5
𝑅2 ↔𝑅 3 
−−−−−→ 0 1 3 1 2
0 0 0 5 5

1 1 2 1 5
5 𝑅 3 ↦→𝑅 3
1 
−−−−−−−→ 0 1 3 1 2
0 0 0 1 1

1 0 −1 0 3
−𝑅2 +𝑅1 ↦→𝑅1 
−−−−−−−−−−→ 0 1 3 1 2
0 0 0 1 1

1 0 −1 0 3
−𝑅3 +𝑅2 ↦→𝑅2 
−−−−−−−−−−→ 0 1 3 0 1 ≔ 𝐴0
0 0 0 1 1

In other words:
  1 −1 0
0 1 0 0
𝐴 =
 
· 0 1 0
0 1 −1 0 0 1 
0 0 1

| {z }
−𝑅2 +𝑅1 ↦→𝑅1

Note:-
I didn’t finish the above but therey are equal

Solving systems of linear equations:

𝐴 · 𝑥® = 𝐵𝔽𝑚
|{z} |{z} |{z}
𝔽𝑚,𝑚 𝔽𝑛

Meaning that the augmented matrix 𝑀 = [𝐴 | 𝐵]

𝑀0 = 𝐸 𝑘 · . . . · 𝐸1 ·𝑀
| {z }
elementary matrices (𝑚×𝑚)
 𝑎0 𝑎1,2
0
... 𝑎1,𝑛
0
𝑏 10 
 1,1
= [𝐴0 | 𝐵0] =  ... .. .. .. ..

.
 
. . . 
 𝑎 𝑎 0𝑚,2 ... 𝑎 0𝑚,𝑛 𝑏 0𝑚
 0 
 𝑚,1


Important: ★ 𝑥® ∈ 𝔽𝑛 | 𝐴 · 𝑥® = 𝐵 = 𝑥® ∈ 𝔽 | 𝐴 · 𝑥® = 𝐵0
𝑛 0
 
Meaning, the solutions to our original system of equations are the same as the solutions to our modified
system of equations.
Proof: Let 𝑃 = 𝐸 𝑘 · . . . · 𝐸1 is invertible.
Where, 𝑃 −1 = 𝐸1−1 · . . . · 𝐸−1
𝑘
And 𝐼 = 𝑃 −1 · 𝑃 = 𝐸1−1 · . . . · 𝐸−1
𝑘
· 𝐸 𝑘 · . . . · 𝐸1
Say 𝑥® ∈ LHS of ★:

52
 
 
Where 𝑀0 = 𝑃 · 𝑀 = 𝑃 ∗ 𝐴 | 𝑃 ∗ 𝐵 
 
|{z} |{z}
 𝐴0 𝐵0 

𝐴 · 𝑥® = 𝐵
𝑃 · 𝐴 · 𝑥® = 𝑃 · 𝐵
𝐴0 · 𝑥® = 𝐵0
=⇒ 𝑥® ∈ RHS.

Use 𝑃 −1 to show the other direction.

(Reduced) Row-Echelon form: Notation: 𝑀 ∈ 𝔽𝑚,𝑛 , write 𝑀 𝑖 for the 𝑖th row of 𝑀.
Definition 3.4.4

𝑀 ∈ 𝔽𝑚,𝑛 is in (reduced ) row-echelon form if:

(i) If 𝑀 𝑖 = (0, . . . , 0) then 𝑀 𝑗 = (0, . . . , 0) for all 𝑗 > 𝑖.

(ii) If 𝑀 𝑖 ≠ (0, . . . , 0), then the left most nonzero entry is a 1 (pivot).
(iii) If 𝑀 𝑖+1 ≠ (0, . . . , 0) as well, then the pivot in 𝑀 𝑖+1 is to the right of the pivot in 𝑀 𝑖 .
(iv) The entries above and below a pivot are 0.

Example 3.4.7
Think 𝔽 = ℚ, ℝ, or ℂ.
1 0 −1 0 3
0 1 3 0 1
 
0 0 0 1 1


0 0 0 0 0

Theorem 3.4.3
Let 𝑀 ∈ 𝔽𝑚,𝑛 . There is a sequence of elementary row operations, 𝐸 𝑘 , . . . , 𝐸1 ,
such that 𝑀 0 = 𝐸 𝑘 · . . . · 𝐸1 · 𝑀 is in row-echelon form.
𝑀 0 is unique.

Solving systems of linear equations using Row-Echelon matrices : Sat 𝐴 · 𝑥® = 𝐵 =⇒ 𝑀 = [𝐴 | 𝐵]


Suppose that the row-echelon form of 𝑀 is:
 1 6 0 1 3 
0 0 0
𝑀 = [𝐴 | 𝐵 ] =  0 0 2 0 0 
 
 0 0 0 0 1 
 
This would imply that:

𝐴0 · 𝑥® =𝐵0
𝑥1 + 6𝑥2 + 𝑥4 =0
𝑥 3 + 2𝑥4 =0
0 =1

Thus, there are no solutions.

53
If instead we had:
 1 6 0 1 1 
0 0 0
𝑀 = [𝐴 | 𝐵 ] =  0 0 1 2 3 
 
 0 0 0 0 0 
 
Thus would imply that:

𝑥1 + 6𝑥 2 + 𝑥4 =1
𝑥3 + 2𝑥4 =3
0 =0

Thus, we have solutions!


Let 𝑥2 = 𝑎 and 𝑥4 = 𝑏 be constants. Solve for pivot variables:

𝑥1 = 1 − 6𝑎 − 𝑏
𝑥3 = 3 − 2𝑏
=⇒ 𝑥® = (𝑥1 , 𝑥 2 , 𝑥 3 , 𝑥 4 ) = (1 − 6𝑎 − 𝑏, 𝑎, 3 − 2𝑏, 𝑏)

In general:
Let 𝑀 0 = [𝐴0 | 𝐵0] be in row-echelon form.

(i) 𝐴0 · 𝑥® = 𝐵0 has no solutions 𝐵 contains a pivot.

(ii) If 𝐵0 has no pivot:


(a) Give the non=pivotal variables constant values.
(b) Solve for pivot variables.

Lenma 3.4.2
Let 𝑥®𝑠 be a solution to 𝑇( 𝑥®) = 𝑏.
®
Where 𝑇 is a linear map that maps 𝑥® ∈ ℝ𝑛 to 𝑏® ∈ ℝ𝑚 by a matrix 𝐴: 𝑇( 𝑥®) = 𝐴 · 𝑥®.
Then, if there are other solutions, 𝑥®★, to 𝑇( 𝑥®) = 𝑏,
®
Then there exists an 𝑥®𝑘 ∈ ker 𝑇 such that every other solution is given by:

𝑥®★ = 𝑥®𝑠 + 𝑥®𝑘

Proof: We know 𝑇( 𝑥®𝑠 ) = 𝑏® and 𝑇( 𝑥®★) = 𝑏® as they are solutions to 𝑇( 𝑥®) = 𝑏.


® Consider, 𝑇( 𝑥®★ − 𝑥®𝑠 ).
Then, by the fact that 𝑇 is a linear map, the following is true:

𝑇( 𝑥®★ − 𝑥®𝑠 ) = 𝑇( 𝑥®★) − 𝑇( 𝑥®𝑠 ) = 𝑏® − 𝑏® = 0®


.
Therefore, 𝑥®★ − 𝑥®𝑠 ∈ ker 𝑇.
Thus, there exists 𝑥®𝑘 ∈ ker 𝑇 such that 𝑥®𝑘 = 𝑥®★ − 𝑥®𝑠 .
By definition, 𝑇( 𝑥®𝑘 ) = 0, meaning it is a solution to 𝑇( 𝑥®) = 0®.
Thus, every other solution to 𝑇( 𝑥®) = 𝑏® is given by a solution, 𝑥®𝑠 plus a solution to 𝑇( 𝑥®) = 0®, 𝑥®𝑘 .

Connection to linear maps: Let 𝐴 ∈ ℝ𝑚,𝑛 .


Then:

𝑇𝐴 : ℝ𝑛 → ℝ𝑚
𝑥® ↦→ 𝐴 · 𝑥®

54
Remember:

𝑒1 , . . . , 𝑒 𝑛 standard basis for ℝ𝑛


𝑓1 , . . . , 𝑓𝑚 standard basis for ℝ𝑚
𝑓1 = (1, 0, . . . , 0)
| {z }
𝑚

We can write:
𝐴 = 𝑀(𝑇𝐴 , (𝑒1 , . . . , 𝑒 𝑛 ), ( 𝑓1 , . . . , 𝑓𝑚 ))

n o
ker 𝑇𝐴 = 𝑥® ∈ ℝ𝑛 | 𝐴 · 𝑥® = 0®ℝ𝑚
n o
= 𝑥® ∈ ℝ𝑛 | 𝐴0 · 𝑥® = 0®ℝ𝑚 where 𝐴0 is in row-echelon form
= ker 𝑇𝐴0

Example 3.4.8
Let:
1 6 0 1
𝐴0 = 0

0 1 2 ∈ ℝ3,4
0 0 0 0

This implies:

𝐴0 · 𝑥® = 0®ℝ3
𝑥1 + 6𝑥 2 + 𝑥4 = 0
𝑥3 + 2𝑥4 = 0
0=0

Non-pivot variables 𝑥 2 and 𝑥4 are free variables.


Say 𝑥2 = 𝑎 and 𝑥4 = 𝑏 are constants.
Then solve for pivot variables:

𝑥1 = −6𝑎 − 𝑏
𝑥3 = −2𝑏

Solutions:

(𝑥1 , 𝑥 2 , 𝑥 3 , 𝑥 4 ) = (−6𝑎 − 𝑏, 𝑎, −2𝑏, 𝑏) = 𝑎(−6, 1, 0, 0) + 𝑏(−1, 0, −2, 1)


So ker 𝑇𝐴0 = ker 𝑇𝐴 = {(−6𝑎 − 𝑏, 𝑎, −2𝑏, 𝑏) | 𝑎, 𝑏 ∈ ℝ} ⊆ ℝ4 .

𝑎 = 1, 𝑏 = 0 =⇒ (−6, 1, 0, 0)
𝑎 = 0, 𝑏 = 1 =⇒ (−1, 0, −2, 1)

Thus:

𝑇𝐴 = span ((−6, 1, 0, 0), (−1, 0, −2, 1))


= span (−6𝑒1 + 𝑒2 , −𝑒1 − 2𝑒3 + 𝑒4 ) .

55
Images: Given 𝑇𝐴 : ℝ𝑛 → ℝ𝑚 .
Compute a basis 𝑣®1 , . . . , 𝑣®𝑟 for the kernel.
Let 𝑖1 , . . . , 𝑖 𝑛−𝑟 be the indices of the pivot columns of 𝐴0

Claim: 𝑣®1 , . . . , 𝑣®𝑛 , 𝑒 𝑖 , . . . , 𝑒 𝑛−𝑟 is a basis for ℝ𝑛 (see notes).


Assume claim.
Proof of rank-nullity shows that 𝑇(𝑒 𝑖 ), . . . 𝑇(𝑒 𝑛−𝑟 ) is a basis for 𝐼𝑚(𝑇𝐴 ).

Example 3.4.9

1 1 2 1
𝐴 = 1 1 2 6
 
1 2 5 2
 
This implies: 𝑇𝐴 : ℝ4 → ℝ3 .
Row-echelon form of 𝐴 is:
1 0 −1 0
0
𝐴 = 0 1 3 0 , 𝑖 1 = 1, 𝑖 2 = 2, 𝑖 3 = 4
 
0 0 0 1
 

𝐼𝑚𝑇 = span (𝑇𝐴 (𝑒1 ), 𝑇𝐴 (𝑒2 ), 𝑇𝐴 (𝑒4 ))


= span (1 · 𝑓1 + 1 · 𝑓2 + 1 · 𝑓3 , 1 · 𝑓1 + 1 · 𝑓2 + 2 · 𝑓3 , 1 · 𝑓1 + 6 · 𝑓2 + 2 · 𝑓3 )

Definition 3.4.5: Elementary matrices + invertibility

𝐴 ∈ 𝔽𝑚,𝑛 is invertible if there is a 𝐵 ∈ 𝔽𝑛,𝑚 such that:


1 
..
 
𝐴 · 𝐵 = 𝐵 · 𝐴 = 𝐼𝑛 =  .
 

 

 1
Notation: 𝐵 = 𝐴−1 .
Note:-
𝐴 ∈ 𝔽𝑛,𝑛 implies:

𝑇𝐴 : 𝔽𝑛 → 𝔽𝑛
𝑥® ↦→ 𝐴 · 𝑥®

(a) 𝐴 is invertible ⇐⇒ 𝑇𝐴 is an isomorphism.


In this case: (𝑇𝐴 )−1 = 𝑇𝐴−1 .
(b) Elementary matrices are invertible.

Theorem 3.4.4
Let 𝐴 ∈ 𝔽𝑛,𝑛 . The following are equivalent (TFAE):

1. The reduced row-echelon form of 𝐴 is 𝐼𝑛 .


2. 𝐴 = 𝐸 𝑘 · . . . · 𝐸1 where 𝐸1 , . . . , 𝐸 𝑘 are elementary matrices.
3. 𝐴 is invertible.

56
Proof of 1 =⇒ 2: Let 𝐼𝑛 = 𝐴0 = 𝐸 𝑘 · . . . · 𝐸1 · 𝐴.
Since elementary matrices are invertible: (𝐸 𝑘 · . . . · 𝐸1 )−1 = 𝐸1−1 · . . . · 𝐸−1
𝑘
.
Then, 𝐴 = 𝐸1−1 · . . . · 𝐸−1
𝑘
.
But 𝐸−1
𝑖
is elementary for 1 ≤ 𝑖 ≤ 𝑘 are also elementary matrices.
Thus, 𝐴 is a product of elementary matrices.

Proof of 2 =⇒ 3 : If 𝐴 = 𝐸1 · . . . · 𝐸 𝑘 , then 𝐴−1 = 𝐸−1


𝑘
· . . . · 𝐸1−1 .
Since:

𝐸1 · . . . · 𝐸 𝑘 = 𝐴
𝐸−1 −1
𝑘 · . . . · 𝐸1 = 𝐵 = 𝐴
−1

Thus:

𝐴 · 𝐵 = 𝐸1 · . . . · 𝐸 𝑘 · 𝐸−1 −1
𝑘 · . . . · 𝐸1 = 𝐼 𝑛
Thus, 𝐴 is invertible.
Proof of 3 =⇒ 1 : Assume 𝐴 is invertible.
Let 𝐴0 = 𝐸 𝑘 · . . . · 𝐸1 · 𝐴 be the row-echelon form of 𝐴.
Either 𝐴0 = 𝐼𝑛 or the bottom row of 𝐴0 is (0, . . . , 0).
If the bottom row of 𝐴0 has all zeros, then:

𝑇𝐴0 : 𝔽𝑛 → 𝔽𝑛 is not surjective.


Thus, 𝑇𝐴0 is not an isomorphism.
Meaning, 𝐴0 is not invertible.
Therefore, 𝐴 is not invertible.
Consequence: If 𝐴 is invertible, then row-reduce it to reduced row-echelon form.
Then, 𝐼𝑛 = 𝐸 𝑘 · . . . · 𝐸1 · 𝐴.
Where 𝐸 𝑘 , . . . , 𝐸1 = 𝐴−1 .

Note:-
Notice that we started to talk about determinants after section 3.C.
I’ve moved this to chapter 10 to correspond with the textbook.
Click here to go to the determinants section: Determinants

3.5 Products and quotients of Vector Spaces


Definition 3.5.1

Let 𝑉1 , . . . , 𝑉𝑚 be vector spaces over 𝔽.


The the product of 𝑉1 , . . . , 𝑉𝑚 is:

𝑉1 × . . . × 𝑉𝑚 = {(𝑣1 , . . . , 𝑣 𝑚 ) | 𝑣 𝑖 ∈ 𝑉𝑖 for 1 ≤ 𝑖 ≤ 𝑚}
I.e., think of this in terms of a cartesian product.

Example 3.5.1
Elements of ℝ2 × ℝ3 look like:

((3, 5), (1, 0, −7.2)) ∈ ℝ2 × ℝ3

57
Example 3.5.2
Vectors in 𝑃2 (ℝ) × ℝ looks like:

(−3 + 𝑥 − 𝑥 2 , (2, 7))

Definition 3.5.2

Let’s define vector addition + scalar multiplication on 𝑉1 × . . . × 𝑉𝑚 .


They are defined component-wise:

(𝑣1 , . . . , 𝑣 𝑚 ) + (𝑤 1 , . . . , 𝑤 𝑚 ) = (𝑣1 + 𝑤 1 , . . . , 𝑣 𝑚 + 𝑤 𝑚 )
𝜆 · (𝑣1 , . . . , 𝑣 𝑚 ) = (𝜆 · 𝑣1 , . . . , 𝜆 · 𝑣 𝑚 )

Thus, the product of 𝑉1 , . . . , 𝑉𝑚 is a vector space over 𝔽.

Proposition 3.5.1
If 𝑉1 , . . . , 𝑉𝑚 are finite dimensional over 𝔽, then so is 𝑉1 × . . . × 𝑉𝑚 .
In fact, the dimension of 𝑉1 × . . . × 𝑉𝑚 is:

dim(𝑉1 × . . . × 𝑉𝑚 ) = dim(𝑉1 ) + . . . + dim(𝑉𝑚 )


Sketch of Proof: Say 𝑉𝑖 has basis {𝑣 𝑖,1 , . . . , 𝑣 𝑖,𝑚 } for 1 ≤ 𝑖 ≤ 𝑚.
Then 𝑉1 × . . . × 𝑉𝑚 has basis:

{(𝑣 1,1 , 0, . . . , 0), . . . , (𝑣 1,𝑚 , 0, . . . , 0), (0, 𝑣 2,1 , 0, . . . , 0), . . . , (0, 𝑣 𝑚,𝑚 )}

Example 3.5.3
𝑃2 (ℝ) × ℝ2 :

(i) 𝑃2 (ℝ) has basis 1, 𝑥, 𝑥 2 .




(ii) ℝ2 has basis {(1, 0), (0, 1)}.

Which means that 𝑃2 (ℝ) × ℝ2 has basis:

(1, (0, 0)), (𝑥, (0, 0)), (𝑥 2 , (0, 0)), (0, (1, 0)), (0, (0, 1))


Connection between products and direct sums: Let 𝑈1 , . . . , 𝑈𝑚 be subspaces of 𝑉 over 𝔽.


Let’s define:

Γ : 𝑈1 × . . . × 𝑈 𝑚 → 𝑈1 + . . . + 𝑈 𝑚
So, Γ(𝑢1 , . . . , 𝑢𝑚 ) = 𝑢1 + . . . + 𝑢𝑚 .
Is Γ a linear map?
Proof of linear map: Vector addition:

Γ((𝑣1 , . . . , 𝑣 𝑚 ) + (𝑢1 , . . . , 𝑢𝑚 )) = Γ((𝑣1 + 𝑢1 , . . . , 𝑣 𝑚 + 𝑢𝑚 ))


= (𝑣1 + 𝑢1 ) + . . . + (𝑣 𝑚 + 𝑢𝑚 )
= (𝑣1 + . . . + 𝑣 𝑚 ) + (𝑢1 + . . . + 𝑢𝑚 )
= Γ((𝑣1 , . . . , 𝑣 𝑚 )) + Γ((𝑢1 , . . . , 𝑢𝑚 ))
58
Thus, it is additive.
Now, we check for homogeneity:

Γ(𝜆 · (𝑣1 , . . . , 𝑣 𝑚 )) = Γ((𝜆 · 𝑣1 , . . . , 𝜆 · 𝑣 𝑚 ))


= 𝜆 · 𝑣1 + . . . + 𝜆 · 𝑣 𝑚
= 𝜆 · (𝑣1 + . . . + 𝑣 𝑚 )
= 𝜆 · Γ((𝑣1 , . . . , 𝑣 𝑚 ))

Thus, it is homogeneous.
Therefore, Γ is a linear map as desired.
Moreover:

(i) Γ is surjective:
if 𝑢1 + . . . + 𝑢𝑚 ∈ 𝑈1 + . . . + 𝑈𝑚 , then Γ((𝑢1 , . . . , 𝑢𝑚 )) = 𝑢1 + . . . + 𝑢𝑚 .

(ii) Γ is injective ⇐⇒ ker(Γ) = {0}


If Γ((𝑢1 , . . . , 𝑢𝑚 )) = 0, then 𝑢1 + . . . + 𝑢𝑚 = 0.
That means the only way to write 0 as a sum of vectors in 𝑈1 , . . . , 𝑈𝑚 is if 𝑢1 = . . . = 𝑢𝑚 = 0.
Or, if the sum is a direct sum: 𝑈1 ⊕ . . . ⊕ 𝑈𝑚 .

Rank nullity: We know that:

dim(𝑈1 × . . . × 𝑈𝑚 ) = dim 𝐼𝑚(Γ) + dim ker(Γ)


Since we know that Γ is surjective, dim 𝐼𝑚(Γ) = dim(𝑈1 + . . . + 𝑈𝑚 ).
Furthermore, we know that Γ is injective ⇐⇒ ker(Γ) = {0}.
Meaning that dim(𝑈1 × . . . × 𝑈𝑚 ) = dim(𝑈1 + . . . + 𝑈𝑚 )
Which means that:

dim(𝑈1 ) + . . . + dim(𝑈𝑚 ) = dim(𝑈1 ⊕ . . . ⊕ 𝑈𝑚 )


Thus, we have:
If 𝑈1 , . . . , 𝑈𝑚 are finite subspaces of 𝑉 over 𝔽, then:

𝑈1 + . . . + 𝑈𝑚 = 𝑈1 ⊕ . . . ⊕ 𝑈𝑚 ⇐⇒ dim(𝑈1 + . . . + 𝑈𝑚 ) = dim(𝑈1 ) + . . . + dim(𝑈𝑚 )


Definition 3.5.3

Let 𝑉 be a vector space over 𝔽.


With 𝑈 ⊆ 𝑉 is a subspace.
With 𝑣 ∈ 𝑉:

𝑣 + 𝑈 = {𝑣 + 𝑢 | 𝑢 ∈ 𝑈 } ”afine subset parallel to 𝑈 ”


In other words, 𝑣 + 𝑈 is an affine subset parallel to 𝑈.

Example 3.5.4
𝑉 = ℝ2 , 𝑈 = {(𝑥, 2𝑥) | 𝑥 ∈ ℝ}.
Then 𝑣 + 𝑈 is the set of all lines parallel to 𝑈.
Let 𝑣1 = (3, 1) and 𝑣 2 = (4, 3).

59
𝑣 + 𝑈 = {(3, 1) + (𝑥, 2𝑥) | 𝑥 ∈ ℝ}
= {(3 + 𝑥, 1 + 2𝑥) | 𝑥 ∈ ℝ}
= {(4 + 𝑥, 3 + 2𝑥) | 𝑥 ∈ ℝ}

So even though 𝑣1 ≠ 𝑣2 but 𝑣1 + 𝑈 = 𝑣 2 + 𝑈.

Lenma 3.5.1

(i) 𝑣1 + 𝑈 = 𝑣2 + 𝑈
(ii) 𝑣2 − 𝑣1 ∈ 𝑈

(iii) (𝑣1 + 𝑈) ∩ (𝑣2 + 𝑈) ≠ ∅

Proof of 𝑖𝑖 =⇒ 𝑖 : Let 𝑣 ∈ 𝑣1 + 𝑈.
So 𝑣 = 𝑣1 + 𝑢 for some 𝑢 ∈ 𝑈.

𝑣 = 𝑣1 + 𝑢 = 𝑣2 − 𝑣2 − 𝑣1 + 𝑢
= 𝑣 2 + (𝑣 1 − 𝑣 2 ) + 𝑢 ∈ 𝑣 2 + 𝑈

Similarly, 𝑣2 + 𝑈 ⊆ 𝑣1 + 𝑈.

Last time we proved: (𝑖𝑖) =⇒ (𝑖) and (𝑖) =⇒ (𝑖𝑖𝑖). Clear


Proof of (𝑖𝑖𝑖) =⇒ (𝑖𝑖) : Take 𝑤 ∈ (𝑣1 + 𝑈) ∩ (𝑣2 + 𝑈).
Then 𝑤 = 𝑣1 + 𝑢1 , 𝑤 = 𝑣2 + 𝑢2 for some 𝑢1 , 𝑢2 ∈ 𝑈.

0®𝑉 = (𝑣1 + 𝑢1 ) − (𝑣2 + 𝑢2 )


= (𝑣1 − 𝑣2 ) + (𝑢1 − 𝑢2 )
=⇒ 𝑣 2 − 𝑣 1 = 𝑢1 − 𝑢2 ∈ 𝑈

Thus, we have shown that 𝑣2 − 𝑣1 ∈ 𝑈.

Example 3.5.5 (Quotient space)


We have 𝑉 \ 𝑈 ≔ {𝑣 + 𝑈 : 𝑣 ∈ 𝑉 }.
Set of affine parallel subsets to 𝑈.
E.g. Let 𝑉 = ℝ2 , 𝑈 = {(𝑥, 2𝑥) : 𝑥 ∈ ℝ}.
With 𝑉 \ 𝑈 = the set of all lines parallel to 𝑈.
Then that means that it is the set of all lines with slope 2.
Note:-
An element of ℝ2 \ 𝑈 is a whole line parallel to 𝑈.

This means that 𝑉 \ 𝑈 is an 𝔽−vector space!


Let’s check addition:

(𝑣 + 𝑈) +𝑉\𝑈 (𝑤 + 𝑈) = (𝑣 +𝑉 𝑤) + 𝑈
Scaler multiplication

60
𝜆 · (𝑣 + 𝑈) ≔ 𝜆 · 𝑣 + 𝑈
Have to check that, e.g, addition is well defined:
Say 𝑣1 + 𝑈 = 𝑣2 + 𝑈 and 𝑤 1 + 𝑈 = 𝑤 2 + 𝑈.
Then we need to show that:

?
z}|{
(𝑣1 + 𝑈) + (𝑤 1 + 𝑈) = (𝑣 2 + 𝑈) + (𝑤 2 + 𝑈)
?
z}|{
(𝑣 1 + 𝑤 1 ) + 𝑈 = (𝑣 2 + 𝑤 2 ) + 𝑈
⇐⇒ (𝑣1 + 𝑤 1 ) − (𝑣2 + 𝑤 2 ) ∈ 𝑈
⇐⇒ (𝑣1 − 𝑣2 ) + (𝑤 1 − 𝑤 2 ) ∈ 𝑈
| {z } | {z }
∈𝑈 ∈𝑈

Which means that 𝑣1 − 𝑣2 ∈ 𝑈 and 𝑤 1 − 𝑤 2 ∈ 𝑈.


We also know that scalar multiplication is well defined:
Say 𝑣1 + 𝑈 = 𝑣2 + 𝑈.

=⇒ 𝑣1 − 𝑣2 ∈ 𝑈
=⇒ 𝜆 · (𝑣 1 − 𝑣 2 ) ∈ 𝑈
=⇒ 𝜆 · 𝑣 1 − 𝜆 · 𝑣2 ∈ 𝑈
=⇒ 𝜆 · 𝑣 1 + 𝑈 = 𝜆 · 𝑣2 + 𝑈
=⇒ 𝜆 · (𝑣 1 + 𝑈) = 𝜆 · (𝑣 2 + 𝑈)
Let’s give an example:

Example 3.5.6
Let:

𝜋 :𝑉 \𝑈
𝑣 ↦→ 𝑣 + 𝑈

Check that 𝜋 is linear.


Say 𝑉 is finite dimensional, then so is 𝑈.
By rank-nullity, dim 𝑉 = dim ker 𝜋 + dim 𝐼𝑚(𝜋).
𝜋 is surjective, which means that dim 𝐼𝑚(𝜋) = dim 𝑉 \ 𝑈.
But 𝐼𝑚(𝜋) is finite dimensional, which means that 𝑉 \ 𝑈 is also finite dimensional.
Let’s prove it.

n n oo
ker 𝜋 = 𝑣 ∈ 𝑉 : 𝜋(𝑣) = 0𝑉\𝑈
®
n o
= 𝑣 ∈ 𝑉 : 𝜋(𝑣) = 0®𝑣 + 𝑈
= {𝑣 ∈ 𝑉 : 𝑣 + 𝑈 = 𝑈 }
= {𝑣 ∈ 𝑉 : 𝑣 ∈ 𝑈 }
=𝑉 \𝑈

The result follows, probably.

61
Theorem 3.5.1 1st isomorphism theorem
Let 𝑇 ∈ ℒ(𝑉 , 𝑊) with 𝐼𝑚(𝑇) ⊆ 𝑊, ker 𝑇 ⊆ 𝑉.
Which means that 𝑉 → 𝑉 \ ker 𝑇.
Let’s define the following:

𝑇
e : 𝑉 \ ker 𝑇 → 𝐼𝑚(𝑇)
𝑣 + ker 𝑇 ↦→ 𝑇(𝑣)

Claims: We have the following claims:

(i) 𝑇
e is well defined.
If 𝑣 1 + ker 𝑇 = 𝑣2 + ker 𝑇, then we want to show that 𝑇(𝑣
e 1 + ker 𝑇) = 𝑇(𝑣
e 2 + ker 𝑇).
We have:

𝑣1 − 𝑣2 ∈ ker 𝑇
𝑇(𝑣 1 − 𝑣 2 ) = 0®𝑊
𝑇(𝑣1 ) − 𝑇(𝑣2 )

Thus, it is well defined.

(ii) 𝑇
e is linear.
e.g., 𝑇((𝑣
e + ker 𝑇) + (𝑤 + ker 𝑇)) = 𝑇((𝑣
e + 𝑤) + ker 𝑇).
This is equal to 𝑇(𝑣 + 𝑤) = 𝑇(𝑣) + 𝑇(𝑤)
Meaning that 𝑇(𝑣
e + ker 𝑇) + 𝑇(𝑤
e + ker 𝑇).
We leave homogeneity as an exercise.

(iii) 𝑇
e is injective!
Say 𝑇(𝑣
e + ker 𝑇) = 0®𝑊
This means that 𝑇(𝑣) = 0®𝑊 .
Which implies that 𝑣 ∈ ker 𝑇.
Hence, 0® + ker 𝑇 = 𝑣 + ker 𝑇.
This is 0𝑉\ker
® 𝑇

(iv) 𝑇
e is surjective!
Let 𝑤 ∈ 𝐼𝑚(𝑇).
Then 𝑤 = 𝑇(𝑣) for some 𝑣 ∈ 𝑉.
Which means that 𝑤 = 𝑇(𝑣
e + ker 𝑇).

Thus, 𝑇
e ∈ ℒ(𝑉 \ ker 𝑇, 𝐼𝑚(𝑇)) is an isomorphism of 𝔽 vector spaces.
i.e.,

𝑉 \ ker 𝑇  𝐼𝑚(𝑇)

62
Chapter 4

Polynomials

Definition 4.0.1

Let 𝑧 = 𝑎 + 𝑏𝑖 where 𝑎, 𝑏 ∈ ℝ. Then:

(i) The real part of 𝑧 is 𝑎, denoted <(𝑧) or Re(𝑧).


(ii) The imaginary part of 𝑧 is 𝑏, denoted =(𝑧) or Im(𝑧).

Hence, 𝑧 = <(𝑧) + 𝑖=(𝑧).

Definition 4.0.2

Let 𝑧 ∈ ℂ, then
The complex conjugate of 𝑧 is 𝑧p= <(𝑧) − =(𝑧)𝑖.
The absolute value of 𝑧 is |𝑧| = <(𝑧)2 + =(𝑧)2 .

Properties of Complex numbers: Let 𝑤, 𝑧 ∈ ℂ, where:

𝑧 = 𝑎 + 𝑏𝑖
𝑤 = 𝑐 + 𝑑𝑖
𝑧 = 𝑎 − 𝑏𝑖
𝑤 = 𝑐 − 𝑑𝑖
(i) Sum of 𝑧 and 𝑧: 𝑧 + 𝑧 = 2<(𝑧)

Proof:
𝑧 + 𝑧 = (𝑎 + 𝑏𝑖) + (𝑎 − 𝑏𝑖)
= 2𝑎
= 2<(𝑧)

(ii) Difference of 𝑧 and 𝑧: 𝑧 − 𝑧 = 2𝑖=(𝑧)

Proof:
𝑧 − 𝑧 = (𝑎 + 𝑏𝑖) − (𝑎 − 𝑏𝑖)
= 2𝑏𝑖
= 2=(𝑧)𝑖

63
(iii) Product of 𝑧 and 𝑧: 𝑧𝑧 = |𝑧| 2

Proof:
𝑧𝑧 = (𝑎 + 𝑏𝑖)(𝑎 − 𝑏𝑖)
= 𝑎 2 − 𝑎𝑏𝑖 + 𝑎𝑏𝑖 − 𝑏 2 𝑖 2
= 𝑎2 + 𝑏2
= |𝑧| 2

(iv) Additivity of complex conjugate: 𝑤 + 𝑧 = 𝑤 + 𝑧

Proof:
𝑧 + 𝑤 = (𝑎 − 𝑏𝑖) + (𝑐 − 𝑑𝑖)
= (𝑎 + 𝑐) − (𝑏 + 𝑑)𝑖
=𝑤+𝑧

(v) Multiplicativity of complex conjugate: 𝑤𝑧 = 𝑤 · 𝑧

Proof:
𝑤 · 𝑧 = (𝑐 − 𝑑𝑖)(𝑎 − 𝑏𝑖)
= 𝑎𝑐 − 𝑎𝑑𝑖 − 𝑏𝑐𝑖 − 𝑏𝑑𝑖 2
= 𝑎𝑐 − 𝑎𝑑𝑖 − 𝑏𝑐𝑖 + 𝑏𝑑
= (𝑎𝑐 + 𝑏𝑑) − (𝑎𝑑 + 𝑏𝑐)𝑖
= 𝑤𝑧

(vi) Conjugate of a conjugate: 𝑧 = 𝑧

Proof:

𝑧 = 𝑎 − 𝑏𝑖
= 𝑎 + 𝑏𝑖
=𝑧

(vii) Real and imaginary parts are bounded by |𝑧|:

Proof:
|𝑧| 2 = 𝑧𝑧
= (𝑎 + 𝑏𝑖)(𝑎 − 𝑏𝑖)
= 𝑎2 + 𝑏2
|𝑧| 2 ¾ 𝑎 2
|𝑧| 2 ¾ 𝑏 2
|𝑧| ¾ 𝑎
|𝑧| ¾ 𝑏

64
(viii) Absolute value of the complex conjugate: |𝑧| = |𝑧|

Proof:

|𝑧| = |𝑎 − 𝑏𝑖|

= 𝑎2 + 𝑏2
= |𝑧|

(ix) Multiplicativity of absolute value: |𝑤𝑧| = |𝑤||𝑧|

Proof:

|𝑤𝑧| 2 = (𝑤𝑧)(𝑤𝑧)
p
|𝑤𝑧| = (𝑤𝑧)(𝑤𝑧)
p
= (𝑤𝑤)(𝑧𝑧)
√ √
= 𝑤𝑤 𝑧𝑧
= |𝑤||𝑧|

(x) Triangle Equality: |𝑤 + 𝑧| ¶ |𝑤| + |𝑧|

Proof:

|𝑤 + 𝑧| 2 = (𝑤 + 𝑧)(𝑤 + 𝑧)
= 𝑤𝑤 + 𝑤𝑧 + 𝑧𝑤 + 𝑧𝑧
= |𝑤| 2 + 𝑤𝑧 + 𝑤𝑧 + |𝑧| 2
= |𝑤| 2 + |𝑧| 2 + 2<(𝑤𝑧)
¶ |𝑤| 2 + |𝑧| 2 + 2 |𝑤𝑧|
¶ |𝑤| 2 + |𝑧| 2 + 2 |𝑤| |𝑧|
= (|𝑤| + |𝑧|)2

65
Definition 4.0.3

Geometric interpretation of complex numbers:


Let 𝑤, 𝑧 ∈ ℂ, 𝜃, 𝜙 ∈ ℝ.
Let’s write 𝑧 = |𝑧| (cos(𝜃) + 𝑖 sin(𝜃)),
And 𝑤 = |𝑤| (cos(𝜙) + 𝑖 sin(𝜙)).
Then:

𝑧𝑤 = |𝑧| |𝑤| (cos(𝜃 + 𝜙) + 𝑖 sin(𝜃 + 𝜙))


Proof: Let’s use trig identities:

𝑧𝑤 = (𝑟(cos(𝜃) + 𝑖 sin(𝜃)))(𝑠(cos(𝜙) + 𝑖 sin(𝜙)))


= 𝑟𝑠(cos(𝜃) cos(𝜙) − sin(𝜃) sin(𝜙) + 𝑖(cos(𝜃) sin(𝜙) + sin(𝜃) cos(𝜙)))
= 𝑟𝑠(cos(𝜃 + 𝜙) + 𝑖 sin(𝜃 + 𝜙))

We used the following trig identities:

cos(𝛼 + 𝛽) = cos(𝛼) cos(𝛽) − sin(𝛼) sin(𝛽)


sin(𝛼 + 𝛽) = cos(𝛼) sin(𝛽) + sin(𝛼) cos(𝛽)

Theorem 4.0.1
Let 𝑎 0 , . . . , 𝑎 𝑚 ∈ 𝔽. If:

𝑎0 + 𝑎1 𝑥 + . . . + 𝑎 𝑚 𝑥 𝑚 = 0
For every 𝑥 ∈ 𝔽, then 𝑎0 = . . . = 𝑎 𝑚 = 0.
Proof: Assume the contrapositive. Let our polynomial be given by

𝑝(𝑥) = 𝑎0 + 𝑎1 𝑥 + · · · + 𝑎 𝑚 𝑥 𝑚
If this polynomial is not the zero function, then there exists some coefficient 𝑎 𝑘 ≠ 0.
Without loss of generality, let’s assume that 𝑎 𝑚 is that coefficient.
We want to show that there exists some value 𝑥 = 𝑧 for which the polynomial does not evaluate to zero.
Specifically, we’ll show that the term 𝑎 𝑚 𝑧 𝑚 will dominate all other terms for a sufficiently large 𝑧,
such that the polynomial cannot evaluate to zero.
To do this, let’s choose 𝑧 such that
Í𝑚−1
𝑗=0 |𝑎 𝑗 |
𝑧>
|𝑎 𝑚 |
Given this choice of 𝑧, the magnitude of the term 𝑎 𝑚 𝑧 𝑚 will exceed the combined magnitudes of all the
other terms:

|𝑎 𝑚 𝑧 𝑚 | > |𝑎0 | + |𝑎1 𝑧| + · · · + |𝑎 𝑚−1 𝑧 𝑚−1 |


Now, when we evaluate 𝑝(𝑧):

𝑝(𝑧) = 𝑎0 + 𝑎1 𝑧 + · · · + 𝑎 𝑚−1 𝑧 𝑚−1 + 𝑎 𝑚 𝑧 𝑚


Given our choice of 𝑧, it’s clear that 𝑝(𝑧) ≠ 0.
This completes the proof by contrapositive.
Thus, if a polynomial is the zero function, all of its coefficients must be zero.

66
Question 2

Fix a real number 𝑐.

(a) Show that if 𝑝 has degree 𝑛 > 0, then there is some monomial 𝑞 such that 𝑝 − (𝑥 − 𝑐)𝑞 is a polynomial
of degree less than 𝑛. (A monomial is a polynomial that has only one non-zero term.)
(b) Suppose that 𝑝 is a polynomial with a root at 𝑥 = i.e., 𝑝(𝑐) = 0. Show that (𝑥 − 𝑐) is a factor of 𝑝
(that is, there is some polynomial 𝑟 such that 𝑝 = (𝑥 − 𝑐)𝑟 ).

Proof of 𝑎: Given polynomial 𝑝 with degree 𝑛 > 0, and the form 𝑝(𝑥) = 𝑎0 + 𝑎1 𝑥 + 𝑎2 𝑥 2 + . . . + 𝑎 𝑛 𝑥 𝑛 , for 𝑎 𝑖 ∈ ℝ.
Let’s fix 𝑐, now, we want to show that there is some monomial 𝑞
such that 𝑝 − (𝑥 − 𝑐)𝑞 is a polynomial of degree less than 𝑛.
Let’s proceed by induction on 𝑛 ∈ ℕ,
Base Case: Let 𝑛 = 1, which means that 𝑝(𝑥) has a degree of 1.

𝑝(𝑥) = 𝑎0 + 𝑎1 𝑥
Clearly, we can pick 𝑞 = 𝑎1 (as it is a monomial).
Moreover, if we solve for 𝑝 − (𝑥 − 𝑐)𝑞, we get

𝑝 − (𝑥 − 𝑐)𝑞 = (𝑎 0 + 𝑎1 𝑥) − (𝑥 − 𝑐)(𝑎1 )
= 𝑎0 + 𝑎1 𝑥 − 𝑎1 𝑥 + 𝑎1 𝑐
= 𝑎0 + 𝑎1 𝑐

Notice, that 𝑎0 + 𝑎1 𝑐 is a constant polynomial, meaning that its degree is 0, which is less than 1.
Hence, the base case holds.

Inductive Step: Assume the statement holds for all polynomials 𝑝 with degree less than 𝑛.
Thus, for all 𝑘 < 𝑛, 𝑘 ∈ ℕ, we have a monomial 𝑞 s.t. 𝑝 − (𝑥 − 𝑐)𝑞 is a polynomial of degree less than 𝑘.
Now, we want to show that the statement holds for 𝑛.
Let’s consider a polynomial 𝑝 with degree 𝑛, then we can write:

𝑝(𝑥) = 𝑎 𝑛 𝑥 𝑛 + 𝑝 𝑛−1 (𝑥), where 𝑝 𝑛−1 (𝑥) is a polynomial of degree less than 𝑛
By our inductive hypothesis, we know that there is some monomial 𝑞 𝑛−1 (𝑥)
such that 𝑝 𝑛−1 − (𝑥 − 𝑐)𝑞 𝑛−1 is a polynomial of degree less than 𝑛 − 1.
Combining this information, let’s pick 𝑞(𝑥) = 𝑎 𝑛 𝑥 𝑛−1 . Clearly, 𝑞(𝑥) is a monomial.
Thus, we have:

𝑝 − (𝑥 − 𝑐)𝑞 = (𝑎 𝑛 𝑥 𝑛 + 𝑝 𝑛−1 (𝑥)) − (𝑥 − 𝑐)(𝑎 𝑛 𝑥 𝑛−1 )


= 𝑎 𝑛 𝑥 𝑛 + 𝑝 𝑛−1 (𝑥) − 𝑎 𝑛 𝑥 𝑛 + 𝑎 𝑛 𝑐𝑥 𝑛−1 leading term cancels out
𝑛−1
= 𝑝 𝑛−1 (𝑥) + 𝑎 𝑛 𝑐𝑥

Notice that the degree of 𝑝 𝑛−1 (𝑥) + 𝑎 𝑛 𝑐𝑥 𝑛−1 is less than 𝑛.


This holds as the degree of 𝑝 𝑛−1 (𝑥) is less than 𝑛 − 1 and 𝑎 𝑛 𝑐𝑥 𝑛−1 is a term of degree 𝑛 − 1.
Which means the polynomial 𝑝 − (𝑥 − 𝑐)𝑞 is a polynomial of degree less than 𝑛.
Therefore, the inductive step holds.
Thus, by the principle of mathematical induction, we have shown that if 𝑝 has degree 𝑛 > 0,
then there is some monomial 𝑞 such that 𝑝 − (𝑥 − 𝑐)𝑞 is a polynomial of degree less than 𝑛 for all 𝑛 ∈ ℕ.

Proof of 𝑏: Assume that 𝑝 is a polynomial with a root at 𝑥 = 𝑐, i.e., 𝑝(𝑐) = 0.


We want to show that (𝑥 − 𝑐) is a factor of 𝑝, i.e., there is some polynomial 𝑟 such that 𝑝 = (𝑥 − 𝑐)𝑟.
Let’s proceed by induction on 𝑛 ∈ ℕ for the degree of 𝑝.
Note that if 𝑛 = 0, then it is trivially true that 𝑝 = (𝑥 − 𝑐)𝑟.
67
Base Case: If 𝑛 = 1, then 𝑝(𝑥) = 𝑎0 + 𝑎1 𝑥.
As 𝑝(𝑐) = 0, this implies that 𝑎0 + 𝑎1 𝑐 = 0 and 𝑎0 = −𝑎1 𝑐.
Hence, 𝑝(𝑥) = 𝑎 1 (𝑥 − 𝑐) and (𝑥 − 𝑐) is a factor of 𝑝.
Notice that 𝑟 = 𝑎 1 , so the base case holds.
Inductive Step: Assume that the statement holds for some 𝑘 ∈ ℕ, then
for any polynomial 𝑝 of degree 𝑘 with 𝑝(𝑐) = 0, (𝑥 − 𝑐) is a factor of 𝑝.
Now, let’s consider a polynomial 𝑝 of degree 𝑘 + 1.
By part (a), there exists a monomial 𝑞 such that

𝑝 − (𝑥 − 𝑐)𝑞 is a polynomial of degree less than 𝑘 + 1


Hence, we can write 𝑝 as:

𝑝(𝑥) = (𝑥 − 𝑐)𝑞(𝑥) + 𝑠(𝑥)


Where 𝑠(𝑥) is the difference of the two polynomials with degree less than 𝑘 + 1.
Now substituting 𝑥 = 𝑐 into 𝑝(𝑥), we get:

𝑝(𝑐) = (𝑐 − 𝑐)𝑞(𝑐) + 𝑠(𝑐)


= 0 + 𝑠(𝑐)
=0
=⇒ 𝑠(𝑐) = 0

Thus, by our inductive hypothesis, we know that (𝑥 − 𝑐) is a factor of 𝑠(𝑥).


Which means we can write:

𝑠(𝑥) = (𝑥 − 𝑐)𝑡(𝑥)
Substituting this into our original equation, we get:

𝑝(𝑥) = (𝑥 − 𝑐)𝑞(𝑥) + 𝑠(𝑥)


= (𝑥 − 𝑐)𝑞(𝑥) + (𝑥 − 𝑐)𝑡(𝑥)
= (𝑥 − 𝑐)(𝑞(𝑥) + 𝑡(𝑥))

Thus, (𝑥 − 𝑐) is a factor of 𝑝 with some polynomial 𝑟 = 𝑞(𝑥) + 𝑡(𝑥).


Completing the inductive step.
Thus, through the principle of mathematical induction,
we have shown that if 𝑝 is a polynomial with a root at 𝑥 = 𝑐, i.e., 𝑝(𝑐) = 0,
then (𝑥 − 𝑐) is a factor of 𝑝, i.e., there is some polynomial 𝑟 such that 𝑝 = (𝑥 − 𝑐)𝑟.

68
Chapter 5

Eigenvalues, Eigenvectors, and


Invariant Subspaces

5.1 Invariant subspaces (5.A + 5.B)


Note:-
Goal: understand Éthe building blocks / internal structure of 𝑇 ∈ ℒ(𝑉), especially when 𝑉 is finite-dimensional.
𝑚
Idea: Maybe 𝑉 = 𝑖=1 𝑈 𝑖
Restrict attention to 𝑇 |𝑈𝑖 : 𝑈 𝑖 → 𝑉.

Definition 5.1.1

Let 𝑈 ⊆ 𝑉 is an invariant subspace under 𝑇 if

𝑢 ∈ 𝑈 =⇒ 𝑇(𝑢) ∈ 𝑈
in other words, if 𝐼𝑚(𝑇 |𝑈 ) ⊆ 𝑈,
or 𝑇 |𝑈 : 𝑈 → 𝑈, i.e., 𝑇 | 𝑢 ∈ ℒ(𝑈) where 𝑇 : 𝑉 → 𝑉.

Example 5.1.1
What does a 1 dimensional invariant subspace under 𝑇 look like?
𝑈 = span (𝑣). Then 𝑇(𝑣) ∈ 𝑈, so 𝑇(𝑣) = 𝜆𝑣 for some 𝜆 ∈ 𝔽.
Conversely, if 𝑣 ≠ 0®𝑣 and 𝑇(𝑣) = 𝜆𝑣 for some 𝜆 ∈ 𝔽,
then 𝑈 = span (𝑣) is 1-dimensional invariant subspace under 𝑇.
We call 𝜆 an eigenvalue of 𝑇.
If 𝑣 ≠ 0®𝑣 , then 𝑣 is an eigenvector for the eigenvalue 𝜆.

Proposition 5.1.1
Suppose that 𝑉 is a finite-dimensional vector space over 𝔽 and 𝑇 ∈ ℒ(𝑉).
Then the following are equivalent:

(a) 𝜆 ∈ 𝔽 is an eigenvalue of 𝑇.
(b) 𝑇 − 𝜆𝐼𝑑 is not injective.
(c) 𝑇 − 𝜆𝐼𝑑 is not surjective.
(d) 𝑇 − 𝜆𝐼𝑑 is not invertible.

Where 𝑇 − 𝜆𝐼𝑑 ∈ ℒ(𝑉):

69
(𝑇 − 𝜆𝐼𝑑)(𝑣) = 𝑇(𝑣) − 𝜆𝐼𝑑(𝑣)
= 𝑇(𝑣) − 𝜆𝑣

And given 𝑇, 𝑆 ∈ ℒ(𝑉 , 𝑊), (𝑇 + 𝑆)(𝑣) = 𝑇(𝑣) + 𝑆(𝑣) for all 𝑣 ∈ 𝑉.


Thus, 𝑇, 𝜆𝐼𝑑 ∈ ℒ(𝑉 , 𝑉), so 𝑇 − 𝜆𝐼𝑑 ∈ ℒ(𝑉 , 𝑉).
Proof of 1 ⇐⇒ 2 : I didn’t get this :sob:

Before we continue, let’s prove a claim:

Claim 5.1.1
Eigenvectors corresponding to distinct eigenvalues are linearly independent.
Let 𝑇 ∈ ℒ(𝑉), and let 𝜆1 , . . . , 𝜆𝑚 be distinct eigenvalues of 𝑇,
with eigenvectors 𝑣 1 , . . . , 𝑣 𝑚 respectively.
Then 𝑣1 , . . . , 𝑣 𝑚 are linearly independent.
Proof: Suppose for contradiction that 𝑣1 , . . . , 𝑣 𝑚 are linearly dependent.
Then by the linear dependence lemma, there exists a (smallest) 𝑘 ∈ {1, . . . , 𝑚} such that

𝑣 𝑘 ∈ span (𝑣1 , . . . , 𝑣 𝑘−1 ) (so 𝑣1 , . . . , 𝑣 𝑘−1 are linearly independent)


This implies that 𝑣 𝑘 = 𝑎 1 𝑣1 + . . . + 𝑎 𝑘−1 𝑣 𝑘−1 for some 𝑎1 , . . . , 𝑎 𝑘−1 ∈ 𝔽.
Now apply 𝑇:

𝑇(𝑣 𝑘 ) = 𝑇(𝑎1 𝑣1 + . . . + 𝑎 𝑘−1 𝑣 𝑘−1 )


= 𝑎1 𝑇(𝑣1 ) + . . . + 𝑎 𝑘−1 𝑇(𝑣 𝑘−1 )
= 𝑎1 𝜆1 𝑣1 + . . . + 𝑎 𝑘−1 𝜆 𝑘−1 𝑣 𝑘−1
𝜆 𝑘 · 𝑣 𝑘 = 𝑎1 𝜆1 𝑣1 + . . . + 𝑎 𝑘−1 𝜆 𝑘−1 𝑣 𝑘−1

Now take: 𝑣 𝑘 · (𝑎1 𝑣 1 + . . . + 𝑎 𝑘−1 𝑣 𝑘−1 ) − (𝑎1 𝜆1 𝑣1 + . . . + 𝑎 𝑘−1 𝜆 𝑘−1 𝑣 𝑘−1 ):

0®𝑣 = 𝑎1 (𝜆 𝑘 − 𝜆1 )𝑣1 + . . . + 𝑎 𝑘−1 (𝜆 𝑘 − 𝜆 𝑘−1 )𝑣 𝑘−1


Since 𝑣1 , . . . , 𝑣 𝑘−1 are linearly independent, this implies that:

𝑎1 (𝜆 𝑘 − 𝜆1 ) = . . . = 𝑎 𝑘−1 (𝜆 𝑘 − 𝜆 𝑘−1 ) = 0
Since we are given that 𝜆1 , . . . , 𝜆𝑚 are distinct, we have that 𝜆 𝑘 − 𝜆 𝑖 ≠ 0 for all 𝑖 ∈ {1, . . . , 𝑘 − 1}.
This means that 𝑎1 = . . . = 𝑎 𝑘−1 = 0𝔽 .
Thus, 𝑣 𝑘 = 0®𝑣 thus 𝑣 𝑘 is not an eigenvector.
Which is a contradiction!

Thus, the claim holds.

Last Time: Let 𝑉 be a finite-dimensional vector space over 𝔽 and 𝑇 ∈ ℒ(𝑉).


i., 𝑇 : 𝑣 → 𝑉 is linear.
WE know that 𝑉  𝔽𝑛 for some 𝑛 ∈ ℕ.
Think 𝑇 : 𝔽𝑛 → 𝔽𝑛 .
Let 𝐴 = ℳ(𝑇, (𝑒1 , . . . , 𝑒 𝑛 )), where 𝑒1 , . . . , 𝑒 𝑛 is the standard basis for 𝑉.
We defined the characteristic polynomial of 𝑇 to be:

det(𝐴 − 𝑥𝐼𝑛 ) ∈ 𝑃𝑛 (𝔽)


We showed 𝜆 ∈ 𝔽 is an eigenvalue for 𝑇 ⇐⇒ det(𝐴 − 𝜆𝐼𝑛 ) = 0.

70
The first part shows that there exists 𝑣 ≠ 0®𝑣 , 𝑇(𝑣) = 𝜆 · 𝑣

Theoremn 5.1.1
o
Let 𝑣 ≠ 0®𝑣 be a finite-dimensional vector space over ℂ.
Let 𝑇 ∈ ℒ(𝑉).
Then 𝑇 has at least one eigenvalue.
n o
Proof: Let 𝑛 = dim 𝑉. Note 𝑛 ¾ 1, since 𝑣 ≠ 0®𝑣 .
Then det(𝐴 − 𝑥𝐼𝑛 ) is a polynomial of degree 𝑛 with complex coefficients.
By the fundamental theorem of algebra (proved in Math 427),
a non-constant polynomial with complex coefficients has a root in ℂ.
Thus, there exists 𝜆 ∈ ℂ such that det(𝐴 − 𝜆𝐼𝑛 ) = 0.

Example 5.1.2
Let

1 4 5
𝐴 = ­0 2 6®
© ª

«0 0 3¬
Let:

1 4 5 𝑥 0 0
det(𝐴 − 𝑥 · 𝐼3 ) = det ­­0 2 6® − ­ 0 𝑥 0 ®®
©© ª © ªª

« «0 0 3¬ « 0 0 𝑥 ¬¬
Thus, we get the following:

1−𝑥 4 5
det ­­ 0 2−𝑥 6 ®® = (1 − 𝑥)(2 − 𝑥)(3 − 𝑥)
©© ªª

«« 0 0 3 − 𝑥 ¬¬
Roots of characteristic polynomial are 𝑥 = 1, 2, or3.

Change of basis: Does the characteristic polynomial depend on 𝐴, or does it depend only on 𝑇?

𝑇 : 𝔽𝑛 → 𝔽𝑛
We can have:

𝐴 = ℳ(𝑇, (𝑒1 , . . . , 𝑒 𝑛 ))
𝐴0 = ℳ(𝑇, ( 𝑓1 , . . . , 𝑓𝑛 )) 𝑓 𝑗 = 𝑎 1,𝑗 𝑒1 + . . . + 𝑎 𝑛,𝑗 𝑒 𝑛

Which means our polynomial looks like:

𝑎 ... 𝑎1,𝑛
© 1,1
𝑃 = ­­ ... ..
.
.. ª®
. ®
« 𝑎 𝑛,1 ... 𝑎 𝑛,𝑛 ¬
Where we get a new basis in terms of old basis.
To get from 𝐴 to 𝐴0:

𝐴0 = 𝑃 −1 · 𝐴 · 𝑃
|{z} |{z} |{z}
converts e’s to f’s apply T WRT e’s converts f’s to e’s

71
What does this mean for our characteristic polynomial?: We have that:

𝑃 −1 (𝐴 − 𝑥𝐼𝑛 )𝑃 = 𝑃 −1 𝐴𝑃 − 𝑃 −1 𝑥𝐼𝑛 𝑃 = 𝑃 −1 𝐴𝑃 − 𝑥𝐼𝑛


| {z }
𝑥·𝐼𝑛
Thus, we can write our characteristic polynomial with respect to 𝑓 ’s as:

det(𝐴0 − 𝑥𝐼𝑛 ) = det(𝑃 −1 𝐴𝑃 − 𝑥𝐼𝑛 )


= det(𝑃 −1 𝐴𝑃 − 𝑥𝑃 −1 𝐼𝑛 𝑃)
= det(𝑃 −1 (𝐴 − 𝑥𝐼𝑛 )𝑃)
= det(𝑝 −1 ) · det(𝐴 − 𝑥𝐼𝑛 ) · det(𝑃)
1
= · det(𝐴 − 𝑥𝐼𝑛 ) · det(𝑃)
det(𝑃)
= det(𝐴 − 𝑥𝐼𝑛 ) which is our characteristic polynomial with respect to E’s
Our next foal is to find basis of 𝑉 such that ℳ𝑇 has many zeros!
Makes computing determinants + eigenvalues easier.
Definition 5.1.2

ℳ(𝑇) is upper triangular if every entry below the diagonal is 0.


For instance:

1 2 3
­ 0 4 5®
© ª

« 0 0 6¬

Proposition 5.1.2
Let 𝑇 ∈ ℒ(𝑉). 𝑉 is a finite-dimensional vector space over 𝔽.
Let 𝑣®1 , . . . , 𝑣®𝑛 be a basis for 𝑉.
The following are equivalent:
1. ℳ(𝑇, (𝑣1 , . . . , 𝑣 𝑛 )) is upper triangular.
2. 𝑇(𝑣 𝑗 ) ∈ span 𝑣1 , . . . , 𝑣 𝑗 for all 𝑗 ∈ {1, . . . , 𝑛}.


3. For all 𝑗 ∈ {1, . . . , 𝑛} , span 𝑣 1 , . . . , 𝑣 𝑗 is invariant subspace for 𝑇.




This means that 𝑇(span 𝑣1 , . . . , 𝑣 𝑗 ) ⊆ span 𝑣1 , . . . , 𝑣 𝑗 .


 

Proof of 1 ⇐⇒ 2 : Definition of ℳ(𝑇).


Proof of 2 =⇒ 3 : Definition of invariant subspace.
Proof of 2 =⇒ 3 : Fix 𝑗 ≥ 1.
Then:

𝑇(𝑣1 ) ∈ span (𝑣1 ) ⊆ span 𝑣 1 , . . . , 𝑣 𝑗




𝑇(𝑣2 ) ∈ span (𝑣1 , 𝑣 2 ) ⊆ span 𝑣1 , . . . , 𝑣 𝑗




..
.
𝑇(𝑣 𝑗 ) ∈ span 𝑣1 , . . . , 𝑣 𝑗


Since 𝑇 is linear.
Then this implies that 𝑇(𝑎1 𝑣1 + . . . + 𝑎 𝑗 𝑣 𝑗 ) = 𝑎1 𝑇(𝑣1 ) + . . . + 𝑎 𝑗 𝑇(𝑣 𝑗 ) ∈ span 𝑣1 , . . . , 𝑣 𝑗 .

Hence, 𝑇(span 𝑣1 , . . . , 𝑣 𝑗 ) ⊆ span 𝑣1 , . . . , 𝑣 𝑗 .
 

72
Theorem 5.1.2
Let 𝑉 be a finite-dimensional vector space over ℂ and 𝑇 ∈ ℒ(𝑉).
Then there exists a basis 𝑣®1 , . . . , 𝑣®𝑛 of 𝑉 such that ℳ(𝑇, (𝑣1 , . . . , 𝑣 𝑛 )) is upper triangular (UT).
For this, we need the above proposition.
Proof: Let’s proceed on induction on 𝑛 = dim 𝑉.
Base Claim: Let 𝑛 = 1, then clearly every 1 × 1 matrix is upper triangular.

Inductive Step: Assume that the statement holds for all 𝑆 ∈ ℒ(𝑊) with dim 𝑊 < dim 𝑉.
Let 𝜆 ∈ ℂ be an eigenvalue for 𝑇. This means it exists such that 𝑣 ≠ 0®𝑣 .
Now consider (𝑇 − 𝜆 · 𝐼𝑑 : 𝑉 → 𝑉).
Set 𝑊 = 𝐼𝑚(𝑇 − 𝜆 · 𝐼𝑑) ⊆ 𝑉.

Claim: 𝑊 is an invariant subspace under 𝑇.


Proof of claim: Let 𝑤 ∈ 𝑊. Then:

𝑇(𝑤) = 𝑇(𝑤) − 𝜆 · 𝑤 + 𝜆 · 𝑤
= (𝑇 − 𝜆 · 𝐼𝑑)(𝑤) + 𝜆 · 𝑤
| {z } |{z}
∈𝑊 by definition ∈𝑊

∈ 𝑊 since 𝑊 is closed under addition

Since 𝜆 is an eigenvalue, we know that 𝑇 − 𝜆 · 𝐼𝑑 is not surjective.


Which implies that 𝑊 ( 𝑉, thus dim 𝑊 < dim 𝑉.
With the claim, we can write:

𝑇 |𝑊 : 𝑊 → 𝑊
This implies that 𝑇 |𝑊 ∈ ℒ(𝑊).
By our inductive hypothesis, there exists a basis 𝑤®1 , . . . , 𝑤®𝑚 of 𝑊 such that
ℳ(𝑇 |𝑊 , (𝑤 1 , . . . , 𝑤 𝑚 )) is upper triangular. 
By our proposition, 𝑇(𝑤 𝑗 ) ∈ span 𝑤1 , . . . , 𝑤 𝑗 for some 𝑗.
Extend to a basis for 𝑉: 𝑤®1 , . . . , 𝑤®𝑚 , 𝑣®1 , . . . , 𝑣 𝑛−𝑚
® .
Then, for 𝑘 = 1, . . . , 𝑛 − 𝑚, we have:

𝑇(𝑣 𝑘 ) = 𝑇(𝑣 𝑘 ) − 𝜆 · 𝑣 𝑘 + 𝜆 · 𝑣 𝑘
= (𝑇 − 𝜆 · 𝐼𝑑)(𝑣 𝑘 ) +𝜆 · 𝑣 𝑘
| {z }
∈𝑊=span 𝑤®1 ,..., 𝑤®𝑚


∈ span (𝑤 1 , . . . , 𝑤 𝑚 , 𝑣 𝑘 ) ⊆ span (𝑤 1 , . . . , 𝑤 𝑚 , 𝑣 1 , . . . , 𝑣 𝑘 )

By the proposition, ℳ(𝑇, (𝑣 1 , . . . , 𝑣 𝑛 )) is upper triangular.


Thus, through the principle of strong mathematical induction, the statement holds for all finite-dimensional
vector spaces over ℂ.

Claim 5.1.2 Upper triangular + invertibility


Let 𝑉 be a finite-dimensional vector space over 𝔽 and 𝑇 ∈ ℒ(𝑉).
Now suppose there exists a basis for 𝑉 such that ℳ(𝑇) is UT (e.g., 𝔽 = ℂ).
Then 𝑇 is invertible ⇐⇒ all diagonal entries of ℳ(𝑇) are non-zero.
Proof: By hypothesis, there exists a basis 𝑣®1 , . . . , 𝑣®𝑛 of 𝑉 such that:

73
𝜆 ★
© 1 ..
ℳ(𝑇, (𝑣 1 , . . . , 𝑣 𝑛 )) = ­­
ª
. ®
®
«0 𝜆𝑛 ¬
Now, we proofed with the biconditional.

⇐= : Suppose 𝜆 𝑖 ≠ 0 for all 𝑖 ∈ {1, . . . , 𝑛}.


Then 𝑇(𝑣1 ) = 𝜆1 𝑣1 .
Which means that if 𝜆1 ≠ 0 =⇒ :
 
1 1
𝑣1 = 𝑇(𝑣1 ) = 𝑇 𝑣1
𝜆1 𝜆1
Which means that 𝑣1 ∈ 𝐼𝑚(𝑇).
𝑇(𝑣2 ) = 𝑎1,2 𝑣1 + 𝜆2 𝑣 2 .
If 𝜆2 ≠ 0, then:

𝑎 1,2 𝑎1,2
 
1 1
𝑣2 = 𝑇(𝑣2 ) − 𝑣1 = 𝑇 𝑣2 − 𝑣1
𝜆2 𝜆2 𝜆2 𝜆2 |{z}
| {z } ∈𝐼𝑚(𝑇)
∈𝐼𝑚(𝑇)
| {z }
∈𝐼𝑚(𝑇) as it is a subspace

Now, induct on 𝑛, to show 𝑣 𝑛 ∈ 𝐼𝑚(𝑇).


This means that span (𝑣1 , . . . , 𝑣 𝑛 ) ⊆ 𝐼𝑚(𝑇).
Which means that 𝑉 ⊆ 𝐼𝑚(𝑇) ⊆ 𝑉.
Thus, 𝑇 is surjective.
Which means that 𝑇 is invertible since we are working in a finite dimensional vector space.

Proof: Suppose the converse i.e., ∃𝑗 ∈ {1, . . . , 𝑛} such that 𝜆 𝑗 = 0.


Then 𝑇(𝑣 𝑗 ) = 𝑎 1,𝑗 𝑣1 + . . . + 𝑎 𝑗−1,𝑗 𝑣 𝑗−1 + 𝜆 𝑗 𝑣 𝑗 .
Notice that the last term is 0, so 𝑇(𝑣 𝑗 ) ∈ span 𝑣 1 , . . . , 𝑣 𝑗−1 .

Thus, 𝑇(span 𝑣1 , . . . , 𝑣 𝑗 ) ⊆ span 𝑣1 , . . . , 𝑣 𝑗−1 .
 
But the latter is dim 𝑗 and the latter is dim 𝑗 − 1.
Which means that 𝑇 | span 𝑣1 ,...,𝑣 𝑗  is not surjective.
As:

𝑇 | span span 𝑣1 , . . . , 𝑣 𝑗 → span 𝑣 1 , . . . , 𝑣 𝑗−1


:
 
𝑣 1 ,...,𝑣 𝑗

Which means that 𝑇 is not surjective, and thus not invertible.


Hence, the converse holds.

Theorem 5.1.3
If ℳ(𝑇) is UT then the eigenvalues of 𝑇 are the diagonal entries of ℳ(𝑇).

Proof: Say:

𝜆 ★
© 1 ..
ℳ(𝑇, (𝑣 1 , . . . , 𝑣 𝑛 )) = ­­
ª
. ®
®
«0 𝜆𝑛 ¬
Let 𝜆 ∈ 𝔽, then ℳ(𝑇 − 𝜆 · 𝐼𝑑) is UT.

74
𝜆1 − 𝜆 ★
ℳ(𝑇 − 𝜆 · 𝐼𝑑, (𝑣1 , . . . , 𝑣 𝑛 )) = ­­ ..
© ª
. ®
®
« 0 𝜆 𝑛 − 𝜆¬
Hence:

𝜆 ∈ 𝔽 is an eigenvalue for 𝑇 ⇐⇒ 𝑇 − 𝜆 · 𝐼𝑑 not invertible


⇐⇒ ℳ(𝑇 − 𝜆 · 𝐼𝑑) not invertible
⇐⇒ 𝜆 𝑖 − 𝜆 = 0 for some 𝑖

5.2 Eigenspaces
Definition 5.2.1: Eigenspaces

Let 𝑉 be a finite-dimensional vector space over 𝔽 and 𝑇 ∈ ℒ(𝑉).


With 𝜆 ∈ 𝔽.
Then the eigenspace corresponding to 𝜆 is:

𝐸(𝜆, 𝑇) ≔ ker(𝑇 − 𝜆 · 𝐼𝑑)


As:

(𝑇 − 𝜆 · 𝐼𝑑)(𝑣) = 0
𝑇(𝑣) − (𝜆 · 𝐼𝑑)(𝑣) = 0
𝑇(𝑣) = 𝜆 · 𝑣

i.e., this is the set of eigenvectors corresponding to 𝜆 together with 0®𝑣 .


Note:-
n o
𝜆 is an eigenvalue for 𝑇 ⇐⇒ 𝐸(𝜆, 𝑇) ≠ 0®𝑣 .

Proposition 5.2.1
Let 𝑉 be a finite-dimensional vector space over 𝔽 and 𝑇 ∈ ℒ(𝑉).
Let 𝜆1 , . . . , 𝜆𝑚 be distinct eigenvalues.
Then 𝐸(𝜆1 , 𝑇), . . . , 𝐸(𝜆𝑚 , 𝑇) ⊆ 𝑉 is a direct sum.
Moreover:

dim 𝐸(𝜆1 , 𝑇) + . . . + dim 𝐸(𝜆𝑚 , 𝑇) ¶ dim 𝑉


Proof: Let 𝑢𝑖 ∈ 𝐸(𝜆 𝑖 , 𝑇) for 𝑖 = 1, . . . , 𝑚.
Suppose that 𝑢1 + . . . + 𝑢𝑚 = 0®𝑣 .
Recall that eigenvectors for distinct eigenvalues are linearly independent.
Which implies that 𝑢𝑖 = 0®𝑣 for all 𝑖.
Thus, 𝐸(𝜆1 , 𝑇), . . . , 𝐸(𝜆𝑚 , 𝑇) is a direct sum.
Thus:

dim 𝐸(𝜆1 , 𝑇) + . . . + dim 𝐸(𝜆𝑚 , 𝑇) = dim (𝐸(𝜆1 , 𝑇) + . . . + 𝐸(𝜆𝑚 , 𝑇)) ¶ dim 𝑉

75
Definition 5.2.2

Diagonal matrix: 𝐴 ∈ 𝔽𝑛,𝑛 :

𝜆 0
© 1 ..
𝐴 = ­­
ª
. ®
®
«0 𝜆𝑛 ¬
Then, 𝑇 ∈ ℒ(𝑉) is diagonalizable if there is a basis 𝑣1 , . . . , 𝑣 𝑛 of 𝑉 such that:
ℳ(𝑇, (𝑣1 , . . . , 𝑣 𝑛 )) is diagonal.

Example 5.2.1
𝑇 : ℝ3 → ℝ3 with ℳ(𝑇, (𝑒1 , 𝑒2 , 𝑒3 ) is:

5 0 0
­ 0 8 0®
© ª

« 0 0 8¬
Then 𝑇(𝑥, 𝑦, 𝑧) = (5𝑥, 8𝑦, 8𝑧).
With 𝑇(𝑒1 ) = 5𝑒1 , 𝑇(𝑒2 ) = 8𝑒2 , 𝑇(𝑒3 ) = 8𝑒3 .
Then 𝐸(5, 𝑇) = span (𝑒1 ), 𝐸(8, 𝑇) = span (𝑒2 , 𝑒3 ), and 𝐸(0, 𝑇) = ℝ3 .
Thus:

dim 𝐸(5, 𝑇) + dim 𝐸(8, 𝑇) ≤ dim ℝ3

Example 5.2.2
Let 𝑇 : ℝ2 → ℝ3 with (𝑥, 𝑦) ↦→ (41𝑥 + 7𝑦, −20𝑥 + 74𝑦
Use standard basis: 𝑒1 = (1, 0), 𝑒2 = (0, 1)

𝑇(𝑒1 ) = (41, −20) = 41𝑒1 − 20𝑒2


𝑇(𝑒2 ) = (7, 74) = 7𝑒1 + 74𝑒2

Thus:
 
41 7
ℳ(𝑇, (𝑒1 , 𝑒2 )) =
−20 74
Now try 𝑣1 = (1, 4), 𝑣 2 = (7, 5).
We claim that 𝑣1 , 𝑣 2 is a basis for ℝ2 .

𝑇(𝑣1 ) = (69, 276) = 69𝑣1 + 0𝑣 2


𝑇(𝑣2 ) = (322, 230) = 0𝑣1 + 46𝑣2

Thus, we get:
 
69 0
ℳ(𝑇, (𝑣 1 , 𝑣 2 )) =
0 46
So 𝑇 is diagonalizable.

𝐸(69, 𝑇) ⊇ span (𝑣1 )


𝐸(46, 𝑇) ⊇ span (𝑣2 )

76
Hence:

1 + 1 ≤ dim 𝐸(69, 𝑇) + dim 𝐸(46, 𝑇) ≤ dim ℝ2 = 2


Which implies that 𝐸(69, 𝑇) = span (𝑣 1 ) and 𝐸(46, 𝑇) = span (𝑣2 ).

Theorem 5.2.1
Let 𝑉 be a finite-dimensional vector space over 𝔽 and 𝑇 ∈ ℒ(𝑉).
Let 𝜆1 , . . . , 𝜆𝑚 be a complete list of the distinct eigenvalues of 𝑇.
The following are equivalent:

1. 𝑇 is diagonalizable.
2. 𝑉 has a basis consisting of eigenvectors of 𝑇.

3. 𝑉 = 𝐸(𝜆1 , 𝑇) ⊕ . . . ⊕ 𝐸(𝜆𝑚 , 𝑇).


4. dim 𝑉 = dim 𝐸(𝜆1 , 𝑇) + . . . + dim 𝐸(𝜆𝑚 , 𝑇).

Proof: We want to show:

1 ⇐⇒ 2
2 =⇒ 3
3 =⇒ 4
4 =⇒ 2

Let’s start:
Proof of 1 ⇐⇒ 2 : This is trivial.
If ℳ(𝑇, (𝑣1 , . . . , 𝑣 𝑛 )) is diagonal, then

𝑇(𝑣 𝑖 ) = 𝜇𝑖 𝑣 𝑖 for some 𝜇𝑖 ∈ 𝔽, 𝑖 = 1, . . . , 𝑛

Proof of 2 =⇒ 3 : Say 𝑉 has a basis consisting of eigenvectors of 𝑇.


This means that all 𝑣 ∈ 𝑉 are linear combinations of eigenvectors.
Which means that 𝑉 ⊆ 𝐸(𝜆1 , 𝑇) + . . . + 𝐸(𝜆𝑚 , 𝑇) ⊆ 𝑉.
Thus, 𝑉 = 𝐸(𝜆1 , 𝑇) + . . . + 𝐸(𝜆𝑚 , 𝑇).
Proof of 3 =⇒ 4 : We showed this in 3.𝐸, where we showed that the sum of direct sums is less than or
equal to the dimension of the vector space.
Proof of 4 =⇒ 2 : Choose bases for each 𝐸(𝜆 𝑖 , 𝑇) for 𝑖 = 1, . . . , 𝑚.
Concatenate to get a list 𝑣1 , . . . , 𝑣 𝑛 of 𝑉
Claim: 𝑣1 , . . . , 𝑣 𝑛 is a basis for 𝑉.

Proof of claim: We need to show span and linear independence.


Linearly independence:
Suppose that 𝑎1Í 𝑣1 + . . . + 𝑎 𝑛 𝑣 𝑛 = 0®𝑣 .
In other words, 𝑛𝑘=1 𝑎 𝑘 𝑣 𝑘 = 0®𝑣 .
Reorganize as 𝑢Í 1 + . . . + 𝑢𝑚 = 0®𝑣 where 𝑢 𝑖 ∈ 𝐸(𝜆 𝑖 , 𝑇).
By taking 𝑢𝑖 = 𝑘∈𝐾 𝑖 𝑎 𝑘 𝑣 𝑘 ,
where 𝐾 𝑖 = {𝑘 | 𝑣 𝑘 ∈ 𝐸(𝜆 𝑖 , 𝑇)}.
Note: 𝑢𝑖 ∈ 𝐸(𝜆 𝑖 , 𝑇).
The 𝑢𝑖 are either 0, or eigenvectors for distinct eigenvalues.
Such eigenvectors would be LI, as otherwise it would contradict 𝑢1 + . . . + 𝑢𝑚 = 0®𝑣 .

77
Hence, 𝑢𝑖 Í= 0®𝑣 for 𝑖 = 1, . . . , 𝑚.
But 𝑢𝑖 = 𝑘∈𝐾 𝑖 𝑎 𝑘 𝑣 𝑘 .
Since these 𝑣 𝑘 ’s are LI (they are all in 𝐸(𝜆 𝑖 , 𝑇)).
Which implies that 𝑎 𝑘 = 0 for 𝑘 ∈ 𝐾 𝑖 and 𝑖 = 1, . . . , 𝑚.
Thus, 𝑣1 , . . . , 𝑣 𝑛 is linearly independent.
Now, our condition says that dim 𝑉 = dim 𝐸(𝜆1 , 𝑇) + . . . + dim 𝐸(𝜆𝑚 , 𝑇).
Let’s denote this as 𝑛, which is the dimension of 𝑉.
Hence, it’s a linearly independent list with an appropriate dimension, which implies that it is a basis for
𝑉.

Thus, the implication holds.


Hence, we have shown all the implications; thus the statement holds.

Example 5.2.3
Let 𝑇 : ℂ2 → ℂ2 with 𝑇(𝑤, 𝑧) = (𝑧, 0).  
0 1
Standard basis 𝑒1 , 𝑒2 , then ℳ(𝑇, (𝑒1 , 𝑒2 )) = .
0 0

𝑇(𝑒1 ) = (0, 0) = 0𝑒1 + 0𝑒2


𝑇(𝑒2 ) = (1, 0) = 𝑒1 + 0𝑒2

We know the eigenvalues are 0 and 0. What is 𝐸(0, 𝑇)?

𝐸(0, 𝑇) = 𝑣 ∈ ℂ2 | 𝑇(𝑣) = 0


= (𝑤, 𝑧) ∈ ℂ2 | (𝑧, 0) = (0, 0)




= span ((1, 0))


=⇒ dim 𝐸(0, 𝑇) = 1

Thus, you will never be able to find a basis of eigenvectors for 𝑇 that makes ℳ(𝑇) diagonal.
Since 2 = dim ℂ2 ≠ dim 𝐸(0, 𝑇) = 1, we conclude that 𝑇 is not diagonalizable.

78
Chapter 6

Inner Product Spaces

6.1 Inner product spaces


Definition 6.1.1: Inner Products

Let 𝑉 be a vector space over 𝔽, with 𝔽 = ℝ or 𝔽 = ℂ.


The inner product:

h, i : 𝑉 ×𝑉 → 𝔽
(𝑣, 𝑤) ↦→ h𝑣, 𝑤i

Such that:

1. h𝑣, 𝑣i ∈ ℝ and h𝑣, 𝑣i ≥ 0 for all 𝑣 ∈ 𝑉

2. h𝑣, 𝑣i = 0 ⇐⇒ 𝑣 = 0®𝑣
3. h𝑢 + 𝑤, 𝑣i = h𝑢, 𝑣i + h𝑤, 𝑣i for all 𝑢, 𝑤, 𝑣 ∈ 𝑉
4. h𝜆 ·𝑣 𝑣, 𝑤i = 𝜆 ·𝔽 h𝑣, 𝑤i for all 𝜆 ∈ 𝔽 and 𝑣, 𝑤 ∈ 𝑉

5. h𝑢, 𝑣i = h𝑣, 𝑢i for all 𝑢, 𝑣 ∈ 𝑉

Example 6.1.1

(i) Let 𝑉 = ℝ𝑛 with 𝔽 = ℝ and h , i = the dot product.


Then,

h(𝑥1 , . . . , 𝑥 𝑛 ), (𝑦1 , . . . , 𝑦𝑛 )i ≔ 𝑥 1 𝑦1 + . . . + 𝑥 𝑛 𝑦𝑛

And:

h(𝑥 1 , . . . , 𝑥 𝑛 ), (𝑥1 , . . . , 𝑥 𝑛 )i = 𝑥12 + . . . + 𝑥 𝑛2 = 0 ⇐⇒ (𝑥1 , . . . , 𝑥 𝑛 ) = (0, . . . , 0)

(ii) Let 𝑉 = ℂ𝑛 with 𝔽 = ℂ and h , i = the dot product.


Then,

h(𝑧1 , . . . , 𝑧 𝑛 ), (𝑤 1 , . . . , 𝑤 𝑛 )i ≔ 𝑧 1 𝑤 1 + . . . + 𝑧 𝑛 𝑤 𝑛

And:

79
h(𝑧1 , . . . , 𝑧 𝑛 ), (𝑧1 , . . . , 𝑧 𝑛 )i = 𝑧1 𝑧1 + . . . + 𝑧 𝑛 𝑧 𝑛 = |𝑧1 | 2 + . . . + |𝑧 𝑛 | 2 ≥ 0

(iii) 𝑉 = 𝑃(ℝ) with 𝔽 = ℝ and h , i = the integral.


Then,
∫ ∞
h𝑝, 𝑞i ≔ 𝑝(𝑥)𝑞(𝑥) · 𝑒−𝑥 𝑑𝑥
0

(iv) 𝑉 = { 𝑓 : [−1, 1] → ℝ | 𝑓 is continuous} with 𝔽 = ℝ and h , i = the integral.


Then,
∫ 1
h 𝑓 , 𝑔i ≔ 𝑓 (𝑥)𝑔(𝑥)𝑑𝑥
−1

80
Definition 6.1.2: Inner product space

Vector space with an inner product is called an inner product space.


Consequences of axoims:

(i) Fix 𝑢 ∈ 𝑉. Define:

𝑇𝑢 : 𝑉 → 𝔽
𝑣 ↦→ h𝑣, 𝑢i

Then 𝑇𝑢 is a linear map.


Additivity:

𝑇𝑢 (𝑣 + 𝑤) = h𝑣 + 𝑤, 𝑢i = h𝑣, 𝑢i + h𝑤, 𝑢i = 𝑇𝑢 (𝑣) + 𝑇𝑢 (𝑤)

Homogeneity:

𝑇𝑢 (𝜆 ·𝑣 𝑣) = h𝜆 ·𝑣 𝑣, 𝑢i = 𝜆 ·𝔽 h𝑣, 𝑢i = 𝜆 ·𝔽 𝑇𝑢 (𝑣)
D E D E
(ii) 0®𝑣 , 𝑣 = 0𝔽 : 0®𝑣 , 𝑣 = 𝑇𝑣 (0®𝑣 ) = 0𝔽

(iii) (h𝑣, 𝑢 + 𝑤i = h𝑣, 𝑢i + h𝑣, 𝑤i):


Reasoning:

h𝑣, 𝑢 + 𝑤i = h𝑢 + 𝑤, 𝑣i
= h𝑢, 𝑣i + h𝑤, 𝑣i
= h𝑢, 𝑣i + h𝑤, 𝑣i

= h𝑣, 𝑢i + h𝑣, 𝑤i
= h𝑣, 𝑢i + h𝑣, 𝑤i

D E D E D E
(iv) 𝑣, 0®𝑣 = 0𝔽 : 𝑣, 0®𝑣 = 0®𝑣 , 𝑣 = 0𝔽 = 0𝔽

(v) h𝑣, 𝜆 · 𝑤i = 𝜆 · h𝑣, 𝑤i:


Reason:

h𝑣, 𝜆 · 𝑤i = h𝜆 · 𝑤, 𝑣i
= 𝜆 · h𝑤, 𝑣i
= 𝜆 · h𝑤, 𝑣i
= 𝜆 · h𝑣, 𝑤i

Definition 6.1.3

Inner products is approx “size of a vector”.


Norm:
p
|𝑣| ≔ h𝑣, 𝑣i

81
Example 6.1.2

1. 𝑉 = ℝ𝑛 , 𝔽 = ℝ, h , i = dot product.
q
k(𝑥1 , . . . , 𝑥 𝑛 )k = 𝑥12 + . . . + 𝑥 𝑛2
∫1
2. 𝑉 = { 𝑓 : [−1, 1] → ℝ | 𝑓 is continuous} , 𝔽 = ℝ, h 𝑓 (𝑥), 𝑔(𝑥)i = −1
𝑓 (𝑥)𝑔(𝑥)𝑑𝑥
s
∫ 1
k𝑓k = 𝑓 (𝑥)2 𝑑𝑥
−1

Meaning k 𝑓 − 𝑔𝑛 k → 0 as 𝑛 → ∞.

For norms:
p
k𝑣 k = 0𝔽 ⇐⇒ h𝑣, 𝑣i = 0𝔽
⇐⇒ h𝑣, 𝑣i = 0𝔽
⇐⇒ 𝑣 = 0®𝑣

Definition 6.1.4: orthogonal

Two vectors 𝑢, 𝑣 are orthogonal if h𝑢, 𝑣i = 0𝔽 .

Example 6.1.3
Let 𝑉 = ℝ2 :

h(𝑥 1 , 𝑦1 ), (𝑥2 , 𝑦2 )i = 0 ⇐⇒ 𝑥 1 𝑥2 + 𝑦1 𝑦2 = 0
𝑥2 𝑦1
⇐⇒ =−
𝑦2 𝑥1
⇐⇒ (𝑥 1 , 𝑦1 ) and (𝑥2 , 𝑦2 ) are perpendicular

Theorem 6.1.1 Pythagoras


Suppose 𝑢, 𝑣 ∈ 𝑉 are orthogonal. Then:

k𝑢 + 𝑣k 2 = k𝑢k 2 + k𝑣k 2
Proof:

k𝑢 + 𝑣 k 2 = h𝑢 + 𝑣, 𝑢 + 𝑣i
= h𝑢, 𝑢 + 𝑣i + h𝑣, 𝑢 + 𝑣i
= h𝑢, 𝑢i + h𝑢, 𝑣i + h𝑣, 𝑢i + h𝑣, 𝑣i
= h𝑢, 𝑢i + 0𝔽 + 0𝔽 + h𝑣, 𝑣i
= k𝑢 k 2 + k𝑣 k 2

Loose End:
k𝜆𝑣 k = |𝜆| · k𝑣k
As both are non-negative.
82
Proof:
p
k𝜆𝑣k = h𝜆𝑣, 𝜆𝑣i
q
= 𝜆 · 𝜆 · h𝑣, 𝑣i
q
= |𝜆| 2 · h𝑣, 𝑣i
p
= |𝜆| · h𝑣, 𝑣i
= |𝜆| · k𝑣 k

Orthonormal Decompositions: Given:


𝑢

𝑣
Find 𝑐 ∈ 𝔽, 𝑤 ∈ 𝑉 such that we complete the triangle, i.e. 𝑢 = 𝑐 · 𝑣 + 𝑤.
𝑢
𝑤

𝑣 𝑐−𝑣
Want:

h𝑤, 𝑣i ⇐⇒ h𝑢 − 𝑐 · 𝑣, 𝑣i = 0
⇐⇒ h𝑢, 𝑣i − 𝑐 · h𝑣, 𝑣i = 0
⇐⇒ h𝑢, 𝑣i − 𝑐 · k𝑣 k 2 = 0
h𝑢, 𝑣i
=⇒ 𝑐 =
k𝑣 k 2

Note:-
h𝑢, 𝑣i
𝑤 = 𝑢 − 𝑐𝑣 = 𝑢 − ·𝑣
k𝑣 k 2
Where 𝑣 and 𝑤 are orthogonal.

Theorem 6.1.2 Cauchy-Schwarz inequality


Suppose that 𝑉 is an inner product space and 𝑢, 𝑣 ∈ 𝑉. Then:

|h𝑢, 𝑣i| ≤ k𝑢k · k𝑣k


With equality if and only if 𝑢 and 𝑣 are linearly dependent.
Proof: If 𝑣 = 0®𝑣 , then both sides are 0𝔽 .
Note:-
In this case equality holds and 𝑢, 𝑣 are linearly dependent.

If 𝑣 ≠ 0®𝑣 , then 𝑣 and 𝑤 ≔ 𝑢 − h𝑢,𝑣i2 · 𝑣 are orthogonal.


k𝑣 k
Which means so are 𝛼 · 𝑣 and 𝑤 for any 𝛼 ∈ 𝔽.
Recall that Pythagoras:

k𝑤 + 𝛼 · 𝑣 k 2 = k𝑤k 2 + k𝛼𝑣 k 2 = k𝑤 k 2 + |𝛼| 2 · k𝑣 k 2

83
Pick 𝛼 = h𝑢,𝑣i2 , so that 𝑤 + 𝛼𝑣 = 𝑢
k𝑣 k
This implies that:
2
h𝑢, 𝑣i
k𝑢 k 2 = k𝑤k 2 + · k𝑣k 2
k𝑣k 2
Where the right term of the sum is:

|h𝑢, 𝑣i| 2
· k𝑣 k 2
k𝑣k 4
Which is:

|h𝑢, 𝑣i| 2 |h𝑢, 𝑣i| 2


k𝑣 k 2 = k𝑤k 2 + ¾
k𝑣 k 2 k𝑣k 2
So we get:

|h𝑢, 𝑣i| 2
k𝑣 k 2 ¾ =⇒ k𝑢k · k𝑣k ¾ |h𝑢, 𝑣i|
k𝑣k 2
Notice that equality holds if and only if:

k𝑤 k = 0 ⇐⇒ 𝑤 = 0®𝑣
h𝑢, 𝑣i
⇐⇒ 𝑢 = ·𝑣 =0
k𝑣k 2
h𝑢, 𝑣i
⇐⇒ 𝑢 = · 𝑣 ⇐⇒ 𝑢, 𝑣 are linearly dependent
k𝑣k 2

Example 6.1.4

1. Let 𝑉 = ℝ𝑛 , 𝔽 = ℝ, h , i = dot product.

𝑥® = (𝑥 1 , . . . , 𝑥 𝑛 )
𝑦® = (𝑦1 , . . . , 𝑦𝑛 )

C.S. tell us:

2 2 2
𝑥®, 𝑦® ¶ 𝑥® · 𝑦®
(𝑥1 𝑦1 + . . . + 𝑥 𝑛 𝑦𝑛 )2 ¶ (𝑥12 + . . . + 𝑥 𝑛2 ) · (𝑦12 + . . . + 𝑦𝑛2 )

∫1
2. Let 𝑉 = { 𝑓 : [−1, 1] → ℝ | 𝑓 is continuous} , 𝔽 = ℝ, h 𝑓 , 𝑔i = −1
𝑓 (𝑥)𝑔(𝑥)𝑑𝑥.
By C.S., we know that:

|h 𝑓 , 𝑔i| 2 ¶ k 𝑓 k 2 · k 𝑔 k 2

84
Thus:

∫ 1 2 ∫ 1  ∫ 1 
𝑓 (𝑥)𝑔(𝑥)𝑑𝑥 ¶ 𝑓 (𝑥) 𝑑𝑥 ·
2
𝑔(𝑥) 𝑑𝑥
2
−1 −1 −1

Theorem 6.1.3 Triangle Inequality


Given 𝑢, 𝑣 in an inner product space 𝑉, we have:
𝑣 𝑢

𝑢+𝑣
The triangle inequality states that:

k𝑢 + 𝑣k ¶ k𝑢 k + k𝑣 k
Proof:

k𝑢 + 𝑣 k 2 = h𝑢 + 𝑣, 𝑢 + 𝑣i
= h𝑢, 𝑢i + h𝑢, 𝑣i + h𝑣, 𝑢i + h𝑣, 𝑣i
= k𝑢 k 2 + h𝑢, 𝑣i + h𝑢, 𝑣i + k𝑣 k 2
= k𝑢 k 2 + 2 · Re h𝑢, 𝑣i + k𝑣k 2 as (𝑎 + 𝑏𝑖) + (𝑎 − 𝑏𝑖) = 2𝑎
≤ k𝑢k 2 + 2 · |h𝑢, 𝑣i| + k𝑣k 2 by ★
≤ k𝑢k 2 + 2 · k𝑢 k · k𝑣k + k𝑣k 2 by C.S.
= (k𝑢 k + k𝑣 k)2

Thus, we get:

k𝑢 + 𝑣 k 2 ¶ (k𝑢 k + k𝑣 k)2
k𝑢 + 𝑣k ¶ k𝑢 k + k𝑣 k

So why is ★ true?


h𝑢, 𝑣i = 𝑎 + 𝑏𝑖 =⇒ |h𝑢, 𝑣i| = 𝑎2 + 𝑏2
Re h𝑢, 𝑣i = 𝑎

But why is 𝑎 ¶ 𝑎2 + 𝑏2?
True since:

𝑎 ≤ |𝑎|

= 𝑎2

≤ 𝑎2 + 𝑏2

Which means the triangle inequality holds.

85
Example 6.1.5
Let 𝑉 = ℝ𝑛 , 𝑥® = (𝑥1 , . . . , 𝑥 𝑛 ), 𝑦® = (𝑦1 , . . . , 𝑦𝑛 ).
We have:
2 2 2
𝑥® + 𝑦® = 𝑥® + 2 · 𝑥®, 𝑦® + 𝑦®
Thus:

v v 2
𝑛
t 𝑛 t 𝑛
Õ © Õ 2 Õ ª
(𝑥 𝑖 + 𝑦 𝑖 ) ¶ ­
2
𝑥𝑖 + 𝑦 𝑖2 ®
𝑖=1 « 𝑖=1 𝑖=1 ¬v
v
𝑛
t 𝑛 t 𝑛 𝑛
Õ Õ Õ Õ
= 𝑥 2𝑖 +2· 𝑥 2𝑖 · 𝑦 𝑖2 + 𝑦 𝑖2
𝑖=1 𝑖=1 𝑖=1 𝑖=1

6.2 Orthonormal bases


Definition 6.2.1

Let 𝑉 be an inner product space over 𝔽.


Let 𝑒1 , . . . , 𝑒 𝑛 be a list of vectors in 𝑉.
Then we say {𝑒1 , . . . , 𝑒 𝑛 } is an orthogonal list if:
(
1𝔽 𝑖=𝑗
𝑒 𝑖 , 𝑒 𝑗 = 𝛿 𝑖𝑗 ≔
0𝔽 𝑖≠𝑗
E.g.

   
1 1 1 1 1
√ , √ , √ , − √ , √ , 0 ∈ ℝ3
3 3 3 2 2

We normalize the vectors to get a length of 1.


If an orthogonal list is also a basis, then the following holds:

k𝑎 1 𝑒1 + . . . + 𝑎 𝑛 𝑒 𝑛 k 2 = h𝑎1 𝑒1 + . . . + 𝑎 𝑛 𝑒 𝑛 , 𝑎 1 𝑒1 + . . . + 𝑎 𝑛 𝑒 𝑛 i
= h𝑎1 𝑒1 , 𝑎 1 𝑒1 i + . . . + h𝑎 𝑛 𝑒 𝑛 , 𝑎 𝑛 𝑒 𝑛 i
= 𝑎1 · 𝑎1 h𝑒1 , 𝑒1 i + . . . + 𝑎 𝑛 · 𝑎 𝑛 h𝑒 𝑛 , 𝑒 𝑛 i
= |𝑎1 | 2 + . . . + |𝑎 𝑛 | 2

A list of orgthogonal vectors is linearly independent, but might not span.

86
Definition 6.2.2

𝑉 is an inner product space over ℝ or ℂ.


Then we say {𝑉1 , . . . , 𝑉𝑛 } is an orthonormal if :
(
1𝔽 𝑖=𝑗
𝑉𝑖 , 𝑉𝑗 =
0𝔽 𝑖≠𝑗

Claim 6.2.1
Suppose {𝑉1 , . . . , 𝑉𝑛 } is orthonormal, then {𝑉1 , . . . , 𝑉𝑛 } is linearly independent.

Proof: Suppose there are some scalars 𝑐 1 , . . . , 𝑐 𝑛 ∈ 𝔽 such that:

𝑐1 𝑣1 + . . . 𝑐 𝑛 𝑣 𝑛 = 0
Then it follows:

h𝑐1 𝑣1 + . . . 𝑐 𝑛 𝑣 𝑛 , 𝑐 1 𝑣1 + . . . 𝑐 𝑛 𝑣 𝑛 i = k𝑐 1 𝑣1 + . . . 𝑐 𝑛 𝑣 𝑛 k 2 = 0
Which means:

|𝑐 1 | 2 + . . . + |𝑐 𝑛 | 2 = 0 =⇒ |𝑐1 | 2 = . . . = |𝑐 𝑛 | 2 = 0
Thus,

𝑐 1 = . . . = 𝑐 𝑛 = 0𝔽

: Suppose {𝑒1 , . . . , 𝑒 𝑛 } is an orthonormal basis for 𝑉. and let 𝑣 ∈ 𝑉. Then:


Algorithm 2: Gram-Schmidt Process
Input: 𝑣®1 , . . . , 𝑣®𝑛 ∈ 𝑉. Linearly independent set.
Output: 𝑒1 , . . . , 𝑒 𝑛 ∈ 𝑉 orthonormal basis and span (𝑒1 , . . . , 𝑒 𝑛 ) = span (𝑣 1 , . . . , 𝑣 𝑛 )

/* We want h𝑒1 , 𝑒2 i = 0𝔽 */
𝑣®1
1 𝑒®1 = 𝑣®1
;
𝑣®2 − 𝑣®2 , 𝑒®1 · 𝑒®1
2 𝑒®2 = 𝑣®2 − 𝑣®2 , 𝑒®1 · 𝑒®1
;
𝑣®𝑗 − 𝑣®𝑗 , 𝑒®1 · 𝑒®1 −...− 𝑣®𝑗 , 𝑒 𝑗−1
® · 𝑒 𝑗−1
®
3 𝑒®𝑗 = 𝑣®𝑗 − 𝑣®𝑗 , 𝑒®1 · 𝑒®1 −...− 𝑣®𝑗 , 𝑒 𝑗−1
® · 𝑒 𝑗−1
®
;

Example 6.2.1
∫1
Let 𝑉 = 𝒫2 (ℝ), h𝑝, 𝑞i = −1 𝑝(𝑥)𝑞(𝑥)𝑑𝑥.
Start with 𝑣®1 = 1, 𝑣®2 = 𝑥, 𝑣®3 = 𝑥 2 .

1.

𝑣®1 1 √

𝑒®1 = = 12 𝑑𝑥 = 2
𝑣®1 −1
1
=√
2

87
2.
∫ 1
1 1
𝑣2 − h𝑣2 , 𝑒1 i · 𝑒1 = 𝑥 − 𝑥 · √ 𝑑𝑥 · √
−1 2 2
notice that the integral is 0 as 𝑥 is odd
=𝑥

Remember that:

𝑣®2 − 𝑣®2 , 𝑒®1 · 𝑒®1 𝑥


𝑒®2 = =
𝑣®2 − 𝑣®2 , 𝑒®1 · 𝑒®1 k𝑥 k

∫ 1
2 2
k𝑥 k = h𝑥, 𝑥i = 𝑥 2 𝑑𝑥 =
−1 3
r
2 𝑥
=⇒ k𝑥k = =⇒ 𝑒®2 = q
3 2
3

3.
1 1
𝑥 𝑥
∫ ∫
1 1
𝑣3 − h𝑣3 , 𝑒1 i · 𝑒1 − h𝑣 3 , 𝑒2 i · 𝑒2 = 𝑥 2 − 𝑥 2 · √ 𝑑𝑥 · √ − 𝑥 2 · q 𝑑𝑥 · q
−1 2 2 −1 2 2
3 3

notice that the integral is 0 as the right side is odd


1
= 𝑥2 −
3

Hence,
s
𝑥2 − 1
∫ 1
r
1 1 8
𝑥2 − = (𝑥 2 − )2 𝑑𝑥 = =⇒ 𝑒®3 = q 3
3 −1 3 45 8
45

This week: (i) Inner product spaces


(ii) Some properties:

𝑢 = 0 ⇐⇒ h𝑣, 𝑢i = 0 for all 𝑣 ∈ 𝑉

In particular:

𝑢 = 𝑢0
⇐⇒ 𝑢 − 𝑢 0 = 0
⇐⇒ ∀𝑣 ∈ 𝑉 , h𝑣, 𝑢 − 𝑢 0i = 0

Goal: Study linear operators between inner product spaces

88
Definition 6.2.3
𝜙
A linear functional on 𝑉 is a linear map 𝑉 −
→ 𝔽.
i.e., 𝜙 ∈ ℒ(𝑉 , 𝔽)

Theorem 6.2.1 Riesz Representation Theorem (RRT)


Let 𝑉 be a finite-dimensional vector space over 𝔽 and 𝜙 ∈ ℒ(𝑉 , 𝔽).
Then there exists a unique 𝑢 ∈ 𝑉 such that:

𝜙(𝑣) = h𝑣, 𝑢i for all 𝑣 ∈ 𝑉


Proof of part 1: Find 𝑢.

𝜙(𝑣) = 𝜙(h𝑣, 𝑒1 i 𝑒1 + . . . + h𝑣, 𝑒 𝑛 i 𝑒 𝑛 )


For some orthonormal basis {𝑒1 , . . . , 𝑒 𝑛 } of 𝑉.
This means we get:

= h𝑣, 𝑒1 i 𝜙(𝑒1 ) + . . . + h𝑣, 𝑒 𝑛 i 𝜙(𝑒 𝑛 )


D E D E
= 𝑣, 𝜙(𝑒1 )𝑒1 + . . . + 𝑣, 𝜙(𝑒 𝑛 )𝑒 𝑛
D E
= 𝑣, 𝜙(𝑒1 )𝑒1 + . . . + 𝜙(𝑒 𝑛 )𝑒 𝑛

Which is 𝑢!
Thus,

𝜙(𝑣) = h𝑣, 𝑢i for all 𝑣 ∈ 𝑉

Uniqueness:
𝜙(𝑣) = h𝑣, 𝑢i = h𝑣, 𝑢 0i for all 𝑣 ∈ 𝑉
Show 𝑢 = 𝑢 0 ⇐⇒ show h𝑣, 𝑢 − 𝑢 0i = 0 for all 𝑣 ∈ 𝑉.

h𝑣, 𝑢 − 𝑢 0i = h𝑣, 𝑢i − h𝑣, 𝑢 0i


= 𝜙(𝑣) − 𝜙(𝑣)
=0

Thus, 𝑢 = 𝑢 0.
Note:-
Because of uniqueness the 𝑢 in the proof cannot / doesn’t depend on the choice of basis.

Example 6.2.2
∫1
Let 𝒫2 (ℝ) with h𝑝, 𝑞i = −1 𝑝(𝑥)𝑞(𝑥)𝑑𝑥.
This has an orthonormal basis:
r r r
1 3 45 2 1
𝑒1 = , 𝑒2 = 𝑥, 𝑒3 = (𝑥 − )
2 2 8 3

89
Let 𝜙 ∈ ℒ(𝒫2 (ℝ), ℝ) be defined by:
∫ 1
𝜙(𝑝) = 𝑝(𝑥) cos(𝜋𝑥)𝑑𝑥 ∈ ℒ(𝒫2 (ℝ), ℝ)
−1

Note:-
We have:

h𝑝, cos(𝜋𝑥)i = 𝜙(𝑝)


but cos(𝜋𝑥) ∉ 𝒫2 (ℝ).

Thus, by using RRT,

𝜙(𝑝) = h𝑝, 𝑢i

Where 𝑢 = 𝜙(𝑒1 )𝑒1 + 𝜙(𝑒2 )𝑒2 + 𝜙(𝑒3 )𝑒3 .


Notice that the second term is
∫ 1
r
3
𝜙(𝑒2 )𝑒2 = 𝑥 cos(𝜋𝑥)𝑑𝑥 · 𝑥
−1 2
Computing gives us:
45 2 1
𝑢=− (𝑥 − )
2𝜋2 3

Now, let (𝑉 , h , i𝑉 ), (𝑊 , h , i𝑊 ) be inner product spaces over 𝔽.


Let 𝑇 ∈ ℒ(𝑉 , 𝑊).
For each 𝑤 ∈ 𝑊, create: 𝜙𝑤 ∈ ℒ(𝑉 , 𝔽) by:

𝜙𝑤 (𝑣) = h𝑇(𝑣), 𝑤i𝑊


By RRT, for all 𝑤 ∈ 𝑊, there exists a unique 𝑢𝑤 ∈ 𝑉 such that:

𝜙𝑤 (𝑣) = h𝑣, 𝑢𝑤 i𝑉
Now, notice:

h𝑣, 𝑢𝑤 i𝑉 = h𝑇(𝑣), 𝑤i𝑊


By uniqueness of 𝑢𝑤 , we can define:

𝑇 ∗ : 𝑊 → 𝑉 , 𝑤 ↦→ 𝑢𝑤 ≔ 𝑇 ∗ (𝑤)
Definition 6.2.4

The adjoint of a linear map 𝑇 : 𝑉 → 𝑊 between inner product spaces is the linear map 𝑇 ∗ : 𝑊 → 𝑉
characterized by:

h𝑇(𝑣), 𝑤i𝑊 = h𝑣, 𝑇 ∗ (𝑤)i𝑉

Example 6.2.3
Let ℝ3 , ℝ2 with the standard inner product i.e., dot product.

𝑇 : ℝ3 → ℝ2 , 𝑇(𝑥1 , 𝑥 2 , 𝑥 3 ) = (𝑥1 + 𝑥2 , 2𝑥2 + 𝑥3 )


What is 𝑇 ∗ : ℝ2 → ℝ3 ?

90
h𝑇(𝑥1 , 𝑥 2 , 𝑥 3 ), (𝑦1 , 𝑦2 )i ℝ2 = h(𝑥1 + 𝑥2 , 2𝑥2 + 𝑥 3 ), (𝑦1 , 𝑦2 )i ℝ2
= h(𝑥1 , 𝑥 2 , 𝑥 3 ), 𝑇 ∗ (𝑦1 , 𝑦2 )i ℝ3
= (𝑥1 + 𝑥2 )𝑦1 + (2𝑥2 + 𝑥 3 )𝑦2
= 𝑥1 𝑦1 + 𝑥 2 𝑦1 + 2𝑥 2 𝑦2 + 𝑥3 𝑦2 = h(𝑥1 , 𝑥 2 , 𝑥 3 ), (𝑦1 , 𝑦1 + 2𝑦2 , 𝑦2 )i ℝ3

Thus, 𝑇 ∗ (𝑦1 , 𝑦2 ) = (𝑦1 , 𝑦1 + 2𝑦2 , 𝑦2 ).

Note:-
Is𝑇∗ is linear?
Adjoins are linear:
If 𝑇 : 𝑉 → 𝑊 is linear, then 𝑇 ∗ : 𝑊 → 𝑉 is linear.

91
Chapter 7

Operators on Inner Product Spaces

92
Chapter 8

Operators on Complex Vector Spaces

93
Chapter 9

Operators on Real Vector Spaces

94
Chapter 10

Determinants and Traces

10.1 Determinants and Permutations


Definition 10.1.1: Determinants

det : 𝔽𝑛,𝑛 =⇒ 𝔽

(a) If 𝑛 = 1, then det(𝑎) = 𝑎.

𝑎 𝑏
 
(b) If 𝑛 = 2, then det = 𝑎𝑑 − 𝑏𝑐.
𝑐 𝑑

(c) If 𝑛 ≥ 3, then we need a recursive definition.


If 𝐴 ∈ 𝔽𝑛,𝑛 , then the 𝑖𝑗-th minor of 𝐴 is 𝐴 𝑖,𝑗 .
Where 𝐴 𝑖,𝑗 means you take 𝐴 and delete the 𝑖th row and 𝑗th column.

Example 10.1.1
Let
1 2 3
𝐴 = ­4 5 6®
© ª

«7 8 9¬
.
Then:
 
2 3
𝐴2,1 =
8 9

Thus, given 𝐴 ∈ 𝔽𝑛,𝑛 , define its determinant as:

𝑛
Õ
det 𝐴 = (−1)𝑖+1 𝑎 𝑖,1 · det 𝐴 𝑖,1
𝑖=1

95
Example 10.1.2
Let:

1 0 3 𝑎 𝑎 12 𝑎 13
ª © 11
𝐴 = ­2 1 2® = ­ 𝑎21 𝑎 22 𝑎 23 ®
© ª

«0 5 1¬ « 𝑎31 𝑎 32 𝑎 33 ¬
Thus,

det 𝐴 = 𝑎1,1 · det 𝐴1,1 − 𝑎2,1 · det 𝐴2,1 + 𝑎3,1 · det 𝐴3,1
     
1 2 0 3 0 3
= 1 · det − 2 · det + 0 · det
5 1 5 1 1 2
= 1 · (−9) − 2 · (−15) + 0 · (−3)
= 21

Theorem 10.1.1 Det 1


There exists a unique function 𝛿 : 𝔽𝑛,𝑛 → 𝔽 with the following properties:

1. 𝛿(𝐼𝑛 ) = 1
2. 𝛿 is row-linear.

3. If 𝐴 has two identical rows, then 𝛿(𝐴) = 0.

Point: we will show that det = 𝛿.

Row-linear: This means that:

1 2 3 1 2 3 1 2 3
𝛿 ­­4𝜆 + 2𝜇 5𝜆 + 5𝜇 6𝜆 + 8𝜇®® = 𝜆 · 𝛿 ­4 5 6® + 𝜇 · 𝛿 ­2 5 8®
©© ªª © ª © ª

«« 7 8 9 ¬¬ «7 8 9¬ «7 8 9¬
Assume the previous theorem is true for now.
What is the value of 𝛿 on elementary matrices?

Theorem 10.1.2 Det 2


𝐸 elementary matrix. Then:

 𝛿(𝐴)


 if 𝐸 is type (i)
if 𝐸 is type (ii)

𝛿(𝐸 · 𝐴) = −𝛿(𝐴)
 𝑐 · 𝛿(𝐴) if 𝐸 is type (iii)



𝑆 is determined on elementary matrices.

Corollary 10.1.1 Related to thm 2

 +1


 if 𝐸 is type (i)
if 𝐸 is type (ii)

𝛿(𝐸) = −1
𝑐 if 𝐸 is type (iii)



Proof: Take 𝐴 = 𝐼𝑛 in theorem 2.

96
Proof to det 2: For 𝐸 of type (𝑖𝑖𝑖) this is jut row-linearity of 𝛿.
Let 𝐴 𝑖 be the 𝑖th row of 𝐴.

1 −−
1 𝐴 −− −− 1 𝐴 −− 1
ª −− 𝐴1 −− ª® .. ..
.. ..
©© © ª © ª © ª
. ® ­−− 𝐴2 . . .
­­ ® © ­ ® ­ ® ­ ®
−−®®®
­­ ª ­ ® ­ ® ­ ®
𝛿 ­­ 𝑐 ®·­ .. ®® = 𝛿 ­­−− 𝑐𝐴 𝑖 −−® = 𝑐·𝛿 ­−− 𝐴 𝑖 −−® = 𝑐·𝛿(𝐴) ­ 𝑐
­­ ® ­ ® ­ ® ­ ®
.
®
.. .. .. ..
­­ ® ­ ®® ­ ® ­ ® ­ ®
. . . .
®®
® −− 𝐴𝑛
­­ ® ­ ® ­ ® ­ ®
­­ −−¬® ­ ® ­ ® ­ ®
1¬ −− 𝐴 −−¬ −− 𝐴 −−¬ 1¬
«
«« ¬ « 𝑛 « 𝑛 «
Since we did not require 𝑐 ≠ 0, then this is true for all 𝑐 ∈ 𝔽.
Thus, 𝛿(𝐸 · 𝐴) = 𝑐 · 𝛿(𝐴) for all 𝑐 ∈ 𝔽.
If a row contains only zeros, then 𝛿(𝐴) = 0.
For types (𝑖) and (𝑖𝑖), we do the special case when 𝐸 acts on consecutive rows.

(Type: i):

−− 𝐴1 −− −− 𝐴1 −−
1
ª ­−− 𝐴2 −−ª® 𝐴2
© ©−− −−ª®
.. .. ..
© ­
­
­ . 𝑎 𝑖,𝑗
® ­
® ­ .
® ­
® ­ .
®
®
𝐸 · 𝐴 = ­­ .. ® · ­−− 𝐴 𝑖 −−®® = ­­−− 𝑎 𝑖,𝑗 𝐴 𝑖 + 𝐴 𝑗
­ ® ­ ® ­ ®
. −−®®
® ­−− 𝐴 𝑖+1 𝐴 𝑖+1
® ­
­ .. −−®® ­­−− −−®®
­ . ® ­
.. ..
. .
­ ® ­ ® ­ ®
1¬ ­ ® ­ ®
«
« −− 𝐴 𝑛 −−¬ «−− 𝐴𝑛 −−¬

Special case, 𝑗 = 𝑖 + 1

−− 𝐴1 −− −− 𝐴1 −− −− 𝐴1 −−
©−− 𝐴2 −−ª® ©−− 𝐴2 −−ª® ©−− 𝐴2 −−ª®
.. .. ..
­ ­ ­
­ ® ­ ® ­ ®
­ . ® ­ . ® ­ . ®
𝛿(𝐸 · 𝐴) = 𝛿 ­­−− 𝐴 𝑖+1 −−®® = 𝑎 · 𝛿 ­­−− 𝐴 𝑖 −−®® + 𝛿 ­­−− 𝐴 𝑖
­ ® ­ ® ­ ®
−−®®
­−− 𝑎𝐴 𝑖 + 𝐴 𝑖+1 −−®® ­−− 𝐴 𝑖 −−®® ­−− 𝐴 𝑖+1 −−®®
.. .. ..
­ ­ ­
. . .
­ ® ­ ® ­ ®
­ ® ­ ® ­ ®
« −− 𝐴 𝑛 −−¬ « −− 𝐴 𝑛 −−¬ « −− 𝐴 𝑛 −−¬

But, the first matrix’s determinant is 0 since it has two identical rows.
Thus, 𝛿(𝐸 · 𝐴) = 𝑎 · 0 + 𝛿(𝐴) = 𝛿(𝐴).
(Type: ii): Swap rows.
Again, this assumes theorem 1.
Let’s assume that we swap row 𝑖 with row 𝑖 + 1

−− 𝐴1 −−
© .. ª
­
­ . ®
®
­−− 𝐴 𝑖 −−®
𝛿(𝐴) = 𝛿 ­
­−− 𝐴 𝑖+1 −−®
­ ®
®
..
.
­ ®
­ ®
«−− 𝐴𝑛 −−¬
by part one:

97
−− 𝐴1 −−
© .. ª
­
­ . ®
®
­−− 𝐴 𝑖 − 𝐴 𝑖+1 −−®
= 𝛿­
𝐴
­ ®
−− 𝑖+1 −−®
..
­ ®
.
­ ®
­ ®
« −− 𝐴 𝑛 −−¬
by part one again:
−− 𝐴1 −−
© .. ª
­
­ . ®
®
­−− 𝐴 𝑖 − 𝐴 𝑖+1 −−®®
= 𝛿 ­­
­−− 𝐴 𝑖+1 + (𝐴 𝑖 − 𝐴 𝑖+1 ) −−®
®
..
.
­ ®
­ ®
« −− 𝐴 𝑛 −− ¬
−− 𝐴1 −−
© .. ª
­
­ . ®
®
­−− 𝐴 𝑖 − 𝐴 𝑖+1 −−®
= 𝛿­
𝐴𝑖
­ ®
­−− −−®®
..
.
­ ®
­ ®
«−− 𝐴𝑛 −−¬
by row linearity:
−− 𝐴1 −− −− 𝐴1 −−
© .. ª © .. ª
­
­ . ®
®
­
­ . ®
®
­−− 𝐴 𝑖 −−®® ­−− 𝐴 𝑖+1 −−®
= 𝛿 ­­ −𝛿­
­−− 𝐴 𝑖 ­−− 𝐴 𝑖 −−®
­ ®
−−®® ®
.. ..
. .
­ ® ­ ®
­ ® ­ ®
«−− 𝐴𝑛 −−¬ «−− 𝐴𝑛 −−¬
but for the first matrix is zero since it has two identical rows:
−− 𝐴1 −−
© .. ª
­
­ . ®
®
­−− 𝐴 𝑖+1 −−®
= −𝛿 ­­
­−− 𝐴 𝑖 −−®
®
®
..
.
­ ®
­ ®
« −− 𝐴 𝑛 −−¬
= −𝛿(𝐸 · 𝐴)

Thus, 𝛿(𝐴) = −𝛿(𝐸 · 𝐴), which implies

𝛿(𝐸 · 𝐴) = −𝛿(𝐴)

In general, for part 2, we want to swap 𝑖 with row 𝑗.


Assume 𝑖 < 𝑗.

(i) We bubble down row 𝑖 to row 𝑗 indices 𝑗 − 1 exchanges.


Thus, row 𝑖 is in the right place.
But Row 𝑗 is row in Row 𝑗 − 1.
(ii) Bubble up row 𝑗 (in position 𝑗 − 1 right now) to row 𝑖.
This involves (𝑗 − 1) − 𝑖 exchanges.
98
Which means that the total number of exchanges is:

𝑗 − 1 + (𝑗 − 1) − 𝑖 = 2(𝑗 − 𝑖) − 1

Which is odd!
This means that 𝛿(𝐸 · 𝐴) = (−1)2(𝑗−𝑖)−1 · 𝛿(𝐴) = −𝛿(𝐴).

Note:-
We can also do this for part 1.
Do this an exercise.
As such, we have proven theorem 2.

Corollary 10.1.2
𝛿(𝑆 · 𝐵) = 𝛿(𝐴) · 𝛿(𝐵) for any 𝐴, 𝐵 ∈ 𝔽𝑛,𝑛 .
We know that 𝛿(𝐸) · 𝛿(𝐴) = 𝛿(𝐸 · 𝐴) if 𝐸 is an elementary matrix.
Let 𝐴0 = 𝐸 𝑘 · · · 𝐸1 · 𝐴 be a reduced row echelon form of 𝐴.
Then either:

(i) 𝐴0 = 𝐼𝑛 or
(ii) the last row of 𝐴0 is all zeros. (could be more than the last row)

Then:

(i) If 𝐴0 = 𝐼𝑛 ,

𝐴0 = 𝐼𝑛 =⇒ 𝐴 = 𝐸1−1 · · · 𝐸−1
𝑘 · 𝐼𝑛
=⇒ 𝛿(𝐴) = 𝛿(𝐸1−1 ) · · · 𝛿(𝐸−1
𝑘 )

On the other hand:

𝛿(𝐴 · 𝐵) = 𝛿(𝐸1−1 ) · · · 𝜆(𝐸−1


𝑘 ) · 𝛿(𝐵)
= 𝛿(𝐴) · 𝛿(𝐵)

(ii) If 𝐴0 has a row of zeros, then:


𝛿(𝐴0) = 0, which implies that 𝛿(𝐴) = 0.
Since 𝛿(𝐴0) = 𝛿(𝐸 𝑘 ) · · · 𝛿(𝐸1 ) · 𝛿(𝐴).
Where 𝛿(𝐸 𝑖 ) ≠ 0 for all 𝑖.
This implies that 𝛿(𝐴) = 0.
And exercise: 𝛿(𝐴 · 𝐵) = 0 as well.

Proof of det 1: Proof of uniqueness: Suppose there are functions 𝛿 : 𝔽𝑛,𝑛 → 𝔽 and 𝛿0 : 𝔽𝑛,𝑛 → 𝔽.
Each satisfying the three desired properties.
Let 𝐴 ∈ 𝔽𝑛,𝑛 such that 𝐴0 = 𝐸 𝑘 · · · 𝐸1 · 𝐴 is a reduced row echelon form of 𝐴.
Either we get 𝐴0 = 𝐼𝑛 or its last row is all zeros.
In either case, 𝛿(𝐴0) = 𝛿0(𝐴0) = 1.
Or 𝛿(𝐴0) = 𝛿0(𝐴0) = 0.
That means that 𝛿(𝐸 𝑘 · · · 𝐸1 · 𝐴) = 𝛿0(𝐸 𝑘 · · · 𝐸1 · 𝐴) in either case.
Thus,
𝛿(𝐸 𝑘 ) · · · 𝛿(𝐸1 ) · 𝛿(𝐴) = 𝛿0(𝐸 𝑘 ) · · · 𝛿0(𝐸1 ) · 𝛿0(𝐴)

99
But by theorem 2, we get 𝛿(𝐸 𝑖 ) = 𝛿0(𝐸 𝑖 ).
Which means that 𝛿(𝐴) = 𝛿0(𝐴) for all 𝐴 ∈ 𝔽𝑛,𝑛 .
Proof of existence: We’ll show det : 𝔽𝑛,𝑛 → 𝔽 satisfies the three properties to be 𝛿.
Let’s proceed by induction on 𝑛 ∈ ℕ
Base Case: Let 𝑛 be 1.
Then det : 𝔽1,1 → 𝔽 is defined by det(𝑎) = 𝑎.
Thus, det(𝐼1 ) = 1.
Now, for row linear:

det(𝜆𝑎 + 𝜇𝑏) = 𝜆 · det(𝑎) + 𝜇 · det(𝑏)


Part 3 is trivial since there is only one row.
Inductive Step: Assume that det : 𝔽𝑛−1,𝑛−1 → 𝔽 satisfies the three properties.
We show the 𝑛 × 𝑛 case!
We need to show the three properties.
(i) 𝐼𝑛 .

1
..
© ª
.
­ ®
­ ®
𝛿(𝐼𝑛 ) = 𝛿 ­ 1
­ ®
®
­ .. ®
­
­ . ®
®
« 1¬
= 1 · 𝛿(𝐼𝑛−1 )
=1·1 by inductive hypothesis
=1

(ii) Let 𝐴, 𝐵, 𝐷 ∈ 𝔽𝑛,𝑛 be identical matrices except for row 𝑘.


Where 𝐷 𝑘 = 𝜆𝐴 𝑘 + 𝜇𝐵 𝑘 .
We want to show that 𝛿(𝐷) = 𝜆 · 𝛿(𝐴) + 𝜇 · 𝛿(𝐵).

Claim 10.1.1
𝑑 𝑖,1 · det(𝐷𝑖,1 ) = 𝜆 · 𝑎 𝑖,1 · det(𝐴 𝑖,1 ) + 𝜇 · 𝑏 𝑖,1 · det(𝐵 𝑖,1 ) is true for all 𝑖 ∈ {1, . . . , 𝑛}.
If claim is true then we can:

(a) Multiply equation by (−1)𝑖+1


(b) Add from 𝑖 = 1 to 𝑛 to get 𝛿(𝐷) = 𝜆 · 𝛿(𝐴) + 𝜇 · 𝛿(𝐵).

Proof of claim: We have two cases:

Case (i) 𝑖 = 𝑘, then The minors 𝐴 𝑘,1 , 𝐵 𝑘,1 and 𝐷 𝑘,1 are equal.
I.e., the 𝑘th row of 𝐴, 𝐵, 𝐷 is deleted.
Which means:
Claim is true ⇐⇒ 𝑑 𝑖,1 = 𝜆 · 𝑎 𝑖,1 + 𝜇 · 𝑏 𝑖,1 .
This is true by our construction of 𝐷.
Case (ii) 𝑖 ≠ 𝑘, then
𝐴0 , 𝐵0 , 𝐷 0 rows with 𝑛 − 1 entries after deleting the 𝑘th row.
Then 𝐷 0𝑘 = 𝜆 · 𝐴0𝑘 + 𝜇 · 𝐵0𝑘 .

100
All other rows of 𝐴0 , 𝐵0 , 𝐷 0 are equal.
Thus, by inductive hypothesis:

det 𝐷𝑖,1 = 𝜆 · det 𝐴 𝑖,1 + 𝜇 · det 𝐵 𝑖,1

But also if 𝑖 ≠ 𝑘, 𝑎 𝑖,1 = 𝑏 𝑖,1 = 𝑑 𝑖,1 .


Thus,

𝑑 𝑖,1 · det 𝐷𝑖,1 = 𝜆 · 𝑎 𝑖,1 · det 𝐴 𝑖,1 + 𝜇 · 𝑏 𝑖,1 · det 𝐵 𝑖,1

Thus, the claim is true in this case as well.

(iii) Moved a bit ahead in these notes, you can see the final part of the proof.

Note:-
On Mondays’ class we showed that:

(a) 𝛿 is unique, if it exists


Í𝑛
(b) det : 𝔽𝑚,𝑛 → 𝔽 such that: 𝐴 ↦→ 𝑖=1 (−1)
𝑖+1 𝑎
𝑖,1 · det 𝐴 𝑖,1 is row-linear and det 𝐼𝑛 = 1.
We showed this by induction on 𝑛.

Proof of Det 1.3: Let’s proceed by induction on 𝑛.


Suppose rows 𝑘 and 𝑘 + 1 of 𝐴 are equal.
Then if 𝑖 ≠ 𝑘 or 𝑘 + 1, then
(𝑛 − 1) × (𝑛 − 1) minor 𝐴 𝑖,1 has two consecutive equal rows.
By inductive hypothesis, det 𝐴 𝑖,1 = 0. Then:

det(𝐴) = (−1) 𝑘+1 · 𝑎 𝑘,1 · det 𝐴 𝑘,1 + (−1) 𝑘+2 · 𝑎 𝑘+1,1 · det 𝐴 𝑘+1,1
Since 𝐴 𝑘 = 𝐴 𝑘+1 have 𝑎 𝑘,1 = 𝑎 𝑘+1,1 and 𝐴 𝑘,1 = 𝐴 𝑘+1,1
This implies that:

det 𝐴 = (−1) 𝑘+1 [𝑎 𝑘,1 · det 𝐴 𝑘,1 + (−1) · 𝑎 𝑘,1 · det 𝐴 𝑘,1 ] = 0
Thus, det 𝐴 = 0.
Therefore, by the principle of mathematical induction, det 𝐴 = 0 for all 𝐴 with two identical rows.

Corollary 10.1.3
These are given free by the theorem of det 1:

(a) det(𝐴 · 𝐵) = det(𝐴) · det(𝐵)


(b) det(𝐴) = 0 if 𝐴 has a row of zeros.

(c) det(𝐴) = 0 if 𝐴 𝑗 = 𝜆 · 𝐴 𝑖 for some 𝑖 ≠ 𝑗 and 𝜆 ∈ 𝔽.

Other formulas: (a) General column expansion:


This lands among the 𝑗th column:
𝑛
Õ
det(𝐴) = (−1)𝑖+𝑗 · 𝑎 𝑖,𝑗 · det(𝐴 𝑖,𝑗 )
𝑖=1
101
(b) General row expansion:
This expands along the 𝑖th row:
𝑛
Õ
det(𝐴) = (−1)𝑖+𝑗 · 𝑎 𝑖,𝑗 · det(𝐴 𝑖,𝑗 )
𝑗=1

Definition 10.1.2: Permutations



A permutation of 𝑆 is a bijection 𝜎 : 𝑆 →
− 𝑆.
e.g. 𝑆 = {1, 2, 3, 4, 5}.

𝑆 1 2 3 4 5
𝜎(𝑆) 3 5 1 4 2
Then:

n o
𝑆𝑛 ≔ permutations 𝜎 : {1, . . . , 𝑛} →
− {1, . . . , 𝑛}

Notice that this is the symmetric group on 𝑛 elements.


We can see that size is:

|𝑆𝑛 | = 𝑛!
Can compare permutations:


𝜏 : {1, . . . , 𝑛} →
− {1, . . . , 𝑛}

𝜎 : {1, . . . , 𝑛} →
− {1, . . . , 𝑛}

Then 𝜏 ◦ 𝜎 is also a bijection (”group law”).

Cycle Notation: Take the explicit 𝜎 above.


Given: 1 ↦→ 3 ↦→ 4 ↦→ 1
draw a 3-cycle
And 2 ↦→ 5 ↦→ 2
draw a 2-cycle
We can write:

𝜎 = (1, 3, 4)(2, 5) this is cycle notation.


= (52)(341) cycle notation is not unique

Example:

𝑆 1 2 3 4
𝜎(𝑆) 4 1 3 2
Thus:

𝜎 = (142)(3)
= (142)

Where (3) is a fixed point.


Thus, we can notice the composition in cycle notation:

102
𝜎 = (134)(25)
𝜏 = (1452)
𝜏 ◦ 𝜎 = [(1452)] ◦ [(134)(25)]
| {z } | {z }
then this first do this
= (135)(2)(4)
= (135)
(𝜏 ◦ 𝜎)(1) = 𝜏(𝜎(1)) = 𝜏(3) = 5

Question 3

Problem 5. The trace of a square matrix 𝐴 is the sum of its diagonal entries:

tr(𝐴) := 𝑎11 + 𝑎22 + · · · + 𝑎 𝑛𝑛

Show that
(a) tr(𝐴 + 𝐵) = tr(𝐴) + tr(𝐵);
(b) tr(𝐴𝐵) = tr(𝐵𝐴);
(c) if 𝐵 is invertible, then tr(𝐴) = tr 𝐵𝐴𝐵−1 .


Proof of 𝑎 : Let 𝐴, 𝐵 be two matrices size 𝑛 × 𝑛 with entries in 𝔽.


Let 𝐶 = 𝐴 + 𝐵, which means that it looks like:

𝑎 +𝑏 𝑎12 + 𝑏 12 ··· 𝑎 1𝑛 + 𝑏 1𝑛
© 11 11
­ 𝑎21 + 𝑏21 𝑎22 + 𝑏 22 ··· 𝑎 2𝑛 + 𝑏 2𝑛 ®
ª
𝐶 = ­­ .. .. .. .. ®
­ . . . .
®
®
« 𝑛1 𝑏 𝑛1
𝑎 + 𝑎 𝑛2 + 𝑏 𝑛2 ··· 𝑎 𝑛𝑛 + 𝑏 𝑛𝑛 ¬
Let’s take the trace of 𝐶:

𝑛
Õ
𝑡𝑟(𝐶) = 𝑐 𝑖𝑖
𝑖=1
= (𝑎 1,1 + 𝑏 11 ) + . . . + (𝑎 𝑛,𝑛 + 𝑏 𝑛,𝑛 )
= 𝑎11 + . . . + 𝑎 𝑛,𝑛 + 𝑏 11 + . . . + 𝑏 𝑛,𝑛

Now let’s take the trace of 𝐴 and 𝐵 separately:

𝑛
Õ 𝑛
Õ
𝑡𝑟(𝐴) + 𝑡𝑟(𝑏) = 𝑎 𝑖𝑖 + 𝑏 𝑖𝑖
𝑖=1 𝑖=1
= 𝑎 11 + . . . + 𝑎 𝑛,𝑛 + 𝑏 11 + . . . + 𝑏 𝑛,𝑛

As both sides are equal, we have shown that 𝑡𝑟(𝐴 + 𝐵) = 𝑡𝑟(𝐴) + 𝑡𝑟(𝐵).
Proof of 𝑏 : Let 𝐴, 𝐵 be two matrices size 𝑛 × 𝑛 with entries in 𝔽.
If they are not of the same size, then we cannot multiply them.
So, let’s assume they are both matrices of size 𝑛 × 𝑛.

103
Note:-
Don’t worry, I have a program that generates these matrices for me.

Let 𝐶 = 𝐴𝐵, which means that it looks like:

Í𝑛 Í𝑛 Í𝑛
𝑎 𝑏 𝑎 𝑏 ··· Í𝑛𝑘=1 𝑎1𝑘 𝑏 𝑘𝑛 ª
©Í𝑛𝑘=1 1𝑘 𝑘1 Í𝑛𝑘=1 1𝑘 𝑘2
­ 𝑘=1 𝑎2𝑘 𝑏 𝑘1 𝑘=1 𝑎 2𝑘 𝑏 𝑘2 ··· 𝑘=1 𝑎 2𝑘 𝑏 𝑘𝑛 ®
𝐶 = ­­ .. .. .. .. ®
. .
Í𝑛 . .
®
­Í ®
𝑛 Í𝑛
𝑎 𝑏
« 𝑘=1 𝑛𝑘 𝑘1 𝑘=1 𝑎 𝑛𝑘 𝑏 𝑘2 ··· 𝑘=1 𝑎 𝑛 𝑘 𝑏 𝑘𝑛 ¬
Õ 𝑛
𝐶 𝑖,𝑗 = 𝑎 𝑖𝑘 𝑏 𝑘 𝑗
𝑘=1

Let 𝐷 = 𝐵𝐴, which means that it looks like:

Í𝑛 Í𝑛 Í𝑛
𝑏 𝑎 𝑏 𝑎 ··· Í𝑛𝑘=1 𝑏1𝑘 𝑎 𝑘𝑛 ª
©Í𝑛𝑘=1 1𝑘 𝑘1 Í𝑛𝑘=1 1𝑘 𝑘2
­ 𝑘=1 𝑏2𝑘 𝑎 𝑘1 𝑘=1 𝑏 2𝑘 𝑎 𝑘2 ··· 𝑘=1 𝑏 2𝑘 𝑎 𝑘𝑛 ®
𝐷 = ­­ .. .. .. .. ®
­Í . . . .
®
®
𝑛 𝑛 Í𝑛
𝑏 𝑎 𝑘=1 𝑛 𝑘 𝑎 𝑘2
𝑏 𝑏 𝑎
Í
« 𝑘=1 𝑛𝑘 𝑘1 ··· 𝑘=1 𝑛𝑘 𝑘𝑛 ¬
Õ 𝑛
𝐷𝑖,𝑗 = 𝑏 𝑖𝑘 𝑎 𝑘 𝑗
𝑘=1

Let’s take the trace of 𝐶 and show it is equal to the trace of 𝐷:

𝑛
Õ
𝑡𝑟(𝐶) = 𝑐 𝑖𝑖
𝑖=1
Õ𝑛 Õ 𝑛
= 𝑎 𝑖𝑘 𝑏 𝑘𝑖
𝑖=1 𝑘=1
Õ𝑛
= (𝑎 𝑖1 𝑏1𝑖 + . . . + 𝑎 𝑖𝑛 𝑏 𝑛𝑖 )
𝑖=1
𝑛
(𝑏1𝑖 𝑎 𝑖1 + 𝑏2𝑖 𝑎 𝑖2 + . . . + 𝑏 𝑛𝑖 𝑎 𝑖𝑛 ) by commutativity of · in 𝔽
Õ
=
𝑖=1
= (𝑏11 𝑎11 + . . . + 𝑏 𝑛1 𝑎1𝑛 ) + . . . + (𝑏1𝑛 𝑎 𝑛1 + . . . + 𝑏 𝑛𝑛 𝑎 𝑛𝑛 )
= (𝑏11 𝑎11 + 𝑏 12 𝑎21 + . . . + 𝑏1𝑛 𝑎 𝑛1 ) + . . . + (𝑏 𝑛1 𝑎1𝑛 + . . . + 𝑏 𝑛𝑛 𝑎 𝑛𝑛 )
𝑛
Õ
= (𝑏 𝑖1 𝑎1𝑖 + 𝑏 𝑖2 𝑎2𝑖 + . . . + 𝑏 𝑖𝑛 𝑎 𝑛𝑖 )
𝑖=1
Õ𝑛 Õ 𝑛
= 𝑏 𝑖𝑘 𝑎 𝑘𝑖
𝑖=1 𝑘=1
Õ𝑛
= 𝑑 𝑖𝑖
𝑖=1
= 𝑡𝑟(𝐷)

Thus, we have shown that 𝑡𝑟(𝐴𝐵) = 𝑡𝑟(𝐵𝐴).

104
Proof of 𝑐: This follows directly from part 𝑏.
Let 𝐴, 𝐵 be two matrices size 𝑛 × 𝑛 with entries in 𝔽.
Remember that we prove in part 𝑏, that 𝑡𝑟(𝐴𝐵) = 𝑡𝑟(𝐵𝐴). Thus:

𝑡𝑟(𝐵𝐴𝐵−1 ) = 𝑡𝑟(𝐴𝐵−1 𝐵) = 𝑡𝑟(𝐴𝐼), as 𝐵𝐵−1 = 𝐼


Remember that multiplying a matrix by the identity matrix does not change the matrix.
Thus,

𝑡𝑟(𝐵𝐴𝐵−1 ) = 𝑡𝑟(𝐴𝐼) = 𝑡𝑟(𝐴)


This means that 𝑡𝑟(𝐴) = 𝑡𝑟(𝐵𝐴𝐵−1 ), as desired.
Consider the group:
   √ 
0 −1 −1 √− 2
Γ= , √
1 1 2+1 2+1

105

You might also like