LectureNotes LinearAlgebra
LectureNotes LinearAlgebra
Strang, G., 2016. In- Searle, S. R., 1982. Schott, J. R., 2005. Golub, G. H., Van
troduction to linear Matrix algebra use- Matrix analysis for Loan, C. F., 1996. Ma-
algebra, 5th Edition. ful for statistics. Wi- statistics, 2nd Edi- trix computations,
Wellesley-Cambridge ley, New York tion. Wiley, Hoboken, 3rd Edition. Johns
Press N.J Hopkins University
Press, Baltimore
• Developing mathematical foundations to many courses and areas, in particular “Data Science”
• Building intuition.
Multivariate Statistics
Statistics
Discrete Mathematics
• Some prerequisites are not so strict; others are possible, e.g., GPU, Algorithms, etc.
• It differs from researchers to practitioners; See pattern recognition course and big picture talk.
Contents vii
1 Introduction 1
1.0 Back to School: visualspace! . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1 Angle, Lengths, and Dot Products (visualspace and school again) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2 Extension and Abstraction: Vectors and Linear Combinations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Bibliography
Introduction
• the 3-tuple v = (v 1 , v 2 , v 3 ).
• the point v = (v 1 , v 2 , v 3 ).
• the arrow connecting O to v, i.e., the vector
−→
v = Ov = (v 1 , v 2 , v 3 ).
• Addition: u + v
a + b = (a 1 + b 1 , a 2 + b 2 , a 3 + b 3 ),
ca = (c a 1 , c a 2 , c a 3 ).
a +b = b+a
a + (b + c) = (a + b) + c
a +0 = a
a + (−a) = 0
c(a + b) = ca + cb
(c + d )a = ca + d a
a = (a 1 , a 2 , a 3 )
= a 1 i + a 2 j + a 3 k,
i = (1, 0, 0),
j = (0, 1, 0),
k = (0, 0, 1).
c 2 = a 2 + b 2 − 2ab cos θ
∥u − v∥2 = ∥u∥2 + ∥v∥2 − 2∥u∥∥v∥ cos θ
2∥u∥∥v∥ cos θ = (u 12 + u 22 ) + (v 12 + v 22 ) − (u 1 − v 1 )2 − (u 2 − v 2 )2 = 2u 1 v 1 + 2u 2 v 2
u1 v 1 + u2 v 2
cos θ = .
∥u∥∥v∥
• If u = v, then θ = 0, u ′ u = u 1 u 1 + u 2 u 2 = ∥u∥2 .
• Basic properties:
u′ v = v ′u
∥au∥ = |a|∥u∥
a(u ′ v) = (au)′ v = au ′ v
(au + bv)′ w = au ′ w + bv ′ w
(u + v)′ (u + v) = u ′ u + 2u ′ v + v ′ v.
• Then, we can generalize this definition in higher dimensions, and define the angle between two vectors
for p > 3.
Example 7 (3D) .
Hint: To save space, we write, e.g., v = (4, 2)′ . Sometimes, we drop the prime if there is no confusion.
8 Copyright © 2016, 2019 Waleed A. Yousef. All Rights Reserved.
Example 8 (Linear Combination) .
• What is the picture for ALL linear combinations? “spanning” the space, independence ...
9 Copyright © 2016, 2019 Waleed A. Yousef. All Rights Reserved.
1.2 Extension and Abstraction: Vectors and Linear Combinations
Extension in both: meaning and number of components to treat applications.
Now, we have to reprove Cauchy-Schwartz inequality; then triangular inequality follows directly!
u′ v
Proof. ∀λ ∈ R , (we will put it later as ∥v∥ 2)
v1 w1 c v1 + d w1
. . ..
c .. + d .. = . .
vp wp cv p + d w p
( )( ) ( )
x − 2y = 1 1 −2 x 1
≡ = ≡ Ax = b
3x + 2y = 11 3 2 y 11
Column picture (linear combination) Row picture (vector equation of line intersection)
( ) ( ) ( ) ( )
1 −2 1 ( ) x
x+ y= 1 −2 =1
3 2 11 y
( ) ( ) ( ) ( )
1 −2 1 ( ) x
3+ 1= 3 2 = 11
3 2 11 y
p
Projection = (x, y)(α/∥α∥) = b/∥α∥ = 11/ 13.
x + 2y + 3z = 6 1 2 3 x 6 R1 x
2x + 5y + 2z = 4 ≡ 2 5 2 y = 4 ≡ Ax = b ≡ C 1 x +C 2 y +C 3 z = R 2 x = b,
6x − 3y + z = 2 6 −3 1 z 2 R3 x
• Backsubstitution.
Example 12 (2 equations)
x − 2y =1
(Before)
3x + 2y = 11
x − 2y =1
(After)
8y =8
Example 13 (3 equations)
2x + 4y − 2z = 2 2x + 4y − 2z = 2 2x + 4y − 2z = 2
4x + 9y − 3z = 8 1y + 1z = 4 1y + 1z = 4
−2x − 3y + 7z = 10 1y + 5z = 12 4z = 8
Step 0 Step 1 Step 2
x − 2y =1
(After)
0y =8
Interpretation from two perspectives
x − 2y =1
(After)
0y =0
Interpretation from two perspectives
• Languages handle matrices differently; e.g., Matlab images, C (row-wise), Fortran (column-wise), etc.
Example 16
( ) 18 19
18 17 11
A= , A ′ = 17 −4
19 −4 0
11 0
Notice:
( )′
• A ′ = A.
• For vectors:
19
( )
x = −4 , x ′ = 19 −4 0 .
0
( )′ ( )
we usually write x = 19 −4 0 , or x ′ = 19 −4 0 to save vertical space.
Definition 17 (Symmetric Matrices (around diagonal)) A square matrix A m×m is called symmetric if A i j =
A j i ; i.e., A = A ′ .
18 17 11
Example 18 (write a SW to check the symmetry of ) : A = 17 −4 0
11 0 2
It is quite easy to show that, the transpose of a partitioned matrix (2.1) is given by
′
A 11 A 12 ··· A 1c A ′11 A ′21 ··· A ′r 1
A 21 A 22 ··· A 2c A ′12 A ′22 ··· A ′r 2
A′ =
.. .. ..
= . .. ..
. . . .. . .
Ar 1 Ar 2 ··· Ar c A ′1c A ′2c ··· A ′r c
Example 21
( )
2 8 9 ( )
A= = A 11 A 12
3 7 4
( ) 2 3
A ′11
A′ = = 8 7
A ′12
9 4
Corollary 23
( )
tr (A) = tr A ′ .
tr (x) = x ∀x ∈ R.
Proof.
∑ ∑ ′
tr(A) = Ai i = (A )i i = tr(A’) .
i i
Example 24
1 7 6
A = 8 3 9 =⇒ tr (A) = −4.
4 −2 −8
• we say that A = B if A i j = B i j ∀i , j .
• Of course, A + 0 = A
(A + B )′ = A ′ + B ′
tr (A + B ) = tr (A) + tr (B )
Proof. Show that the general element i j of LHS equal to that of RHS.
( )
(A + B )′ i j = (A + B ) j i = A j i + B j i = (A ′ )i j + (B ′ )i j = (A ′ + B ′ )i j .
∑ ∑ ∑ ∑
tr(A + B ) = (A + B )i i = (A i i + B i i ) = A i i + B i i = tr(A) + tr(B ).
i i i i
∑
n
C i k = a i′ b k = a i j b j k = a i 1 b 1k + a i 2 b 2k + . . . + a i n b nk .
j =1
However, we can partition either (or both) A m×n and B n×p as rows and/or columns to see the multiplication
differently. This has a great value in mathematical treatments and semantics. We have only 4 ways to do
that:
1. A m×1 , B 1×p .
2. A 1×n , B n×p .
3. A 1×n , B n×1 .
4. A m×n , B n×1 .
a 1′
. ( )
C = .. b 1 · · · b p (A m×1 B 1×p partitioning)
′
am
′ ∑n ∑n
a1 b1 a 1′ b p j =1 a 1 j b j 1 j =1 a 1 j b j p
.. .. .. ..
= . .. ..
. . . . .
′ ′ ∑n ∑n
am b1 am bp j =1 m j b j 1
a j =1 a m j b j p
∑
n
C i k = a i′ b k = a i j b j k = a i 1 b 1k + · · · + a i n b nk
j =1
Example 27
( )( )
1 4 3 2 0
=
1 5 1 4 −1
( )
3 + 4 2 + 16 0−4
=
3 + 5 2 + 20 0−5
( )
7 18 −4
=
8 22 −5
b 11 b 1p
( ) . .. ..
C = a1 ··· a n .. . . (A 1×n B n×p partitioning)
b n1 b np
( )
= b 11 a 1 + · · · + b n1 a n · · · b 1p a 1 + · · · + b np a n
(∑ ∑ )
= j b j 1a j · · · j bjpaj
( )
= c1 · · · c p
( ) ( ∑ ) ∑( ) ∑
C i k = ck i = bjk aj i = b j k a j i = b j k ai j .
j j j
Example 28
( )( )
1 4 3 2 0
C=
1 5 1 4 −1
( ( ) ( ) ( ) ( ) ( ) ( ))
1 4 1 4 1 4
= 3 +1 2 +4 0 + −1
1 5 1 5 1 5
(( ) ( ) ( )) ( )
3+4 2 + 16 0−4 7 18 −4
= =
3+5 2 + 20 0−5 8 22 −5
Example 29
a 11 a 1n b 1′ ( )( )
.. . .. .. 1 4 3 2 0
C = . . . . . (A m×n B n×1 partitioning) =
1 5 1 4 −1
a m1 a mn b n′ ( ( ) ( ))
1 3 2 0 + 4 1 4 −1
a 11 b 1′ + · · · + a 1n b n′ = ( ) ( )
1 3 2 0 + 5 1 4 −1
.. (( )) ( )
= . 3 + 4 2 + 16 0 − 4 7 18 −4
a m1 b 1′ + · · · + a nm b n′ = ( ) =
3 + 5 2 + 20 0 − 5 8 22 −5
∑n ′
j =1 a 1 j b j
..
= .
∑n ′
a
j =1 m j j b
′
c1
..
= .
′
cm
(∑ ) ∑( ) ∑
C i k = (c i′ )k = a i j b ′j k = a i j b ′j k = a i j b j k .
j j j
Example 30
b 1′ ( )( )
( ) . 1 4 3 2 0
C = a1 ··· a n .. (A 1×n B n×1 partitioning) =
1 5 1 4 −1
b n′ ( ) ( )
1 ( ) 4 ( )
= a 1 b 1′ + · · · + a n b n′ = 3 2 0 + 1 4 −1
1 5
∑n ( ) ( ) ( )
= a j b ′j , 3 2 0 4 16 −4 7 18 −4
= + =
j =1 3 2 0 5 20 −5 8 22 −5
a1 j a 1 j b ′j
. ..
a j b ′j = .. b ′j = .
,
′
an j an j b j
(∑ ) ∑( ) ∑
Ci k = a j b ′j i k = a j b ′j i k = a i j b j k .
j j j
←→ ←→ · · · ←→
n1 n2 nc
A 11 A 12 · · · A 1c n1 ↕ B 11 B 12 · · · B 1k
A 21 A 22 · · · A 2c n2 ↕ B 21 B 22 · · · B 2k
=
.. .. .. .. .
. .. ..
. . . . . . .
Ar 1 Ar 2 · · · Ar c r ×c
nc ↕ B c1 B c2 · · · B ck c×k
n 1 + · · · + n c = n.
Since there is no confusion, we subscript d i instead of d i i . We also, for short, write D = diag(d 1 , . . . , d n )
Row scaling:
d1 a 1′ d 1 a 1′
.. .. ..
D m×m A m×n = . . = .
′ ′
dm am dm am
Column scaling:
d1 0 0
( ) .. ( )
A m×n D n×n = a 1 ··· an 0 . 0 = a1 d1 ··· an dn
0 0 dn
Definition 32 The identity matrix I is a special case diagonal matrix and defined as
It is obvious that I A = AI = A.
31 Copyright © 2016, 2019 Waleed A. Yousef. All Rights Reserved.
Transpose of a Product
(AB )′ = B ′ A ′ ,
∑
n ∑
n ( ) ( ′) ∑
n ( ) ( ) ( )
(AB )i k = Ai j B j k = A′ ji B kj = B ′ k j A ′ j i = B ′ A ′ ki .
j =1 j =1 j =1
The trace is defined only for a square matrix; hence, for a product to have a trace it must be A m×n B n×m .
tr (AB ) = tr (B A) ,
i.e., it is the sum of products of each element of A multiplied by the corresponding element of B ′ . And if B = A ′
( ) ( ) ∑n ∑
m ∑
n ∑
m
tr A A ′ = tr A ′ A = Ai j Ai j = A 2i j ,
j =1 i =1 j =1 i =1
0 0 1 1 1 0 0 1 0 2
0 0
0 0 0 2 0 0 2 0
T 2 = 0 0 1 0 0 , T 3 = 0 0 0 0 1
• The traffic is represented as a matrix 0 0 1 0 0 0 0 0 0 1
T , where a path from S i to S j exists if 0 0 0 0 1 0 0 1 0 0
Ti j = 1.
• Number of ways of getting from S i to S k in exactly r steps is
0 1 1 0 0 ∑ ( r −1 ) r
0 j Ti j T j k = (T )i k .
0 1 1 0
T = 0 0 0 0 1 • There is no path from S i to S k only if
0 0 0 0 1 Σ∞ r
r =1 (T )i k = 0.
0 0 1 0 0
∑∞ r
• What is r =1 T ?
35 Copyright © 2016, 2019 Waleed A. Yousef. All Rights Reserved.
2.3.6 The Laws of Algebra
Theorem 37 ∀A m×n , B m×n , C m×n , c scalar, we have
A +B = B + A (commulative)
c (A + B ) = c A + cB (distributive)
A + (B +C ) = (A + B ) +C , (associative)
and
∑
n
(AB )i j = Ai r Br j
r =1
∑ p
((AB )C )i k = (AB )i j C j k
j =1
∑
p ∑
n
= Ai r Br j C j k
j =1 r =1
∑
n ∑
p
= Ai r Br j C j k
r =1 j =1
∑
n
= A i r (BC )r k
r =1
= (A (BC ))i k
Y = X P X +Q X 2 + X
= (X P +Q X + I ) X
X P X +Q X X + X
Back to Definition 2 it is very important, sometimes, to make sure of conforming even for scalars; i.e., we
write
y m×1 a 1×1 NOT a y.
This is because, sometimes, a 1×1 itself is a matrix multiplication that if dissembled it should conform with
the remaining of equation
′
a 1×1 = x 1×m A m×m x m×1
′
y n×1 a 1×1 = y n×1 x 1×m A m×m x m×1
′
a 1×1 y n×1 = x 1×m A m×m x m×1 y n×1 (WRONG!)
Ver. 2 is half the number of multiplications of Ver. 1; it is almost double speed gained by a simple trick.
(∑ ∑ )
For the following quadratic form y, expand column wise j i :
1 2 3 x1
( )
y = x1 x2 x 3 4 7 6 x 2
2 −2 0 x 3
= x 12 + 4x 2 x 1 + 2x 3 x 1 + 2x 1 x 2 + 7x 22 − 2x 3 x 2 + 3x 1 x 3 + 6x 2 x 3
= x 12 + (2 + 4) x 1 x 2 + (3 + 2) x 1 x 3 + 7x 22 + (6 − 2) x 2 x 3
= x 12 + 6x 1 x 2 + 5x 1 x 3 + 7x 22 + 4x 2 x 3 .
σi i = (a i i + a i i ) /2 = a i i
( ) ( ) ( )
σi j + σ j i = a i j + a j i /2 + a j i + a i j /2
= ai j + a j i
x Ax = x ′ Σx
′
( )′ ( )
y = x −µ Σ x −µ
( ) ( )
= x ′ − µ′ Σ x − µ
= x ′ Σx − x ′ Σµ − µ′ Σx + µ′ Σµ
( )′
= x ′ Σx − x ′ Σµ − µ′1×p Σp×p x p×1 + µ′ Σµ (sc al ar ′ = sc al ar )
= x ′ Σx − x ′ Σµ − x ′ Σµ + µ′ Σµ
= x ′ Σx − 2x ′ Σµ + µ′ Σµ
Definition 42 The elimination matrix E i j (l ) is an identity matrix except the element e i j = l to perform:
R inew = R i + l × R j . If not ambiguous we write E i j
44 Copyright © 2016, 2019 Waleed A. Yousef. All Rights Reserved.
1 0 0
E 31 (1) = 0 1 0
1 0 1
1 0 0 2 4 −2 x 1 0 0 2
0 1 0 0 1 1 y = 0 1 0 4
1 0 1 −2 −3 7 z 1 0 1 10
2 4 −2 x 2
0 1 1 y = 4
0 1 5 z 12
Finally,
1 0 0
E 32 (−1) = 0 1 0
0 −1 1
1 0 0 2 4 −2 x 1 0 0 2
0 1 0 0 1 1
y = 0 1 0 4
0 −1 1 0 1 5 z 0 −1 1 12
2 4 −2 x 2
0 1 1
y = 4 ,
0 0 4 z 8
,then
E 32 E 31 E 21 (A| b)
Definition 43 The permutation matrix P i j is an identity matrix except that in Rows i and j (to be permuted)
the ones are located in p i j , p j i respectively; e.g.,
1 0 0
P 23 = 0 0 1 .
0 1 0
Of course, P i j = P j i .
¯
1 0 0 1 2 2 ¯¯ 1
−4 1 0 ↙ = 0 0 1 ¯¯ −1 (E 21 (−4))
0 0 1 0 3 2 ¯ 1
¯
1 0 0 1 2 2 ¯¯ 1
0 0 1 ↙ = 0 3 2 ¯¯ 1 . (P 23 )
0 1 0 0 0 1 ¯ −1
This is called Gauss elimination, and by back-substitution, z = −1, y = 1, x = 1. Jordan would go further to
get pivots on diagonal and zeros elsewhere.
E = E 31 E 21
1 0 0 1 0 0
= 0 1 0 −a 21 /a 11 1 0
−a 31 /a 11 0 1 0 0 1
1 0 0 ( )
1 0
= −a 21 /a 11 1 0 =
−A 21 /a 11 I
−a 31 /a 11 0 1
This gives
¯
1 2 2 ¯¯ 1
0 0 1 ¯ −1 ,
¯
0 3 2 ¯ 1
which would be obtained of course having multiplied by E 21 (−4)
A −1 −1
l A = A A r = I p×p
aX = b
a −1 aX = a −1 b
1 X = a −1 b
X = a −1 b
AX = b
−1
A AX = A −1 b
I X = A −1 b
X = A −1 b,
although finding A −1 is more computational expensive than solving by elimination as we will see.
51 Copyright © 2016, 2019 Waleed A. Yousef. All Rights Reserved.
Lemma 47 If both left and right inverses exist they are equal
This Lemma is different from the last two statements in Lemma 51 (will be proven shortly), from which we
can say:
1. If left (or right) inverse exists the right (or left) exists and equals it. Stated differently, if AB = I then
B A = I.
Therefore,
Either: the square matrix A has no inverse
Or: the left and right inverses are identical and unique.
1. Any 2 × 2 matrix:
( )−1 ( )
a b 1 d −b
=
c d ad − bc −c a
Proof. The proof is by direct multiplication from both sides; it is obvious for 1 and 2. For 3,
1 0 ··· 0 1 0 ··· 0 1 0 ··· 0
.. ..
0 1 . ... 0 .
0 1 1
. . = . .
. .. . .. . ..
. −l . 0 . l . 0 . −l × 1 + l .
0 ··· 0 1 0 ··· 0 1 0 ··· 0 1
Proving 4 follows exactly the same line. In few words, since P i j is I with rows i and j swapped then P i j P i j
swaps again the same rows to bring it back to I .
53 Copyright © 2016, 2019 Waleed A. Yousef. All Rights Reserved.
Example 49 Consider E 21 (−5), then
1 0 0 R1 R1
E 21 (−5)A = −5 1 0 R 2 = R 2 − 5R 1 ,
0 0 1 R3 R3
1 0 0 R1 R1
E 21 (5) (E 21 (−5)A) = 5 1 0 R 2 − 5R 1 = R 2 ,
0 0 1 R3 R3
Ax i = e i , i = 1, . . . , n.
2. If either inverse exists A has n pivots and hence A −1 exists. (This means if AB = I then B A = I )
3. If the inverse exists then it is unique along with pivots and the solution to Ax = b.
Proof. 1: If the pivots exist then this has been produced to initially solve the problem AX = I , and the
solution X went to the right side; therefore the solution X is A −1
r . In parallel, the solution is nothing but a
series of matrix multiplications:
2: If A −1
r exists (AX = I ) we will prove A has n pivots by contradiction. Assume that A does not have n pivots
(the elimination matrices M A produces a matrix with zero row):
A1 A = I
A1 A A2 = A2
A1 = A2
Since the inverse is unique, the elimination process cannot produce different pivots; hence they are unique
too and the solution to Ax = b ∀b will be unique as well and equals to A −1 b.
BA =I
A′B ′ = I
AB ′ = I
B AB ′ = B
B ′ = B.
AX = 0
A −1 AX = A −1 0
X = 0,
( )−1
A = M L−1 U = E 32 (−1)E 31 (1)E 21 (−2) U
( )
= E 21 (2)E 31 (−1)E 32 (1) U
A=LU
2 4 −2 1 0 0 2 4 −2
4 9 −3 = 2 1 0 0 1 1
−2 −3 7 −1 1 1 0 0 4 Remark: L stores the Gauss-
elimination steps on A, which
1 0 0 2 0 0 1 2 −1
= 2 1 0 0 1 0 0 1 1 ends up to U .
−1 1 1 0 0 4 0 0 1
A = L D U.
AX = b ≡ L (U X ) = b
1 0 0 2 4 −2 x 2
2 1 0 0 1 1 y = 8
−1 1 1 0 0 4 z 10
1 0 0 c1 2
2
1 0 c2 = 8 (Gauss-elimination for b)
−1 1 1 c3 10
c1 = 2 2c 1 + c 2 = 8 −→ c 2 = 4 − c 1 + c 2 + c 3 = 10 −→ c 3 = 8
2 4 −2 x 2
0 1 1 y = 4 (same obtained with augmenting)
0 0 4 z 8
4z = 8 −→ z = 2 y + z = 4 −→ y = 2 2x + 4y − 2z = 2 −→ x = −1
where: (1) both M L , M L−1 (= L) are LTMs, and (2) L has L i j equals directly to the element of E i j as opposed to
M L . The proof is immediate from the following two more general lemmas. Hint: to prove that the elements of
M L are not directly the elements of E i j a single counter example is enough.
Lemma 57 Multiplication of two lower (or upper) triangular matrices is a lower (or upper) triangular matrix.
The diagonal will be one if A i i B i i = 1 (A i i = B i i = 1 is a special case).
Proof. Suppose A, B are LTMs; i.e., A i j = B i j = 0 ∀i < j . Then, the element C i j , i ≤ j will be
∑ ∑ ∑ ∑ ∑
Ci j = Ai k Bk j = Ai k Bk j + Ai i Bi j + Ai k Bk j = Ai k 0 + Ai i Bi j + 0 Bk j = Ai i Bi j
k k<i k>i k<i k>i
which is 0 for i < j and A i i B i i for i = j . Hence, it is obvious that M L is LTM with ones on the diagonal
Ai j = Bi j = 0 ∀i < j
Ai i = Bi i = 1
Aj = ej ∀j > J
Bj = ej ∀j < J
Ai J = 0 ∀I < i
Bi J = 0 ∀J < i ≤ I .
∑
Since C j = i B i j A i , we get:
∑
Cj = Bi j Ai + B j j A j = 0 + A j = A j , (∀ j < J )
i ̸= j
∑ ∑ ∑ ∑
Cj = Bi j Ai = Bi j Ai + Bi j Ai = 0 + Bi j ei = B j , (∀ j > J )
i i<j j ≤i j ≤i
∑ ∑ ∑ ∑
CJ = Bi J Ai + B J J A J + Bi J Ai + Bi J Ai = 0 + A J + 0 + Bi J ei . (j = J)
i <J J <i ≤I I <i I <i
L −1
i , j = −L i , j , i>j
L −1
i , j = L i , j = 1, i=j
L −1
i,j = L i , j = 0, i < j.
Lemma 60 If A has a row starting with zero, so does the same row in L; and when a column in A starts with
zero, so does the same column in U
Also, it could be immediate from the fact that if a row in A has zero, it does not need elimination and hence
the element of its E matrix will be zero. This saves computer time.
On the other hand, if A 1 j = 0, then U1 j = 0 is immediate from
∑ ∑
0 = A1 j = L 1k Uk j = 1U1 j + 0Uk j ,
k k>1
LAPACK (Linear Algebra PACKage): stands on EIS- • It builds on top of many existing open-
PAC and LINPACK and heavily on BLAS (all source packages: NumPy, SciPy, mat-
written in Fortran) to make them run effi- plotlib, Sympy, Maxima, GAP, FLINT, R
ciently on shared-memory vector and parallel and many more.
processors. • Access their combined power through a
common, Python-based language or di-
Matlab : is a commercial SW: rectly via interfaces or wrappers.
• late 1970s, written to access to EISPACK • Mission: Creating a viable free open
and LINPACK without learning Fortran. source alternative to Magma, Maple,
• Then was written in C. Mathematica and Matlab.
• Then, in 2000, rewritten to use LAPACK. • Examples and Sage cheat sheet:
66 Copyright © 2016, 2019 Waleed A. Yousef. All Rights Reserved.
2.7.2 Issues on Complexity*
To measure algorithm complexity we need to define a step; We adopt the definition of FLOP (Floating Point
Operation) from the great and very mature reference for matrix computations (Golub and Van Loan, 1996,
Sec. 1.2.4): □×□ + □ (almost the inner loop)
= (n)(n − 1) + (n − 1)(n − 2) + · · · + 2 · 1 = (n − 1) + (n − 2) + · · · + 1
∑n 1 1 1
= (n − i + 1)(n − i ) = (n − 1)n = n 2 − n = O (n 2 ).
i =1
2 2 2
∑( )
= i 2 − (2n + 1)i + n(n + 1)
i
(1 1 1 ) (1 )
= n 3 + n 2 + n − (2n + 1) n(n + 1) + n 2 (n + 1)
3 2 6 2
1 3 1
= n − n = O (n 3 ).
3 3
sage : var ( ’i ,j ,k , n ’) ;
sage : sum (( n - i +1) *( n - i ) ,i ,1 , n )
1/3* n ^3 - 1/3* n
67 Copyright © 2016, 2019 Waleed A. Yousef. All Rights Reserved.
Example 62 (Elaboration on Lemma 57 and looping over LT (or UT)) .
∑
Multiplication A m×n B n×p , C i j = k A i k B k j , mnp (or n 3 ) steps:
C = 0
for i =1: m
for j =1: p
for k =1: n
C (i , j ) = A (i , k ) B (k , j ) + C (i , j )
∑i
If both A, B are LT: C i j = 0, ∀i < j , Ci j = Ai k Bk j , ∀ j ≤ i
k= j
C = 0
for i =1: n
for j =1: i // ( to access the UT j = i : n )
no. of steps =
for k = j : i // B =0 for k <j , A =0 for i < k
C (i , j ) = A (i , k ) B (k , j ) + C (i , j )
n ∑
∑ i ∑
i
=
1
i =1 j =1 k= j
∑
n ∑
i
= (i + 1 − j )
i =1 j =1
∑ n ( 1 ) 1∑ n
=
(i + 1)i − i (i + 1) = (i + i 2 )
i =1 2 2 i =1
1(1 1 3 1 2 1 )
sage : var ( ’i ,j ,k , n ’) ; = n(n + 1) + n + n + n
sage : sum ( sum ( sum (1 , k , j , i ) , j , 1 , i ) , i , 1 , n ) 2 2 3 2 6
1/6* n ^3 + 1/2* n ^2 + 1/3* n 1 3 1 2 1
=
n + n + n.
6 2 3
68 Copyright © 2016, 2019 Waleed A. Yousef. All Rights Reserved.
Example 63 (Matrix round off error and LU partial permutation) (Golub and Van Loan, 1996, Sec. 3.3).
Suppose the PC has a floating point arithmetic with t = 3 digits; what is the LU factorization/solution to:
( )( ) ( )
.001 1.00 x 1 1.00
=
1.00 2.00 x 2 3.00
3-digit precision:
( ) ( ) ( ) ( ) ( )
1.00 0 .001 1.00 .001 1.00 x1 0.00
L= ,U= , LU = = A, =
1000 1.00 0 −1000 1.00 0.00 x2 1.00
some calculation steps:
−1000 × 1 + 2 = −1.00 × 10 + 0.002 × 10 = (−1.00 + 0.00) × 10−3 = −1000.
3 3
commutativity u + v = v + u ∈ V ∀u, v ∈ V .
multiplicative identity 1v = v ∀v ∈ V .
Hint:
• Informally: it is the set of vectors which all additions and scalars lay in the set as well.
• any linear combination lie in the subspace (from first and last identity)
{ } { }
Example 65 (R 2 = (x 1 , x 2 )|x 1 , x 2 ∈ R vs. V = (x 1 , x 2 )| − a ≤ x 1 , x 2 ≤ a is NOT an example) .
• We can expand from the set R to any F ; the vector space will be defined then over this F .
• Edwin A. Abbott, 1884, “Flatland: a romance of many dimensions”: can help creatures living in three-
dimensional space, such as ourselves, imagine a physical space of four or more dimensions.
Polynomial: p(z) = a o + a 1 z + · · · + a m z m .
3. 0v = 0 ∀v ∈ V .
4. a0 = 0 ∀a ∈ F .
0′ = 0′ + 0 = 0
←−−→
w = w + 0 = w + (v + w ′ ) = (w + v) + w ′ = 0 + w ′ = w ′ .
←−−−−−−−−−−−−−−−−−−−−→
0v = (0 + 0)v = 0v + 0v −→ 0v = 0
a0 = a(0 + 0) = a0 + a0 −→ a0 = 0
v + (−1)v = (1)v + (−1)v = (1 − 1)v = 0v = 0 −→ (−1)v is the additive inverse of v
Proposition 70 For any space V and a subset U ⊂ V , U is a space (or a subspace) if the following hold:
additive identity 0 ∈ U .
Proof. The proof is obvious since other properties are satisfied immediately on the subset as long as they are
satisfied on the whole set.
{ }
• U = (x 1 , x 2 )|0 ≤ x 1 , x 2 .
{ }
• U = (x 1 , x 2 )|x 1 ∈ R , x 2 = ax 12 , a ̸= 0
1. (0, 0) ∈ U .
( ) ( )
2. (x 1 , ax 12 ) + (x 2 , ax 22 ) = (x 1 + x 2 ), a(x 12 + x 22 ) ̸= (x 1 + x 2 ), a(x 1 + x 2 )2
Remark 2 This recalls the solution of Ax = b exists? b must be in the column space of A.
1 0
Example 74 What is the column space of the matrix A = 4 3?
2 3
1 0
It is the set C (A) = Ax = 4 x 1 + 3 x 2 ∀x 1 , x 2 , which is actually a plane passing through zero.
2 3
Now: it is natural to define a space from ONLY x (not Ax or x ′ A), under some constraint.
Definition 77 (Null Space N (A) ⊆ R n ) , and is constructed such that N (A) ⊥ R (A).
{ }
N (A) = x | Ax = 0, x ∈ R n .
Proof of N (A) is really a subspace.
x =0 −→ A0 = 0
x 1 , x 2 ∈ N (A) −→ Ax 1 = Ax 2 = 0 = Ax 1 + Ax 2 = A(x 1 + x 2 )
x 1 ∈ N (A) −→ Ax 1 = 0 = a Ax 1 = A(ax 1 ).
Remark 3 .
{ }
1. It is impossible to have a subspace of x | Ax = b, x ∈ R n except for b = 0; why?
Example 79 What is the null space of the matrix: x 1 + 2x 2 + 3x 3 = 0. Here: A = ( 1, 2, 3 ). No pivot cancellation:
−2x 2 − 3x 3 −2 −3
x= x2 = x2 1 + x3 0
x3 0 1
The solution is the set of all linear combinations of these two (2 = 3 − 1) simple vectors; A PLANE: let’s draw it.
Example 80 Suppose A =
( ) ( ) 11
1 2 3 1 2 3
A= −→ −→ x 2 = −7x 3 , x 1 = 11x 3 −→ x = x 3 −7 .
1 1 −4 0 −1 −7
1
So, the solution is the set of all linear combination of this single (1 = 3 − 2) vector; A Line: let’s see Sage.
79 Copyright © 2016, 2019 Waleed A. Yousef. All Rights Reserved.
Example 81 (Motivation from data science) :
• Data Interpretation.
x 1 = −2x 3 + 6x 4
x 2 = 0x 3 − 5x 4
x1 −2 6
x 0 −5
2
x = = x3 + x4
x 3 1 0
x4 0 1
Proof.
For columns:
( )( ) ( )
1 0 a 11 a 21 · · · αa 11 + βa 21 ··· a 11 a 21 ··· αa 11 + βa 21 ···
=
−A 1 /a 11 I. A 1 A2 · · · αA 1 + βA 2 ··· 0 A 2 − A 1 a 21 /a 11 · · · β(A 2 − A 1 a 21 /a 11 ) · · ·
second pivot cancellation will not provide pivots in the linear combination column.
For rows:
a 11 R1 a 11 R1 a 11 R1
a 21 R2 0 R 2 − R 1 (a 21 /a 11 ) 0 R 2 − R 1 (a 21 /a 11 )
−→ . = .
.. .. . .. . ..
. . . . . .
( ) ( )
αa 11 + βa 21 αR 1 + βR 2 0 (αR 1 + βR 2 ) − R 1 αa 11 + βa 21 /a 11 0 β R 2 − R 1 (a 21 /a 11 )
Proof. We arrange
2. Since we have just proven that α ∉ N (A); then Aα ̸= 0. Therefore, the corresponding columns of A are
linearly independent as well.
• N (A) = N (U ) = N (R) of course, since pivot cancellation will not change the 0 vector at the R.H.S.
• The number of vectors in N (A) is itself the number of linear combinations of columns of A that gives 0.
Golub, G. H., Van Loan, C. F., 1996. Matrix computations, 3rd Edition. Johns Hopkins University Press, Balti-
more.
Schott, J. R., 2005. Matrix analysis for statistics, 2nd Edition. Wiley, Hoboken, N.J.
Searle, S. R., 1982. Matrix algebra useful for statistics. Wiley, New York.
Strang, G., 2016. Introduction to linear algebra, 5th Edition. Wellesley-Cambridge Press.