Cholesky Decomposition, Linear Algebra Libraries and Matlab Routine
Cholesky Decomposition, Linear Algebra Libraries and Matlab Routine
Course Website
https://sites.google.com/view/kporwal/teaching/mtl107
Symmetric, positive definite systems
Definition
A symmetric matrix A is positive definite, if the corresponding
quadratic form is positive definite, i.e., if
Theorem
If a symmetric matrix A ∈ Rn×n is positive definite, then the
following conditions hold.
1. aii > 0 for i=1,...,n.
2 <a a
2. aik ii kk for i ̸= k, i, k = 1, ..., n.
3. There is a k with maxi,k |aik | = akk .
Proof.
1. aii = e⊤
i Aei > 0.
2. (ξei + ek )⊤ A(ξei + ek ) = aii ξ 2 + 2aik ξ + akk > 0.This
quadratic equation has no real zero ξ, therefore its
2 − 4a a
discriminant 4aik ii kk must be negative.
3. Clear.
Gaussian elimination works w/o pivoting. Since a11 > 0 we have
a11 a⊤ 0⊤ a⊤
1 1 a11 1
A= =
a1 A1 a1 /a11 I 0 A1 − a1 a⊤ 1 /a11
0⊤ a11 0⊤ 1 a⊤
=
1 1 /a11
a1 /a11 I 0 A(1) 0 I
with A(1) = A1 − a1 a⊤
1 /a11 . But for any x ∈ R
n−1 /{0} we have
⊤
a11 0⊤
0 0
x⊤ A(1) x =
x 0 A(1) x
⊤
0⊤ a11 a⊤ 1 −a⊤
0 1 1 1 /a11 0
=
x −a1 /a11 I a1 A1 0 I x
= y⊤ Ay > 0.
Cholesky decomposition
Lc = b (Forward substitution)
L⊤ x = c (Backward substitution)
The complexity is half that of the LU factorization:
1 3
n + O(n2 )
3
Efficient implementation
See http://www.netlib.org/blas
BLAS-1: vector operations (real, double, complex variants)
▶ Swap two vectors, copy two vectors, scale a vector
▶ AXPY operation: y = αx + y
▶ 2-norm, 1-norm, dot product
▶ I AMAX index of largest matrix element:
first i such that |xi | ≥ |xk | for all k.
▶ O(1) flops per Byte memory access.
BLAS (cont.)
Course Website
https://sites.google.com/view/kporwal/teaching/mtl107
Error estimation
Let
1.2969 0.8648 0.8642
A= , b=
0.2161 0.1441 0.1440
Suppose somebody came up with the approximate solution
0.9911
x̃ =
−0.4870
−10−8
Then, r̃ = b − Ax̃ = =⇒ ∥r̃∥∞ = 10−8
10−8
2
Since x = =⇒ error ∥z̃∥∞ = 1.513 which is ≈ 108
−2
times larger than the residual.
Error estimation (cont.)
Definition
The quantity
κ(A) = ∥A∥
A−1
∥E ∥2 1
=
∥A∥2 κ2 (A)
Error estimation (cont.)
This yields
−1
A
= 1.513×108 =⇒ κ∞ (A) = 2.162×1.513×108 ≈ 3.27×108 .
∞
1.513 3.27
The numbers 2 < 0.8642 confirm the estimate
∥z∥∞ ∥r̃∥∞
≤ κ∞ (A)
∥x∥∞ ∥b∥∞
Error estimation (cont.)
So, indeed,
∥E ∥ 1
≈ .
∥A∥ κ(A)
(This estimate holds in l2 and l∞ norms.)
Sensitivity on matrix coefficients
Thus,
δx = A−1 (δb − δAx − δAδx)
For compatible norms we have
Then,
−1
A
∥δx∥ ≤ (∥δb∥ + ∥δA∥ ∥x∥)
1 − ∥A−1 ∥ ∥δA∥
Since we are interested in an estimate of the relative error we use
1 ∥A∥
∥b∥ = ∥Ax∥ ≤ ∥Ax∥ ≤ ∥A∥ ∥x∥ =⇒ ∥x∥ ≥ ∥b∥ / ∥A∥ =⇒ ≤ .
∥x∥ ∥b∥
Therefore, we have
−1
∥δx∥
A
∥δb∥
≤ + ∥δA∥
∥x∥ 1 − ∥A−1 ∥ ∥δA∥ ∥x∥
Sensitivity on matrix coefficients (cont.)
∥δx∥ κ(A) ∥δb∥ ∥δA∥
≤ +
∥x∥ 1 − κ(A) ∥δA∥
∥A∥
∥b∥ ∥A∥
∥δb∥ ∥δA∥
≤ 5.10−d , ≤ 5.10−d .
∥b∥ ∥A∥
∥δx∥
≤ 10α−d+1
∥x∥
Rule of thumb (cont.)
x = D2 y.
Example
10 100000 x1 100000
=
1 1 x2 2
The row equivalent scaled problem is
0.0001 1 x1 1
=
1 1 x2 2
The solutions with 3 decimal digits are x̃ = (0, 1.00)T for the
unscaled system and x̃ = (1.00, 1.00)T for the scaled system.
The correct solution is x = (1.0001, 0.9999)T .
See Example 5.11 in Ascher and Greif
Theorem on scaling
Theorem
Let A ∈ Rn×n be nonsingular. Let the diagonal matrix Dz be
defined by
−1
n
X
dii = |aij |
j=1
Then
κ∞ (Dz A) ≤ κ∞ (DA)
for all nonsingular diagonal D.
See: Dahmen & Reusken: Numerik fur Ingenieure und
Naturwissenschaftler. 2nd ed. Springer 2008.
Remark on determinants