0% found this document useful (0 votes)
6 views

lesson4

Uploaded by

isabon7044
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

lesson4

Uploaded by

isabon7044
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

INVERTIBLE MATRICES

1. The inverse of a square matrix


Definition 1.1. We say that a square n × n matrix A is invertible or regular if
there exists an n × n matrix B such that
AB = In and BA = In .
The matrix B is then unique. It is called the inverse of A and is denoted by A−1 .
Proof. We need to prove the unicity of B for this definition to make sense. Suppose
we have two matrices B and C such that:
AB = In , BA = In and AC = In , CA = In .
Then, since by associativity we have C(AB) = (CA)B, we obtain
(CA)B = In B = B = C(AB) = CIn = C
thus B = C. □
Example 1.2. The 2 × 2 matrix
( )
3 0
A=
0 1
is invertible and ( )
1
−1 3 0
A = .
0 1
Indeed, we can easily check that AA−1 = I2 and A−1 A = I2 . However, the following
matrix: ( )
5 2
B=
0 0
is not invertible. Indeed, suppose there existed a matrix C such that BC = I2 with
( )
a b
C= .
c d
Then we would have
( ) ( )
5a + 2c 5b + 2d 1 0
BC = = .
0 0 0 1
But looking at the last entry of the second row, this would imply 0 = 1, which is
impossible.
Proposition 1.3. The inversion of matrices satisfies the following important prop-
erties.
• It is involutive: (A−1 )−1 = A
• It is anti-commutative: (AB)−1 = B −1 A−1
for all n × n invertible matrices A and B.
1
Proof. The first property is obvious by the definition of the inverse. The second
property follows from the two equations:
(B −1 A−1 )(AB) = B −1 (A−1 A)B = B −1 In B = B −1 B = In

(AB)(B −1 A−1 ) = A(BB −1 )A−1 = AIn A−1 = AA−1 = In


Remark 1.4. If A is an n × n invertible matrix and λ is a nonzero scalar, it is also


clear that the inverse of the matrix λA is the matrix: λ1 A−1 .

2. Finding the inverse by direct computation


Given a square matrix A, we would like now to answer the following two impor-
tant questions:
(1) Is the matrix A invertible or not?
(2) If A is invertible, how can we compute the inverse A−1 ?
The first method is to do it by direct computation. We take a n × n matrix B
with unknown coefficients and we solve the matrix equation:
AB = In
If we come up with a contradiction, it means that B cannot be computed and that
A is thus not invertible. If we find a solution for B, we check that
BA = AB = In
and we deduce from this that B is the inverse of A. The matrix B has n2 coefficients,
so this method amounts to solving n systems of linear equations with n variables
each. It is a very basic, but not a very efficient method. We give a few examples
below when n = 3.
Example 2.1.
 
1 −2 2
A= 1 −1 3 
2 −4 5
Let B be the matrix  
a d g
B= b e h 
c f i
and suppose that AB = I3 , that is
   
a − 2b + 2c d − 2e + 2f g − 2h + 2i 1 0 0
 a − b + 3c d − e + 3f g − h + 3i  =  0 1 0 .
2a − 4b + 5c 2d − 4e + 5f 2g − 4h + 5i 0 0 1
We obtain three systems of linear equations in three variables each:
  
 a − 2b + 2c = 1  d − 2e + 2f = 0  g − 2h + 2i = 0
a − b + 3c = 0 d − e + 3f = 1 g − h + 3i = 0
  
2a − 4b + 5c = 0 2d − 4e + 5f = 0 2g − 4h + 5i = 0
2
which we solve by row reduction. Here, all three systems have a unique solution,
and we can compute all the coefficients a, b, c, d, e, f, g, h, i. We obtain:
 
7 2 −4
B =  1 1 −1  .
−2 0 1
Now we check by direct computation that:
AB = I3 and BA = I3 .
From all this we conclude that A is invertible and that:
 
7 2 −4
A−1 =  1 1 −1  .
−2 0 1
Example 2.2.
 
1 2 3
A =  −1 0 −2 
0 2 1
Let B be the matrix  
a d g
B= b e h 
c f i
and suppose that AB = I3 , that is
   
a + 2b + 3c d + 2e + 3f g + 2h + 3i 1 0 0
 −a − 2c −d − 2f −g − 2i  =  0 1 0 .
2b + c 2e + f 2h + i 0 0 1
We obtain three systems of linear equations in three variables each:
  
 a + 2b + 3c = 1  d + 2e + 3f = 0  g + 2h + 3i = 0
−a − 2c = 0 −d − 2f = 1 −g − 2i = 0
  
2b + c = 0 2e + f = 0 2h + i = 0
which we solve by row reduction. But here, we see that the first system becomes:
  
 a + 2b + 3c = 1  a + 2b + 3c = 1  a + 2b + 3c = 1
−a − 2c = 0 ⇐⇒ 2b + c = 1 ⇐⇒ 2b + c = 1
  
2b + c = 0 2b + c = 0 0 = −1
which is impossible. Therefore, the existence of a matrix B such that AB = I3
would imply a contradiction. Such a B cannot exist. The matrix A is not invertible.

3. The coefficient matrix of a system of linear equations


Let A be a square n × n matrix. In this section, we will investigate the set of
solutions of the system of linear equations:


A−→x = b


where b is an n-dimensional column vector with coordinates b1 , b2 , ..., bn . We will
see that the situation is very different when A is invertible and when A is not
invertible.
3
Theorem 3.1. Let A be an n × n matrix. The following three conditions are
equivalent:
(1) The matrix A is invertible.


(2) For each n-dimensional column vector b , the system of linear equations


A−

x = b
has a unique solution.
(3) The reduced row-echelon form of A is the unit matrix In .
Proof. • First we prove that 1 =⇒ 2. Suppose that A is invertible and fix


a column vector b . Then

→ −
→ −

A−→x = b =⇒ A−1 A− →x = A−1 b =⇒ − →
x = A−1 b
since A−1 A = I and I −
n

x =−
n

x . Furthermore,

→ −
→ −
→ −

x = A−1 b =⇒ A− →
x = AA−1 b =⇒ A− →
x = b

→ − →
since AA−1 = In and In b = b . Therefore we have proved that the system

→ −

of linear equations A−

x = b has a unique solution, which is −

x = A−1 b .


• Next we prove that 2 =⇒ 3. Take the zero vector for b . The homogeneous

→ −

system A− →
x = 0 has a unique solution, which is necessarily − →
x = 0 . Let
B be the augmented matrix of this system. The first n columns of B are
the columns of A, and the last column of B is the zero column. Now we
transform B by elementary row operations into its reduced row-echelon
form B ′ . Notice that the last column doesn’t change during this process: it
stays equal to the zero column. The system of linear equations associated
with B ′ has only one solution, so it is necessarily:


 x1 = 0

x2 = 0

 ...

xn = 0
B ′ is necessarily equal to:
 
1 0 0 ... 0 0 0
 0 1 0 ... 0 0 0 
 
B′ =   ... ... ... ... ... ... ... .

 ... ... ... ... ... ... ... 
0 0 0 ... 0 1 0
The first n columns of B ′ are exactly the columns of In and the last column
of B ′ is the zero column. B ′ is row equivalent to B , so In is row equivalent
to A.

• Now, we prove that 3 =⇒ 2 =⇒ 1. Suppose that A is row equivalent




to the identity matrix In . Let b be a column vector and let B be the


augmented matrix of the system A− →
x = b . We perform on B the elemen-
tary row operations that reduce A to In and we obtain a matrix B ′ . It is
4
then clear that the first n columns of B ′ are the columns of In and the last
column of B ′ is of the type:
 
c1,1 b1 + c1,2 b2 + ... + c1,n bn
 c2,1 b1 + c2,2 b2 + ... + c2,n bn 
 
 ... 
cn,1 b1 + cn,2 b2 + ... + cn,n bn
where the numbers ci,j are real coefficients, for 1 ≤ i ≤ n and 1 ≤ j ≤ n.
B ′ is clearly the reduced row-echelon form of B and the system of linear


equations A− →
x = b has a unique solution which is:
 
c1,1 b1 + c1,2 b2 + ... + c1,n bn

→  c2,1 b1 + c2,2 b2 + ... + c2,n bn 
x =  ...
.

cn,1 b1 + cn,2 b2 + ... + cn,n bn
Now let C be the matrix
 
c1,1 c1,2 ... ...
c1,n
 c2,1 c2,2 c2,n 
... ...
C=  .
... ... ... ...
cn,1 cn,2 ... ...
cn,n


We have proved that for each column vector b , the system of linear equa-


tions A−→
x = b has a unique solution, which is

→ −

x =Cb.


So for all n-dimensional vectors b , we get:

→ − →
AC b = b .
from which we can conclude:
AC = In .


Furthermore, for any column vector − →
y , if we set b = A−

y , then the system

→ −

of linear equations A x = b has a unique solution which is −

→ →x =− →
y =Cb.
So for all vectors −

y in Rn , we get:
CA−

y =−

y
and therefore we can prove that:
CA = In .
The matrix A is invertible and A−1 = C.

We had stated that an n × n matrix A is invertible if there exists an n × n matrix
B such that both conditions:
AB = In and BA = In
are satisfied. However, in practice, the reader may have noticed that both conditions
are not necessary: once one of the two conditions is satisfied, the other one is
also automatically satisfied. We will prove this fact in the following proposition:
actually, we do not need to check both conditions, only one of the two conditions
is sufficient.
5
Proposition 3.2. Let A and B be two n × n matrices. If either AB = In or
BA = In then A is invertible and B = A−1 .
Proof. Suppose BA = In and let − →
x be a solution of the homogeneous system

→ − →
A x = 0. Then we have BA x = B 0 = 0 and BA−

→ −
→ →
x = In −

x =− →x which implies

→ −
→ −

that x = 0 . So the homogeneous system A x = 0 has a unique solution −

→ −
→ →
x = 0.
By the previous discussion, this implies that the reduced row-echelon form of A is
In so A is invertible. Furthermore,
BA = In =⇒ BAA−1 = In A−1 =⇒ B = A−1 .
If AB = In we reverse the roles of A and B: B is invertible and B −1 = A which is
equivalent to A−1 = B. □

Now, we would like to deal with the case where the matrix A is not invertible.
The following theorem is an analog of Theorem 3.1, rephrased for non-invertible
matrices.

Theorem 3.3. Let A be an n × n matrix. The following three conditions are


equivalent:
(1) The matrix A is not invertible.


(2) For each n-dimensional column vector b , the system of linear equations


A−→x = b
has either no solution or infinitely many solutions.
(3) In any row-echelon form of the matrix A, the bottom row is zero.
Proof. • First, we will prove that 1 =⇒ 3. Suppose that A is not invertible
and consider the homogeneous system


A−→
x = 0.
By the results of the previous section, this system has infinitely many solu-
tions. Let B be the augmented matrix of this system: the first n columns
of B are the columns of A and the last column of B is the zero column.
We transform this B, by elementary row operations, into its reduced row-
echelon form B ′ . Notice that the last column stays constantly equal to zero
during this process. The system of linear equations associated with B ′ has
infinitely many solutions. So there are k free variables, for some k ≥ 1, and
n − k leading variables. There is one leading variable on each row, from row
1 to row n − k. The rows below row n − k do not have a leading variable:
they are all zero rows. Since n − k ≤ n − 1, there is at least one zero row,
the bottom one.
• Now, we will prove that 3 =⇒ 2. Suppose that in all row-echelon forms


of A the bottom row is zero. Let b be a column vector and let B be the
augmented matrix of the system
−→
A−→
x = b.
We transform B by elementary row operations into its reduced row-echelon
form B ′ . Now the first n columns of B ′ form a row-echelon form of A, so
6
their bottom k rows are zero for some integer k ≥ 1. Then the bottom k
rows of B ′ are necessarily of the type:

(0, 0, ..., 0, αi )


for some constant αi depending on b . If αi ̸= 0 for some i then the system
is inconsistent: it has no solution. If αi = 0 for all i with n − k + 1 ≤ i ≤ n,
then the system is consistent and it has k free variables: it has infinitely
many solutions.
• Finally, 2 =⇒ 1 follows directly from the result of Theorem 3.1. So we
are done.

4. Finding the inverse by solving a system of linear equations


This gives us a new method to compute the inverse of a matrix A: solve the
system


A−

x = b
with the coefficients of
 
b1
 b2 
→ 
− 
b =
 ... 

 ... 
bn
seen as parameters.

More precisely, two cases can occur:




• When A is invertible, the system A−

x = b has a unique solution which is

→ −

x = A−1 b .

By solving the system, at the end, we will get the entries of the matrix
A−1 = C in the formula:
 
c1,1 b1 + c1,2 b2 + ... + c1,n bn

→  c2,1 b1 + c2,2 b2 + ... + c2,n bn 
x =  ...


cn,1 b1 + cn,2 b2 + ... + cn,n bn

as in the proof of Theorem 3.1.




• When A is not invertible, when we solve the system A− →
x = b , at some
point the bottom row of the augmented matrix will become

(0, 0, ..., 0, αn )

where αn is a linear combination of the coefficients b1 , b2 , ..., bn . The system


has either no solution or infinitely many solutions, depending on the vector


b . By Theorem 3.3, we can stop here and conclude that A is not invertible.
7
Example 4.1 (Example 2.1).
 
1 −2 2
A= 1 −1 3 
2 −4 5


We solve the system A−

x = b with the coefficients of
 
b

→  1 
b = b2
b3
seen as parameters. By Gaussian elimination:
     
1 −2 2 b1 1 −2 2 b1 1 −2 0 5b1 − 2b3
 1 −1 3 b2  ∼  0 1 1 b2 − b1  ∼  0 1 0 b1 + b2 − b3 
2 −4 5 b3 0 0 1 b3 − 2b1 0 0 1 b3 − 2b1
 
1 0 0 7b1 + 2b2 − 4b3
∼ 0 1 0 b1 + b2 − b3 
0 0 1 −2b1 + b3
Therefore, we get a unique solution
 
7b1 + 2b2 − 4b3

→ −

x = A−1 b =  b1 + b2 − b3 
−2b1 + 0b2 + b3
and thus the coefficients of the matrix
 
7 2 −4
A−1 = 1 1 −1  .
−2 0 1
We see in this example that this method is not very simple and not very effective.
However, it is in some way essential, because it captures the real meaning of the


inverse of a matrix. When A is invertible, the equation A−→x = b is equivalent to

→ −
→ −

the equation −→x = A−1 b . That is, if −→
x is transformed into b by A, then b is
transformed into − →
x by A−1 and conversely.
Example 4.2 (Example 2.2).
 
1 2 3
A =  −1 0 −2 
0 2 1
We proceed similarly to the previous example, and:
     
1 2 3 b1 1 2 3 b1 1 2 3 b1
 −1 0 −2 b2  ∼  0 2 1 b1 + b2  ∼  0 2 1 b1 + b2 
0 2 1 b3 0 2 1 b3 0 0 0 b3 − b1 − b2
At this point, the last row has become:
(0, 0, 0, α3 )
with α3 = b3 − b1 − b2 . The system has either no solution or infinitely many
solutions: A is not invertible.
8
5. Finding the inverse by row reduction
In this section, we want to discuss a new way to compute the inverse of an n × n
matrix A. This is also probably the quickest and the most effective way to do
it. This method is heavily based on the method used in the previous section, but


instead of using the vectors b and their coordinates b1 , b2 , ..., bn , we perform the
elementary row operations directly on the matrices A and In .

Definition 5.1. An elementary matrix is a square n × n matrix obtained by


performing one elementary row operation on the identity matrix In .
There are three types of elementary matrices, corresponding to the three ele-
mentary row operations.
(1) Transpositions or exchanges are elementary matrices obtained by ex-
changing two rows of In .
(2) Dilatations are elementary matrices obtained by multiplying one row of
In by a constant k ̸= 0.
(3) Transvections are elementary matrices obtained by adding to one row of
In another row multiplied by a constant k.
Example 5.2. Here are a few examples of elementary matrices when n = 3.
     
0 1 0 1 0 0 1 0 0
 1 0 0   0 3 0   0 1 0 
0 0 1 0 0 1 0 5 1
The first one is a transposition: it is obtained from I3 by exchanging row 1 and row
2. The second one is a dilatation: row 2 of I3 is multiplied by 3. The third one is
a transvection: we added row 2 multiplied by 5 to row 3.


Proposition 5.3. Let b be an n-dimensional column vector. Performing one ele-

→ −
→ −

mentary row operation on b is equivalent to replacing b by the vector E b which


is obtained by multiplying b on the left by the corresponding n × n elementary
matrix E.


Proof. If E is a transposition that exchanges row i and row j, then row i of E b is

→ −

bj and row j of E b is bi . The other rows of b remain the same. If E is a dilatation


that multiplies row i by a constant k ̸= 0, then row i of E b is kbi . The other rows


of b remain unchanged. If E is a transvection that adds to row i another row j

→ −

multiplied by a constant k, then row i of E b is bi + kbj . The other rows of b
remain unchanged, which completes the proof. □

Because of this, we see that instead of solving the system of linear equations


A−

x = b


where the coefficients of b are considered as parameters, and perform the ele-
mentary row operations on the usual augmented matrix B, we can consider the
following ”super-augmented” matrix with n rows and 2n columns:
(A In )
9
obtained by writing the matrices A and In side by side, and perform the elementary
row operations directly on this matrix.

If A is invertible, after a finite number of operations, we will obtain the matrix


(In A−1 )
whose the last n columns are exactly the inverse A−1 . If A is not invertible, at
one point the bottom row of the first n columns will become the zero row. At that
point, we can stop the process and conclude that A is not invertible, without going
all the way to computing the reduced row-echelon form of A.
Example 5.4 (Examples 2.1 and 4.1 ).
 
1 −2 2
A =  1 −1 3 
2 −4 5
We obtain:
   
1 −2 2 1 0 0 1 −2 2 1 0 0
 1 −1 3 0 1 0 ∼ 0 1 1 −1 1 0 
2 −4 5 0 0 1 0 0 1 −2 0 1
   
1 −2 0 5 0 −2 1 0 0 7 2 −4
∼ 0 1 0 1 1 −1  ∼  0 1 0 1 1 −1 
0 0 1 −2 0 1 0 0 1 −2 0 1
and so  
7 2 −4
A−1 =  1 1 −1  .
−2 0 1
Note that in this case, we have performed 5 operations, that is we multiplied on
the left by the following elementary matrices:
     
1 0 0 1 0 0 1 0 0
E1 =  −1 1 0  E2 =  0 1 0  E3 =  0 1 −1 
0 0 1 −2 0 1 0 0 1
   
1 0 −2 1 2 0
E4 =  0 1 0  E5 =  0 1 0 
0 0 1 0 0 1
which happen to be only transvections. The reader can check that I3 is equal to
the product of matrices:
I3 = E5 E4 E3 E2 E1 A
and that
A−1 = E5 E4 E3 E2 E1 .
Notice the similarities between examples 4.1 and 5.4. Actually, what we are doing
in both cases amounts to the same thing. Only, by performing the operations on


the matrix I3 instead of performing them on the column vector b , we are avoiding
writing all over again the coefficients b1 , b2 and b3 and this makes our computations
more simple and more elegant.

In the following example, we see what happens when A is not invertible.


10
Example 5.5 (Examples 2.2 and 4.2).
 
1 2 3
A =  −1 0 −2 
0 2 1
We obtain:
     
1 2 3 1 0 0 1 2 3 1 0 0 1 2 3 1 0 0
 −1 0 −2 0 1 0  ∼  0 2 1 1 1 0 ∼ 0 2 1 1 1 0 
0 2 1 0 0 1 0 2 1 0 0 1 0 0 0 −1 −1 1
Here, the bottom row of the first three columns has become zero, so we can conclude
that A is not invertible.
This method for computing the inverse of a matrix A is very efficient when A is
a larger matrix. We give a last example with a 4 × 4 matrix.
Example 5.6.
 
1 −1 0 2
 2 1 1 1 
A=
 −1

1 3 0 
0 2 2 −1
We obtain:
   
1 −1 0 2 1 0 0 0 1 −1 0 2 1 0 0 0
 2 1 1 1 0 1 0 0   3 1 −3 −2 1 0 0 
 ∼ 0 
 −1 1 3 0 0 0 1 0   0 0 3 2 1 0 1 0 
0 2 2 −1 0 0 0 1 0 2 2 −1 0 0 0 1
   
1 −1 0 2 1 0 0 0 1 −1 0 2 1 0 0 0
 0 1 −1 −2 −2 1 0 −1   1 −1 −2 −2 1 0 −1 
∼ ∼ 0 
 0 0 3 2 1 0 1 0   0 0 3 2 1 0 1 0 
0 2 2 −1 0 0 0 1 0 0 4 3 4 −2 0 3
   
1 −1 0 2 1 0 0 0 1 −1 0 2 1 0 0 0
 0 1 −1 −2 −2 1 0 −1   1 −1 −2 −2 0 −1 
∼ ∼ 0 1 
 0 0 3 2 1 0 1 0   0 0 1 1 3 −2 −1 3 
0 0 1 1 3 −2 −1 3 0 0 3 2 1 0 1 0
   
1 −1 0 2 1 0 0 0 1 −1 0 2 1 0 0 0
 0 1 −1 −2 −2 1 0 −1   1 −1 −2 −2 0 −1 

∼ ∼ 0 1 
0 0 1 1 3 −2 −1 3   0 0 1 1 3 −2 −1 3 
0 0 0 −1 −8 6 4 −9 0 0 0 1 8 −6 −4 9
   
1 −1 0 0 −15 12 8 −18 1 −1 0 0 −15 12 8 −18
 0 1 −1 0 14 −11 −8 17   9 −7 −5 11 
∼ ∼ 0 1 0 0 
 0 0 1 0 −5 4 3 −6   0 0 1 0 −5 4 3 −6 
0 0 0 1 8 −6 −4 9 0 0 0 1 8 −6 −4 9
 
1 0 0 0 −6 5 3 −7
 0 1 0 0 9 −7 −5 11 
∼ 0 0 1 0 −5

4 3 −6 
0 0 0 1 8 −6 −4 9
11
and it follows that:  
−6 5 3 −7
 9 −7 −5 11 
A−1 =   −5
.
4 3 −6 
8 −6 −4 9
Notice that, because of the size of the matrix, this example would have been quite
defficult to solve using the previous methods.

12

You might also like