0% found this document useful (0 votes)
834 views

a first course in module theory

Uploaded by

Junaid Ali
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
834 views

a first course in module theory

Uploaded by

Junaid Ali
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 257

A First Course

in Module Theory
This page is intentionally left blank
A First Course
in Module Theory

M E Keating
Imperial College, London

Imperial College Press


ICP
Published by
Imperial College Press
203 Electrical Engineering Building
Imperial College
London SW7 2BT

Distributed by
World Scientific Publishing Co. Pte. Ltd.
P O Box 128, Farrer Road, Singapore 912805
USA office: Suite IB, 1060 Main Street, River Edge, NJ 07661
UK office: 57 Shelton Street, Covent Garden, London WC2H 9HE

Library of Congress Cataloging-in-Publication Data


Keating, M. E., 1941-
A first course in module theory / M. E. Keating.
p. cm.
Includes bibliographical references and index.
ISBN 186094096X(alk. paper)
1. Modules (Algebra)
QA247.K43 1998
512'.4--dc21 98-9963
CIP

British Library Cataloguing-in-Publication Data


A catalogue record for this book is available from the British Library.

Copyright © 1998 by Imperial College Press


All rights reserved. This book, or parts thereof, may not be reproduced in any form or by any means,
electronic or mechanical, including photocopying, recording or any information storage and retrieval
system now known or to be invented, without written permission from the Publisher.

For photocopying of material in this volume, please pay a copying fee through the Copyright
Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to
photocopy is not required from the publisher.

This book is printed on acid-free paper.

Printed in Singapore by UtoCPdjOyrig/lfec/ Material


To
Valerie
and
Christopher
This page is intentionally left blank
Introduction
The purpose of this book is to provide an introduction to module theory
for a reader who knows something of linear algebra and elementary ring
theory.
There is a very natural theme for a first course in module theory, namely
the structure theory of modules over Euclidean domains. This theory is very
explicit, and it has interesting and surprisingly disparate interpretations.
An abelian group can be regarded as a module over the ring of integers Z,
while a matrix with entries in a field F defines a module over the polyno­
mial ring F[X]. As both Z and F[X] are examples of Euclidean domains,
the general theory of modules over Euclidean domains leads to specific re­
sults about abelian groups and about matrices. In the former, we obtain
a classification of finitely generated abelian groups, and, in the latter, a
description of the rational canonical form and the Jordan normal form of a
matrix.
Although the structure theory for modules over Euclidean domains is
the core of this text, we also consider modules over more general, even
noncommutative, rings of coefficients. This extra generality allows us to
discuss the limitations and some of the extensions of our main results.
The contents of this text are based on a final year undergraduate course
that I gave a number of times at Imperial College, London, with some
additional material. In the lecture course, I assumed that everyone was
familiar with the elementary properties of rings, ideals and Euclidean do­
mains. Here I have provided an introduction to ring theory in the first two
chapters, so that the text is more self-contained than the lectures, and a
greater variety of rings can be used.
Chapters 3 to 7 expound the basics of module theory, including meth­
ods of comparing, constructing and decomposing modules. The results in
these chapters are rather general and do not depend much on the ring of
coefficients. Chapters 8 to 12 are the heart of this text, since it is here
that we obtain the strong results that are special to Euclidean domains.
Chapter 12 also contains two applications of the theory, to abelian groups

vn
Vlll Introduction

and to lattices.
In Chapter 13, we use the module theory to find two standard forms for
a square matrix, namely, the rational canonical form and the Jordan normal
form of a matrix. In addition to the usual version of the Jordan normal
form of a matrix over the complex numbers, we give two further versions
that apply to a matrix whose entries are taken from a field other than the
complex numbers. The second of these variations is used, without proof,
in a fundamental paper on representation theory by J A Green [Green],
and a proof has not been published before at an elementary level, as far
as I know. I am grateful to my colleague Gordon James for drawing my
attention to these versions of the Jordan normal form.
In the final chapter, we go beyond the bounds of Euclidean domains to
look at some basic results on projective modules over rings in general. Here,
we establish some basic facts about group algebras and the relationship
between module theory and the representation theory of groups.
As befits a book that is intended as a first course for undergraduates,
arguments are given in considerable detail, at least, in the earlier part of
text. There is an index entry "proofs in full detail" to help the reader to
locate these agruments. There are many explicit illustrations and exercises,
and some hints and partial solutions to the exercises are provided in an
appendix.
Material that was not covered in the original lecture course is indicated
C by a "supplementary" symbol as in the margin. This material in not essen­
tial for the core results on modules over Euclidean domains, and it can be
omitted if the reader wishes.
Each chapter has a section headed "Further developments" in which
the definitions and results of the chapter are placed in a wider setting.
References are provided to enable the interested reader to follow up the
topics that are introduced in these sections. I hope that these sections will
provide a useful source of projects for students who are in the final year of
a four-year undergraduate MMath or MSci course at a UK university, or
who are taking the new one-year preliminary research degree, the "MRes".
The text is divided into chapters and sections, which are numbered as
you would expect: 1.1, 1.2, and so on. For ease of reference, results are
numbered consecutively within each section, and so they appear in the form
"12.1.1 Theorem", "12.1.2 Lemma", etc.
This book was written while I was also working on the more advanced
texts [B & K: IRM] and [B & K: CM] with Jon Berrick of the National Uni­
versity of Singapore. My collaboration with Jon has had many influences,
both on my teaching of the lecture course and on the composition of this
text, which it is impossible to acknowledge individually.
Introduction IX

Finally, I should thank the Mathematics Department at Imperial for al­


lowing me the time and space in which to write textbooks. A special thanks
also to my colleagues at Imperial for their assistance in mastering IMjrX.
Phillip Kent of the METRIC Project ran an introductory course which got
me started, and Oliver Pretzel provided me with the LXTJTJX software, and
kindly and patiently explained why it sometimes did not do what I hoped.
This page is intentionally left blank
Contents

Introduction vii

1 Rings and Ideals 1


1.1 Groups 1
1.2 Rings 3
1.3 Commutative domains 4
1.4 Units 4
1.5 Fields 5
1.6 Polynomial rings 5
1.7 Ideals 7
1.8 Principal ideals 8
1.9 Sum and intersection 9
1.10 Residue rings 10
1.11 Residues of integers 12
Exercises 14

2 Euclidean D o m a i n s 17
2.1 The definition 17
2.2 The integers 18
2.3 Polynomial rings 18
2.4 The Gaussian integers 19
2.5 Units and ideals 20
2.6 Greatest common divisors 21
2.7 Euclid's algorithm 22
2.8 Factorization 23
2.9 Standard factorizations 25
2.10 Irreducible elements 26
2.11 Residue rings of Euclidean domains 28
2.12 Residue rings of polynomial rings rings 29
2.13 Splitting fields for polynomials 31

xi
xii Contents

2.14 Further developments 31


Exercises 32

3 Modules and Submodules 35


3.1 The definition 35
3.2 Additive groups 37
3.3 Matrix actions 37
3.4 Actions of scalar matrices 39
3.5 Submodules 40
3.6 Sum and intersection 41
3.7 fc-fold sums fc-foldsums 43
3.8 Generators 43
3.9 Matrix actions again 45
3.10 Eigenspaces 46
3.11 Example: a triangular matrix action 47
3.12 Example: a rotation 48
Exercises 48

4 Homomorphisms 51
4.1 The definition 51
4.2 Sums and products 53
4.3 Multiplication homomorphisms 54
4.4 F[X]-modules in general 55
4.5 F[X]-module homomorphisms 56
4.6 The matrix interpretation 57
4.7 Example: p = 1 57
4.8 Example: a triangular action 58
4.9 Kernel and image 58
4.10 Rank & nullity 60
4.11 Some calculations 61
4.12 Isomorphisms 62
4.13 A submodule correspondence 64
Exercises 65

5 Free Modules 69
5.1 The standard free modules 70
5.2 Free modules in general 71
5.3 A running example 73
5.4 Bases and isomorphisms 74
5.5 Uniqueness of rank 76
5.6 Change of basis 77
5.7 Coordinates 79
Contents xiii

5.8 Constructing bases 80


5.9 Matrices and homomorphisms 81
5.10 Illustration: the standard case 83
5.11 Matrices and change of basis 84
5.12 Determinants and invertible matrices 85
Exercises 88

6 Quotient Modules and Cyclic Modules 91


6.1 Quotient modules 92
6.2 The canonical homomorphism 92
6.3 Induced homomorphisms 93
6.4 Cyclic modules 94
6.5 Submodules of cyclic modules 96
6.6 The companion matrix 99
6.7 Cyclic modules over polynomial rings 100
6.8 Further developments 102
Exercises 102

7 Direct Sums of Modules 107


7.1 Internal direct sums 107
7.2 A diagrammatic interpretation 109
7.3 Indecomposable modules Ill
7.4 Many components 112
7.5 Block diagonal actions 113
7.6 External direct sums 114
7.7 Switching between internal Sz external 115
7.8 The Chinese Remainder Theorem 116
Exercises 118

8 Torsion and the Primary Decomposition 123


8.1 Torsion elements and modules 124
8.2 Annihilators of modules 125
8.3 Primary modules 126
8.4 The p-primary component 128
8.5 Cyclic modules 130
8.6 Further developments 130
Exercises 131

9 Presentations 133
9.1 The definition 134
9.2 Relations 135
9.3 Defining a module by relations 136
xiv Contents

9.4 The fundamental problem 136


9.5 The presentation matrix 139
9.6 The presentation homomorphism 140
9.7 F[X]-module presentations 141
9.8 Further developments 142
Exercises 143

10 Diagonalizing and Inverting Matrices 145


10.1 Elementary operations 145
10.2 The effect on defining relations 146
10.3 A matrix interpretation 149
10.4 Row & column operations in general 150
10.5 The invariant factor form 152
10.6 Equivalence of matrices 155
10.7 A computational technique 156
10.8 Invertible matrices 158
10.9 Further developments 160
Exercises 161

11 Fitting Ideals 163


11.1 The definition 163
11.2 Elementary properties 165
11.3 Uniqueness of invariant factors 166
11.4 The characteristic polynomial 167
11.5 Further developments 168
Exercises 169

12 The Decomposition of Modules 171


12.1 Submodules of free modules 171
12.2 Invariant factor presentations 174
12.3 The invariant factor decomposition 176
12.4 Some illustrations 178
12.5 The primary decomposition 179
12.6 The illustrations, again 181
12.7 Reconstructing the invariant factors 182
12.8 The uniqueness results 183
12.9 A summary 185
12.10 Abelian groups 186
12.11 Lattices 187
12.12 Further developments 190
Exercises 190
Contents xv

13 Normal Forms for Matrices 193


13.1 F[X]-modules and similarity 194
13.2 The minimum polynomial 195
13.3 The rational canonical form 197
13.4 The Jordan normal form: split case 200
13.5 A comparison of computations 202
13.6 The Jordan normal form: nonsplit case 203
13.7 The Jordan normal form: separable case 206
13.8 Nilpotent matrices 209
13.9 Roots of unity 209
13.10 Further developments 211
Exercises 211

14 Projective Modules 215


14.1 The definition 215
14.2 Split homomorphisms 216
14.3 Semisimple rings 220
14.4 Representations of groups 222
14.5 Hereditary rings 224
Exercises 225

Hints and Solutions 229

Bibliography 243

Index 245
Chapter 1

Rings and Ideals

Each module has a ring of scalars associated with it. Therefore, we must
discuss rings before we can begin to tackle modules. In this chapter, we
collect together the basic definitions and properties of rings that we will
need in subsequent chapters. We consider ideals, which tell us about the
internal structure of a ring, and we look at some special types of ring,
particularly fields and polynomial rings. We also give the construction of
residue rings, which is an important method for obtaining new rings from
old.
A reader who has already met rings, ideals and Euclidean domains may
prefer to go directly to the start of our discussion of module theory in
Chapter 3, using this chapter and the next for reference.
We precede the definition of a ring with the definition of a more funda­
mental structure, namely, a group.

1.1 Groups
We encounter two notations for groups, additive and multiplicative, which
are used according to the context in which the group arises. First, we
introduce the notation that is most often met in ring theory and module
theory.
An additive group is a (nonempty) set A together with a law of compo­
sition + , called addition, which behaves as you might expect. Thus for any
a, b e A, there is an element a + b £ A which is called the sum of a and b,
and the following axioms must be satisfied.

1
2 Chapter 1. Rings and Ideals

Al: Associativity.

(a + b) + c = a + (b + c) for all a, b and c e A.

A2: Commutativity.

a + b = b + a for all a, b € A.

A3: Zero.
There is a zero element 0 in A with a + 0 = a for all a in A.
A4: Negatives.
Each element a oi A has a negative —a so that a + (—a) = 0.
It is usual to omit the bracket around a negative and write a + (—b) =
a — b and (—a) + b = —a + 6.
When studying groups in their own right, the law of composition is
usually written in multiplicative notation instead of additive notation. In
the multiplicative notation, a group is a (nonempty) set G in which any
elements g,h in G have a product g ■ h, often written simply as gh, and the
axioms are as follows.
Gl: Associativity.

(/ • 9) ■ h = f ■ (g ■ h) for all / , g and heG.

G2: Identity.
There is an identity element 1 in G with g ■ 1 = g = 1 • g for all g in
G.
G3: Inverses.
Each element g of G has an inverse g~l so that

S - 9 " 1 = 1 = 0 - 1 •£■

A multiplicative group is abelian or commutative if

g ■ h = h- g for all g,h e G.

Notice that our list of axioms for a multiplicative group is not simply
a translation of the list for an additive group. The difference is found in
axiom A2, which demands that an additive group must be abelian.
As groups are not our main concern in this text, we will not pause here
to give examples of them. We will meet additive groups in profusion in our
study of rings and modules, while multiplicative groups will be invoked to
construct some examples of rings in Chapter 14.
1.2. Rings 3

1.2 Rings
Informally, a ring is a set R in which arithmetic can be performed, that
is, the members of R can be added and multiplied, in much the same way
as integers or real numbers. However, there are two common properties of
multiplication that do not hold in a general ring.
The first of these properties is that a nonzero element of a ring need not
have a multiplicative inverse in that ring. For example, the integer 2 has
no inverse within the ring of integers Z, although it does have an inverse in
the rational numbers Q.
The second property is that multiplication need not be commutative,
that is, we can have rs ^ sr for two elements r and s of a ring.
Now for our formal definition of a ring. A ring is a nonempty set R,
on which there are two laws of composition, addition and multiplication.
Addition is indicated by the symbol +, so that for each pair r and s of
members of R there is a sum r + s in R. Multiplication is usually indicated
by simply writing the elements next to each other: for each pair r,s € R,
there is a product rs in R. When it is more convenient to have a symbol
for multiplication, we use a dot "•", so the product appears as r ■ s. We use
the dot while writing out the axioms.
Under addition, R must be an additive group as in the preceding section.
The properties of multiplication, and the interaction between addition and
multiplication, are given by the following axioms.
RM 1: Associativity.

(r ■ s) ■ t = r ■ (s ■ t) for all r, s and t € R.

RM 2: Identity.
There is an identity element 1 in R with

r ■ 1 = r = 1 • r for all r in R.

RM 3: Distributivity.
For all r, s and t in R,

(r + s)-t = r-t + s-t and r ■ (s + t) = r ■ s + r ■ t.

We allow the possibility that 0 = 1 in a ring. In that event,

r = rl = rO = 0 for every r 6 R,
so R must be the trivial or zero ring 0 that has only one element. Many
statements about rings or modules have trivial exceptional cases when they
4 Chapter 1. Rings and Ideals

are interpreted for the zero ring; as a rule, we will not state these exceptions
separately.
Familiar examples of rings are the ring of integers Z, the ring of rational
numbers Q, the ring of real numbers R and the ring of complex numbers
C. We shall assume the basic properties of these rings when we need to, as
any attempt to establish them in full detail would take us too far from the
point of this text. The more leisurely introduction to ring theory given by
Allenby [Allenby] does cover these topics.
s Given a ring R, the set Mn(R) of n x n matrices with entries in R is
also a ring under the usual addition and multiplication of matrices. The
verification of this assertion is a worthy but lengthy exercise.

1.3 Commutative domains


We now introduce two properties which will be satisfied by many of the
rings that we consider in this text.
C: A ring R is commutative if

rs = sr for all r,s G R.

D: A ring R is a domain (alternative names are an integral domain or


an entire ring) if R is not the zero ring, and whenever rs = 0 for r,s € R,
then either r = 0 or s = 0 already.
Familiar examples of domains are the rings Z, Q and R.
The standard examples of rings which fail to be commutative or to be
domains arise as matrix rings. Take R to be any (nontrivial) ring, Z for
instance, and let M2{R) be the set of all 2 x 2 matrices over R. Then
M2(R) is neither commutative nor a domain, for if we take r —
o o
and s = ( ), then rs =£ 0 but sr — 0.

1.4 Units
Let R be a ring. An element u of R is said to be a unit (or invertible) in R
if the following condition holds.
U: There is an element w in R so that

uw = 1 and wu = 1.

Such an element w is unique, and it is called the inverse of u. It is usually


written u _ 1 or 1/u.
1.5. Fields 5

The inverse u _ 1 of a unit is itself a unit, with inverse u, and the product
of two units u, v is again a unit, with inverse v~lu~l. This means that the
set U(R) of units in R is a multiplicative group as in section 1.1.
Thus U(Z) = { + 1 , - 1 } , while U(Q) is the set of all nonzero rational
numbers. Notice that an element that is not a unit in one ring may become
a unit in a bigger ring.
An important type of ring is defined by the requirement that the zero
element is the only non-unit.

1.5 Fields
F. A field is a nonzero commutative ring F in which every nonzero
element is a unit of F.
The rational numbers Q, the real numbers R and the complex numbers
C are all examples of fields.
It is a fact that any commutative domain R is contained in a field Q
whose elements are of the form r ■ s~x = r/s for r , s € i J , s / 0 . The field
Q is called the field of fractions or quotient field of R. For instance, Q is
the field of fractions of Z.
As the existence of the field of fractions is extremely believable in all
the concrete examples that we consider, we will take it for granted. The
technical details of the construction are given in several texts - for instance,
section 3.10 of [Allenby] or [B & K: IRM] (1.1.12).
A discussion of the existence and construction of rings of fractions for
more general types of ring can be found in [Rowen] and [B & K: CM],
among other texts.

1.6 Polynomial rings


We now introduce rings of polynomials, which play an important role in
this text. For the moment, let F be any ring - later F will usually be a
field. A polynomial with coefficients in F, or "over F", is an expression

/ = f(X) = f0 + fxX + f2X2 +■■■+ fmXm

with fo,fi,...,fm € F, where X is an "indeterminate" or "variable". The


polynomials / and

g = go + giX + g2X2 + ■■■ + gnXn

in F[X], with, say, m < n, are equal if /* = #, for i = 1 , . . . , m and


9m+i = ■ ■ ■ = 9n = 0-
6 Chapter 1. Rings and Ideals

We prefer the notation / to f(X) unless there is a special reason to


mention the variable X.
If ft = 0 for all t, then / is the zero polynomial 0. If ft = 0 for i > 1,
then / is a constant polynomial.
When / is not the zero polynomial, the degree of / is the largest index
m with fm ^ 0, and we write deg(/) = m. Then fm is the leading term of
/ . For convenience, the zero polynomial is allocated the degree —oo.
We write F[X] for the set of polynomials over F. Addition and multi­
plication of polynomials are defined by the standard rules: given

f = fo + fiX + f2X2 + ■■■ + fmXm

and
9 = go + giX + g2X2 + ■■■ + gnXn
in F[X], with m < n, their sum / + g is given by

f + 9 = (fo+9o) + (fi+gi)X + --- + (fm + gm)Xm + gm+1Xm+1 + ---+gnXn

and their product fg = f(X)g(X) is the polynomial

ho + hxX + ■ ■ ■ + hkXk + ■■■ + hm+nXm+n

where

hk = fo9k H h fi9k-i H 1- fk9o for k = 0,... ,m +n.

In particular, h0 = fogo, hi = f0gi + figo, and hm+n = fmgn.


The verification that F[X] is actually a ring is a matter of careful calcu­
lation. The zero element of F[X] is the zero polynomial, while the identity
element is the constant polynomial with /o = 1.
Note: our definition of a polynomial is rather informal; for instance we have
not defined the "variable" X. A more thorough construction of polynomials
is given in section 1.6 of [Allenby].
Our first listed result is elementary but crucial.

1.6.1 Lemma
(i) If a ring F is commutative, so also is the polynomial ring F[X].
(ii) If a ring F is a domain, so also is the polynomial ring F[X].

Proof
Let
f(X) = fo + fiX + hX2 + ■■■ + fmXm
1.7. Ideals 7

and
9(X) =g0 + giX + g2X2 + ■■■+ gnXn
be in F[X], so that fo,fi,...,fm and go,gi,...,gn are in F; then their
product fg is the polynomial
h0 + hiX + --- + hkXk + ■■■ + hm+nXm+n
with
hk = fo9k H 1- fi9k-i H 1- Aito-
When F is commutative, the /c-th coefficient of / # is the same as the
fc-th coefficient of gf for k = 0,... ,m + n, since both are the sums of all
possible terms ftgk~i, i = 0 , . . . , k, but in different orders. Thus fg = gf
for all polynomials / , g, which gives the first assertion.
To prove the second, suppose that / and g are both nonzero. We may as
well assume that the polynomials have been written so that the coefficients
fm and gn are both nonzero. Then the product hm+n = fmgn is also
nonzero, so fg is nonzero. □

1.7 Ideals
Ideals are crucial to the investigation of modules. At an elementary level,
they provide the first examples of modules, and, at a much deeper level, a
knowledge of all the ideals of a ring sometimes enables us to describe the
modules over the ring. Ideals come in three types.
A left ideal of a ring R is a subset I of R with the following properties.
Idl: Zero. The zero element 0 of R is in I.
Id2: Additive closure. If x, y € / , then x + y e I.
IdL3: Multiplicative closure. If r £ R and x € I, then rx € / .
A right ideal I satisfies axioms Idl and Id2, but instead of IdL3 we have
IdR3: IfreR and x € I, then xr € / .
If I is simultaneously both a left and right ideal of R, we say that I is
a two-sided ideal.
If R is a commutative ring, then rx = xr for any x and r in R, so that
conditions IdL3 and IdR3 are the same. Thus every ideal of R is both left
and right and therefore two-sided. In this case, we refer simply to an ideal
of R. We will soon see an example in which left and right ideals differ from
one another.
For any ring R, R itself is a two-sided ideal of R. A (left, right or
two-sided) ideal / of R is called proper if I ^ R.
The subset {0} of R is a two-sided ideal, called, naturally enough, the
zero ideal of R. We usually write 0 for the zero ideal.
Next, we see how ideals of a ring arise from the elements of the ring.
8 Chapter 1. Rings and Ideals

1.8 Principal ideals


For any fixed element a of a ring R, let Ra = {ra | r 6 R}. Then Ra is a
left ideal of R, called the principal left ideal generated by a. The element a
is called a generator of Ra.
The principal right ideal generated by a is aR = {ar | r S R}.
When R is commutative, Ra = aR is the (two-sided) principal ideal
generated by a.
The concept of a principal ideal is one of the fundamental notions in this
text, since we will usually impose conditions on the ring R which guarantee
that all its ideals are principal. To make sure that we get off on the right
foot, we give the almost trivial verification that Ra is actually a left ideal.
Idl: 0 is in Ra since 0 = Oa.
Id2: Suppose that ra and sa are in Ra, where r, s are in R. Then

ra + sa = (r + s)a € Ra.

Id3: Suppose that r £ R and so € Ra, where s £ R. Then r{sa) = (rs)a is


also in Ra.
Examples.
1. The principal (left or right) ideal generated by the zero element 0 of
R is always the zero ideal, since rO = 0 = Or for every r in R.
2. The principal (left or right) ideal generated by the identity element
1 of R is always R itself, since r l = r = l r for every r in R.
3. In the ring of integers Z, there are ideals

2Z, 3Z, 4Z,....

By Lemma 1.8.1, these ideals are all distinct, and, in the next chapter, we
see that they are the only proper nonzero ideals of Z (Theorem 2.5.3).
Note: our usual rule is that we write a two-sided principal ideal as a left
ideal Ra. However, it looks unnatural to write Z2 in place of 2Z, etc.
g 4. For an example in which left and right ideals differ, we take R to
be the ring M 2 (R) of 2 x 2 matrices over the field R of real numbers. Let
{ 1 0
611 = J. Then
{ 0 0

Ren
= {{Z 2) | a i i ' a 2 i G R }
and
e R a
" = { { 7 o)\an,a12eRy
1.9. Sum and intersection 9

In the case of greatest interest to us in this text, it is straightforward to


determine whether or not two elements of a ring generate the same principal
ideal.

1.8.1 Lemma
Let R be a commutative domain, and let a and b be nonzero elements
of R. Then Ra = Rb if and only if a = ub where u is a unit of R.
In particular, Ra = R if and only if a is a unit of R.

Proof Suppose that Ra = Rb. Then a = 1 • a is in Rb, so a = ub for some u


in R. Similarly, b = wa for some u>. Then a = uwa, and so a(l — uw) = 0.
As R is a domain and a ^ 0, we have 1 = uw, which shows that u is a unit
with inverse w.
Conversely, suppose that a = ub with u a unit. Then ra = (ru)b is in
Rb for all r in R, and so Ra C Rb. But b = u~la, so the reverse inclusion
also holds.
The final assertion is obvious. □

1.9 Sum and intersection


Next we introduce some useful operations on ideals. Suppose that / and
J are both left ideals, or both right ideals, or both two-sided ideals of R.
Their sum I + J is defined as

I + J = {x + y | x € I, y e J},

while their intersection I n J is the usual intersection of sets:

7 n J = { x | a ; € / and x 6 J } .

1.9.1 Lemma
If I and J are both left ideals, right ideals or two-sided ideals of a ring
R, then I + J and I tl J are also correspondingly left, right or two-sided
ideals of R.

Proof
Suppose that I and J are both left ideals. First, we have 0 e I and
0 e J, giving 0 + 0 = 0 6 / + J .
Next, let x + y and x' + y' be members of / + J, with x,x' £ / and
y,y' e J. Then
(x + y) + ix' + y') = (x + x') + (y + y'),
10 Chapter 1. Rings and Ideals

with x + x' e I and y + y' £ J, so I + J is closed under addition.


For any r in R, we have rx £ I and ry £ J. Thus r(x + y) = rx + ry
also belongs to J + J, which verifies the final condition.
The verification for the intersection is even easier and we leave it to the
reader, along with the remaining cases. D

1.10 Residue rings


The construction of a residue ring is a very useful method of obtaining a
new ring with interesting properties. We will use this technique to construct
some finite fields (Lemma 1.11.2), and to extend a field to a larger field
which contains the roots of a given polynomial (Proposition 2.13.1).
Let R be a ring and let / be a two-sided ideal of R. Informally, the idea
behind the construction of the ring R/I of residues of R modulo I is that
two elements r,s of R which differ by an element x in I should give the
same element r in R/I. Thus, if r = s + x in R, then f = s in R/I.
The point of such a construction can be illustrated by the special case
where we take the ideal 2Z in the ring of integers Z. Then two integers r, s
differ by an element 2a of 2Z precisely when 2 \r — s, that is, either r and
s are both odd or r and s are both even. Thus we expect there to be two
members of the residue ring Z/2Z, namely 0 and 1, corresponding in turn
to the set of even integers and the set of odd integers.
This illustration also explains the use of the term 'residue ring': when
an integer is divided by 2, there are two possible 'residues' or 'remainders',
0 or 1. (However, some authors prefer the terms factor ring or quotient
ring to residue ring.)
Now we turn to the formal construction of the residue ring. Given a
two-sided ideal / of a ring R, we define a relationship between the elements
of Rby
r = s mod / <=> r — s £ I.
The above expression is read "r is congruent to s modulo J", and the
relationship is called congruence modulo I.
When I = Ra is a principal ideal, we usually write r = s mod a rather
than r = s mod Ra. Thus for integers r, s,
r = s mod 2 <=> r - s £ 2Z <=3> 2 | r - s.
We record the basic property of congruence.

1.10.1 L e m m a
Let I be a two-sided ideal of a ring R. Then congruence modulo I is an
equivalence relation on R.
1.10. Residue rings 11

Proof We need to check the three properties which define an equivalence


relation, namely, that the given relation is reflexive, symmetric and transi­
tive.
For the first, we need to verify that r = r always, which is obvious.
The requirement for symmetry is that if r = s, then s = r. But r - s £ I
implies s-r e I. For transitivity, we assume that r = s and s = t for three
elements r,s,t € R, and we have to show that r = t. But

r - 1 = (r - s) + (a -1) e I.

D
Given an element r of R, we define the residue class of r mod / to be

r = {s&R\s = r mod / } .

Since congruence is an equivalence relation on R, the residue class of r is


its equivalence class under this equivalence relation. Thus, by the general
properties of equivalence relations, the ring R is partitioned into disjoint
residue classes, that is, for r,s G R, either r = s or r P i s = 0.
The residue ring R/I is defined to be the set of all residue classes r mod /
of elements of R. Thus, as promised, Z/2Z = {0,1}.
To make R/I into a ring, we must define addition and multiplication of
residue classes. For r,s~ 6 R/I, we put

r + ~s = r + s

and
r • s = rs.
The hardest point in the verification that these laws of composition make
R/I into a ring is to check that they are well-defined. This problem arises
since one element of R/I can be expressed as the residue class of many
different elements from R, and we need to know that the sum and product
in R/I are not affected by such variations. We record the statement and
proof as follows.

1.10.2 Proposition
Let I be a two-sided ideal of a ring R. Then
(i) addition and multiplication in R/I are well-defined;
(ii) R/I is a ring with zero element 0 and identity element 1;
(Hi) if R is commutative, then R/I is also a commutative ring.
Remark: Exercise 1.8 gives a nontrivial example in which R/I is commu­
tative although R is not.
12 Chapter 1. Rings and Ideals

Proof
(i) Suppose that r = ?i and s = si, so that r = n + x and s = si + y for
elements x, y € /.
Then r + s = (ri+si) + (x + y) with x + y € I and hence r -f s = T\ + Si,
that is, the two alternative methods for computing the sum F + s give the
same result.
Also, rs = r i s i + (r x y + s\x + xy) with rxy + six + xy € I, so the two
computations of r • s have the same outcome.
(ii) This is a matter of checking that the addition and multiplication in R/I
satisfy the axioms as given in section 1.2, granted that the axioms hold in
R already. We give two sample checks to show how easy it is.
For any r e f l ,

r+ 0 = r +0
= f

which shows that 0 is indeed the zero element of R/I.


For any r,s,t 6 R,

(r + s) -t = r + s -t
= (r + s)t
= rt + st
= ri + li
= r -t + s -t
which establishes that the distributive law holds.
(iii) Suppose R is commutative. Then, for any elements f,s of R/I, we
have
r-~s=f~s = ~sr = ~s-r.
D

1.11 Residues of integers


Our first explicit examples of residue rings arise from the various ideals of
the ring of integers Z. The calculations depend on some well-known facts
about factorization and division of integers which we take as granted in this
chapter. In the next chapter, these facts will be established as special cases
of results that hold for Euclidean domains in general.
Let m be a positive integer and let / = mZ be the principal ideal
generated by m. We introduce the special notation Z m for the ring Z/mZ,
since these rings appear frequently in this text.
1.11. Residues of integers 13

To describe Z m , we use the fact that, for any integer s, there are integers
q and r with s = qm + r and 0 < r < m — 1.
Now, for integers r, s,

s = f in Z m <=*> s = r mod m <=> s — r € mZ <=> m | s — r.

Thus the set of residue classes mod m is

Z m = { 0 , T , . . . , m - 1},

and these are all distinct since m cannot divide r — s if 0 < r, s < m — 1
unless r = s.
Note that the ring Z m need not be a domain even though Z is a domain
- for example 2 • 2 = 0 in Z4, although 2 ^ 0 . However, in the important
case when p is a prime number, the ring Z p is a field, which we deduce from
a more general result that is our next theorem.
Before we can state this theorem, we need a preliminary definition. A
two-sided ideal I of a ring R is maximal if I is a proper ideal of R, and
there is no two-sided ideal J of R with I C J C R. Here, we use the symbol
"C" to indicate strict containment, that is, I ^ J and J / R. Maximal
left ideals and maximal right ideals are defined in the obvious way.
Granted the unique factorization of integers, and that every ideal in Z
is principal, the maximal ideals in Z have the form pZ where p is prime.

1.11.1 T h e o r e m
Let I be an ideal of a commutative ring R. Then the following state­
ments are equivalent.
(i) I is a maximal ideal of R.
(ii) The residue ring R/I is a field.

Proof
(i) => (ii): By Lemma 1.10.2, R/I is a commutative ring, so we only need
to find an inverse for a typical nonzero element f € R/I. By Lemma 1.9.1,
Rr+I is an ideal of R, and, as / is maximal, either Rr+I = I or Rr+I = R.
But if Rr + I = I, then r = l-r + 0el and so f = 0, contradicting our
assumption that r is nonzero.
Thus Rr + I = R, and so there is an element s of R and an element x
of I with sr + x = 1. This equality gives s • f = 1 in R/I, that is, f has an
inverse.
(ii) => (i): Suppose that J is an ideal of R with I C J, (so I ^ J). Then
there is some element r £ J with r $ I. Since F ^ 0 in R/I, s • f = 1 for
some s € R.
14 Chapter 1. Rings and Ideals

But then 1 = sr + x in R for some element x of J, and so 1 £ J, which


implies that J = R. Thus / is a maximal ideal of R, as desired. □

1.11.2 Corollary
Let p be a prime number. Then Z p is a field. D

Exercises
1.1 A nonzero element r of a ring R is called a proper zero divisor if
rs = 0 for some nonzero element s £ R. Show that a proper zero
divisor cannot be a unit of R.
1.2 Show that a commutative ring R is a field if and only if 0 is the only
proper ideal of R.
1.3 Let R be a commutative ring. An ideal / of R is said to be prime if
the following holds.
P: If r, s £ R and rs £ I, then either r £ I or s £ I (possibly both are
in / ) .
Show that
(i) 0 is a prime ideal of R «=>• R is a domain.
(ii) / is prime <=> R/I is a domain.
(iii) A maximal ideal of R must be a prime ideal.
1.4 Let m > 2 be an integer. Show that Z m is a domain <=>• m is a
prime number.
Show by direct calculation that U(ZQ) = {1,5} and that the set
of proper zero divisors in I,Q is {2,3,4}.
Find the corresponding results for Zg and Zio-
1.5 Let A be a commutative domain and let R = A[X] be the polynomial
ring over A. Prove the following assertions.
(a) / = f{X) = /o + fiX + ■ ■ • + fnXn is a unit of R <=> n = 0
and /o is a unit of A.
(b) X divides a product fg in i? <=>■ X divides either / or g.
(c) X = fg with f,g £ R «=*> * either / or g is a unit of R.
1.6 Let F be a field and let fl = F[X, Y] (= F[X][Y]) be the polynomial
ring in two variables over F. Show the following.
(a) R is a domain.
(b) RX is a prime ideal of R.
(c) i?X + RY is a maximal ideal of R but .RX + RY is not principal.
g 1.7 Let F be a field and let D be the set of all diagonal matrices ( ^ ]
with r,s € F. Verify that D is a commutative ring under the usual
sum and product of matrices.
Exercises 15

Show also that the only proper nonzero ideals of D are De and
Df, where e =( J jj J and / = ( J J J, and that De + D / = D
and DedDf = 0.
Show that there is a bijective map 9 : D/De —► F given by

r 0
x 0 s

and that

9{d + d1) = 9{d) + 6(d') and 6(d ■ df) = 9(d) ■ 9(d')

for all d, d' in D. Thus we can identify the ring D/De with F.
Remark in other words, 9 is an isomorphism of rings - see Exercise
4.4.
1.8 Let F be a field and let T be the set of all upper triangular matrices C
/ r t \
I n . ) with r, s,t in F. Verify that T is a ring under the usual
sum and product of matrices. Show that T is not commutative.
Let

H
={{I $) | r , t 6 i r }
0 t
t,seF
0 s

'-{(SO1"'}
Show that i7, J and J are all two-sided ideals in T. Using the
methods of the preceding exercise, show that T/H = T/I = F, while
T/J = D. (Thus, as promised earlier, a noncommutative ring can
have a commutative residue ring.)
1.9 Let F be a field and let R be the ring of all 2 x 2 matrices over F .
Show that R has no two-sided ideals except 0 and R. (See Exercise
7.8.)
Chapter 2

Euclidean Domains

We now introduce the type of ring which is of greatest interest to us in this


text, namely, a Euclidean domain. Such a ring shares with the integers Z
the property that long division is possible, that is, given elements a, b of
the ring, then there are elements q, r of the ring with a = qb + r where the
remainder r is 'smaller' than b in some sense. Consequently, many results
that hold for the ring of integers can be extended to Euclidean domains in
general.
Apart from the integers themselves, most of Euclidean domains that we
encounter in this text are polynomial rings of the form F[X] for a field F.
We briefly consider the Gaussian integers Z(»], and some further examples
are mentioned in the exercises.
This chapter also contains a detailed analysis of the residue rings of
polynomial rings. This analysis is used immediately to give an algebraic
method for constructing roots of polynomials, and later, in Chapter 13, to
find normal forms for matrices.

2.1 The definition


A Euclidean domain is a commutative domain R together with a function
ip : R —► Z that has the following properties.

ED 1: ip(a) > 0 for all r € R, and <p(a) = 0 «=> a = 0.


ED 2: <p(ab) = ip(a)ip(b) for all a,be R.
ED 3: Let a, b be nonzero elements of R. Then there are elements q, r with

a = qb + r where <p(r) < <p(b).

17
18 Chapter 2. Euclidean Domains

The element q is called the quotient and r the remainder. Either of them
may be 0, and they may not be uniquely determined by the pair a, b.
Axiom ED 3 is called the division algorithm, or, less formally, long
division.
Any field F is a trivial Euclidean domain in which <p(x) = 1 for any
nonzero element x of F. We next consider some nontrivial examples.

2.2 The integers


The ring Z of integers, with <p(a) = \a\, is the original example of a Eu­
clidean domain.
We shall simply assume that the ring Z has all the properties that we
have listed, since an attempt to derive them would take us too far away
from the direction of this text. Further details can be found in Chapter 1
of [Allenby].
Note that we used the division algorithm for Z when we found the
residue rings of the integers in section 1.11:

Z m = {0, . . . , m - 1}.

Note also that axiom ED 3 does not define the residue uniquely - we
are permitted to write 7 = 3 - 2 + 1 o r 7 = 4 - 2 + (—1) when we divide 7
by 2, for instance. However, custom dictates that we choose the smallest
non-negative residue.

2.3 Polynomial rings


Next we show that for any field F, the polynomial ring F[X] is a Euclidean
domain. As there are infinitely many distinct fields (Q, R, C, Z p for p a
prime,...) we obtain an infinite supply of Euclidean domains.
Since a field is a commutative domain, so is the polynomial ring FIX]
(Lemma 1.6.1). The function ip is defined in terms of the degree of a
polynomial.
Let / = /o + fiX + f2X2 + ■■■ + fmXm with /o, fu.■.,/m G F be a
polynomial in F\X}. If / / 0, we assume that / has been written with
highest coefficient fm ^ 0, so that m is the degree deg(/) of / .
Then the function ip is given by

nJ>
1 o, / =o
2.4. The Gaussian integers 19

Axiom ED 1 holds by definition, and axiom ED 2 is clearly satisfied. The


verification of axiom ED 3 requires a bit of work.
Suppose that f = f0-\ 1- fmXm and g = g0 H + gnXn are nonzero
polynomials in F[X], of degrees m and n respectively. We have to show
that / = qg + r where q and r belong to F[X] and either deg(r) < deg(g)
or r = 0. We use induction on deg(/).
If m < n then we can take q = 0 and r = f. This remark also covers
the initial case in the induction, when m = 0, for if m = n = 0 then both
/ and g are units in F[X] and the assertion is trivial.
Suppose now that m > n, and put / = / - fng^n1Xm~ng. By con­
struction, the coefficient of Xm in / is 0, so deg(/) < m. By induction
hypothesis, / = qg + r with deg(r) < deg(g) (or r = 0), and a rearrange­
ment gives the desired form for / .
Unlike the situation for Z, the quotient and remainder are unique in
polynomial rings. To see this, suppose that / = qg + r and / = q\g + n
with deg(r) < deg(g) and deg(ri) < deg(^). Then (q — q\)g = T% — r.
But deg(ri — r) cannot be a nonzero multiple of deg(g), so ri — r = 0 and
9 - 9i = 0.

2.4 The Gaussian integers


A Gaussian integer is a complex number of the form a + bi where a, b are
ordinary integers. The set of all Gaussian integers is written 1\i]. If c + di
is also a Gaussian integer, the sum and product,
(a + bi) + (c + di) = (a + c) + (c + d)i

and
(a + bi) ■ (c + di) = (ac — bd) + {ad + bc)i
respectively, are again Gaussian integers.
The fact that Z[i] is a ring can now be confirmed, in one of two ways,
according to the reader's inclination. The hard but secure route is to check
all the axioms, one by one. A simpler method, which tends to arouse
suspicion at first sight, is to observe that the set C of complex numbers is a
ring (which we take for granted), and that properties such as associativity,
distributivity,..., are therefore inherited by Z[»]. Similarly, there are two
ways to verify that Z[i] is a commutative domain.
For each Gaussian integer a + bi, define

(p(a + bi) = (a + bi)(a — bi) = a2 + b2.

A direct check shows that axioms ED 1 and ED 2 hold in Z[i].


20 Chapter 2. Euclidean Domains

To establish ED 3, suppose that a+bi and c + di are nonzero Gaussian


integers, and put x + yi = (a + bi)(c + di)^1 G C. Choose integers u and
v with \x - u\ < 1/2 and \y - v\ < 1/2, and write s = (x - u) + (y - v)i.
Extending the definition of ip to C in the obvious way, we see that

<p(s) < 1 / 4 + 1 / 4 < 1.

Put r = s{c + di) and q = u + vi. By construction, r and q are Gaussian


integers, and we have

a + bi = q(c + di) + r and (p(r) < <p(c + di).

2.5 Units and ideals


We now return to the investigation of the properties of a general Euclidean
domain R. First, we note that the units of R are characterized easily in
terms of the function tp.

2.5.1 Lemma
Let R be a Euclidean domain and let u be in R. Then u is a unit of R
if and only if f(u) = 1.

Proof Consider the identity element 1 of R. Since l 2 = 1, axiom ED 2


shows that (<p(l))2 = ip(l) in Z and so that <p(l) = 1 or 0. But a Euclidean
domain is, by definition, a nonzero ring, so 1 / 0 and ip(l) = 1.
Now suppose that u is a unit. Then uv = 1 for some v G R, and

<p(u)<p(v) = 1 in Z.

As the value of ip{u) is positive, this gives ip(u) = 1.


Conversely, suppose tp(u) = 1. By axiom ED 3, 1 = qu + r for q, r G R
with <p(r) < ip(u). This forces the equalities f(r) = 0 and r = 0, and so u
has an inverse. □

2.5.2 Corollary
(i) In the Gaussian integers Z[i], the units are { ± l , ± i } .
(ii) In a polynomial ring F[X], F a field, the unit polynomials are the
nonzero constant polynomials / = /o- □
Next we establish the key result about the ideals of a Euclidean domain.
Recall that an ideal I of R is principal if / = Ra = {ra \ r G R} for some
element a of R.
2.6. Greatest common divisors 21

2.5.3 Theorem
Let R be a Euclidean domain. Then every ideal of R is principal.

Proof It is obvious that the zero ideal 0 is principal, its unique generator
being the zero element 0. Now suppose that 7 ^ 0 , and choose an element
a of I so that
0 < <p(a) < ip(x) for all x £ I.
Clearly, Ra C I. To prove equality, take any x £ I and write x = qa + r
with <p(r) < ip(a). Since r = x — qa, r belongs to I. But the inequality for
<p(a) now forces the equality <p(r) = 0 and so r = 0. □

2.6 Greatest common divisors


Next we show that any two elements of a Euclidean domain have a greatest
common divisor. First, we must define this term, and we choose the defi­
nition which is most useful for our applications, although it is perhaps not
the most transparent extension of the usual definition in Z.
Let a and b be elements of a Euclidean domain R (one or both of o, b
may be 0). Then a greatest common divisor (or GCD) of the pair a,b is
any element d of R which satisfies the following requirements.

GCD 1: d | a and d | b,
GCD 2: d = sa + tb for some elements s,t of R.

To see that this definition is reasonable, notice that if x G R divides


both a and b, then, by axiom GCD 2, x divides d also and so <p(x) \ <p(d)
in Z. Thus the value of <p(d) is maximal for common divisors of a and b.
Next, we show that greatest common divisors exist. Recall from section
1.9.1 that any two ideals I, J have a sum I + J which is also an ideal.

2.6.1 L e m m a
Let R be a Euclidean domain and let a,b G R. Then d € R is a greatest
common divisor of a and b <*=> Rd = Ra + Rb.
In particular, a, b always have a greatest common divisor, which can be
written d = sa + tb for some s,t € R.

Proof =x By GCD 2, d € Ra+Rb and so Rd C Ra + Rb. For the reverse


inclusion, suppose that r 6 Ra + Rb, with r = xa + yb for some x,y S R.
By GCD 1, a — a'd and b = b'd for a',b' e R, and so r = (xa' + yb')d
belongs to Rd.
22 Chapter 2. Euclidean Domains

<S=: Obviously, GCD 2 is satisfied. For GCD 1, note that a = l-a+0-6 £


Ra + Rb, hence a = a'd for some a', and likewise for b.
By Theorem 2.5.3, the ideal Ra + Rb must be principal, so a, 6 have a
greatest common divisor rf, which can be written as claimed. □

Choosing a G C D In a general Euclidean domain, there are many different


greatest common divisors of a pair a, b, since if d is one GCD, then so is
ud for any unit u of R. Thus, in Z, - 2 and 2 are both greatest common
divisors of the pair 6, 8, while in a polynomial ring F[X] over a field F, a
greatest common divisor can be multiplied by any nonzero constant.
In practice, we will make standard choices for greatest common divisors
in the integers and in polynomial rings -F't-X']. In Z we will always choose
the unique positive greatest common divisor of a nonzero pair of integers.
In a polynomial ring F[X], any nonzero polynomial / = /oH \-fmXm
with / m ^ 0 can be written / = fmg with g a unique monic polynomial,
that is, g = go + g\X + ■ ■ ■ + gm-iXm~l + Xm. We always choose the
greatest common divisor of a nonzero pair of polynomials to be monic.
The notation (a, b) will be used for the standard choice of the greatest
common divisor of a and b in the ring of integers or a polynomial ring, and
for some arbitrary choice of a greatest common divisor in other rings, such
as the Gaussian integers, where there is no evident "standard" choice.

2.7 Euclid's algorithm


At this point, we should explain the origin of the term Euclidean domain.
In Book VII, Propositions 1 and 2, of his Elements [Euclid], Euclid gives an
algorithmic procedure for the computation of the greatest common divisor
of a pair of integers. His method uses long division with successively re­
ducing remainders. The properties of Z that Euclid requires are just those
listed in Axioms ED 1 - 3 , and so the algorithm can be implemented in any
ring which satisfies these conditions.
Here is the method. Suppose that we are given elements a,b of a Eu­
clidean domain R. If one (or both) is 0, say 6 = 0, then the greatest
common divisor (a, b) is trivially a. If both are nonzero and b \ a, then the
computation is again trivial: (a, b) = b.
In the remaining case, we have a, b ^ 0 and a = qb + r with r ^ 0 and
V?(r) < ip(b). Since a € Rb+Rr and r € Ra+Rb, we have Ra+Rb = Rb+Rr
and so (o, 6) = (b,r).
If r divides 6, we are done. If not, we write 6 = q\r+ri with <p(ri) < ip{r)
and note that (6,r) = (r,r\). We can evidently repeat the argument, and
after a finite number of steps we must find a remainder r^ with r^ \ r ^ - i .
2.8. Factorization 23

Then
rk - (rk-i,rk) = •■• = (a,b).

2.8 Factorization
Our aim now is to show that, in an arbitrary Euclidean domain, there is a
unique factorization theorem analogous to that for the integers. We must
first extend some definitions from Z to Euclidean domains.
An element p of a commutative domain R is said to be irreducible if p
is neither zero nor a unit, and whenever p = ab for a, b in R, then either
a or b is a unit. When we are working in Z, we defer to tradition and say
'prime number' in preference to 'irreducible number'.
A nonzero, nonunit element which is not irreducible is called reducible.
If p is irreducible, so also is up for any unit u of R. Two elements x, y of
R are said to be associates if x = uy for some unit u of R, and such elements
are not regarded as being genuinely distinct when computing factorizations.
Thus, when the irreducible elements p, q are said to be distinct irreducible
elements, it is to be understood that they are not associates, rather than
simply being unequal.
In Z, prime numbers appear in pairs of associates {2, —2}, {3, — 3 } , . . . ,
and custom dictates that we always choose the positive prime. In a poly­
nomial ring F[X] over a field F, irreducibles occur in families of associates
{up | u € F, t i / 0 } , since the unit polynomials are the nonzero constants.
Here, the custom is to choose the unique monic polynomial in each family
of associates.
Two elements a, b of R are said to be coprime if their greatest common
divisor (a, b) is 1 (or a unit, which can always be replaced by 1). The
following sequence of results is the key to unique factorization, and also to
some other important results that we will meet later.

2.8.1 P r o p o s i t i o n
Let R be a Euclidean domain and suppose that a, b G R. Then a,b are
coprime •$=> 1 = sa + tb for some s,t € R.

Proof Suppose a, b are coprime. By Lemma 2.6.1, R = Rl = Ra + Rb,


giving 1 = sa + tb for some s,t € R. Conversely, if 1 can be written in this
for, any common divisor x of a, b must divide 1, and so x must be a unit.

24 Chapter 2. Euclidean Domains

2.8.2 Proposition
Let R be a Euclidean domain and suppose that a,b S R are coprime. If
a | be with c £ R, then a | c.

Proof By the previous result, we can write 1 = sa + tb for some s,t £ R.


Since be = xa for some x e R, c = sac + tbc = (sc + tx)a. □

2.8.3 Corollary
Let R be a Euclidean domain and suppose that a,b € R are coprime.
Then RaC\Rb = Rab. □

2.8.4 Corollary
Suppose that p G R is irreducible and that p \ be where b and c are in
R. Then either p\b or p\c.

Proof Since (p, b) is a divisor of p, either (p, b) = up for some unit u or


(p, b) = 1. In the first case, p | b, and in the second, p | c by Proposition
2.8.2 above. □
We come to the main result on factorization.

2.8.5 The Unique Factorization Theorem


Suppose that a is a nonzero element of a Euclidean domain R. Then
there are irreducible elements p\,..., ps of R and a unit u of R with
a = upi ■ ■ -ps.

Furthermore, if a = wq\ ■ • ■ qt, with q\,..., qt irreducible and w a unit, then


s = t and there is a permutation -K of {!,...,$} so that pi and q^u) are
associates for i = 1,... ,s.

Proof The existence of a factorization is proved by induction on <p(a). The


initial case is that <fi(a) = 1. Then a is a unit (Lemma 2.5.1), and a has a
trivial "factorization" with s = 0.
Suppose that </?(a) > 1. If a is already irreducible, it is its own one-term
factorization. If a is reducible, a = be with neither b nor c a unit. Then
1 < </?(&) < ip(a) and 1 < ip(c) < <^(a), so, by induction hypothesis, both
b and c already have factorizations into irreducible elements, which can be
multiplied together to give a factorization of a.
The uniqueness is established by induction on the number s of irre­
ducible factors. Suppose first that s = 1, so that a is irreducible. By
Corollary 2.8.4, a divides qj for some index j . But then a and q$ are
associates and ip(a) = <p{Qj)> which means that t = 1 also.
2.9. Standard factorizations 25

In the case s > 1, we have p\ | wgi • • • qt. By the above argument, pi


and <7j are associates for some index j , and, since i? is a domain, we have

UP2 ■ • ■ Pa = vqi ■ ■ ■ <jj_i ■ qi+i ---qt

where v is a unit of -R.


By induction hypothesis, s — 1 = t — 1 and we can pair off the sets

{P2,--,Ps}

and
{qi,.. .,<7j_i,<7j+i,... ,qt}

as required. Q

2.9 Standard factorizations


In the irreducible factorization a = up\ ■ ■ -pk that we obtained in the pre­
ceding theorem, it is possible that two or more of the irreducible factors are
associates of one another. In applications, it is often more convenient to
ensure that a given irreducible element can appear in one form only. We do
this by selecting a single member from each set {up | u a unit} of associated
irreducible elements of R. The resulting irreducible elements are called the
standard irreducible elements of R. By construction, no two standard ir­
reducible elements are associates. As in section 2.8, in the integers Z we
always take the positive primes as the standard primes, and in a polynomial
ring F[X] we choose the monic irreducible polynomials.
We can now rewrite the irreducible factorization of an element a of R
so that each irreducible term is a standard irreducible and the occurrences
of each standard irreducible are grouped together. Thus the factorization
takes the form
n(l) n(fc)
a = up1K ' ■■■Pk' ',

where u is a unit and pi, ■ ■ ■ ,Pk are distinct (that is, non-associated) irre-
ducibles.
Such a factorization is called a standard factorization of a. The unique­
ness part of the theorem above tells us that the set of irreducibles pi,...,pk
is uniquely determined by a (apart from the order in which it is written),
and that the exponents n ( l ) , . . . ,n(fc) are uniquely determined by a once
an order of listing for irreducibles is fixed.
26 Chapter 2. Euclidean Domains

2.10 Irreducible elements


As the Unique Factorization Theorem will be one of our main tools in
the description of modules, it will be both interesting and useful to know
something about the irreducible elements in the various Euclidean domains
that we encounter.

P r i m e numbers
We are obliged to assume that we know a prime number when we see
it. Although we know that there are infinitely many prime numbers (see
Exercise 2.1), we cannot list any infinite subset of them, and the problem
of determining very large primes is an active area of research.

Irreducible Gaussian integers


As noted in Corollary 2.5.2, there are four unit Gaussian integers, ± l , ± i ,
which means that each irreducible Gaussian integer comes in four associated
disguises. From the definition of <p (section 2.4), any Gaussian integer z is
a factor of tp[z). It follows that every irreducible Gaussian integer can be
found as a factor, in Z[»], of a prime number p € Z. The factorizations of
small primes can be found by direct calculation.
Thus 2 = (1 + i)(l - i), but 1 - i = - i ( l + i), so 2 = - i ( l + i)2 is
effectively a square in Z[i].
If 3 has a proper factor z = a + bi in Z[t], then ip(z) = a2 + b2 = 3 in Z,
which is impossible. Thus 3 is irreducible in Z[i].
Obviously, 5 = (2 + i)(2 — i), and both factors must be irreducible in Z[i]
since ip(2 ± i) = 5. They are also distinct, that is, not associated, because
the ratio (2 + i)/(2 — i) is not a Gaussian integer.
Similar calculations show that 7 , 1 1 , . . . are irreducible in Z[i], while
13,17,... each split into two irreducible factors. The general result is that
an odd prime number p remains irreducible in Z[i] if p = 3 mod 4, and
splits into distinct irreducible factors if p = 1 mod 4 - see [H & W], §15.1.

Irreducible complex polynomials


The Fundamental Theorem of Algebra assures us that any polynomial with
complex coefficients has a complex root. This means that the only irre­
ducible members of the polynomial ring C[X] are the linear polynomials
X - A for A € C - see Exercise 2.2. The Fundamental Theorem cannot be
established by purely algebraic methods since analytic properties of func­
tions must be used at some point. A proof can be found in §7.4 of [Cohn 2].
2.10. Irreducible elements 27

Irreducible real polynomials


There are two types of irreducible polynomial over the field of real num­
bers. These are the linear polynomials X — A, A £ K, and the quadratic
polynomials

X2 + bX + c with b, c € R and b2 - Ac < 0.

Such a quadratic polynomial has a pair of complex conjugate roots A, A in


C with A = (-b + Vb2 - 4c)/2.
To see this, take an irreducible polynomial / in R[X], of degree at least
2. Over C,
/ = (X - Ax) • • • (X - Xn).
Since the coefficients of / are real, they are unchanged by complex conju­
gation, and so / must also have the factorization

f = (X-X1)---(X-Xn).

By the uniqueness of the factorization, Ai = A^ for some h.


We can't have h = 1 since this would be contrary to the irreducibility
of / , so k > 1 and we may as well renumber the indices so that h = 2. But
then (X — Xi){X — A2) has real coefficients and it is a factor of / . Thus

/ = (X - Xx)(X - A2).

Writing b = —(Ai + A2) and c = A1A2 gives the required form for / .

Irreducible rational polynomials


There is no general description of the irreducible polynomials over Q[X]. (If
there were, algebraic number theory would be an easier subject!) We quote
two useful results for handling rational polynomials; the first is proved in
several of our references: [Allenby], [Cohn 1] and [Marcus], and the second
is a not-too-difficult consequence - see Exercise 2.3.
Gauss' Lemma. Suppose that / is a monic polynomial with integer
coefficients and that / = gh with / , g £ Q[X] both monic. Then g and h
also have integer coefficients.
Eisenstein's Criterion. Let / = Xn + / n _ i X n _ 1 + • • • + f\X + f0 be
a monic polynomial with integer coefficients, and suppose that there is a
prime number p such that p \ fi for i = 0 , . . . , n — 1 but p2 does not divide
/o. Then / is irreducible over Q.
These results remain true if Q is replaced by the field of fractions Q of
a Euclidean domain R and p is taken to be an irreducible element of R.
28 Chapter 2. Euclidean Domains

Irreducible polynomials over finite fields


Although there are irreducible polynomials of every degree over a finite field
([Allenby] p. 163), the argument that predicts their existence is indirect
and does not tell us what they look like. For small fields, it is possible
to list the irreducible polynomials of a given small degree by enumerating
those that are reducible.
For example, take F = 1>2 = {0,1} - here, it is convenient to omit
the "bars" that indicate we are working with residue classes. There are
two linear monic polynomials in Z2[X], namely X and X + 1, and four
quadratic monic polynomials, X2, X2 + 1, X2 + X and X2 + X + 1. The
first two are squares (note that 2 = 0 in Z2) and the third is a product.
Thus X2 + X + \ is, by elimination, the only possible irreducible polynomial
of degree 2 over Zg. It actually is irreducible, since neither element of Z2
is a root.

2.11 Residue rings of Euclidean domains


The fact that every ideal of a Euclidean domain is principal, together
with the Unique Factorization Theorem, leads to a good description of
the residue rings of a Euclidean domain. We will postpone the general
treatment of this topic until we discuss cyclic modules in Chapter 6, since
the discussion can be simplified when we have some machinery from mod­
ule theory at our disposal. In the remainder of this chapter, we will show
how fields arise as residue rings of Euclidean domains and we will give an
explicit description of the residue rings of polynomial rings. Combining
these, we obtain an algebraic construction for roots of polynomials.
First, we show how fields arise.

2.11.1 Proposition
Let R be a Euclidean domain, I an ideal of R. Then the residue ring
R/I is a field <?$■ I = Rp with p irreducible.

Proof
By Theorem 1.11.1, R/I is a field if and only if I is a maximal ideal of
R. But I = Ra for some element a of R, and it is clear that Ra is maximal
precisely when a is irreducible. □
2.12. Residue rings of polynomial rings 29

2.12 Residue rings of polynomial rings


Let F[X] be the polynomial ring over a field F, and let
/ = /o + fxX + ■ ■ ■ + fn-iXn~l +Xn,n = deg(/) > 1
be a monic polynomial in F[X], Our aim is to give an explicit description
of the residue ring F[X\/F[X]f that we need in subsequent applications.
We show that each element of F[X]/F[X]f can be written as a polynomial

90 + 9ie + 92£2 H 1- 3 n - i « n _ 1
where e is a root of / . The addition and multiplication of such polynomials
follows the expected rules, with the relation /(e) = 0 being used to eliminate
the powers en,en+1, ... from products.
Before we get into the technicalities, here is a familiar example. Take F
to be the field R of real numbers, and let / = X2 + 1. Then the elements
of RfXj/Rpf]/ have the form go + gie, where e2 = —1. Apart from nota­
tion, this is the classical definition of the complex numbers as pairs of real
numbers.
Now we start our formal analysis of the residue ring F[X]/F[X]f. First
we note that we have not lost any generality by imposing the requirement
that / is monic, since the ideal F[X]f is unchanged if we replace / by uf
for any nonzero constant u (see Lemma 1.8.1 and Corollary 2.5.2).
There is no loss either in requiring that deg(/) > 1, since when / is
a nonzero constant polynomial, F[X]f = F[X] and so F[X]/F[X]f is the
trivial ring.
Now let g € F[X) be arbitrary. As we saw in section 2.3, we can write
g = qf + r
with the remainder r having degree less than n. Since g = f in F[X\/F[X]f,
the elements of F[X]/F[X]f can be taken to have the form g with g £ F[X],
deg(g) <n.
If his also a polynomial with deg(/i) < n and g = h, then g — h 6 F[X]f,
so / divides g — h. But deg(g — h) < n, so the division is possible only
if g - h = 0. It follows that each element of F[X]/F[X]f can be written
uniquely in the form ~g with deg g < n.
Suppose then that degg < n and write
g = g0 + giX + g2X2 + ■ ■ ■ + gn-xX71-1, g0,...,gn-i € F,
where any (or indeed all) of the coefficients gi can be 0 - the notation is
not to be interpreted to imply that deg(<?) is n— 1. Then, in F[X]/F[X]f,
g = Ifi + giX + g2X2 + hffn-iX™"1
30 Chapter 2. Euclidean Domains

= m+&X~ + teX2 + --+g^VT~1. (2.1)

To make this expression tidier, we notice that for scalars k, k' € F, we


have
k = F in F[X]/F[X]f <=> k = k' in F
and furthermore

k + k' = k + F and k ■ k' = k~ ■ k'.

We can therefore regard F as being contained in F [ X ] / F [ X ] / by agreeing


to identify k with fc for each element k € F.
For k € F and 7; € F [ X ] / F [ X ] / , we interpret k ■ ~g as fcg, so that
F[X]/F[.Y]/ becomes a vector space over F.
Put e = X. We can now rewrite Equation 2.1 in the friendlier form

9 = 90 + gie + 92<? + ■■■ + ffn-ieB~l. (2-2)

The scalars go,gi, ■ ■ ■, gn-i are uniquely determined by the element ~g of


F[X]/F[X]f, since otherwise we could find another polynomial h of degree
at most n — 1 with h = ~g, contrary to the uniqueness of g. It follows that
the set
{ l , « , c a , . . . , c n - l } > n = deg(/),
is a basis of F [ X ] / F [ X ] / as a vector space over F . This basis is called the
canonical basis of F[X]/F[X]f.
It is clear that the addition of elements of F[X]/F[X]f follows the rule
for addition of polynomials. By construction,

m = / = o,
so that the multiplication in F[X]/F[X]f follows from polynomial multi­
plication, together with the rule

en = - / o - fit -fie2 /n-ie"-1 (2.3)

where
f = fo + fiX + --- + / n - i X " - 1 +Xn,n = deg(/).
Note: the "root" e that we have constructed for the polynomial / is an
algebraic entity. Even if / has real coefficients, it may not be possible to
identify e as a real or complex number. For example, if / = X2, then
e2 = 0 but e -fi 0. This observation illustrates the fact that the residue ring
F[X]/F[X]f will not be a field if / is not irreducible.
2.13. Splitting fields for polynomials 31

2.13 Splitting fields for polynomials


Given a polynomial / with coefficients in a field F, it is possible to construct g
a bigger field E in which f(X) has "all its roots". More precisely, / is said
to split in E if, in E[X], every irreducible factor p of / is linear, that is,
p = X - A for some A in E. The field E is then called a splitting field for
/ . Each linear factor of / corresponds to a root A of / - see Exercise 2.2 -
so a splitting field of / is indeed a field in which / has all its roots.

2.13.1 Proposition
Let F be a field and let f be a polynomial with coefficients in F. Then
there is a field E containing F in which f is split-
Proof Write the irreducible factorization of f(X) in F[X] as
f(X) = (X-\1)-.-(X~ Afc)pi(X) ■ • -p,(X)
where we have gathered all the linear factors of f(X) at the start. We allow
the possibilities k = 0 or s = 0, and the Aj need not be distinct.
Let n = deg(/) and put d = d(f,F) = n — k. We induce on d. The
initial case is that d = 0. Then k = n, so we must have s = 0, that is, F is
already a splitting field for / .
Suppose that d > 0, so that s > 1. Let F' = F[X]/F[X]pi be the ring
constructed in section 2.12. Then F' contains F. Further, by Proposition
2.11.1, F' is a field, and Equation 2.2 shows that F' contains a root of p\.
Thus the factorization of / over F' has more than k linear terms, so that
d(f,F')<d(f,F).
By our induction hypothesis, there is a field E containing F' in which
/ splits. □
Remarks If F is contained in the field C of complex numbers, the Fundamen­
tal Theorem of Algebra shows that C is a splitting field for any polynomial
over F. However, the algebraic construction for the splitting field works for
any field F.
A stronger result can be proved, that a polynomial has an essentially
unique smallest splitting field ([Allenby], p. 149), but we do not need this
for our purposes.

2.14 Further developments


• Some authors replace axiom ED 2 by a weaker statement:

tp(ab) > ip(a) for all a , i e i J .


32 Chapter 2. Euclidean Domains

This weaker axiom has the advantage that it allows more rings to be
considered to be Euclidean domains than does ED 2, while causing only
a small increase in the complexity of proofs. However, all the Euclidean
domains that we meet in this text satisfy the stronger axiom.
• The technique used to show that the Gaussian integers are a Euclidean
domain can be extended to some similar types of ring - see Exercise 2.8
below and [Allenby] §3.7 for some of the easier cases. Comprehensive dis­
cussion of the limits of the technique are given in Chapter 14 of [H & W]
and in [E, L & S].
• A commutative domain in which every ideal is principal is called a prin­
cipal ideal domain. By Theorem 2.5.3, a Euclidean domain is a principal
ideal domain. The Unique Factorization Theorem holds over a principal
ideal domain, as do most of the results that we subsequently obtain for
modules over Euclidean domains, but the arguments are more technical.
As genuine (that is, non-Euclidean) principal ideal domains are rarely en­
countered in undergraduate mathematics, and not too often at any level,
this text keeps to Euclidean domains. The proof of unique factorization
in a principal ideal domain can be found in section 2.15 of [Jacobson] or
section 10.5 of [Cohn 1], among others.
The most accessible non-Euclidean principal ideal domain is the ring
Z[(l + \/—19)/2]. The fact that this ring is not Euclidean is proved in
the sources mentioned in the previous paragraph, while some algebraic
number theory is needed to show that it is nevertheless a principal ideal
domain. [Marcus] gives a nice account of the calculations needed; Exer­
cise 9 of Chapter 5 is particularly relevant.
• The results of this chapter (and book) can be extended to noncommu-
tative versions of Euclidean domains ([B k. K: IRM], Chapter 3) and
to noncommutative principal ideal domains ([Cohn: FRTR], Chapter 8;
[Rowen]).

Exercises
2.1 Suppose that pi,- ■ ■ ,Pk are distinct prime numbers. Show that the
product
Pi•■•Pk + 1
has a prime factor q with q ^ Pi for any i. Deduce that there are
infinitely many prime numbers.
2.2 Let f(X) and X - A be polynomials over a field F. Show that f(X) =
q{X)(X - A) + r where r G F. Deduce that A is a root of f(X) if
and only if X - A | f{X). Prove further that f(X) can have at most
deg(/) distinct roots in F
Exercises 33

2.3 Let p e Z b e prime. Arguing directly from Gauss's Lemma, show that
the polynomials Xn — p are all irreducible for n > 2.
Generalize your argument to a proof of Eisenstein's Criterion.
2.4 Let R be a Euclidean domain and let a in R be neither a unit nor
irreducible. Show that the ring R/Ra contains a nontrivial divisor of
0.
2.5 Let f(Y) = f(Y + 1) be the polynomial obtained by the change of
variable X = Y +1 from f(X) € F[X], F a field. Show that fg = fg.
Deduce that f{X) is irreducible if and only if f(Y) is irreducible.
Let p € E be a prime number. Prove that the polynomial Xp~ * +
• • ■ + X + 1 = (X" - 1)/(X - 1) is irreducible in Q[X].
2.6 Show that the polynomial ring F[X, Y] in two variables over a field is
not a Euclidean domain.
2.7 Let R = Z[%/—5]. Show that 3, 2+\J—5 and 2 — yj— 5 are all irreducible
elements of R and that no two of them are associates. Verify that
3 2 = (2 + v / ^ 5 ) ( 2 - V ^ 5 ) .
This means that unique factorization does not hold in R and hence
that R cannot be a Euclidean domain. Confirm this by showing that
the ideal 3R + (2 + y/—5)R is not principal.
2.8 Some more Euclidean domains. Here are some rings which can be
shown to be Euclidean domains using mild variations of the technique
for the Gaussian integers.
i: Z[yf^\, <p(a + by/^2) = a2 + 2b2.
ii: Z[w], where u> = (— 1 + -</—3)/2 is a cube root of 1, and

tp(a + bw) = ((a + b)2 + 3b2) /2.

hi: Z[\/2], ip(a + by/2) = \a? - 2b2\, the absolute value.


In Z[i/2]i verify that 1 + v7^ is a unit of infinite order.
2.9 Using the method of (2.10), draw up a list of "standard" irreducibles
in the Gaussian integers Z[i] which contains all the distinct irreducible
factors of the (ordinary) primes 2,3,5,7. Hence give a standard fac­
torization of 4200 in Z[i].
Remark It is not too hard to compute the irreducible factorizations
of small prime integers p = 2,3, 5 , . . . in the rings Z[>/—2] and Z[w].
The same computations in "L[y/2\ are already quite tough with the
elementary methods at our disposal, as is the proof that

U (z[v/2]) = {±(1 + \/2) 1 | t € Z } .

Chapter XV of [H & W] contains a good account of these topics.


Chapter 3

Modules and Submodules

Now that we have finished our introductory survey of rings, we can intro­
duce the main objects of our enquiries in this text, namely, modules. This
chapter is concerned with the definitions and general properties of mod­
ules and their submodules, together with concrete examples arising from
additive groups and from matrices acting on vector spaces. The latter type
of module will prove to be crucial to our investigation of normal forms for
matrices in Chapter 13.
Although we are mainly interested in modules over commutative do­
mains, we allow the ring of scalars to be arbitrary in our basic definitions.

3.1 The definition


Let R be a ring. A left R-module M is given by two sets of data.
First, M is to be an additive group. Thus for m, n £ M, there is a sum
m + n £ M, and the addition satisfies the requirements listed in section
1.1.
Second, the elements of the ring R must act by left multiplication on
the members of M, so that for m £ M and r £ R, there is an element
rm £ M. This action is called scalar multiplication, and it must satisfy the
following axioms.

SML 1: (rs)m = r(sm) for all m € M and r,s £ R.


SML 2: r(m + n) = rm + rn and (r + s)m = rm + sm for all m,n £ M
and all r,s £ R.
SML 3: l m = m for all m in M, where 1 is the identity element in R.

35
36 Chapter 3. Modules and Submodules

The last axiom can be stated as "M is a unital module". Sometimes it is


convenient to allow non-unital modules, but we shall not do so in this text.
The ring R is called the ring of scalars for M.
If the elements of R act on the right of M, then we obtain a right module,
and the rules for scalar multiplication are changed accordingly:
SMR 1: m(rs) = (mr)s for all m e M and r,s 6 R.
SMR 2: (m + n)r = mr + nr and m(r + s) = mr + ms for all m,n e M
and all r,s € R.
SMR 3: m l = m for all m in M, where 1 is the identity element in R.
Extremely important examples of R-modules arise from the ring R itself,
which can be regarded either as a left module or as a right module. By
the definition of a ring, R is an additive group. To obtain a left scalar
multiplication, we simply view the multiplication in R in a new way: a
product rs, r,s G R, is interpreted as the result of r acting on s. The
axioms for scalar multiplication hold since they are just re-intepretations
of the axioms for ring multiplication listed in section 1.2.
On the other hand, we can interpret the product rs as the result of s
acting on r and so turn R into a right R-module.
When R is viewed as a left or right R-module in this way, it is often
called the (left or right) regular .R-module.
Equally ubiquitous are the zero modules. For any ring R, the set {0} is
both a left and a right R-module, with rO = 0 = Or always. We denote any
zero module by 0 - thus we use the same notation for a zero module as we
do for a zero element, but this should not cause any confusion in practice.
Suppose that the ring of scalars R is commutative. Then we can convert
any right module into a left module by the rule
rm = mr for all r € R,m € M,
and likewise, any left module is equally a right module. Thus, in the case
of main interest in this text, we need not distinguish between left and
right modules. We will use the left-handed notation for a module over a
commutative ring.
In general, a careful distinction must be made between left and right
modules, as we can see from Exercise 3.10 below, and from Chapter 14.
The modules over one special kind of ring are familiar from elementary
linear algebra. When F is a field, an F-module is the same thing as a
vector space over F . Since a field can be regarded as a trivial Euclidean
domain (2.1), the theory of modules over Euclidean domains includes the
theory of vector spaces. However, we need to assume a prior knowledge of
vector space theory so that we can introduce some of our basic examples
of modules.
3.2. Additive groups 37

3.2 Additive groups


Next, we show how an additive group can be viewed as a Z-module. Since a
multiplicative abelian group can be re-written as an additive group (section
12.10), this observation will allow us to obtain results about abelian groups
from our general theory of modules over Euclidean domains.
Let A be an additive group (1.1). Intuitively, the action of a positive
integer n on an element a in A is given by na = a + ■■■ + a, where there
are n a's in the sum. A more formal inductive definition runs as follows.
To start, define 0a = 0, where the first "0" is the zero in Z and the
second "0" is the zero in A. Then, for n > 0, put na = (n - l)a + a, and
for n < 0, put na = ~{—n)a.
Using the natural notation m, n for integers and a, b for members of the
abelian group A, the scalar multiplication axioms now read
(mn)a = m(na),
(m + n)a = ma + na,
m(a + b) = ma + mb
and
la = a.
These are the expected rules for computing multiples and we will take them
for granted. Formal proofs by induction are not hard, and so they are left
to the reader.
Aside on notation. From time to time, we will use the symbol m in two
ways. Sometimes it will denote a typical element of a module M and at
others it will indicate an integer. This should not cause any confusion in
practice, since the context will make it clear which is meant.

3.3 Matrix actions


Perhaps the most important examples of modules, at least as far as this
text is concerned, are modules that arise through the action of a matrix on
a vector space. The relationship between matrices with entries in a field F
and modules over the polynomial ring F[X] is the key to the derivation of
normal forms of matrices in Chapter 13.
Let F be a field and let Fn be the vector space over F consisting of all
/ vi \
V2
column vectors v with each v< € F.

\vn )
38 Chapter 3. Modules and Submodules

Addition and scalar multiplication in Fn are given by the expected rules:

/ n \ ( Wi \ ( Vi + VJi >
V2 + W2
V2 W2
+
\Vn ) \ wn / \ vn + wn )
and
/ vi \ ( kVl \
V2 kv2

V vn j \ fc
^n /
where k is in F.
Now suppose we have a n n x n matrix

/ a n c-12 ain \

A= O-il 0-i2

\ o.„i a„2 ... ann j

with entries O;J belonging to F.


Then for each v € Fn, we can form the vector

/ auvi + ai2V2-\ \-ainvn \

Av = auvi + ai2V2 + ■ ■ ■ + ainvn

\ a-niV\ + an2V2 H V annvn J

which again belongs to Fn; this is what we mean when we say that "the
matrix A acts on the space Fn ".
Careful calculation shows that for any such matrix A, any vectors v, w
in Fn and any scalar k in F, we have

A(v + w) = Av + Aw

and
A(kv) = k(Av),
that is, A acts as an F-linear transformation on Fn.
3.4. Actions of scalar matrices 39

Since the powers A2, A3,..., A1,... of A are also nxn matrices over F,
they all act on F , giving vectors A2v, A3v,..., Aiv,... for v e Fn.
n

This allows us to define a scalar multiplication in which polynomials /


in F[X] act on vectors in Fn. Given

/ = /o + fiX + ■ ■ ■ + fiX1 + ■ ■ ■ + fnXn e F[X],

put
fv = f0v + fxAv + ■■■ + fiA{v + ■■■ + fnAnv for v e Fn.
With the convention that ^4° = / , the nxn identity matrix, we can write
fv as the sum

fv = YdfiAiv.
i=0
A great deal of checking (which is left to the reader as a very good exercise)
confirms that Fn becomes an F[X]-module M with this scalar multiplica-
tion.
We use the expression "M is given by X acting as A" to indicate that
the F[X]-module M is Fn made into an F[X]-module by this construction.
The value of n, the choice of F and the fact that A is an n x n matrix over
F will usually be clear from the context.
Note that each choice of an n x n matrix gives a different module struc-
ture on Fn. For this reason, we sometimes use the notation M(A), M(B),...
to indicate the modules given by X acting as the matrices A, B,....
Here are some uncomplicated examples.

3.4 Actions of scalar matrices


A scalar matrix is an n x n matrix A = XI, where / is the nxn identity
matrix and A is in F. (We use A rather than A; for a scalar in this context
since in most applications A will be an eigenvalue of some matrix.)
Because P = I always, A1 = X'l for any i > 0. Also, Iv = v for
all v in Fn, which means that, with X acting as A, the result of scalar
multiplication on Fn by

/ = /o + fiX + ■ ■ • + ftXl + ■ ■ ■ + fnXn e F[X]

is
fv = f0v + fiXv + ■■■ + fiX{v + ■■■ + fnXnv, for v e Fn.
An important special case occurs when A = 0, that is, A is the zero nxn
matrix. Then Xv — 0 always, and fv = f0v for any polynomial / . In this
40 Chapter 3. Modules and Submodules

case, the corresponding F[X]-module is called the trivial F[X]-module, on


which X acts as 0. Be careful not to confuse the trivial module with the
zero module; the latter has only one element 0.
When n = 1, the vector space F1 is simply the field F regarded as a
vector space over itself. (This is a special case of the fact that any ring
R can be thought of as an .R-module.) A 1 x 1 matrix is, for all practical
purposes, the same thing as an element A in F, so we find that for each
choice of A, F can be made into an F[Af]-module with X acting as A. The
action of a polynomial / is given explicitly by

fk = f0k + fx\k + ■■■ + fiXk + ■■■ + fnXnk, keF.

For example, if A = —1, then

fk = f0k -fik + f2k + --- + {-l)lfik + ■■■ + (-l) n /„fc.

3.5 Submodules
Before we discuss some further explicit examples of modules, we take a first
look at the internal structure of a general module.
Let M be a left module over an arbitrary ring R. An R-submodule of
M is a subset L of M which satisfies the following requirements.
SubM 1: 0 6 L.
SubM 2: If I, /' € L, then I + I' e L also.
SubM 3: HI € L and r £ R, then rl € L also.
A submodule L of a left .R-module is itself a left .R-module, since the
axioms for addition and scalar multiplication already hold in the larger
module M.
If the ring R of scalars can be taken for granted, we omit it from the ter­
minology and say that L is a submodule of M rather than an i?-submodule.
An .R-submodule of a right module M is defined by making the obvious
modification to axiom SubM 3:
SubMR 1: HI € L and r £ R, then Ir € L also.
Clearly, a submodule of a right /^-module is again a right module.
General statements and definitions about submodules of left modules
have obvious counterparts for submodules of right modules. As our main
interest is with modules over commutative rings, for which we use the left-
handed notation, we will as a rule give only the left-handed versions of
such statements and definitions. The reader should have no problem in
providing the right-handed versions where desired.
3.6. Sum and intersection 41

When the ring of scalars R is commutative, then any left module M can
be regarded as a right module, and any submodule of M as a left module
is equally a submodule of M as a right module.
A left module M is always a submodule of itself. A submodule L of M
with L / M is called a proper submodule of M.
At the other extreme, any module has a zero submodule {0}, which we
usually write simply as 0. The zero submodule is a proper submodule of
M unless M is itself the zero module.
A left module S is said to be a simple module if S is nonzero and it has
no submodules except 0 and itself. The description of the simple modules
over a ring R is one of the fundamental tasks in ring theory.
Next, we look at some situations where submodules are already known
to us under different names.

• Z-modules. As we noted in section 3.2 above, a Z-module A is the


same thing as an abelian group, written additively. A subgroup B of A,
is, by definition, a subset of A which satisfies conditions SubM 1 and 2.
However, if B does satisfy these conditions, then it also satisfies SubM 3,
since scalar multiplication by an integer is essentially repeated addition
or subtraction. Thus, a Z-submodule of A is the same thing as a subgroup
of A
• Vector spaces. Let F be a field. Then an F-module is a vector space V
over F, and an F-submodule W of V is more familiarly called a subspace
of V.
• Ideals. As we noted when we first defined modules in section 3.1, a ring
R can be considered to be both a left i?-module and a right R-module.
If we compare the definition of a submodule with that of an ideal (1.7),
we find that a left ideal of R is the same thing as an R-submodule of
the left regular R-module R, while a right ideal of R is an R-submodule
of R when it is considered to be a right .R-module. This distinction is
illustrated in Exercise 3.10.

3.6 Sum and intersection


A fundamental problem in module theory is the description of a given
module in terms of a collection of submodules, each of these submodules
being in some sense "simpler" than the original module. As a first step
toward this goal, we give two basic methods of constructing new submodules
from old. We assume throughout that our modules are left modules; the
modifications for right modules are straightforward.
42 Chapter 3. Modules and Submodules

Suppose that L and N are both submodules of a module M. Their sum


is
L + N = {l + n\leL, n e N}
and their intersection is

LC\N = {x\x£L and x £ iV},

which is the intersection of L and N in the usual sense.


The elementary properties of the sum and intersection are given in the
following lemma, which we prove in great detail as it is our first use of the
definitions.

3.6.1 Lemma
(i) Both L + N and LC\ N are submodules of M.
(ii) L + N = L <i=^> N C L.
(Hi) LnN = L <=> LCN.

Proof (i) We check the submodule conditions one by one.


SubM 1: 0 is in L + N since 0 = 0 + 0 where the first 0 belongs to L and the
second 0 belongs to N - both zeroes are, of course, the zero element
of M.
SubM 2: Suppose m,m' e L + N. Then m = l + n and m' = /' + n' where
1,1' € L and n, n' € N, so that

m + m' £ L + N

since
l + l' € L and n + n' € N.

SubM 3: Carrying on the same notation, if m = I + n is in L + N and


r £ R, then rm = rl + rn is in L + N since rl £ L and rn £ N.
The argument for L n N is even easier, so it is left to the reader.
(ii) Suppose L + N = L. Given n £ N, n = 0 + n i s also in L + N, so
that n £ L, that is, N C L.
Conversely, if N C L, then we must have / + n £ L for any I £ L and
n £ N.
(iii) This assertion is a result in set theory rather than module theory.

3.7. k-fold sums 43

3.7 /c-fold sums


It will be necessary to extend the definition of a sum to allow for an arbitrary
number of submodules, rather than just two. Let L\,..., Lk be any set of
submodules of a left module M, where k > 0 is an integer. Their sum is
defined to be

L1 + --- + Lk = {li + --- + lk\l1eLu...,lk€Lk}.

When k = 0, this sum is to be interpreted as the zero submodule 0 of M,


and for k = 1 the "sum" is simply the submodule L\ itself. For k = 2 we
regain the previous definition in different notation.
The proof that L\ + ■ ■ ■ + Lk is a submodule of M is similar to that
given above and is left to the reader.

3.8 Generators
A convenient way to specify a module or one of its submodules is in terms
of generators. In fact, our investigation of the structure of modules over
Euclidean domains will be based on an analysis of the generators of the
modules. Again, we work only with left modules, and leave right-handed
definitions to the reader.
First, we consider a single generator. Let M be a left .R-module and
let x be an element of M. The cyclic submodule of M generated by x is
defined as
Rx = {rx | r e R}.
The left module M itself is said to be cyclic if M = Rx for some x e M,
and the element x is called a generator of M.
The confirmation that Rx is actually a submodule of M is a trivial
extension of the argument given for principal ideals in section 1.8; also, in
a moment we will give a more general calculation which includes the cyclic
case.
Notice that a principal left ideal Rx is a special type of cyclic submodule,
where we take M = R and x in R.
We next consider finite sets of generators. Let X = {x\,... ,xt} be a
finite subset of M, and put
L{X) = { n i l H \-rtxt | ri,...,rt 6 R},

which is the same as saying that

L(X) =Rxi-\ \-Rxt,


44 Chapter 3. Modules and Submodules

the sum of the cyclic submodules Rx\,..., Rxt-


Then L(X) is called the submodule of M generated by the set X. The
fact that L(X) is a submodule follows from the following equations, which
are easy consequences of the axioms for a module.
SubM 1: 0 = Qxx + • ■ ■ + 0x t £ L{X).
SubM 2:

(rxxi H h r t x t ) + (siZi H h stxt) = (n + s x ) xx H (- (r t + s t ) x t

for all r\,..., rt and S i , . . . , st £ R-


SubM 3:

r ■ (rii! H h rtxt) = (r • r i ) X\ + h (r • r t ) i t

for all r, r i , . . . , rt £ i?.


If M = L(X), then we say that X is a set of generators for X, or that
"X generates M". A finitely generated module is one that does have a finite
set of generators - these are the modules which interest us in this text.
The most familiar examples of generating sets occur in linear algebra,
as bases of vector spaces. We review the definitions briefly.
Let V be a vector space V over a field F. In elementary linear algebra,
a generating set for V as an F-space is more often called a spanning set for
V; it has the property that for any v £ V, there are scalars k\,..., kt £ F
so that
v = kiXi + 1- ktxt.
A basis of V is a spanning set X which is linearly independent, that is,
if
k\X\ + ■ ■ ■ + ktxt = 0 with k\,..., fct £ F,
then
fcx = • ■ ■ = kt = 0.
If V has a finite generating set X, then a finite basis of V can be obtained
from the generating set by successively omitting elements. Moreover, any
linearly independent subset Y of V can be extended to a basis by adding
suitable members of X, and any two bases of V have the same number of
members, this number being the dimension of V.
Naturally enough, we refer to a finitely generated vector space as definite
dimensional vector space.
The problem of extending these definitions and results to modules over
Euclidean domains in general will occupy us in later chapters. It suffices
for the moment to warn the reader that we will encounter phenomena that
do not occur in vector spaces.
3.9. Matrix actions again 45

For example, the sets X = {1} and Y = {2,3} are both generating sets
of Z, considered as a module over itself. The set X is linearly independent
in an obvious sense, but Y is not. Further, neither element of Y can be
omitted to give a generating set with one member.

3.9 Matrix actions again


Let A be an n x n matrix over a field F and let M be the F[X]-module
obtained from the vector space Fn with X acting as A (3.3). We will see
that the F[X]-submodules of M are determined by the action of A on the
subspaces of Fn.
Suppose first that L is an F[X]-submodule of M. Then L is a subset
of Fn, and, by axioms SubM 1 and 2, L must contain the zero vector and
it must be closed under addition. Since the elements of the field F can be
regarded as constant polynomials and L is closed under scalar multiplica­
tion by polynomials (SubM 3), L is closed under scalar multiplication by
elements of the coefficient field F. These remarks show that L must be a
subspace of the space Fn.
Appealling to axiom SubM 3 again, we have XI € L for any I £ L. Since
XI = Al, we have A- L C L, which means that the subspace L is invariant
under A.
Conversely, if U is a subspace of Fn which is invariant under A, then for
any u € U, we have Au e U and hence A2u € U, A3u 6 U, and Alu e U
for all i. Thus, for any polynomial

/ = /o + hX + ■ ■ • + fiX* + ■ ■ • + fnXn £ F[X],

and any vector u £U,

f • u - f0u + fiu H + fiA*u H + fnAnu is in U,

which shows that £/ is closed under scalar multiplication by polynomials


and so defines a submodule of M.
The correspondence between submodules and subspaces will be used
frequently, so we state it formally as a theorem.

3.9.1 T h e o r e m
Let F be a field, let A be an n x n matrix over F, and let M be the
F[X)-module obtained from the vector space Fn with X acting as A. Then
there is a bijective correspondence between
(i) F[X)-submodules L of M,
and
46 Chapter 3. Modules and Submodules

(ii) F-subspaces U of Fn which are invariant under A, that is, AU C U.



Now we have given the general description of submodules of modules
defined by matrix actions, we look at some increasingly specific calculations.

3.10 Eigenspaces
Given an n x n matrix A over a field F, an eigenspace for A is a nonzero
subspace U of Fn with the property that there is a scalar A g F s o that

Au = Aw for all u € U.

The scalar A is the eigenvalue of A corresponding to U, and a nonzero


vector u £ U is an eigenvector. Notice that we allow the possibility that
A = 0.
The remarks in section 3.4 show that any eigenspace gives an F[X]-
submodule of the F[X]-module that arises from Fn with X acting as A.
The converse is far from true - usually, there are invariant subspaces which
are not eigenspaces (see Exercise 3.6). However, a reasonable first approach
to the problem of determining invariant subspaces is to compute the eigen­
values and then the eigenspaces of A. We recall from elementary linear
algebra how this is done.
Let I = In be the n x n identity matrix. We have

Au = Au for some u ^ 0 ■<=► (XI — A) u = 0


<=* det(AJ -A)=Q,

where det(5) denotes the determinant of a matrix B. It is well known that


the expression det(XI — A) is a polynomial in the variable X, of degree n.
It is in fact the characteristic polynomial of A, which will play an important
role later in these notes.
If we can find a root A of det(XI — A) in F (there is no guarantee that
this can be done), then we can find the eigenvectors u and the eigenspace
U by solving the system of linear equations (XI — A)u = 0.
In one elementary but important special case, a submodule must be
given by an eigenvector.

3.10.1 Lemma
Let F be a field, let A be annxn matrix over F, and let M be the F[X]-
module obtained from the vector space Fn with X acting as A. Further,
suppose that the subspace U of Fn is one-dimensional over F.
3.11. Example: a triangular matrix action 47

Then U gives an F[X]-submodule of M if and only if U = Fu is an


eigenspace of A, where u is an eigenvector for some eigenvalue A of A.

Proof Suppose that U does give a submodule, so that U is invariant under


A. Since U has dimension 1, U — Fu for some vector u / 0 . But Au e U,
so we have Au = A • u for some A e F.
The converse is clear from the preceding discussion. □
Here are some concrete examples to illustrate the theory.

3.11 Example: a triangular matrix action


l i "
Let F be any field and put A = I o l 1. Let M be F2 regarded as an
F[X]-module with X acting as A, so that for

"'-- \ l I ^ A/>
x+y
Xm-i
y
A proper subspace of F2 must have dimension 1, and hence a proper sub-
module L of M must be given by an eigenvector of A. The eigenvalues of
A are the roots of

det( X~l X11)=(X-1)\

so the only eigenvalue is 1. The eigenvectors are found by solving the


equations
=
. 0 0 ){y ) {0
and so have the form

(o
for x / 0.
It follows that there is exactly one proper submodule of M, given by
the subspace U = F ( _ j of F2.
Noice that this result is not influenced by any specific properties of the
field F - it holds if we take F to be K, C, Z p where p is a prime, or any
other field.
48 Chapter 3. Modules and Submodules

3.12 Example: a rotation


In this example, the nature of the field of coefficients F is important. Let
A = ( ~ ) and let M be F2 regarded as an .F[X]-module with X
acting as A. Thus for
X
m = ( } e M,

Xm =

Again, any proper submodule L of M must be given by an eigenvector of


A.
The eigenvalues of A are the roots of
X
det ( x £ J =X2 + 1.

Now suppose that F = R, the field of real numbers. Then A has no


eigenvalues, so M cannot have any proper submodules, that is, it is a simple
R[X]-module. This result is also intuitively true geometrically, since A
corresponds to a rotation of the plane through it/2.
However, if we take instead F = C, the complex numbers, then there
are two eigenvalues, +i, —t, with respective eigenvectors

v+ = ( I and V- = (
l
\ J V -»
We obtain two one-dimensional C-submodules of M, namely L + = Cv+
and L_ = Cv_.

Exercises
3.1 Show that the set {6,10,15} generates Z as a Z-module, but that no
subset of {6,10,15} generates Z.
For each k > 4, find a set A). = {ai,.,.,afe} of integers which
generates Z, so that no subset of Ak generates Z.
3.2 Let p e Z be a prime number. Show that if £ is a nonzero element
of the additive group Z p , then x is a generator of Z p as a Z-module.
Deduce that Z p is a simple Z-module. Hint: Corollary 1.11.2.
Remark this result is more frequently given in the form "a cyclic
group of prime order has no nontrivial subgroups".
Exercises 49

3.3 Let A be an abelian group and let r 6 Z. Show that

rA = {ra | a G >!}

is a subgroup of A.
Take A - Z m - Prove that rA = A <=> (r, m) = 1 and that
rA = 0 <=> m divides r.
More generally, let R be a Euclidean domain and let M be an
.R-module. Given an element r e R, show that

rM = {rm | m £ M }

is a submodule of M.
Suppose that M = R/Rx for some x e R. Determine conditions
on r so that (i) rM = M and (ii) rM = 0.
3.4 Repeat Example 3.12 for the finite fields Z p , p prime. (Note that
y/—l € Z p <=> p = 1 mod 4 for odd p.)
3.5 Let M be the C[X]-module given by an n x n matrix A acting on C n .
Show that M is simple if and only if n = 1.
/0 1 1\
3.6 Let M be the C[X]-module given by the matrix A = I 0 0 1 I
\ 0 0 0 /
acting on C 3 . For each vector v € C 3 , let L(v) be the cyclic C[X]-
submodule of M generated by v, and write L0 for L{e\) where e\ =
1
0
0
Show that Cei is the only eigenspace of C 3 . Deduce that LQ C
L(v) for any v ^ 0. Find all v with
(a) dimL(v) = 2,
(b) dimL(v) = 3.
Do your answers change if the field C is replaced by an arbitrary
field F?
/ 0 1 0
3.7 Let N be the C[X]-module given by the matrix B = j 0 0 1
\ 1 0 0
acting on C 3 .
(a) Prove that there are exactly 3 submodules of iV that are one-
dimensional as vector spaces over C.
(b) Show that TV is a cyclic C-module.

(c) Show that the submodule generated by I — 1 I is two-dimensional.


50 Chapter 3. Modules and Submodules

(d) Investigate what happens if the field of complex is replaced by the


real numbers E, or by the finite fields Zj, Z3, or Z7.
3.8 Let R be an arbitrary ring and let M be an R-module. Suppose that
J is a two-sided ideal of R with the property that IM = 0, that is,

xm = 0 for all x £ / and m € M.

Show that the rule

r m = rm for all r € R / i and m G M

gives a well-defined action of R/I on M and that M is an R/J-module


with this action as scalar multiplication.
3.9 Let R be a ring and let M be a simple left .R-module. Show that any
nonzero element of M is a generator of M.
C 3.10 Let R be the ring of 2 x 2 matrices over a field F. Let / be the set of
n 12
matrices of the form I I and let J be the set of matrices

n
of the form ( 021 „0 |. Show that, under the usual rules for matrix
multiplication, / is a right ideal of R (and hence a right R-module)
but that I is not a left ideal. Show also J is a left ideal but not a
right ideal in R.
Prove that I is simple as a right module and that J is simple as a
left module.
Generalize these results to the ring o f n x n matrices over F. (See
also Exercise 7.8.)
Chapter 4

Homomorphisms

We next introduce a fundamental concept in module theory, that of a ho-


momorphism, which is a map from one module to another that respects the
addition and scalar multiplication. A knowledge of the homomorphisms be­
tween two modules allows us to compare their internal structures. In later
chapters, we use an analysis of homomorphisms to obtain the fundamental
results on the structure of a module over a Euclidean domain.
Just as a vector space over a field F is a special type of module, a linear
transformation between vector spaces is another name for a homomorphism
between them. We will show how to describe the homomorphisms between
modules over a polynomial ring F[X] in terms of linear transformations
between the vector spaces that underlie the modules. We also show that a
general .F[X]-module arises through "X acting as a linear transformation"
on an underlying vector space, which need not be a standard column space
Fn.

4.1 The definition


Let R be a ring and let M and N be left .R-modules. An R-module homo­
morphism from M to N is a map 9 : M -> TV which respects the addition
and scalar multiplication for these modules. More formally, 6 must satisfy
the following axioms.

HOM 1: 6(m + n) = 6(m) + 6{n) for all m,n e M.


HOM 2: e\rm) = rO{m) for all m G M and all r £ R.
If M and N are both right modules, the second condition is replaced
by

51
52 Chapter 4. Homomorphisms

HOMR 1: 6{mr) = (6(m))r for all m e M and all r £ R.

We will work only with left modules and their homomorphisms in this
chapter.
Alternative terms are module homomorphism, when there is no doubt
about the choice of coefficient ring R, or simply homomorphism if it is
obvious that we are dealing with modules rather than groups, rings, or
some other mathematical structure.
When the ring of scalars is a field F, so that M and N are then vector
spaces over F, an F-module homomorphism is more familiarly known as
an F-linear transformation or an F-linear map, or simply a linear transfor­
mation or linear map.
Some authors extend the vector space terminology to modules in gen­
eral, and speak of ".R-linear transformations" or ".R-linear maps". However,
as we shall be concerned with the relationship between F-linear maps and
F[X]-module homomorphisms when we analyse the structure of modules
over polynomial rings, it will be convenient to limit the use of the term
"linear" to vector spaces.
When M and N are Z-modules, that is, additive groups, the second
axiom HOM 2 follows automatically from the first. Thus a Z-module ho­
momorphism is another name for a group homomorphism from M to N.
(The reader who has not studied group theory can take this as a definition.)
Here are three homomorphisms that are always present.

• Given any module M over any ring R, the identity homomorphism

idM : M ->■ M

is defined by
idM (TTI) = m for all m € M.

• Given a submodule L of M, there is an inclusion homomorphism

inc : L ->■ M,

defined by
inc(l) = I for all / e L.

At first sight, it may seem pointless to give names to these "do nothing"
maps, but there are circumstances where it is very useful to be able to
distinguish between an element of L regarded as an element of L and the
same element regarded as an element of M.
4.2. Sums and products 53

• If TV is also an R-module (the possibility TV = M is allowed), the zero


homomorphism
0 : M -+ TV
is defined by
0(m) = 0 e TV for all m 6 M.
Notice that the symbol "0" is used in two ways in this expression: once for
the zero map that we are denning, and again for the zero element of TV.
An attempt to introduce a separate label for every zero that we encounter
would lead to overcomplicated notation.

4.2 Sums and products


Suppose that 9 : M -> TV and V : M —> TV are both 7?-module homomor-
phisms. Their sum
e + ip : M -+ TV
is defined by
(9 + ip){m) = 9(m) + ip(m) for all m e M.
That the sum is again an /^-module homomorphism is confirmed by routine
checking: for m,m' 6 M and r € R, we have
(9 + ip)(m + m') = 9{m + m')+4>(m + m')
= 9{m) + 9{m') + tp(m) + ip(m')
= {9 + V)(m) + (9 + iP)(m')

and
(9 + ip)(rm) = 9{rm) + ip(rm)
= r ■ 9(m) + r ■ ip(m)
= r ■((<? +V0(m))-

If we are given i?-module homomorphisms 9 : M -> TV and <f> : TV —> P,


their product
4>9:M-+P
is defined by
{<j>9){m) = <p{9{m)) for all m e M.
Another routine verification shows that <f>9 is also an i?-module homomor­
phism. Further properties of sums and products are developed in Exercises
4.1 and 4.2 below.
54 Chapter 4. Homomorphisms

4.3 Multiplication homomorphisms


The scalar multiplication between a module and its ring of scalars can be
used to define some homomorphisms, the multiplication homomorphisms,
which turn out to be surprisingly useful, despite their elementary nature.
Let R be any ring and let M be an i?-rnodule. Choose an element x in
M, and define a map T{X) : R —> M by

r(x)r = rx for all r £ R.

It is easy to confirm that T{X) satisfies HOM 1 by using the distributive


property of scalar multiplication (3.1), and HOM 2 follows from the iden­
tities
r(x)(rs) = (rs)x = r(sx) = r(T(x)s), r,s € R, i £ M.
For the next definition, we must suppose that R is commutative. Fix
an element a of R and define a map a(a) : M —)• M by

<r(a)(m) = am for all m € M.

Again, axiom HOM 1 is immediate from the distributive property of scalar


multiplication (3.1). To verify that a{a) satisfies HOM 2, we need to confirm
the identity
a(a)(rm) = r(cr{a)m) for all r 6 R,
that is,
a(rm) = r(am) for all r e R.
But this equality holds for any choice of a, r and m since R is commutative.
Aside: it is possible to refine the definition of a(a) so that R is not required
to be commutative, but to do this we would need to use a more careful
system of notation which is unnecessary for our study of modules over
commutative rings.
Various properties of the multiplication homomorphisms will be estab­
lished in detail as we progress through this text. As a first illustration, we
give an example to show why we must distinguish a{a) from the element a
oiR.
Take R to be the ring of integers Z and let M be Z&, regarded as a
Z-module. An element m in M has the form m = 1 for an integer z, and,
given a € Z, we have a(a)(J) = ~az. Since

M = {0,1,2,3,4,5},

we have
a(a)M = {0,a, 2a, 3a, 4a, 5o}.
4.4. F[X]-modules in general 55

Thus different integers can give the same homomorphism of M. By


direct observation, cr(l) = id,M = o'(7), and it is fairly obvious that a ( l +
6k) = id,M for any k € Z.
We can also see that cr(0) = <r(6) = 0, the zero homomorphism, and
that cr(6fc) = 0 for any I C G Z .

4.4 F[X]-modules in general


So far, our examples of modules over a polynomial ring F[X] have been
constructed from the action of a matrix A on a vector space Fn. This is
not quite the full story of how F[X]-modules can arise.
Suppose that M is an F[X]-module, where F is a field. Since the
elements of F can be regarded as constant polynomials, there is a scalar
multiplication of F on M, and clearly M is then a vector space over F. We
call this space the underlying space of M, and denote it by a new symbol,
V.
The multiplication homomorphism <J{X) : M —► M has the property
that
a(X){km) = k ■ a(X) for all me M and k £ F,
and so
a(X) : V -> V
is an F-linear transformation.
Conversely, suppose that V is any vector space over F and that

a:V -> V

is an F-linear transformation. Then

a* : V -> V
is an F-linear transformation for any i > 1, and we can define a scalar
multiplication of polynomials

f(X) = fo + fiX + --- + fsX° in F[X]

on vectors v in V by

f(X) -v = f0v + fiav + ■■■ + ftcSv + ■■■ + fsasv.

A great deal of checking, which is left to the reader, confirms that V has
become an F[X]-module. We give this a new name, M, and say that M is
defined by X acting as a on V.
56 Chapter 4. Homomorphisms

Notice that V is the underlying space of M and that u(X) = a.


Suppose now that V = Fn and that A is an n x n matrix. Then the
map a : Fn —»• Fn defined by a ■ v = Av is an F-linear transformation, and
so the .FfXJ-modules given by matrix actions as in section 3.3 are special
cases of the general construction.
In the next chapter, we will see that a linear transformation of a finite
dimensional space V can be represented by a square matrix A once we have
chosen a basis of V. Thus any F[X]-module with a finite dimensional un­
derlying space can be regarded as arising through the action of a matrix.
However, it is essential to use the more general description of F[X]-modules
in terms of actions of linear transformations, since a given linear transfor­
mation can be represented by many different matrices, depending on the
bases we choose for V. One of our prime objectives in these notes is to find
normal forms for matrices by discovering the bases that are best adapted
to a given transformation.
Notice that the underlying space V of an F[X]-module need not be
finite dimensional; for example, F[X] itself must be infinite dimensional
since the set {1, X, X2,...} is linearly independent over F - in other words,
a polynomial is 0 only if all its coefficients are 0.

4.5 F[X]-module homomorphisms


We now wish to describe the F[X]-module homomorphisms 9 : M —t N
between two F[X]-modules, M and N. We write V for the underlying
space of M and a : V —¥ V for the F-linear transformation that defines M,
and W and /? for the corresponding data for N.
Suppose that such a homomorphism 9 is given. Since the elements of F
are constant polynomials, axiom HOM 2 gives
9(kv) = k6(v) for all v € V,
so that 0 is also an F-linear transformation from V to W. Using axiom
HOM 2 again, we must have 6{X ■ v) = X ■ 6(v) for all v, that is,
9a(v) = /38(v)
always. Thus we have obtained the fundamental equality

9a = 130.

Conversely, suppose that 9 : V -4 W is an F-linear transformation


which satisfies this equality. Retracing our steps, we have
9(X ■m) = X ■ 9{m) for all m e M
4.6. The matrix interpretation 57

and so
6{Xi ■m)=Xi ■ 6(m) for all m and all i > 1.
Thus, for any polynomial f(X) = f0 + fiX + h fsX3 and any m € M,

d(f(X),m) = 8(f0m) + 0{hX ■ m) + ■ ■ ■ + 0(fsXs ■ m)


= f06(m) + he{Xm) + ■■■ + fse(X"m)
= f0e(m) + hXB(m) + ■■■ + fsXse{m)
= f(X)-6(m),

which shows that 6 is an F[X]-module homomorphism from M to TV.


In view of the importance of this discussion, we summarize it as a formal
theorem.

4.5.1 T h e o r e m
Let F be a field, and suppose that the F[X]-module M is given by the
action of the F-linear transformation a on the F-space V and that N is
given by the action of P on W.
Then there is a bijective correspondence between
(i) F[X]-module homomorphisms 9 : M —> N
and
(ii) F-linear transformations 6 : V —»■ W such that 6a = /39. □

4.6 The matrix interpretation


Suppose that we are in the special case where M is Fp made into an F[X]-
module by the action of a p x p matrix A over F and N is Fn made into
an F[X]-module using an n x n matrix B.
Any F-linear map 6 from Fp to Fn is given by an n x p matrix T such
that 6{v) = Tv for all v £ Fp (this is another fact from elementary linear
algebra that we will re-establish in the next chapter). Thus the F[X]-
module homomorphisms from M to N are given by those matrices T which
satisfy the equality
TA = BT.

4.7 Example: p = 1
As an illustration, we look at the elementary (but sometimes confusing)
case in which p = 1 but n is arbitrary. The space F = F1 has dimension 1,
and we can take the single element 1 G F to be a basis. The action of the
58 Chapter 4. Homomorphisms

variable X on F is given by the constant A in F such that X ■ (1) = A • 1


(see 3.4). Thus F becomes an F[-X"]-module in a different way for every
choice of A in F.
An F-linear transformation 9 from F to F " is given by an n x 1 matrix,
that is, a vector w in F n so that 0(k) = kw for all "vectors" k € F .
Choose some A, let M be the corresponding F[X]-module, and suppose
that F n is an F[X]-module TV through a n n x n matrix B. Then 9 defines
an F[X]-homomorphism from M to TV if and only if w\ = Bw.
Thus a nonzero vector w in F " gives a homomorphism precisely when
it is an eigenvector of B. If we take w to be the zero vector, 9 is evidently
the zero homomorphism.

4.8 Example: a triangular action


Let F be any field and let M be the F[AT]-module on F 2 given by the
matrix
' 0 1
0 0
We describe all the 2 x 2 matrices T that give F[X]-homomorphisms from
n 12
M to itself. Write T = [ I . An easy computation shows that
V 121 *22 /

0 in
TA
0 i2i

and
*21 *22
AT =
0 0
so T gives a homomorphism precisely when i n = i22 and i2i = 0, that is,
T = I . I with i n and ii2 arbitrary.

4.9 Kernel and image


Let R be an arbitrary ring and let 9 : M -> TV be a homomorphism of left R-
modules. We define two submodules, one of M and one of TV, that measure
the failure of 9 to be injective or surjective.
The kernel Ker(0) of 9 is

Ker(0) = {m e M | 9{m) = 0}
4.9. Kernel and image 59

and the image lm(9) of 6 is

lm(0) = {n G TV | n = 0(m) for some m 6 M } .

The corresponding definitions for right modules are left to the reader.

4.9.1 Lemma
(i) Ker(0) is a submodule of M.
(ii) 0 is injective •<=> Ker(0) = 0.
(Hi) \m(6) is a submodule of N.
(iv) 6 is surjective <=> Im(6) = N.
Remark. Some texts use the terms one-to-one for an injective map and
onto for a surjective map.

Proof (i) We prove this claim in full detail to provide a model for
arguments of this type. We have to check the axioms listed in (3.5). First,
we need 0 G Ker(0). In M, 0 + 0 = 0. Thus, in N, 6(0) = 6(0) + 6(0). But
6(0) has a negative in TV, so we get the equations

0 = 6>(0) - 6(0)
= (0(0)+ 0(0)) -0(0)
= 6(0)+ (6(0) -6(0))
= 6(0) +0
= 0(0).

Next, suppose that m,n € Ker(6'). Then

6(m + n) = 6(m) + 6(n)


= 0+ 0
= o,
which shows that m + n € Ker(6). Finally, let m G Ker(0) and let r G R
be arbitrary. We have

6(rm) = r ■ 9(m)
= r-0
= 0,

so that rm G Kei(6).
(ii) ■$=: Suppose that 6(m) = 6(n) for elements m, n of M. Then 6(m - n) — 0,
so m - n G Kev(6), giving m - n — 0 and m = n a s required.
60 Chapter 4. Homomorphisms

= > : If m € Ker(0), then 0(m) = 0 = 0(0), which forces m = 0.


(iii) This follows from the identities
0 = 0(0),
6(m) + 6(n) = 0{m + n),
r ■ 0(m) = 6(rm).
(iv) There is nothing to prove as the claim is simply a restatement of
the definition of a surjective map. P

4.10 Rank & nullity


When 0 is a linear transformation from Fp to Fn given by an n x p matrix
T, the kernel and image have a more familiar interpretation.
The kernel of 0 is
Ker(0) = {v € Fp \ Tv = 0},
which is the set of solutions of a system of n linear equations in p unknowns.
We will write Ker(T) for this space; it is sometimes called the null space of
T.
The dimension dim(Ker(T)) of Ker(T) is the nullity null(T) of T or
null(0) of 0.
Suppose w e lm(0). Then w = Tv for some v £ FP, and
w = v{Te\ + ■ ■ ■ + vpTep,
where
/ 1 \
0 1 0
ei = ,e2 = , . . . , ep —

\° / \° / \l )
is the standard basis of Fp. A direct calculation, to be spelt out in a
wider context in the next chapter, shows that the vectors { T e i , . . . ,Te p }
are simply the columns of T, so that lm(0) is the subspace of Fn spanned
by the columns of T. This space is familiarly known as the column space
of T.
The dimension of lm(0) is called the rank of 0 or of T, and written
rank0 or rank(T).
The rank and nullity are connected by the following well known result,
which is called variously the Rank and Nullity Theorem or Kernel and Image
Theorem:
rank(0) + null(0) = p.
4.11. Some calculations 61

4.11 Some calculations


Here are some computations of kernels and images. First, we look at the
multiplication homomorphisms (4.3).
(i) Let R be commutative. Given an .R-module M and a fixed element
a G R, the homomorphism a(a) : M -» M was defined by a(a)m = am for
all m e M. Then
Ker(cr(a)) = {m | am = 0}
and
Im(a(a)) = {ne M \n = am for some m € M } = aM.

When R = 7L and M = Z 6 , we have for instance

Ker(<7(3))= {0,2,4}
and
Im(<7(3)) = {0,3}.
(ii) Let R be arbitrary. For a fixed a; in M, r ( i ) : fi —\ M is given by
T(X)T = rx. The image Im(r(a;)) is simply the cyclic submodule Rx of M
generated by x (as in section 3.8), so that T(X) is surjective precisely when
M is cyclic with x as a generator.
The kernel of T(X) is called the annihilator Ann(:r) of x:

Ann(x) = {r G R \ rx = 0}.

Ann (a;) is a left ideal of R that will play an important part in our discussions
in later chapters.
(iii) Finally, we take M to be the module of section 4.8, which is given
by the matrix A = I n „ I over an arbitrary field F, and we compute
the kernel and image of each F[X]-homomorphism from M to itself. Recall
' t t
that such a homomorphism is given by a 2 x 2 matrix T - ' n
0 in
x
\ - "2
with tn and £12 arbitrary, so that, for v = f ) G F , we have Tv =
tnx + ti2y
tny
There are two approaches, one by direct assault and one that uses a
little subtlety. We give both for illustration. The direct calculation of the
' 0
kernel, those v with Tv = I 1, falls into three cases.
If i n / 0, then tnV = 0 gives y = 0 and then x = 0, so that Ker(T) = 0.
62 Chapter 4. Homomorphisms

On the other hand, if t n = 0, we require only ti2y = 0. If t i 2 ^ 0, then


x is arbitrary and y = 0, so that Ker(T) = ( 1 . Finally, if t i 2 = 0 also,
then T gives the zero map and Ker(T) = M.
The calculation of the image falls into three cases as well. We have to
find all w = I , 1 £ F2 for which we can solve the equations

tnx + t12y \ _ / a
tuy ) \ b
If t n 7^ 0, we can solve for y and x in succession, so T is then surjective.
If tn = 0, we can only obtain those w with 6 = 0. If t 12 ^ 0, we can always
solve the equation ti2y = a, which shows that Im(T) = I _ I . Finally,
T = 0 implies Im(T) = 0.

Now we take a more intellectual line. To start, we notice that any


proper, nonzero submodule of M must have dimension 1 as an F-space and
so it must be given by an eigenspace of A - see Theorems 3.9.1 and 3.10.1.
But the unique eigenspace of A is I _ , so M has exactly one proper
0
nonzero submodule. Thus the only possibilities for Ker(T) and Im(T) are
0, L and M, whatever T.
By the Rank and Nullity Theorem (4.10), these possibilities are not
independent; we must have

Ker(T) = M and Im(T) = 0,

or
Ker(T) = L and Im(T) = L,

or
Ker(T) = 0 and Im(T) = M.
The first combination corresponds only to T = 0. For the second, we must
have t n = 0, by direct observation, and ti 2 ^ 0. Thus if both tu and t i 2
are nonzero, we must be in the third case.

4.12 Isomorphisms
In our study of the structure of modules, it will be important to know when
two superficially different modules are in essence the same. This is the case
when there is an isomorphism between them.
4.12. Isomorphisms 63

The definition is as follows. An R-module homomorphism 0 : M -> TV


is said to be an isomorphism if 9 is bijective, that is, it is both injective and
surjective. If there is an isomorphism from the module M to the module
TV, then M and TV are said to be isomorphic; the notation is
M^N.
It is often convenient to have an alternative description of isomorphisms
in terms of invertibility. An .R-module homomorphism 9 : M -> TV is
invertible if there is an .R-module homomorphism <f> : TV -> M such that
4>9 = id,M, the identity map on M, and 9<j> = id,N, the identity map on TV.
The equivalence of invertibility and isomorphism, and a little more, is
given in the next result. As with Lemma 4.9.1, the proof is given in full
detail to ensure that the reader has a model for arguments with modules
and their homomorphisms.

4.12.1 Proposition
Let R be a ring, let M and TV be R-modules, and let 9 : M —> TV be an
R-module homomorphism. Then the following statements are equivalent,
(i) 9 is an isomorphism,
(ii) 9 is invertible.
(Hi) Ker(0) = 0 and lm(0) = TV.

Proof (i) => (ii): We have to construct an inverse map <j> : TV —► M.


Take an element n e TV. Since 9 is surjective, there is some m € M with
9(m) = n, and, since 9 is injective, m is uniquely determined by n. We
can therefore define a map <j> : TV —>■ M by <j>{n) = m, and the method of
construction guarantees that, as a map, 4> is an inverse of 9.
However, we also need to check that (p is an R-module homomorphism.
Suppose that 4>(n) = m and that 4>(n') = m'. Then
n + n' = 9{m) + 6(ri) = 9(m + m'),
which shows that 4>{n + n') = m + m', as desired. Similarly, (j>(rn) = r<j>(n)
for all r € R.
(ii) => (hi): If m e Ker(0), then
m = idM(m) = <t>{9{m)) = 0(0) = 0,
so Ker(0) = 0.
If n £ TV, then
n = idN(n) = 9{4>{n)) e lm(0),
so that lm(0) = TV.
(iii) => (i): Immediate from Lemma 4.9.1. D
64 Chapter 4. Homomorphisms

4.13 A submodule correspondence


Given a homomorphism 9 : M -+ N of left P-modules, we can find a
relationship between the submodules of M and the submodules of N. In
applications, the submodules of one module will be known to us, so we can
then describe some of the submodules of the other.
For a submodule L C M of M, we put

fl.(L) = {0(1) 11 6 L},


the image of L, and for a submodule P C JV of iV, we define
0*(P) = { m e M | 0 ( m ) g P } ,
the inverse image of P .
A routine verification confirms that 9„(L) is a submodule of TV and that
9*(P) is a submodule of M.

4.13.1 P r o p o s i t i o n
Let R be a ring, let M and N be left R-modules, and let 9 : M —>■ N be
a surjective R-module homomorphism. Then the following assertions hold.
(i) Let L be a submodule of M with Ker(0) C L. Then
9*9*{L) = L.
(ii) Let P be a submodule of N. Then
9,9*(P) = P.
(Hi) The maps 9, and 6* are mutually inverse bisections between
the set of submodules L of M that contain Ker(0)
and
the set of submodules P of N.
Explicitly,
L^6,(L)
and
P^9*(P).

Proof
(i) It is obvious that L C 9*9*(L). To prove equality, take any m €
6*9,(L). Then 9(m) = 9(1) for some I € L, so that
m - I = k G Ker(0) C L.
Thus m € L.
Exercises 65

(ii) It is clear from the definition that 0*6* (P) C P. But if p £ P, then
p = 9{m) for some m £ M, as 0 is surjective. Thus m € 0*{P),
again by the definition, so we have equality.
The final assertion is now obvious. D

Exercises
4.1 Let R be a commutative ring and let M and N be R-modules. Let
Hom(M, N) be the set of all .R-module homomorphisms 9 : M -> N'.
Show that the addition defined in section 4.2 makes Hom(M, N) into
an additive group, with zero element the zero map, and with — 9
defined by -9(m) = -(0(m)) for all m £ M.
For any r £ R, define r9 by

(rO){m) =r(0{m))

for all m. Verify that r0 £ Hom(M, TV) and hence that Hom(M, N)
is also an .R-module.
4.2 Let R be a ring, let M, N and P be .R-modules and suppose that
0,i> e Hom(M, N) and <j>,p £ Hom(iV, P).
Verify that
(f,(9 + i>) = 00 + 4>ip

and that
(cf) + p)9 = 09 + P9.

Let Q be another .R-module and let w £ Hom(P, Q). Show that

u{<t>9) = (w</>)9.

4.3 Combine the results from the above exercises, show that Hom(M, M)
is a ring.
Aside: this ring is called the endomorphism ring of M. Usually, the
endomorphism ring of a module is noncommutative (see Exercise 5.9).
Thus endomorphism rings do not play an explicit role in this text, in
marked contrast to their fundamental importance in ring theory in
general.
4.4 Let R be a commutative ring, let M be an R-module and write S =
Hom(M,M). For a in R, let

cr(a) : R —> £ , (x(a)m = am,


66 Chapter 4. Homomorphisms

be the multiplication map defined in section 4.3. Verify that for any
elements a,b 6 R,
a(a + b) = a (a) + a(b)
and
a(ab) = cr(a)a(b)
and that
o-(lfi) = I s = idM-
Aside: in general, a map a from a ring fitoa ring S that satisfies the
above equalities is, by definition, a ring homomorphism. A bijective
ring homomorphism is a called a ring isomorphism. Ring homomor­
phisms play only a minor role in these notes, although we have used
them implicitly in our discussion of residue rings of polynomials in
section 2.12 - when we regard a scalar k G F as the same as its image
k G F[X]/F[X]f, we are using the fact that the map k —> k is an
injective ring homomorphism.
4.5 Take M = R in the preceding exercise. Show that a(a) is an injective
homomorphism on R for all nonzero a € R «=^> R is a domain.
Show also that <r(a) is surjective for all nonzero a in R if and only
if R is a field.
4.6 Let R be a commutative ring and let M be an .R-module. For each x
in M, let
T(X) : M -> Hom(i?, M), r(a;)r = rx,
be as in section 4.3. Show that r is an isomorphism of .R-modules.
4.7 Let F be a field, let M be F p made into an F[X]-module through a
p x p matrix A, and let N be F made into an F[X]-module through
a constant A (so we are in the reverse situation to the example in
section 4.7).
Show that the F[X]-module homomorphisms from M to TV are
given by the row vectors w of length p with wA = Aw.
Note: a square matrix A has two sets of eigenvectors, those satisfying
Aw = Xw and those satisfying vA = Av. These are the right and left
eigenvectors of A, respectively. As we nearly always regard a matrix
as a left operator, we use the term "eigenvector" to mean a right
eigenvector.
4.8 Let F be a field and let M and N be F[X]-modules given by the
F-linear transformations a and /? respectively. Let M(h) denote the
F[X]-module given by ah for h > 1 and define N(h) similarly.
Show that if 6 : M —> N is an F[X]-module homomorphism, then
9 : M{h) —► N(h) is also a F[X]-module homomorphism for all h > 1.
Exercises 67

4.9 Here are 5 modules over R[X], listed with the 2 x 2 matrices which
define them.

0 1
L :A =
1 0
0 -1
M :B = I " ■■
" 1 (see the example in 3.12)
1 0
1 1
N :C = I ) (see the example in 3.11)
0 1
0 1
P :D = ( n j (see the example in 4.8)
0 0
0 1
Q :E =
-1 0

Using the fact that an isomorphism R2 —> R2 must be given by


an invertible 2 x 2 matrix, determine which of these modules are
isomorphic to one another. Hint: The preceding exercise helps!
4.10 Let L be the R[X]-module given by I I, as in Exercise 4.9, and
/0 1 0\
let Z be the R[X]-module given by the 3 x 3 matrix 0 0 0 1.
\0 0 l)
Show that Hom(L, Z) has dimension 1 as an R-space, and that there
are no injective or surjective homomorphisms from L to Z.
Discuss Hom(Z, L).
4.11 Let 9 : M —> TV be a homomorphism, and define 9* and 9* as in
section 4.13.
Describe 9*9,(L) when the submodule L of M need not contain
Ker(0).
For a submodule P of TV, describe 9*9* (P) when 9 is not neces­
sarily surjective.
Deduce that 6* and 9* need not be inverse bijections when the
conditions of Proposition 4.13.1 are relaxed.
Chapter 5

Free Modules

A fundamental result about vector spaces is that any finite dimensional


vector space has a basis. In contrast, a module over an arbitrary ring of
scalars need not have basis, and so we must give a special place to those
modules that do have a basis, namely, the free modules. As we shall see, the
theory of free modules and their bases is a generalization of the familiar
theory of vector spaces. In this chapter we give the definition of a free
module in terms of bases, and we show how the alternative bases of a free
module are related by matrices. We also give the matrix description of the
homomorphisms between free modules.
The fact that results about free modules can be re-interpreted in terms
of matrices is crucial to our analysis of modules over Euclidean domains in
subsequent chapters.
This chapter also contains a brief survey of the properties of determi­
nants, up to the computation of the inverse of a matrix through its adjoint.
The "supplementary topic" sign that adorns the margin refers not to g
the individual topics that are covered in this chapter, but to the treatment
of them. In the lecture course on which these notes are based, time did
not permit me to spell out the details of the derivation of the properties
of change of basis matrices, or of matrices of transformations. Instead, the
students were assured that everything was essentially the same as it was
in a previous course on vector spaces. However, I have provided the extra
details in this text so that it is more self-contained, and hopefully easier for
the reader to follow.

69
70 Chapter 5. Free Modules

5.1 The standard free modules


Let R be a ring. The standard free left R-module of rank k is the set R
In \
of all fc-tuples m = . , where n , r 2 , . . . , rk are arbitrary elements of

\n )
the ring R. The set .R-module structure on Rk is given by the expected
rules for addition and scalar multiplication: if

/ si \
1"2 Si
m: and n are in R ,

\rk J \sk )

then
( ri + si \
r 2 + s2
m +n

and for r m. R,
( rri \
rr2
rm =
V rrk )
A routine verification confirms that Rk is a left R-module. When the ring
of coefficients is a field F, Fh is familiar to us as the standard column space
of dimension k.
The standard basis of Rk is the set

/ 1 \ /o\ ( 0 \

ei = , ej — ■i efc =

\0J \0 J \ i /
where, for j = 1 , . . . , k, ej has the entry 1 in the j-th place and zeroes
elsewhere.
5.2. Free modules in general 71

For any element m of Rk as above, we can write

m = rid H h TjCj H h rkek.

Note that the coefficients r\,..., rk are uniquely determined by m since


members m,n oi Rk are the same precisely when tj = s3 for all j .
When k = 0, the convention is that R° = 0, the zero module, whose
standard basis is taken to be the empty set 0. For k = 1, we have R1 = R
and ei = 1, the identity element of R.

5.2 Free modules in general


We extend the definition of a basis from vector spaces to modules in a
straightforward way.
Let R be a ring and let M be an R-module. Recall from (3.8) that
a subset B = {b\,..., bk} of M generates M as an .R-module if for each
m e M there is a set of coefficients {ri,... ,Tk} C R with

m = 7-i&! + . . . + rkbk-

We say that the subset B is linearly independent over R if the equality

r-ybi H \-rkbk =0

holds only if
n = • • ■ = r;. = o.
Then B is a basis of M if it is linearly independent over R and it
generates M as an .R-module.
A free R-module is denned to be an R-module M that has a basis.
The number of elements in the basis is called the rank of M, and written
rank(M).
It is clear that the standard basis of Rk, as defined in the preceding
section, is actually a basis of the standard free module Rk, and consequently
Rfc is indeed free, of rank k. By our conventions, the zero module 0 = R°
is free of rank 0 since its basis is the empty set.
Before we commence a detailed discussion of bases, here are some points
to bear in mind.
• When the ring of scalars is a field F, a finitely generated F-module is the
same thing as a finite dimensional vector space over F. By elementary
linear algebra, such a vector space V always has a basis, and the rank of
V is usually referred to as the dimension of V.
72 Chapter 5. Free Modules

• Two basic results in elementary linear algebra are that any linearly inde­
pendent subset of a finite dimensional vector space V over a field F can
be extended to a basis of V, and that any generating set of V contains a
basis of V.
These results do not hold for modules over more general rings. For
example, the subset {2} of Z is linearly independent but Z has no basis
of the form {2, o , . . . } , whatever the choice of a , . . . in Z. The set {2,3}
generates Z as a Z-module, since 1 = 2 - 2 - 3 , but no subset of {2,3} is
a basis.
• A consequence of the results quoted above is that every finite dimensional
vector space over a field F is a free F-module. When the coefficient ring
R is not a field, we expect to find .R-modules that are not free (Exercise
5.1). For example, the Z-modules Z m , m > 0 (1.11) contain no linearly
independent subsets, since mx = 0 for any x e Z m .
• Warning! Some authors define bases in a different way which allows the
possibility that Z m has a basis as a Z-module; however, the definition of
a free module must then be altered.
• Whether or not a module is free depends on the ring of scalars, as do
the concepts of linear independence and generation. Consider the residue
ring Z p where p is a prime. This is a field (1.11.2) and so free of rank 1
as a Zp-module, but it is not free as a Z-module.
• Another illustration of the same type is provided by the standard vector
space Fk over a field F, made into an F[X]-module M with X acting as 0.
Then M is not free as an F[X]-module as it has no linearly independent
subsets over F[X].
• According to our definition, the rank of a free module depends on the
basis B. We shall see soon (Theorem 5.5.2) that the rank is in fact
independent of the choice of basis (at least, for the rings of most inter-
set to us in these notes). However, there are rings for which the rank
of a free module can vary with the choice of basis - see section 2.3 of
[B & K: IRM].
• Our definition of a free module requires that the rank is finite. Since
free modules of infinite rank play only a minor role in this text, we
have relegated their definition to Exercise 7.10. The extension of the
definitions and results of this chapter to modules with infinite ranks is
discussed in [B & K: IRM], (2.2.13).

The following restatement of the definition of a basis is very useful.

5.2.1 Lemma
Let R be a ring, let M be a left R-module, and let B = {&i,..., 6fe} be
a subset of M.
5.3. A running example 73

Then the following assertions are equivalent,


(i) B is a basis of M as an R-module.
(ii) Given m e M, there is a unique set of coefficients {r\,..., rj,} in R
with
m = ribi H \-rkbk-

Proof (i) => (ii): Suppose that B is a basis. Since B generates M, we


can write m = rxb\ + ■ ■ • + rkbk for some r\,..., r^ in R. If also m =
s\b\ + ■ ■ ■ + Skbk with s i , . . . ,Sjt in R, then

0 = ( n - &i)bi + h (rfc - sfc)6fe


and so rt = Si for all i by linear independence, which shows that the
coefficients are unique.
(ii) <= (i): Our assumption obviously implies that B generates M. To
prove linear independence, suppose that

0 = nbi H \-rkbk

for some coefficients r\,..., 7>. Since

0 = Oh + 1- 0bk

also, the uniqueness of the coefficients guarantees that r; = 0 for all i. □

5.3 A running example


Our general discussion of bases is necessarily rather formal, so we will anal­
yse an example to provide concrete illustrations of the various concepts that
we encounter.
We let &i = I ) and 62 = I I be elements of Z 2 , where, for the
moment, a is any integer, and we ask if set B = {61, 62} is a basis of 1?.
To see if B is linearly independent, we must try to solve the equation

0 = ri^i + r2bi with r i , r 2 e Z .

This gives the pair of equations

n + 2r 2 = 0
- r i + ar2 = 0

and hence the equation


( 2 + a ) r 2 = 0.
74 Chapter 5. Free Modules

Thus the set B is linearly independent provided that a ^ - 2 .


To see if B generates Z 2 , we try to write

ei = r\b\ + r2b2 for some r i , r 2 € Z.

The pair of equations is now

n + 2r 2 = 1
—r\ + ar 2 = 0

which give the equation


(2 + a)r 2 = 1.
This can be solved in Z only if 2 + o = ± 1 , that is, a = - 1 , - 3 . In the case
a = —1, we have

e\ = —b\ + 62
e2 = — 2£>i + 62

and so, for an arbitrary element

*=(sH
we have
x = ( - i i - 2x2)h + (xi + x2)b2.
This confirms that, for a = —1, B is a basis of Z 2 ; the reader is recom­
mended to make the corresponding calculation for a = —3.

5.4 Bases and isomorphisms


Now we give a useful characterization of the bases of a free module in
terms of the isomorphisms between the given module and a standard free
module.

5.4.1 T h e o r e m
Let R be a ring and let M be a left R-module. Then there is a bijective
correspondence between
(i) the set of all R-module isomorphisms 6 : Rk —> M
and
(ii) the set of all bases B = {b\ ..., bk} of M.
5.4. Bases and isomorphisms 75

Under this correspondence, an isomorphism 9 corresponds to the basis


0 ( e i ) , . . . , 9(ek), where e i , . . . , ek is the standard basis of Rk.
In particular, an R-module M is free if and only if M = Rk for some
k.

Proof Suppose that 9 : Rk -> M is given, put 8(ej) = bj for j = 1 , . . . , k


and define B = {bi,... ,bk}. We show that B is a basis by verifying in turn
that B generates M and that B is linearly independent.
Given m € M, we have m = 6(x) for some x € Rh, since # is surjective.
But x = X\e\ + ■ ■ - + xkek for some x\,..., Xk £ R, and therefore

m = ^(x)
= 6>(a;iei) + ••■ +6{xkek)
= xi9(ei)-\ \-xk9(ek)
= xih-\ \-xkbk,

which shows B generates M.


If 0 = x\bi + ■ ■ - + xkbk, then 0 = 0(x\ei + ■ ■ - + xkek). But 9 is injective,
so 0 = x^e\ + ■ ■ ■ + xkek in Rk, from which x\ = ■ ■ ■ = xk = 0.
Conversely, suppose a basis B is given. By Lemma 5.2.1, each element
m in M can be written m = r\b\ + ■ ■ ■ + rkbk with unique coefficients. We
can therefore define maps 9 : Rk -> M by

#(nei H h rkek) = nh -\ h rkbk

and %l) : M ->■ Rk by

iKn&i H h rkbk) = rxei H h rkek.

A direct verification confirms that 9 and V a r e homomorphisms oi .R-


modules, and they are obviously inverses of one another, which shows that
9 is an isomorphism - see Proposition 4.12.1.
The correspondence must be a bijection since 9 and B determine one
another uniquely.
The final assertion is now clear. □
For an illustration, take the basis

*-(_! u= .;
of Z 2 which we constructed in section 5.3. The theorem predicts that there
is a corresponding isomorphism 9 : Z 2 -> Z 2 with 9{e\) = b\ and #(e 2 ) = b2-
76 Chapter 5. Free Modules

For a general element x = I I of Z 2 , we have

e{x) = Xlbi + x2b2 = ( 11**11

Since x = (-xi-2x2)b\ + (xi + x2)b2, the inverse of 6 is the homomorphism

- x i - 2x2
tp(x)
X
Xi+ X2

5.5 Uniqueness of rank


Our aim now is to show that the rank of a free module is unambiguously
defined as the number of elements of any basis. To simplify the discussion,
we assume the fact that a commutative domain -R has a field of fractions
Q whose elements can be written in the form r/s with r , s e i i , s / 0 -
see §3.10 of [Allenby]. The benefits of our assumption are contained in the
following technical lemma.

5.5.1 Lemma
Let R be a commutative domain with field of fractions Q. Then the
following statements hold.
(i) If qi = Ti/Si, i = 1 , . . . ,k is any finite set of elements in Q, there is
an element s € R and elements ai € R with qi = Oj/s for all i.
Note: s is called a common denominator of q\,... ,qk, and this
rewriting process is known as "placing over a common denomina­
tor".
(ii) Rk C Qk for anyk>\.
(Hi) If v £ Qk, then v = (l/s)m for some s 6 R and m € Rk.
(iv) If B is an R-basis of Rk, then B is also a Q-basis of Qk.

Proof (i) We have qi = n/si e Q, ri,Si £ R for each i = l,...,k,


with all Si ^ 0. Let s = s\.. .Sk and put ai = (s\... Si-iSi+i... Sk)ri for
i = 1,... ,k.
(ii) This is obvious since R C Q.
(iii) Let v 6 Qk. Then

(91 \
n\
V = = ((V») = (l/s)m
i
\9k ) V ak )
5.6. Change of basis 77

where qx,... ,qk are in Q, s is a common denominator as in part (i), and


/ a!
eR k
m
\ ak
(iv) Write B = {bi,...,bh}. (We do not assume k = hi) First, we
show that B remains linearly independent in Qk. Suppose that qxbi +
■ ■ ■ + qhbh = 0 with g* 6 Q. Keeping the above style of notation, we have
(l/s)(aibi H h ahbh) = 0 and hence a%bi H h ahbh = 0 in Rk. Thus
all a, = 0 and hence all qi = 0.
To see that B generates Qk, take v G Qk, write w = ( l / s ) m with m £ M,
and note that m = ri6 x + ■ • ■ + r^fy, for some elements rt e R. □
The uniqueness of rank follows easily.

5.5.2 T h e o r e m
Let R be a domain and let M be a free R-module. Then any two bases
of M have the same number of elements.

Proof Suppose that M has two bases, one with h elements and one with
k elements. By the previous result, there are isomorphisms 6 : Rh —5- M
and <)> : Rk —» M, and hence an isomorphism 6~14> : Rk —> Rh. Thus the
standard free module Rh itself has a basis B with k elements, by Theorem
5.4.1.
But the dimension of a vector space over the field of fractions Q is
unique, and, as B is also a basis of Q , we have h = k. □
Remark: the rank of a free .R-module is unique for any commutative ring
R (Exercise 5.2).

5.6 Change of basis


We next explore the relationships between the various possible bases of a
free module over a commutative domain R. To set the scene, we consider
the trivial but instructive case of the standard free module of rank 1, that
is, R itself.
The standard basis of R is the set {1}, and any other basis must consist
of one element, say b. The submodule of R generated by b is the principal
ideal Rb, so that {b} generates R precisely when Rb = R, that is, b is a
unit (see Lemma 1.8.1). If b is a unit, then the equation rb — 0 holds only
if r = 0, so that {b} is linearly independent as well. Thus the bases of R
are the sets {6} with b a unit. If {c} is another basis, the two bases are
78 Chapter 5. Free Modules

related by t h e innocuous equations b = (bc~1)c and c = ( c 6 - 1 ) 6 in which


both "change of basis coefficients" 6 c _ 1 and c 6 _ 1 are themselves units.
Now consider a general free i?-module, with bases B = {b\,... ,bk} and
C = {ci,...,Cfc} - by Theorem 5.5.2, t h e bases must contain t h e same
number of elements. To relate t h e bases, we use t h e fact t h a t a member
of a free module can be written uniquely as a linear combination of t h e
elements of a given basis (Lemma 5.2.1).
We can therefore write
b\ = pnci + P21C2 H + PkiCk
62 = Pl2Cl + P22C2 H 1" Pk2Ck
(5.1)

bk = PlfeCi + p2kC2 H 1" PkkCk


and
c\ = quh + q2ib2 H + qkih
C2 = quh + 92262 H 1- qk2bk
(5.2)

Cfc = qikbi + ?2fc&2 H 1- 9fcfc&fc


in which the coefficients p y and qij,i,j = \,...,k, are uniquely determined
elements of .R. T h e order of t h e suffices may be unexpected, but it is best
suited to computations, which fact will become apparent as we progress
(see Theorem 12.11.1 in particular). T h e underlying reason for this choice
of ordering of suffices is t h a t we write both scalars and transformations on
the left-hand side of module elements.
Substituting, we obtain a n expression for t h e basis B in terms of itself:
for each h = 1 , . . . , k, we have
k k ( k \ k / k \
b
h = ^PihCi = ^2pih ^qjibj = J2[ ^QHPih I bJ- (5-3)

However, t h e unique way of expressing B in terms of itself must be


61 = I61 + 06 2 + • ■ • + Obfc
62 = O61 + 16a + • • • + Ofcfc
(5.4)

bk = O61 + 06 2 + • • ■ + l i t
which gives t h e identities
k f 0 if h ^
(5.5)
\ lifh =
5.7. Coordinates 79

The last equation can be summarized conveniently in matrix form. Put

Pc,B = (pih) and PB,c = (qji),


so that both PC,B and PB,C are k x k matrices over R, the change of basis
matrices for the pair B and C. Then Eq. (5.5) reads

PB,CPC,B = I, (5.6)

where / is the k x k identity matrix.


Reversing the roles of B and C, we find that

PC,BPB,C = I (5.7)

also, so that PC,B and PB,C are mutually inverse matrices over R.
We illustrate these calculations with the basis B = {61,62} of Z 2 ,

♦.-(-DM.;)-
which we considered in sections 5.3 and 5.4, taking C = E, the standard
basis.
It is clear that

M-l -0-
that is, the columns of Pc,s are simply the vectors 61,62 themselves.
To compute PB,C, recall from section 5.3 that
ei = -61 + 62
e2 = -26j + 62;

thus

Ml "0-
If we are given a basis B of a free module and an invertible matrix Q
of the correct size, we can construct a new basis C so that Q = PB,C but,
before we do so, it is convenient to discuss coordinates.

5.7 Coordinates
Given a free .R-module M with basis B = {&i,...,6fc}, we know from
Lemma 5.2.1 that for m e M, there is a unique set of coefficients { r i , . . . , r^}
in i? with
m = ri&iH +r fc 6 fc .
80 Chapter 5. Free Modules

The element

MB = I i I e Bk

is called the coordinate vector of m with respect to B. (Strictly speaking,


(m)s isn't a vector unless R is a field, but it is convenient to extend the
use of the word "vector" to the more general situation.) Now suppose that
C is another basis of M and that B and C are related as in Eqs. (5.1) and
(5.2). Substituting, we obtain

m = ri(pnci H hPfciCfc)
+ V rk(j>ikci H 1- PfcfcCfc)
= {pun H hpifcrfc)ci
+ r {pkin + r PkkTk)ck
which can be summarized as

{m)c = Pc,B{m)B. (5.8)

Notice that when E is the standard basis of the standard free module
Rk, we have
k
(X)E = x for all x € R .

In our running example, the calculations in section 5.3 show that

\ xi + X
Xi x22 J
for our illustrative basis B, while (with C = E),

«■* - - 1 1 ))
by the computation in section 5.6. An easy verification confirms Formula
5.8.

5.8 Constructing bases


Suppose that we have a basis B of the free module M of rank k, and that
we have an invertible kxk matrix Q = (g^) over R, with inverse P = (py)-
We now interpret Eq. (5.2) as the definition of a set C = { c i , . . . , cjt} of
elements of M, and we claim that C is a basis of M.
5.9. Matrices and homomorphisms 81

First, notice that the identities in Eq. (5.1) still hold - this is confirmed
by substituting for Ci,...,Cfc in the right-hand side of the equation and
using the fact that P is given to be the inverse of Q. It follows that C
generates M, for any m e M has the form

m T-I&H \-rkbk,
and so can be expressed in terms of c\,..., ck by substitution.
Suppose next that 0 = siCi + ■ ■ ■ + SkCk for some scalars 8\,..., s& in
R. Substituting for each c, in terms of the bi's and calculating coefficients
as in section 5.7 above, we find that

0 ='- (0)B = Qs
where

s-
\ sk
Then s = PQs = 0, so that C is linearly independent and hence a basis for
M.
Notice that Q = PB,c and P = PC,B-

5.9 Matrices and homomorphisms


Suppose that M and TV are free modules over a commutative domain R.
Our aim in this section is to show that an .R-module homomorphism 0 :
M —> N can be represented by a matrix T, and conversely, that a matrix
of the correct size defines a homomorphism from M to N.
Let B = { 6 i , . . . , bk} and C — {c\,..., C/} be bases of M and N respec­
tively, so that M has rank k and N has rank I. Suppose that 9 : M —> N
is an R-module homomorphism. The image 0(bi) of a member of B is an
element of N, and so, by Lemma 5.2.1, can be written as a linear combina­
tion
6(bi) = t\iCX H Ytuci
with unique coefficients tu, ■ ■ ■ ,tu in the ring of scalars R. We can thus
associate with 6 an I x k matrix
/ tn t 12 hk \

T = {6)C,B thl th2 thk (5.9)

V tn ti2 Uk )
82 Chapter 5. Free Modules

with entries in R. The matrix T is called the matrix of the homomorphism


6 with respect to the pair of bases B, C.
Notice that the i-th column of T is the coordinate vector {0(bi))c of
9(bi) with respect to C.
A computation very similar to that used to derive Eq. (5.8) shows that
for any element m in M, the coordinate vector of 6(m) is related to that of
m by the formula
(0(m))c = (8)c,B(m)jj. (5.10)
(In fact, (5.8) can be obtained as a special case of (5.10); see Exercise 5.4.)
Conversely, if an I x k matrix T is given, then Eq. (5.10) can be used to
define an .R-module homomorphism 6 from M to N, since the image 9(m)
of an element m G M is uniquely determined by specifying its coordinate
vector (6(m))c-
The fact that the entries thi of (6)c,B are uniquely determined by 9
confirms that, for fixed bases B and C, the correspondence

0 <-> (0)C,B (5.11)

defines a bijection between .R-module homomorphisms from M to N and


I x k matrices over R.
The following results are often useful.

5.9.1 Proposition
Let R be a commutative domain and let M, N and P be free R-modules,
with bases B, C and D respectively. Then the following hold.
(i) {idM)s,B = I, where id^ is the identity map on M and I is a kx k
identity matrix, k = rank(M).
(ii) If 0 : M —¥ N and 4>: N —»■ P are R-module homomorphisms, then

{<fi)D,c(e)c,B = (4>0)D,B-

Proof Let B = {blt..., bk}, C = {ct,..., a) and D = {dly..., dm}.


(i) This is obvious, since idM(h) = bi for all i.
(ii) Write (6)C,B =T = (Uj), an / x k matrix, and {(J))D,C = S = (shi),
an m x / matrix. Then

4>6(bj) = 4>(J2UjCi\
i

= y2uj4>{ci)
i=\
5.10. Illustration: the standard case 83

fc 1 IIL

i=\
'\J2Sh idh I
\h=l
m / I \
1 dh,
h=\ \i=\ J

which shows that, for all h = 1 , . . . , m and j = 1 , . . . , k, the (h, j)-entry of


the m x p matrix (4>9)D,B is the same as that of the product matrix ST.
D

5.9.2 Corollary
Let M and N be free R-modules, with bases B and C respectively, and
let 9 : M —> N be an R-module homomorphism.
Then 6 is an isomorphism if and only if (0)c,B is an invertible matrix,
in which case {{0)C,B)~ = ( # - 1 ) B C-

Proof =>: Suppose that 6 is an isomorphism. By Proposition 4.12.1, 6 has


an inverse, and by the preceding result

(6)c,B(0-1)B,c = (idN)c,c=I
and
{6-1)B,G{6)C,B = (idM)B,B = I.
4=: If (9)c,B has an inverse S = (shi), take <j>: N —> M to be the homomor­
phism with matrix S. Then both products 9(/> and <j>6 have as matrix the
identity matrix, so they are both identity maps, on N and M respectively.
Thus (j> = 9~1. U

5.10 Illustration: the standard case


Suppose that M and N are the standard free modules Rk and Rl respec­
tively, and that we take as bases the standard free bases, which we denote
E{k) and E(l). An element m of M can then be identified with its coordi­
nate vector {m)Eiky Likewise, if 9 is a homomorphism from Rk to Rl, we
have 9(m) = {9{m))E(iy Thus, Eq. (5.10) appears as

9(m) = {9)E(l)Mk)m (5.12)

so that the homomorphism 9 is given by the matrix

T
= (9)E(i),E(k)
84 Chapter 5. Free Modules

which is an / x k matrix whose i-th column is simply the image 0(e;) of the
i-th standard basis vector e^ in E(k).
Thus for many purposes the homomorphism 9 can be viewed as being
effectively the same as the matrix T it defines. However, this identification
of a homomorphism and a matrix is dependent on the fact that we use the
standard bases, and so it can be misleading when we use nonstandard bases
of standard free modules.
When the ring of scalars is a field F and the free modules are the
standard vector spaces Fk and Fl with their standard bases, we recover
the correspondence between F-linear transformations and matrices that is
familiar from elementary linear algebra, as promised in sections 4.6 and
4.10.

5.11 Matrices and change of basis


Next, we show how the correspondence between homomorphisms and ma­
trices depends on the choice of bases for the free modules M and TV. This
relationship will be important for our subsequent analysis of modules in
general. Let B = {b\,..., bk} and B' - {b[,..., b'k} be bases of M, and let
C = { c i , . . . , c;} and C = {c[,..., c[} be bases of N. Then an .R-module
homomorphism 6 : M —>• N has two associated I x k matrices,

TC,B = (thi)

and
TC,B> = (t'gj)-
We claim that these matrices are related by the formula

Tc,B' = PC,CTC,BPB,B', (5.13)

where Pc,c aud PB,B> a r e change of basis matrices. There are two ap­
proaches to the verification of this formula. One is simply to expand the
matrix on the right and check that we have the desired equality. Although
this method is elementary, it does require a lot of detailed calculation. A
more sophisticated approach is to use the calculations that we have already
performed, together with a nice observation.
If we compare the formulas given in Eqs. (5.1) and (5.9), we see that the
change of basis matrix PB,B' is just the basis of the identity transformation
on M with respect to the pair of bases B',B, so that

PB,B' = B{idM)B'- (5.14)


5.12. Determinants and invertible matrices 85

Likewise
Pc,c = c(idN)c-
Since
6 = idjv • 0 ■ id,M,
we obtain the relation 5.13 immediately from the product formula for ma­
trices of homomorphisms that we obtained in Proposition 5.13.

5.12 Determinants and invertible matrices


Invertible matrices play an important role in the theory of modules, since
they represent both change of bases within free modules and also isomor­
phisms between free modules. It will therefore be useful to develop some
criteria that allow us to decide if a given matrix is invertible. In this section
we give one such criterion in terms of the determinant - later we will give a
more constructive approach based on row and column operations (section
10.8).
We adopt an inductive definition of the determinant. Let A = (a^) be
a k x k matrix with entries in a commutative ring R. If k = 1, we have
det(A) = a n , and for k = 2,

det(A) = ana22 — ai2«2i-


For general k, we assume we know how to calculate the determinant of a
k — 1 x k — 1 matrix. For each pair i,j of indices, let m^ be the matrix
formed from A by eliminating row i and column j of A, and define the
(i,j)-cofactor of A to be

>4ii = ( - l ) < + i d e t ( m y ) .
We can then take as our working definition the formula

det (A) = aii-An + ai2An H h aut-Aut,

that is, expansion from the first row. For example, when k = 2, An = a22
and A\i = —021, so we recover the familiar formula.
We assume the basic properties of the determinant, which we list below
for future reference. These results hold for matrices with entries in any
commutative ring. Proofs can be found in Chapter 7 of [Cohn 1].
Det 1: Let A be a k x k matrix over a commutative ring R. Then det (A) E R.
Det 2: If B is also k x k matrix over R, then

det(AB) = det (A) det (£).


86 Chapter 5. Free Modules

Det 3: det(AT) = det(^4), where AT is the transpose of A.


Det 4: If A has two identical rows or columns, det(A) = 0.
Det 5: If B is formed from A by adding a scalar multiple of one row (or
column) to a different row (or column), then det(B) = det{A).
Det 6: The determinant is an additive function of the rows of a matrix.
More precisely, let a, be the i-th row of A, which is a row vector of
length k and suppose that
a,i = bi + Ci

for row vectors T>i and Cj. Write B for the matrix which is formed from
A by replacing row a^ by b*, and define C similarly. Then
det(A) = det(B) + det(C).
There is a corresp onding result for columns.
Det 7: If
/on 0 ... 0 0 \
021 ^22 • •• 0 0
A=
flfc-1,1 a/c-i,2 • ■• a-k-i,k-i 0
\ Ofel afc2 ■ ■ afc,fc-i afcfc /
is a lower triangular matrix, then
det(yl) = ana22 ■ ■a-kk-

In particular, det(7) = 1.
Det 8: The following relations hold:
det(A) for h=i
ahiAn + • ■ • +dhkAik
-{ 0 for
and
det (A) for i = j
Auaij + ■ ■ ■ + Akidkj
-{ 0 for
In the case h = i, the first relation tells us that the determinant can
be expanded from row h for any h = 1 , . . . , k. Similarly, for i = j , the
second tells us that we can expand from the j - t h column.
Det 9: The adjoint of A is adj(A) = (Aij)T, the transpose of the matrix of
cofactors of A. Then the formulas of Det 8 can be re-interpreted as the
matrix product formulas
A ■ adj(A) = det(A)I = adj(A) • A
where the middle term is simply the diagonal matrix with all diagonal
terms equal to det(A).
5.12. Determinants and invertible matrices 87

This last property leads to the invertibility criterion that we have been
seeking.

5.12.1 Theorem
Let A be a k x k matrix over a commutative ring R. Then A has an
inverse with entries in R if and only i/det(,4) is a unit in R.
Furthermore, if A is invertible, then

A-1 = (det(A))- 1 adj(yl).

Proof Suppose A has inverse A~l with entries in R. Then det( J 4 _ 1 ) G R


also. Since

det{A) det(A" 1 ) = det(A • A~l) = det(/) = 1,

we see that det(j4) is a unit in R.


Conversely, suppose det(^4) is a unit of R. Each cofactor Aij is the
determinant of a matrix with entries in R and so belongs to R. Thus
adj(^4) has entries in R, as does (det(A))'1 &d)(A), which is therefore the
inverse of A from the relations in Det 9. □
Remark the strength of the above result is that it tells us when a matrix
has an inverse with entries in the given ring, rather than some larger ring.
For example, let A be a square matrix over the ring of integers Z.
The condition that A have an inverse with entries in Z is the stringent
requirement that det (^4) = ± 1 , since the only units in Z are ± 1 . On the
other hand, A will have an inverse in Q provide only that det(j4) ^ 0.
For a final illustration, we return to the problem considered first in
section 5.3, that of determining the values of a in Z for which the elements
b\ = I ) and b2 = I ) form a basis of Z 2 . If B = {61,62} is a
basis, then

'-*.-(-1.1)
must be invertible, by the results of section 5.6. On the other hand, if P is
invertible, then B is a basis by section 5.8.
Now det(P) = a + 2, which is a unit in Z when a = - 1 or a = - 3 , thus
confirming the calculation made in section 5.3.
88 Chapter 5. Free Modules

Exercises
5.1 Let R be a commutative ring and suppose that the nonzero element
r € R is not a unit of R. Show that the i?-module R/Rr is not a free
i?-module. (Hence a ring which is not a field has non-free modules.)
5.2 (a) Let P be an invertible matrix over a commutative domain R. Using
Theorem 5.5.2, show that P must be a square matrix.
(b) Let R now be any commutative ring. Show that if & : R —> R
is an isomorphism, then there is an invertible I x k matrix T with
Tv = 9(v) for all v e Rk.
(c) If T is not square, add zero rows or columns to obtain a square
matrix, and so obtain a contradiction, using Det 8. Deduce that
Theorem 5.5.2 holds for any commutative ring.
5.3 Let P be k x k matrix over a commutative domain R. Show that P
is invertible if and only if the columns of P form a basis B of Rk, in
which case P = PE,B-
5.4 Let B and C be bases of a free i?-module M. Use the fact that

{idM)c,B = Pc,B

to obtain Formula 5.8:

(m)c = Pc,B{m)B

from Formula 5.10:

(9(m))c = (6)c,B(m)B.

5.5 Let B, C and D be bases of a free i?-module M. Show that

PD,C • PC,B = PD,B-

5.6 Let

and
- - U h - -0)}
D
2
- * - -i -*=U))}
be bases of Z (see section 5.3).
Compute PD,B-
5.7 Find all values of a in the Gaussian integers Z[i] for which the elements
b\ = [ I and 62 = I j form a basis of Z[i] 2 .
Exercises 89

5.8 Let g(X) be a polynomial over 11 Find all f(X) in R[X] for which
the elements b\ = ( I and hi " ( fun i formabasisofR x 2
> /""""- [ ] -
5.9 Let R be a commutative ring and let E be the standard basis of Rn.
For each 0 in Eom(Rn,Rn), put a(0) = E{0)E- Show that a is a ring
s
isomorphism from Hom(/? n , Rn) to the ring Mn(R) of n x n matrices
over .R.
Let M be any free R-module with rank(M) > 1. Deduce that the
ring Hom(M, M) is not commutative.
5.10 Let
0 . 0 0 \
021 a22 . 0 0
A =
fl
fc-i,i afc- 1,2 ■ • ajt-i,fe-i 0
\ ajti afc2 • afc,fc-i «fcfc /
be a lower triangular matrix over a commutative ring R.
Show that A is invertible if and only if all the diagonal terms
011,022, • • ■ ,afcfc a r e units in R. Hint: Det 7.
Show further that if A is invertible, the inverse of A is also lower
triangular.
Chapter 6

Quotient Modules and


Cyclic Modules

Let R be a ring and let 6 be an .R-module homomorphism from an R-module


M to an .R-module N. In Chapter 4 we associated to 8 a submodule Ker(#)
of M, the kernel of 6, which measures the failure of 6 to be injective - 0 is
injective precisely when Ker(#) = 0. Our first construction in this chapter
is in a way a reverse procedure. Given a submodule L of M, we find a
module M/L, the quotient of M by L, and a homomorphism IT from M to
M/L which has kernel L. We then use this construction to manufacture an
injective homomorphism from a hon-injective homomorphism 6; the new
homomorphism, 6, is the mapping from M/Ker(#) to N that is induced
by 6. This construction leads us to a crucial result, the First Isomorphism
Theorem, which is a very useful tool for the production of isomorphisms
and hence, ultimately, for the description of modules.
We illustrate this approach by examining the cyclic modules over a ring
R, that is, the modules of the form M = Rx for an element x in M. We
prove that a cyclic module is isomorphic to a quotient module R/I for a
left ideal / of R. Thus the structure of the cyclic R-modules is determined
by the nature of the ideals in R. In particular, when R is Euclidean we are
able to give a complete description of the submodules of a cyclic module in
terms of the factorization of the elements of R.

Finally, we make a detailed analysis of the action of a polynomial vari­


able X on a cyclic module F[X]/F[X]f over the polynomial ring F[X],
which leads to a first result on normal forms of matrices.

91
92 Chapter 6. Quotient Modules and Cyclic Modules

6.1 Quotient modules


Let R be an arbitrary ring and let L be a submodule of a left i?-module
M. We construct the quotient module (sometimes called the factor module)
M/L of M by L in much the same way as we constructed the residue ring
R/I from a ring R and ideal / in section 1.10.
Define a relation on M by the rule that

m = n <=$• m — n € L.

The verification that = is an equivalence relation is a matter of routine


checking, similar to that performed in detail in the proof of Lemma 1.10.1.
The equivalence class of an element m of M is the set

m = m + L = {m + I | I G L};

we usually prefer the notation m. The quotient module M/L is defined to


be the set of all such classes, with addition given by the rule

m + n = m + n for m, n G M/L,

and scalar multiplication given by

r -m = rm for r G R and m G M/L.

More routine verifications, very similar to those made in the proof of Propo­
sition 1.10.2, show that these operations are well-defined and make M/L
into a left .R-module, with zero element 0.

6.2 The canonical homomorphism


The map 7r : M —y M/L defined by wijiz) = ui is called the canonical homo­
morphism from M to M/L. The fact that 7r is an .R-module homomorphism
is immediate from the definition, and 7r is surjective since every element of
M/L has the form m for some m in M.
In Lemma 4.9.1, we showed that the kernel Ker(#) of an R-module
homomorphism 6 : M —> N is a submodule of M. Now we obtain a
converse.

6.2.1 L e m m a
Let L be a submodule of M and let w : M —> M/L be the canonical
homomorphism. Then
Ker(7r) = L
6.3. Induced hoznomorphisms 93

Proof
m e Ker(7r) <=> m = 0
«=> m - 0€ L
<=> m e L.

6.3 Induced homomorphisms


6.3 Induced homomorphisms
An important use of the quotient module construction is that it enables
us to construct new homomorphisms from old. The new homomorphism
is often an isomorphism, which observation is a key contribution to the
task of describing an arbitrary module in terms of a collection of standard
modules.
Suppose that we are given an R-submodule L of a left R-module M and
an .R-module homomorphism from M to a left .R-module N. Suppose also
that L C Ker(#). We can then define a mapping
9 : M/L ->■ N
by
0(m) = 9(m) for all m e M/L;
6 is called the homomorphism induced by 6, or sometimes, the induced
mapping.
First, we must check that 9 is actually well-defined. Suppose that m,n
are elements of M with m = n in M/L. Then m = n + / for some / € L, so
that
e(m) = 9{m) = 9(n) + eil) = 9{n) = 9{n).
Next, we note that 6 is an R-module homomorphism because, for any
m, n e M/L and any r € R, we have
6~(m + n) = 6 ■ (m + n)
= 9{m + n)
= 9{m) + 9{n)
= 9{m) + 9(n)
and
9{r • m) = 9 ■ (rrn)
= 9{rm)
= r ■ 9(m)
= r-9(rn).
94 Chapter 6. Quotient Modules and Cyclic Modules

We summarize the basic properties of induced mappings in the following


theorem.

6.3.1 The Induced Mapping Theorem


Let R be an arbitrary ring and let M and N be left R-modules. Suppose
that L is a suhmodule of M, that 9 : M -» N is an R-module homomorphism
and that Ker(0) C L.
Then the induced homomorphism 9 : M/L -> N has the following prop­
erties.
(i) Ker(0) = {m \ m £ Ker(6>)} C M/L.
(ii) IfKer(9) = L, then 9 is injective.
(Hi) If 9 is surjective, so also is 9.
(iv) If Ker(0) = L and 0 is surjective, then 6 is an isomorphism.

Proof (i) This follows from the implications

m e Ker(0) «=> 9m = 0
<=> m£ Ker(0).

(ii) If Ker(0) = L, then Kei(9) = 0 by the first part, so 9 is injective by


part (ii) of Lemma 4.9.1.
(hi) If 9 is surjective, each element n € N has the form n = 9{m) for
some m € M, and so n = 9(m) also, showing 9 to be surjective.
(iv) This is clear from the preceding results. □
The last part of this result is important enough to be stated separately
as a theorem in its own right.

6.3.2 The First Isomorphism Theorem


Let 9 : M —)• N be a surjective R-module homomorphism. Then the
induced homomorphism 9 : M{ Ker(#) —>• N is an isomorphism. □

6.4 Cyclic m o d u l e s
As an application of the First Isomorphism Theorem, we show how to
describe cyclic modules and their submodules.
Recall from section 3.8 that an .R-module M is cyclic if M = Rx for
some element x in M. This is equivalent to saying that the multiplication
homomorphism of section 4.3,

T = T(X) : R—> M, r(r) = rx for all r S R,


6.4. Cyclic modules 95

is a surjective .R-module homomorphism. From the definitions, the kernel


of r is the left ideal

Ker(r) = {r £ R | rx = 0},

which is the annihilator ideal Ann(x) of x (sections 3.5 and 4.11). Write
I = Ker(r). Then, by the First Isomorphism Theorem, f is an isomorphism
from R/I to M.
In the other direction, suppose we are given a left ideal / of R. Then
the quotient module R/I is cyclic with generator 1, since r = r ■ 1 for any
reR.
The left ideal I that we have associated with a cyclic module M depends
on the generator x of M that we have chosen. To complete the classification
of cyclic modules, we need to show that I depends only on M. In fact, we
prove an apparently stronger result, that the ideal I is uniquely determined
by the isomorphism class of the cyclic module. Suppose that M and N are
cyclic left .R-modules and that there is an .R-module isomorphism

a : M -> N.

Let I and J be left ideals of R so that there are isomorphisms

a : R/I -> M and (3 : R/J -> N.

Then there is a composite isomorphism

7 = (j3)-laa : R/I -► R/J.

Now, if x G J , then, in R/J,

0 = x-j(T) = j(x),

so x — 0 in R/I, from which x £ I. Thus J C J, and by symmetry I = J.


We record our findings as a theorem.

6.4.1 Theorem
Let R be a ring and let M be a cyclic left R-module. Then there is a
left ideal I of R such that M = R/I as an R-module.
If N is also a cyclic left R-module with N = R/J for some left ideal J,
then M = N as an R-module if and only if I = J.
In particular, the left ideal I is uniquely determined by the cyclic module
M. □
96 Chapter 6. Quotient Modules and Cyclic Modules

Remarks
(i) These results include the two extreme cases of cyclic modules, which we
have not mentioned explicitly until now. If / = 0, the zero ideal, then
R/I = R. U I = R, then R/I = 0, the zero module.
(ii) Suppose that ideal / is two-sided (as is always the case when R is com­
mutative). Then the quotient module R/I is, in essence, the same as
the residue ring R/I of section 1.10 (which is the reason that we can use
the same notation for these two constructions). Both have the same rule
of addition, and the scalar multiplication is related to the residue ring
multiplication by the formula r -x = Tx.
(iii) As we have seen in our discussion of matrix actions on vector spaces, a
single additive group can usually be viewed as a module in many different
ways. Any module whose underlying group A is cyclic is itself necessarily
cyclic. Then A must be isomorphic to Z or to Z p for some prime p.
The converse is far from true. For example, the ring of Gaussian
integers Z[t] is cyclic as a module over itself, but not as a module over
Z since Z[i] = Z 2 . Other examples are provided by the C[X]-modules of
Exercises 3.6 and 3.7; both these modules are cyclic, but the underlying
vector space C 3 is not cyclic over C, nor does it give a cyclic C[X]-module
when X acts as 0.

6.5 S u b m o d u l e s of cyclic m o d u l e s
We now combine the description of cyclic modules given above with the
submodule correspondence that we obtained in section 4.13 to find the
submodules of a cyclic module.
Let R be an arbitrary ring. We make repeated use of the observation
that the left ideals of R are precisely the i?-submodules of the left regular
i?-module R. Choose a left ideal I of R and write n : R —» R/I for the
canonical homomorphism of left .R-modules. If P C R/I is an i?-submodule
of R/I, then the inverse image of P is

H = ir*{P) = {r£R\Tr(r) £ P},

which is a left ideal of R that contains / . Conversely, if we have a left ideal


H of R, the image jr»(JEf) of H is a submodule of R/I. We can therefore
restate Proposition 4.13.1 as follows.

6.5.1 P r o p o s i t i o n
Let I be a left ideal I of R and let M = R/I be the cyclic left R-module
defined by I. Then there is a bijective correspondence between
6.5. Submodules of cyclic modules 97

(i) left ideals H of R with I Q H


and
(ii) submodules P of M,
in which
H <+ jr.(Jff)
and
P^iv*(P).
D
For a general ring of scalars R, there is no reason why a submodule of a
cyclic module should itself be cyclic. The assertion that every submodule of
the left i?-module R itself is cyclic is the same as saying that every left ideal
of R is principal, which is a very strong condition, although it happens to
hold for Euclidean domains. (An example of a non-principal ideal domain
was given in Exercise 1.6.)
When R is Euclidean, we obtain a complete result.

6.5.2 Theorem
Let R be a Euclidean domain and let R/I be a cyclic R-module, where
the ideal I is neither 0 nor R. Then the following statements are true.
(i) I = Ra, where a has a standard factorization of the form
n(l) n(Jfe)

for distinct irreducible elements Pi, ■ ■ ■ ,Pk of R and unique positive


exponents n ( l ) , . . . ,n(fc).
(ii) The submodules of R/Ra are cyclic modules of the form Rd/Ra
where

d = p™{l) ...p™ik\ 0<m(z)<n(i), i = l,...,k.

(Hi) Rd/Ra = R/Rd' where dd' = a.

Proof (i) By Theorem 2.5.3, the ideal I is principal, say / = Ra' where
l nil) n(k) .,
a = upj v ' .. .pk , u a unit,
is a standard factorization of a' (section 2.9), and a = u _ 1 a ' is also a
generator of / (Lemma 2.5.3), with the desired standard form,
(ii) By the preceding result, the submodules of R/Ra are given by the
ideals H of R with
Ra C H C R.
98 Chapter 6. Quotient Modules and Cyclic Modules

But then H = Rd for some d, and since a € Rd, d is a divisor of a.


Discarding unit factors as before, we can take d to have a factorization as
claimed.
(iii) Define 0 : R -> Rd/Ra by

0(r) = rd for all r € R.

It is easy checked that 6 is an /^-module homomorphism, and 6 is evidently


surjective. We have

r £ Ker(0) rd = 0
rd € -Ra
r € fid'

so that Ker(0) = Rd'. The First Isomorphism Theorem (6.3.2) now shows
that R/Rd' 3! fid/ito. □

Diagrams. If the element a has a straightforward factorization, the sub-


modules of R/Ra can be described by drawing a diagram.
Suppose that a = pq, where p, q are distinct irreducible elements of the
Euclidean domain R. Then the diagram is as follows.
M •

pM qM

If a — pn, it is more convenient to turn the diagram on its side:


• • • • •
p""1M pn~2!M
,n-2 p2M pM M
6.6. The companion matrix 99

6.6 The companion matrix


Let F[X] be the polynomial ring over a field F, and let / = f(X) be a
polynomial in F[X|. As we remarked in section 6.4, the quotient mod­
ule F[X]/F[X]f is the residue ring of F[X] modulo F[X]f, but with a
different interpretation of the multiplication. In section 2.12, we found a
canonical basis for F[X]/F[X]f as a vector space over the field F, which
we can now use to find a nice matrix representatation for the action of X on
F[X]/F\X]f. This is a first step towards finding normal forms of matrices,
a problem we consider in depth in Chapter 13.
To avoid trivialities, we suppose that / is not a constant polynomial.
Since the nonzero constants are the unit polynomials (Lemma 1.4) and the
quotient module is unchanged if we multiply / by any unit, we can take

f = fo + fiX + f2X2 + ■■■ + / n - i X " " 1 + Xn

to be monic, with n = deg(/) the degree of / .


By (2.12), the canonical F-basis of F[X]/F[X]f is

1 e ei en~l

where 1 = 1, e = X and in general el = X . The multiplication in


F[X]/F[X]f is derived from the relation

en = -fo-he fn-ie71'1.

The F[X]-module structure of F[X]/F[X]f is completely determined


by the action of the variable X on F[X]/F[X]f, which we now describe.
By definition,
X ■ g = ~X~g for any g e F[X]/F[X]f.

Thus X acts as a linear transformation on the space F[X]/F[X]f, and we


wish to find the matrix of this linear transformation relative to the canonical
basis.
It is easy to see that we have equations

X 1 =e
X e = e2
(6.1)
£ n --2 -n —1
X
X e™"-1 = - / o - 1 --he- _ / 2 £ 2 /n-ie n - 1
100 Chapter 6. Quotient Modules and Cyclic Modules

Thus, by the definition in section 5.9, the matrix of the linear transfor­
mation corresponding to X is

(o 0 • • 0 0 -/o \
1 0 ■ • 0 0 -h
0 1 ■ • 0 0 -h (6.2)
C(f) =
0 0 ■ ■ 1 0 — fn-2
0 • • 0 1 -/n-1 j

The matrix C(f) is called the companion matrix of the polynomial / or the
rational canonical block matrix associated to / , since it is a typical building
block for the rational canonical matrices that we encounter later.
Here are some special cases. A linear polynomial / = X — a, has
C ( / ) = ( a )i a 1 x 1 matrix. Since a e F is arbitrary, we see that any
method of turning F into an F[X]-module must result in a cyclic module,
which fact is obvious anyway.
For a quadratic polynomial / = X2 + aX + 6, we have

C(f)

It is far from the case that an action of X on F2 necessarily gives a cyclic


F[X]-module; a trivial example is given by allowing X to act as 0.

6.7 Cyclic modules over polynomial rings


We can reverse the above construction. Given a rational canonical block
matrix C, that is, a matrix which has the form exhibited in Eq. (6.2), we
work backwards to obtain a cyclic F[X]-module.
First, note that it is evident that there is a unique monic polynomial /
such that C = C(f). Next, let M be the F[X]-module given by X acting
as C on the standard F-space F n in the usual way (section 3.3). Then the
action of X on the elements of the standard basis { d , . . . , e n } is given by
the equations

Xei = 62, Xe2 = e 3 , . . . , Xen-\ = en


(6.3)
Xen = -foei - / i e 2 fn-l^n
which mimic those in Eq. (6.1).
Let 6 : M -> F[X]/F[X]f by the F-linear transformation defined by

6(ei) = l,d(e2)=e,..., 6(en) = e n - 1


6.7. Cyclic modules over polynomial rings 101

Comparing Eqs. (6.1) and (6.3), we see that 9 is an isomorphism of F[X}-


modules.
We summarize our discussion, and a little more, as a theorem.

6.7.1 T h e o r e m
Let B be an n x n matrix over a field F, and let M be Fn made into
an F[X]-module with X acting as B. Then the following assertions are
equivalent.
(i) M is isomorphic to a cyclic F[X}-module F[X}/F\X]f for some
monic polynomial f € F[X].
(ii) There is an invertible n x n matrix T so that TBT~l is a rational
canonical block matrix C.
When these assertions hold, C = C(f) for a unique monic polynomial f in
F[X].

Proof
Before we can get started, we need to set up some notation. Given B,
the action of X as B on Fn defines an F-linear transformation

(3:Fn^ Fn, P(v) =Xv = Bv for v € Fn,

and the matrix of (5 is {P)E,E — B relative to the standard basis E of Fn,


by the results of (5.10).
On the other hand, given a monic polynomial / , the preceding dis­
cussion shows that the action of X on F[X]/F[X]f defines an F-linear
transformation
7 : F[X]/F[X]f -► F[X]/F[X]f
which has matrix (7)2 % = C ( / ) w i t r i respect to the canonical basis Z of
F[X]/F[X]f.
(i) => (ii): Suppose that 9 : M -> F[X]/F[X]f is an F[X]-module isomor­
phism. By Theorem 4.5.1,
OP = 16,
so that
(0)Z,E(P)E,E = (l)z,z(8)z,E
by Proposition 5.9.1. Put T = {0)Z,E\ then T is invertible (Corollary 5.9.2),
and
TB = C(f)T
as desired.
102 Chapter 6. Quotient Modules and Cyclic Modules

(ii) => (i): Given T and C, define 6 by the relation (6)Z,E = T, and let /
be the monic polynomial with C = C(f). Reversing the above argument,
we see that 6 is an F[X]-module homomorphism from M to F[X]/F[X]f.
Finally, we show that the monic polynomial / is uniquely determined by
M. If M £ F[X]/F[X]f and also M S F[X]/F[X]fc, then F [ X ] / F [ X ] / a
F[X]/F[X]h, so that F [ X ] / = F[X]h by Theorem 6.4.1 and therefore
f = h. □

6.8 Further developments


Theorem 6.5.2 holds if R is a principal ideal domain, since it depends only on
the fact of unique factorization ([Cohn 1], §10.5). It can even be extended to
noncommutative principal ideal domains - see Chapter 8 of [Cohn: FRTR].
The description of the action of X on a cyclic i^X]-module in terms of
the companion matrix does not really depend on the fact that the multi­
plication in a field F is commutative. Thus it can be extended to the case
that F is a division ring - that is, F is a field except that multiplication
may not be commutative. (Thus every field is also a division ring; a non-
commutative example of a division ring is given in Exercise 6.9.) Details of
the extended result are to be found in [Cohn: FRTR], §8.4. A polynomial
ring F[X] over a division ring is an example of a noncommutative Euclidean
domain (see [B & K: IRM], §3.2).
When the ring R is not a field or division ring, the classification of
cyclic ,R[X]-modules, that is, the description of the ideals of i2[X], can be
a difficult problem.

Exercises
6.1 Let R be a commutative ring and let M be an i?-module. Recall
from Exercise 4.6 that every i?-module homomorphism p : R —> M is
determined uniquely by x = p(l) 6 M; then p(r) = rx for all r G R.
Let a be a fixed element of R. Show that

M(a) = {x £ M | ax = 0}

is a submodule of M.
Show that if £ : R/Ra -> M is an .R-module homomorphism, then
C, = p where the element x corresponding to p belongs to M(a). Show
conversely that if x 6 M(a), then x gives rise to a homomorphism
from R/Ra to M.
Exercises 103

Deduce that there is a bijective correspondence between the set


Hom(R/Ra,M) of all R-module homomorphisms from R/Ra to M
and the set M(a).
Show further that this bijection is itself an isomorphism of R-
modules (see Exercise 4.1).
Now take M = R/Rb for some b £ R. Verify that

(R/Rb)(a) = {r€ R/Rb \ ar e Rb}.

Hence (or otherwise) show that Hom(Z p , Z ? ) = 0 if p, q are distinct


prime numbers.
Compute Hom(Z p ,Z p ), Hom(Z p 2,Z p ), and Hom(Z p ,Z p 2).
6.2 The Second Isomorphism Theorem.
This and the following exercise give two important consequences
of the Induced Mapping Theorem (6.3.1). They are given only as
exercises since they are not used explicitly in these notes.
Let R be any ring and let K and L be submodules of a left R-
module M. Recall from section 3.6 that

K + L = {k + l | ke K, leL}.

Define p : K —> (K+L)/L by p(k) = k. Verify that p is a surjective


homomorphism of R-modules and that Ker(p) = K C\L. Deduce that

K/(Kr\L) S {K + L)/L.

6.3 The Third Isomorphism Theorem.


(In this exercise, we use notation "x" to denote the image of x in
any quotient module.)
Let R be any ring and let M be a left .R-module. Suppose that K C
L C M is a chain of submodules of M. Show that the canonical map
L : L/K —y M/K, i(l) = 1, is an injective i?-module homomorphism.
Regard i as the inclusion map (that is, think of L/K as a submod-
ule of M/K), and define a : M/K -*■ M/L by a(m) = m.
Prove that a is a surjective .R-module homomorphism with Ker(a) =
L/K, and deduce that there is an isomorphism of R-modules

(M/K)/(L/K) £ M/L.

This result is also known as the Idiot's Cancellation Lemma, for


obvious reasons.
6.4 Let a and b be nonzero elements of a Euclidean domain R. Com­
bining the results of Proposition 2.8.2 with the Second and Third
Isomorphism Theorems, prove
104 Chapter 6. Quotient Modules and Cyclic Modules

(a) Ra/Rab S R/Rb (see Theorem 6.5.2);


(b) {R/Rab)/{Ra/Rab) £ fl/ito.
6.5 Let
/ 0 0 0 0 -/o \
1 0 0 0 -h
0 1 0 0 -h
C =

0 0 1 — fn-2
V0 0 0
be a rational canonical block matrix, and put / /o + / i X + ■ • • +
fn-iXn~l + Xn, so that C = C(f).
Let I be an n x n identity matrix. Show that
det{XI -A)=f.
Remark. Thus / is the characteristic polynomial of C, which we meet
again in section 9.7.
Hint: use row operations - see section 5.12.
6.6 Let R be a Euclidean domain and let l,p, q be distinct irreducible
elements of R. Draw up diagrams that illustrate the submodules of
R/Ra (as in section 6.5) when
(a) a = p;
(b) a = p2;
(c) a = Ipq;
(d) a = p2q;
(e) a = p2q2.
s 6.7 Let R be any ring. A composition series for a left .R-module M is a
finite ascending chain
0 = M 0 C Mi c • • • C Mfc_x c Mk = M
in which each quotient Mi/Mi_i, i = l,...,k, is a simple left R-
module.
Suppose that R is a Euclidean domain and that M = R/Ra is
cyclic, a / 0. Find a composition series for M , and show that the
set {Mi/Mi-i | i = l,...,k} corresponds bijectively to the set of
irreducible factors of a (counting multiplicities).
Does R itself have a composition series?
Remark. If a module has a composition series, then the set of simple
quotient modules associated to the series is essentially unique. This
classical result is the Jordan-Holder Theorem, a proof of which can
be found in [B & K: IRM] (4.1.10).
Exercises 105

6.8 Let F be a field and let R be t h e ring o f n x n matrices over F. For


k = 1 , . . . , n, let Ik be t h e set of all matrices of the form
s
/ an ••• flu 0 ■•• 0 \
fl2i • ■ • a2fc 0 ■ • • 0

\ a„i ■■• a nfc 0 • •■ 0 /

Show t h a t each 7fc is a left ideal of R, that / = I\ is a simple left


.R-module and t h a t Ik+i/h = / for fc = 1 , . . . , n — 1.
Deduce t h a t

0 C J i C • - • C Ife C 4 + 1 C--- C R

is a composition series of R (as a left module).


6.9 T h e quaternions.
Let Q be a four-dimensional vector space over t h e real numbers
with basis l,i,j,k. Introduce a multiplication on Q by t h e rules t h a t
s
1 is t h e identity element and t h a t

i2 = j 2
= k2 = — 1 and ij = k,

the multiplication being extended t o arbitrary elements of Q by dis-


tributivity a n d associativity. Much checking confirms that Q is a ring.
Verify t h a t ij = —ji, so t h a t Q is not commutative.
For a n element v = a ■ 1 + ai + bj + ck of Q, put

T(v) = 2a and N(v) = a2 + a2 + b2 + c2.

Show t h a t N(vw) = N(v)N(w) for any two elements v, w of Q and


t h a t v satisfies t h e polynomial equation

X2 - T{v)X + N{v) = 0.

Deduce t h a t Q is a division ring, t h e quaternion algebra


Chapter 7

Direct Sums of Modules

In this chapter, we introduce the direct sum construction, which is a very


useful tool for analysing the structure of a module, and for making new
modules out of old. It comes in two varieties - internal and external.
Internal direct sums arise when we wish to express a given module in
terms of its submodules, this decomposition being "internal" to the mod­
ule. The ultimate aim of this approach to module structure is to describe
the modules that cannot be expressed as a direct sum - these are the "in­
decomposable" modules - and then to show how a general module can be
assembled from indecomposable component submodules.
The other version of the direct sum construction arises when we wish to
find a module that contains a given set of modules as components. There
is no reason why two modules should both appear naturally as submodules
of a third module, so we need an "external" construction for the larger
module.
At the end of the chapter, we show that the two types of direct sum are,
to an extent, interchangeable, and we give an interpretation of a classical
result from number theory, the Chinese Remainder Theorem, in terms of
direct sums.
The definitions and the formal properties of direct sums are valid for
modules over any ring, but our illustrations and applications require that
the ring of scalars is a Euclidean domain.

7.1 Internal direct sums


To begin, we consider the simplest examples of direct sums, in which there
are only two components.

107
108 Chapter 7. Direct Sums of Modules

Let M be a left ^-module, where R is an arbitrary ring, and let L and


TV be submodules of M. Then M is the internal direct sum of L and TV if
the following conditions hold.

IDSM 1: L + N = M;
IDSM 2: L n TV = 0.
The notation M = L © TV indicates that M is the internal direct sum of its
submodules L, TV, which are then called the components or summands of
M. We also say that TV is the complement of L in M, and vice versa.
When we have expressed a module M as a direct sum of two (or more)
components, we sometimes say that we have decomposed M or that we have
found a decomposition of M. Before giving any examples, we reformulate
the definition in a useful way.

7.1.1 Proposition
Let L and TV be submodules of a left R-module M. Then the following
assertions are equivalent.
(i) M = L®N.
(ii) Let m g M. Then there are unique elements I € L and n € TV with
m = I + n.

Proof
(i) =>• (ii): Suppose that m € M is given. Since M = L + N, we have
m = I + n for some I G L, n € N. If also m = V + n! with /' € L, n' 6 N,
then l — l' = n' — n belongs to LC\N, which is 0. Thus I and n are uniquely
determined by m.
(ii) => (i): The fact that there are elements / and n with m = I + n for
each m in M shows that M = L + N. Suppose that x 6 LflJV. Then
£ = x + 0 with i e i and 0 s TV, and also a: = 0 + x with 0 € L and x € TV.
By uniqueness, we must have x = 0. □

Comments & examples.

(i) The order of the components is not important; if M = L © TV, then


equally M = TV @L.
(ii) We allow trivial direct sums in which one component L or TV is the zero
module 0; the other component must then be equal to M.
(iii) Let F be a field and let V = F2, the two-dimensional vector space over F.
The standard basis {ei,e2} leads to internal direct sum decomposition
V = U © W of V in which U = Fe\ and W = Fe^. More generally, any
basis {fi, f2} of V gives V = Ff\ © Ff2 - see Lemma 5.2.1.
7.2. A diagrammatic interpretation 109

The above comments hold when the field F is replaced by any ring
R.
(iv) The previous example illustrates that a module can be expressed as a
direct sum in many ways, since a vector space has many bases.
It also shows that the choice of one component of a direct sum need
not determine the other. To see this, consider the bases of the form
{ e i , / ( a ) } with / ( a ) = I °. ), where a <E F is arbitrary. Then Ff(a) ^
Ff(b) if a ^ b, and V = Fex © Ff(a) for every choice of a.
(v) Let_M = Z 6 , considered as a Z-module. Then 2M = {0,2,4} and 3M =
{0, 3} are submodules of M with 2M n 3M = 0. The fact that M =
2M + 3M follows either by direct calculation or, more intellectually, from
the identity 1 = 2 - 2 - 3 .
In contrast to the previous example, 2M and 3M are the only non-
trivial summands of M.
This example foreshadows a general technique for constructing de­
compositions of modules over a Euclidean domain.
(vi) Let F be a field, let A — f I be a 2 x 2 diagonal matrix over F,
and let M be the F[X]-module defined by X acting as A on F2. Then the
subspaces L = Fei and N = Fe^ are F[X]-submodules of M, the action
of X on L being given by multiplication by b and on N by multiplication
by c. We have M = L © N.
Notice that we already know that M is the direct sum of L and N as
a vector space over the field F; the real meat of this example lies in the
fact that L and N are invariant under the action of X.
(vii) More generally, suppose that ^ 4 = 1 n ^ l i s a f c x f c block matrix
over F, with diagonal blocks of sizes r x r and s x s respectively. (Hence
r + s = k.)
Let M be the .F[X]-module defined by X acting as A on Fk, and let
L = Fe\ + - ■ -+Fer and TV = Fe r + iH hFek, where, as usual, e\,..., efc
is the standard basis of Fk. Both L and N are .F[X]-submodules of M,
the action of X on L being given by the matrix B and on N by C. Then
M = L © N. (A more general version of this construction is given in
section 7.5 below.)

7.2 A diagrammatic interpretation


If L and N are arbitrary submodules of a right .R-module M, there is no
reason why we should h@e^^-)^|fffeg MaNeriaM or I fl JV = 0. The config-
110 Chapter 7. Direct Sums of Modules

uration of the four submodules LnN,L,N and L + N can be represented


by the diagram
M •

L+N

LDN

On the other hand, when M is the direct sum of L and N, the equations
Lf) N = 0 and L + N = M lead to the simpler diagram
M •

L • N

0 •
7.3. Indecomposable modules 111

(If one of the submodules L, N is zero, the diagrams become even simpler.)

7.3 Indecomposable modules


A module is called indecomposable if it is nonzero, and if it cannot be
expressed as an internal direct sum except trivially. Thus, when M is
indecomposable, any expression M = L ® N must have either L — 0 or
N = 0.
Recall from section 3.5 that a simple module is a nonzero module that
has no proper submodules. Clearly, a simple module must be indecom­
posable. The reverse is far from true. For instance, let R be a Euclidean
domain considered as the regular .R-module. Since a submodule of R is an
ideal (section 3.5) and any ideal is principal (Theorem 2.5.3), the nonzero
submodules of R are the principal ideals Ra with a ^ 0. An intersection
Ra (~l Rb of nonzero ideals contains the product Rab, which is again nonzero
if a and b are nonzero. Thus R ^ Ra © Rb for any nontrivial choice of a, b.
An example of a different kind occurs in Example 3.11. There, we con­
structed an F[A"]-module M that has only one proper nonzero submodule,
which means that M must be indecomposable.
We record a more general version of the above argument for future
reference.

7.3.1 Theorem
Let p be an irreducible element in a Euclidean domain R. Then the
cyclic R-module R/Rpn is indecomposable for any integer n > 1.

Proof By Theorem 6.5.2, we know that the nontrivial submodules of R/Rpn


have the form Rpl/Rpn for i = 1 , . . . , n — 1. Thus every proper submodule
of R/Rpn is contained in Rp/Rpn, from which we see that no two proper
nonzero submodules L and N can satisfy the requirement that L + N =
R/Rpn. (A diagram of the submodules of R/Rpn is given in section 6.5.)

Remark. It turns out that the only indecomposable modules over a Eu­
clidean domain are those of the form R or R/Rpn for an irreducible element
p of R and positive integer n. In Corollary 7.8.2 below, we confirm this fact
for cyclic modules, but we are some distance from being able to prove it
without the prior information that the module is cyclic.
Our next task is to consider internal direct sums with more than two
components.
112 Chapter 7. Direct Sums of Modules

7.4 Many components


Let R be any ring and suppose that Li,...,Lk are /?-submodules of a
left .R-module M. Then M is the internal direct sum of L\s... , Lfc if the
following hold.

IDSMkl: Li + --- + Lh = M;
IDSMk 2: Li n (Li + ■ • • + U-i + Li+1 + h Lk) = 0 for i = 1 , . . . , k.
The notation for such an internal direct sum is M = L\ © • • • © Lfc. The
submodules Li, i = 1,... ,k are the components or summands of M, and
the complement of L* is the submodule

Lj = Li + • ■ • + Li_i + Li+i + h Lfc.

The order of the terms is unimportant, and we allow the possibility that
some components are the zero module.
It is convenient to allow the trivial cases k = 0, where M = 0, and
k = I, where M = L\. For k = 2 we regain the definition of the direct sum
of two submodules.
Notice also that if M = L\ © • • • © Lfc, then M = Li® Li for each i.
Internal direct sums with many components will occur frequently later
in these notes. An immediate example is provided by the standard left free
module Rk. For any basis {bi,..., bt} of Rk,

Rk = Rh © ■ • • © Rbk

by Lemma 5.2.1.
The proof of the following useful but straightforward extension of Propo­
sition 7.1.1 is left to the reader.

7.4.1 Proposition
Let L i , . . . , Lfc be submodules of a left R-module M. Then the following
assertions are equivalent.
(i) M = Li © • • • © Lfc;
(ii) Let m € M. Then there are unique elements

h £ Li, . . . ,/fc € Lfc

with
m = li + ■ ■■ +/fc.


7.5. Block diagonal actions 113

7.5 Block diagonal actions


Let F be a field and let
/ D1 0
0 D2 0
D =

\ 0 0 Dk J
be a block diagonal matrix over F, with k blocks on the diagonal, and
suppose that D is an s x s matrix. The action of D on F" defines an F[X\-
module M which can be expressed very naturally as an internal direct sum.
Since this type of decomposition is very important in future applications,
particularly in Chapter 13, we give the full details, although they appear
rather gruesome at first sight.
As a first illustration, we consider the case that

/ di 0 \
0 d2
D =

\ 0 0 ds J
is an s x s diagonal matrix over F (so that k = s).
Put Li = Fei for i = 1 , . . . , s. As mentioned in the previous section, we
already know that M = L\ © • • • © Ls as a vector space over F. However,
each Li is also an F[JV]-submodule of M, with X acting as di, and so we
have a decomposition of M as an internal direct sum of F[X]-submodules.
For a general block diagonal matrix D as above, we have to overcome
some notational complications to describe F-bases of the components Li of
M. Suppose that the i-th block Di is an n(i) x n(i) matrix, where n(i) > 1
is an integer. Since D is an s x s matrix, we have

s = n(l) H \-n(k).

Let { e i , . . . , es} be the standard basis of Fs, and put

L\ =Fei H hFe„(i),
L>2 = JFe n (i) + i + 1- Fe n (i) + n ( 2 ),

Li = F e n ( 1 ) + . . . + n ( j _ i ) + i + ■ • ■ + i ? e n ( 1 ) + . . . + n ( i _ i ) + n ( i ) ,

Lk = ■f1en(l) + --+n(fc-l) + l + •" Fen(i) + ...+n(k-l)+n(k)-


114 Chapter 7. Direct Sums of Modules

Because of the block form of D, each Ll is an F[X]-submodule of M on


which X acts as LV To see this explicitly, write out the submatrix Di as
Di = (dui) where 1 < u,v < n(i). The (u,v)-entry of Di is the n(l) +
h n(i — 1) + u, n(l) H + n(i - 1) + v-entry of the matrix D, and the
remaining entries in row n(l) + • ■ ■ + n(i — 1) + u of D must all be zero,
since they lie outside £>,. Thus
(i) (i)
^" e n(l)+ •+n(i-l)+t) = ^ l , v e n ( l ) + - + n ( i - l ) + lH l-^ n ( i )^e n ( 1 ) + ... + n ( i _i) + n (i)

for v = 1 , . . . ,n{i).
It follows that M = Li © ■ • ■ © Lfc as an F[X]-module, as desired.

7.6 External direct sums


When we express a module as an internal direct sum, we break it down into
component submodules. The construction of an external direct sum solves
the converse problem: given a collection of modules, find a larger module
that has the given modules as its direct summands.
There is a complication in building up a module from specified com­
ponents, since, given an arbitrary pair of i?-modules L and ./V, there is no
reason why there should be any module M that contains both L and TV
as submodules. For instance, there is no obvious candidate for a Z-module
that contains both Z and Z2. To escape from this difficulty, we must be
content with a construction that causes the given modules to be replaced
by isomorphic modules.
Suppose that P\,... ,Pk are left .R-modules over some ring R. Recall
from elementary set theory that the Cartesian product of P\,... ,Pk is the
set
Px x ••• xP fc = {(pi,...,pk) I PI e Pi,...,p f c e Ffc},
where
(Pi, •• -,Pk) = (pi,- • • ,Pfc) «=> Pi = Pi, ■ • • ,Pfc = p'fc.
Then the external direct product of P i , . . . , Pfc is the set Pi x • • ■ x Pfc made
into a module by the rules

(pi,...,Pfe) + (p' 1 ,... 1 p' fc ) = (pi + p ' i , . . . , Pfc + p'k)

and
r- (pi,.-.,Pit) = (rpi,...,rp f c ),
where Pi,pi G P i , . . . ,Pfc,p'fc £ Pfc and r G R. An easy but long-winded
verification of the axioms (3.1) confirms that Pi x • • ■ x Pfc is an R-module.
7.7. Switching between internal & external 115

In the case that P\ = ■ ■ ■ = Pk = R, the external direct product is


simply the standard free left module Rk as defined in section 5.1, except
that its members appear as rows rather than columns.
On ordering. The ordering of the factors is important in an external direct
sum, as opposed to the situation with an internal direct sum, where it does
not concern us. The reason for our concern is that the same module may
appear more than once as a component of an external direct sum, and so
we must rely on the order of terms to distiguish elements that have the
same unordered set of entries. For example, we must not confuse (0,1)
with (1,0) in R2. (See also Exercise 7.5 below.)
The axioms for an internal direct sum rule out any repetition of sum-
mands, save for 0 terms, since the different components can only have zero
intersection.

7.7 Switching between internal & external


Next we see how an external direct sum can be rewritten as an internal
direct sum, and vice versa.
Suppose that M = Pi x ■ • ■ x Pfc, and define

Li = {(pi,0, ,0,0) | px 6 Pi)

U = {(o,...,o, P l ,o,...,o) \PlePt}

£fc = {(o,o, ,o,pfc) \pkePk}.


Then each Li is a submodule of M.
Since
(pi, ■ • • ,Pk) = (Pi, 0, ■ • •, 0) + • ■ • + ( 0 , . . . , O.pjt)

for any element of M, we see that

M = Lx + --- + Lk.

The other requirement for an internal direct sum, that

Li n (Lj H h Li_! + Li+i + ■ ■ ■ + Lk) = 0 for i = 1 , . . . , k,

is satisfied since the entries of a member of the Cartesian product are


uniquely determined. Thus the axioms in section 7.4 hold, and we can
write M = Li © ■ ■ ■ ® Lk.
116 Chapter 7. Direct Sums of Modules

Although the original modules Pi are not themselves contained in the


external direct sum, each is isomorphic to its corresponding submodule Li
by the i?-module isomorphism

0i : Pt -> Lh 6{Pi) = ( 0 , . . . , 0 , P i , 0 , . . . ,0).

In the reverse direction, any internal direct sum is isomorphic to an


external direct sum. To see this, suppose that M = L\ © ■ • ■ © Lk is an
internal direct sum, and put P — L\ X • • • x Lk, the external direct sum of
the submodules Li of M.
Since each element m of M can be written in the form m = l\ H— ■ + h
where the elements l\ e L\,...,lk £ Lk are uniquely determined by m
(Proposition 7.4.1), there is a well-defined map ip : M -> P given by i/;(m) =
(ii,...,/fc). An easy verification shows that V is a n isomorphism of R-
modules.
We summarize the preceding discussion as a proposition.

7.7.1 Proposition
The following assertions hold.
(i) If a left R-module M can be expressed as an internal direct sum
M = L\ © • ■ ■ ©Lfc, then M is isomorphic to the external direct sum
Li x ■■• x Lk.
(ii) If P = P\ x ■ • • x Pk is an external direct sum of left R-modules,
then P is an internal direct sum P = L\ © • ■ • © Lk of submodules
L\,..., Lk with Li = Pi for i = 1 , . . . ,k.

7.8 The Chinese Remainder Theorem
Our aim now is to show how a classical result from number theory form,
namely, the Chinese Remainder Theorem, leads to direct sum decomposi­
tions of cyclic modules over Euclidean domains.
In its most familiar form, the theorem reads as follows. Suppose we
are given a pair of coprime positive integers m and n, and an arbitrary
pair of integers y, z. Then there is an integer x which satisfies both the
congruences
x = y mod m and x = z mod n.
We reformulate this assertion in the language of rings and modules.
First, notice that the congruence x = y mod z is equivalent to the equality
x = y in the cyclic Z-module Z m (see 1.11), and similarly x = z mod n
means that x = z in ^© 0 n(^feft^l?^f^^-/| n e a n i n S °f t n e notation "x"
7.8. The Chinese Remainder Theorem 117

to vary according to context, which is more convenient than introducing


several notations for residue classes.)
Next, observe that there is a canonical Z-module homomorphism
Q : Z —► Z m x Z n , a(x) = (x,x).
Thus the Chinese Remainder Theorem asserts that a is a surjection.
This algebraic formulation prompts us to ask for the kernel of a, which,
as we shall see, is the ideal mnZ. By Theorem 6.3.2, we then have an
isomorphism

which identifies the direct sum Z m cyclic module.


In classical language, the interpretation of the fact that a is an isomor­
phism is that the integer x is unique modulo mn.
With this preamble, we now give the proof of the algebraic form of the
Chinese Remainder Theorem, working over an arbitrary Euclidean domain
rather than the integers.

7.8.1 The Chinese Remainder Theorem


Let R be a Euclidean domain and let b and c be coprime elements of R.
Then the canonical R-module homomorphism
a: R-> R/Rb x R/Rc
is a surjection, with
Ker(a) = Rbc.
Furthermore, there is an induced isomorphism
a : R/Rbc -s- R/Rb x R/Rc
of R-modules.

Proof Denoting residue classes in either R/Rb or R/Rc by x, the map a is


given by a(x) = (x,x), which is a homomorphism since it is composed of
two canonical homomorphisms.
Now suppose we are given an element (y, z) in R/Rb x R/Rc. Since b
and c are coprime, we can write 1 = sb + tc for some elements s,t of R (see
Lemma 2.8.1). Put x = zsb + ytc.
Then x = ytc = y mod b, so that x = y € R/Rb, and similarly x = z £
R/Rc. Thus a is surjective.
Clearly, x e Ker(a) if and only if x is divisible by both b and c, which
means (Lemma 2.8.1 again) that x is divisible by be, that is, x £ Rbc.
The final assertion follows from the First Isomorphism Theorem 6.3.2.

118 Chapter 7. Direct Sums of Modules

Remark. At this point, the reader might expect to find a description of the
corresponding internal direct sum decomposition of R/Rbc. However, this
description requires some machinery that we are going to develop in a more
general setting in the next chapter, so we postpone a statement until more
tools are at our disposal - see Corollary 8.3.2.

7.8.2 Corollary
Let M = R/Ra be a cyclic module over a Euclidean domain R. Then M
is indecomposable if and only if a = upn, where p is an irreducible element
of R, u is a unit of R and n > 1.

Proof Suppose that M is indecomposable. By the preceding result, a can­


not have two nontrivial coprime factors, and so the irreducible factorization
can involve only one irreducible element of R. Thus a = upn as claimed.
Conversely, if a = upn, then M = R/Rpn is indecomposable by Theorem
7.3.1.

Exercises
7.1 Let I, p and q be distinct irreducible elements of a Euclidean domain
R. Using the diagrams you found in Exercise 6.6, suggest internal
direct sum decompositions of the modules
(a) R/Rlpq;
(b) R/p2qR:
(c) R/p2q2R.
(The next chapter contains a systematic method for obtaining such
decompositions.)
/ 0 1 0
7.2 Let N be the C[X]-module given by the matrix S = 0 0 1
\ 1 0 0 )
acting on C 3 , which we considered in Exercise 3.7.
Find a direct sum decomposition of N into three one-dimensional
components, and show that this decomposition is unique.
Investigate what happens if the field of complex numbers is re­
placed by the real numbers K, or by the finite fields Z 2 , Z 3 , or Z 7 .
7.3 Let M i , . . . , Mfc be a set of i?-modules, and let Li be a submodule of
Mi for i = l,...,fc.
Show that there is an .R-module isomorphism
(Mi x • • • x Mfc)/(.Li x ■ • • x Lfe) = M i / L j x ■ • • x Mk/Lk.
Exercises 119

7.4 Let D be the ring of 2 x 2 diagonal matrices over a field F, and let
e = e n and / = e 22 . Using Exercise 1.7, show that D = De © £>/ as
a left P-module.
Prove that the only D-module homomorphism from De to Df is
the zero homomorphism and hence that the /^-modules De and Df
are not isomorphic. (This contrasts to part (e) of Exercise 7.8.)

F.
Generalize these results to the ring o f n x n diagonal matrices over s
7.5 Let Pi x Pi be an external direct sum of P-modules. Define LJ : Pi x
P2 —» P 2 x Pi by u){pi,p2) = (p2,Pi)- Show that UJ is an isomorphism
of P-modules.
Given a set of modules { P i , . . . , Pk), k > 2 and any permutation
a of the integers 1 , . . . , k, prove that

Pi x ■•• x P f c ^ P Q ( i ) x ••• xP Q ( f c ) .

7.6 This exercise and the next anticipate some results that are developed
further in section 14.2. Let R be a ring and let M be a left P-module
with M = L © TV.
(a) Show that the canonical homomorphism IT : M —> M/L induces an
isomorphism r : N —> M/L.
(b) Let L = inc : N -> M be the inclusion map and let

a= {T)-1TT: M -» M / L - > TV.

Show that at = idw, the identity map on TV.


(c) Conversely, suppose that there is a left P-module Q, and homomor-
phisms a : M —> Q and e : Q —> M with ae = id,M- Verify that a is
surjective.
Show that for any m G M, we have m — etr(m) e Ker(<r). Deduce
that M = Kei(a) © e(Q).
Show also that Q = M/ Ker(tr).
7.7 Let P = Pi x P 2 be an external direct sum of P-modules. Write idi
for the identity map on Pj, i = 1,2 and define maps as follows:
s
7ri : P->- Pi, 7ri(pi,p 2 ) =Pi

7r 2 :P->-P2, 7r 2 (pi,p 2 ) =P2

ei.Pi^P, ei(pi) = (pi,0)

£2 : P2 -> P «2(P2) = (0,p 2 )-


120 Chapter 7. Direct Sums of Modules

Verify that these are all homomorphisms, and that the relations

TTiCi = idi, 7r2£2 = id-2, 7T2£l = 0, 7TXE2 = 0

and
£l7Ti + C27T2 = idp

hold.
Conversely, suppose we are given a collection of modules P, Pi, P2
and homomorphisms as above. Show that P = P\ x P%.
Generalize this exercise from 2 to k terms.
s 7-8 Let R be the ring of all n x n matrices over a field F. For each pair
of integers i, j = 1 , . . . ,n, let Cy be the matrix with entry 1 in the
(i,j)-th place and all other entries 0. The set of all such matrices is
sometimes called a set of standard matrix units for R (despite the fact
that the matrices e^ are not units of R).
(a) Show that the set of standard matrix units is a basis for R as a
vector space over F.
(b) Prove that
_ j ehk if i = j
ehi.eik - j 0 ii i^j ■

(c) For each i, j , let Ij = Reij. Deduce that Ij is the set of all matrices
A = ( 0 , . . . , 0, a,-, 0 , . . . , 0), where the j - t h column Oj is an arbitrary
vector in the column space Fn and all other columns of A are zero
vectors. (Thus Ij does not depend on the value of i).
(d) Show that R = I a left i?-module.
(e) For each pair of suffices j , k, define 6jk : Ij -> Ik by 9jk(x) = xejk-
Verify that 6jk is an isomorphism of left i?-modules.
(f) Show that R has no two-sided ideals apart from 0 and itself. (This
result generalizes Exercise 1.9.)
7.9 Direct products of rings.
s Exercise 7.8 can be generalized by introducing the direct product
of a set of rings. This is the construction for rings that corresponds
to the direct sum for modules. Let Ri,..., R^ be a set of rings, and
let R = Ri x • ■ ■ x Rk be the external direct sum of Ri,..., Rk as
additive groups. Define the product by

( r i , . . . , r f c ) ( s 1 , . . . , s f c ) = (risi,...,r f c s f c )

and confirm that R is a ring, with identity element IJI = ( l i , . . . , lfc)


where lj is the identity element of Rt.
Exercises 121

Show that R is commutative if and only if each Ri is commutative,


but that R is not a domain (save in trivial cases).
For each i, let U = {(0,... , 0 , r , , 0 , . . . ,0) | r* e Rt}. Show that
each h is a two-sided ideal of R and that R = I\ © ■ ■ ■ © Ik as an
additive group.
Conversely, suppose that we have a set I\,..., Ik of two-sided ide­
als of R and that R — Ix ffi ■ ■ • © Ifc as an additive group. Write
1 = ej H 1- efc with e* in / j . Show that e\ = e» and that e ^ = 0 if
i ^ j . (So e i , . . . , efc is a set of orthogonal idempotents for i?.)
Verify that /j = Rti for each i, that Ii is a ring with identity ei,
and that i? = I\ x • • • x Ik as a ring.
Show also that the only .R-module homomorphism from Ii to Ij is
0 if i / j .
7.10 Infinite direct s u m s .
Infinite direct sums of modules are a useful source of counterex- O
amples. We also need them for our treatment of projective modules
in Chapter 14. Here, we sketch the definition.
First, we must define an ordered set. This is a set / so that, for
any two distinct elements i, j of I, either i < j or j < i, but not both.
We require also that ii i < j and j < k, then i < k. The finite sets
{ 1 , . . . , n} of integers are ordered in the obvious way, which is why we
can avoid any explicit mention of ordered sets in the main part of this
text.
Let R be any ring and let {Mi \ i € / } be an infinite set of left
.R-modules indexed by an ordered set I. The external direct sum
M = 0 7 Mi of these modules is defined to be the set of all infinite
sequences
m = (m,i), rrii £ Mi for all i

which satisfy the restriction that only a finite number of terms m^ of


m can be nonzero. Thus, there is an index s(m), which depends on m,
so that mi = 0 for all i > s(m). (The fact that the order is important
in an external direct sum explains why J must be ordered.)
Define addition and scalar multiplication in M by the obvious
analogy with the finite case. Verify that M is an i?-module and that
Mi is isomorphic to a submodule of M.
If we take Mi = R, we obtain the free module R1. For each i E I,
let ei be the element of R1 that has entry i in the i-th place and zeroes
elsewhere. Verify that {e^ \ i £ 1} is a basis of R1, the standard basis
- you will need to formulate the generalization of "basis" to infinite
sets.
122 Chapter 7. Direct Sums of Modules

C 7.11 Let R be the set of all infinite matrices A = (aij) over a field F,
with rows and columns indexed by the positive integers, subject to
the condition that each row and column of A has only a finite number
of nonzero entries. Verify that R is a ring under the expected rules of
addition and multiplication. Find a set of left ideals I\s... so that R
is an internal direct sum 0 Ii as a left .R-module.
Let rnR be the set of all matrices A in R that have only a finite
number of nonzero entries. Show that mR is a two-sided ideal of R.
(This provides a contrast to Exercise 7.8 above.)
Chapter 8

Torsion and the Primary


Decomposition
Now that we have the language of direct sums at our disposal, we can start
the task of expressing a general module M over a Euclidean domain R as a
direct sum of simpler submodules. The first step is to isolate a submodule
T{M) of M, the torsion submodule of M, which is in a sense the "non-free"
part of M. In a later chapter, we shall see that M = T(M) © P with P a
free module. If a module is equal to its torsion submodule, that is, it has
no free component, then the module is called a torsion module.
We will show that a finitely generated torsion module can be annihilated
by a nonzero element a of the ring of scalars R. If a can be chosen to be a
power p " of an irreducible element of R, then the torsion module is called
a p-primary module. The main result in this chapter is that any finitely
generated torsion module can be decomposed into a direct sum of p-primary
submodules, one for each irreducible divisor of o.
An immediate consequence of this result is that a cyclic module can
be decomposed into a direct sum of p-primary indecomposable cyclic sub-
modules. It will take us several more chapters before we can obtain the
corresponding result for non-cyclic modules in section 12.5.
In this chapter we take the ring of scalars R to be a Euclidean do­
main, apart from some preliminary definitions that need R to be only a
commutative domain.

123
124 Chapter 8. Torsion and the Primary Decomposition

8.1 Torsion elements and modules


Let R be a commutative domain and let M be an .R-module. By definition
(section 4.11), the annihilator of an element m € M is the ideal

Ann(m) = {r £ R | rm = 0} C R.

An element m £ M is said to be a torsion element of M if

Ann(m) ^ 0,

that is, there is some nonzero element a e R with am = 0.


The zero element of any module is always a torsion element, since
Ann(0) = R (remember that we insist that a domain is a nonzero ring).
Thus we say that a module is torsion-free if the only torsion element in M
is 0. At the other extreme, a module is a torsion module if all its elements
are torsion.
An example of a torsion-free module is provided by the ring R itself -
since R is a domain, the equation ar = 0 has no solution apart from the
trivial ones with either a = 0 or r = 0. More generally, the standard free
modules Rk are all torsion-free.
On the other hand, any cyclic module of the form R/I with J / 0 is
torsion, since / ■ {R/I) = 0.
The zero module 0 is allowed to be both torsion and torsion-free. Some
of our results have trivial exceptional cases caused by the presence of su­
perfluous zero submodules or summands; we shall ignore these.
Notice that the definition of torsion depends on the coefficient ring R.
For example, a field F is always torsion-free when considered to be an F-
module, that is, a one-dimensional space over itself. On the other hand, if
F is viewed as a module over the polynomial ring F[X] with X acting as
A for some A in F, then every element of F is annihilated by X — A and so
F is a torsion F[X]-module.
To handle modules which may contain both torsion and non-torsion
elements, we introduce the torsion submodule T(M) of M:

T(M) = {m € M | m is torsion},

which consists of all the torsion elements in M. Thus M is torsion-free if and


only if T{M) = 0, while M is a torsion module precisely when M = T(M).
The next result justifies the use of the word "submodule", and gives an
important property of T(M).
8.2. Annihilators of modules 125

8.1.1 Proposition
(i) T(M) is a submodule of M.
(ii) M/T(M) is torsion-free.

Proof (i) Suppose that m,n e T(M). Then am = 0 and bn = 0


for nonzero elements a, b of R. Since R is a domain, ab ^ 0, and clearly
ab(m + n) = 0. Thus m + ne T(M).
If r e R, then n(rm) = r(am) = 0, so rm € T(M), confirming that
T(M) is a submodule.
(ii) Let x € T{M/T(M)), so that there is a nonzero element b of R with
6a; = 0. By definition of the quotient module (section 6.1), x = m for some
m in M. Since the zero element of M/T{M) is 0, we have bra = bx = 0,
and hence bra 6 T{M).
But then a{bm) — 0 for some a ^ 0 in .R; since ab ^ 0 and a6m = 0, we
have m € T(M) and so a; = 0 in M/T(M). U
An example. Here is an example of a module which is neither torsion nor
torsion-free.
Let n be a nonzero positive integer and put M = Z x Z n , the external
direct sum of Z-modules. To describe M as an internal direct sum, take
L = {(y, 0) | y e Z} and N = {(0, z) \ z £ Zn}. As in section 7.7, we have
M = L®N.
Let x = (y, z) G Af, and suppose that a ^ 0 in Z and that ax = 0.
Then aj/ = 0 and az = 0, which means that y = 0 and that z = w for some
w 6 Z with Sw = 0, that is, aw; C nZ. Thus T(M) C TV; but it is clear
that we have the equality T(M) = TV since nz = 0 for all z in Z n .
The quotient module M/T{M) can be identified with L = Z by using
the First Isomorphism Theorem as in Exercise 7.6.

8.2 Annihilators of modules


Next we extend the definition of annihilators from elements to modules.
Let M be an i?-module. The annihilator of M is

Ann(M) = {r e R\rm = 0 for all m € M } .

An easy verification shows that Ann(M) is an ideal of R.


When the module M is cyclic, say M = Rx, then Ann(M) = Ann(x), so
an annihilator of an element is a special case of an annihilator of a module.
If the module M is not torsion, then Ann(M) = 0, since there is some
element m of M which is not in T(M) and so Ann(M) C Ann(m) = 0.
In the other direction, it is possible for the annihilator of a torsion module
126 Chapter 8. Torsion and the Primary Decomposition

to be 0 - see Exercise 8.6 below. However, the next result shows that this
does not happen in the cases of most interest to us.

8.2.1 Proposition
Suppose that M is a finitely generated R-module. Then M is a torsion
R-module if and only if Ann(M) ^ 0.

Proof Suppose first that M is torsion. Since M is finitely generated, we


have M = Rx\ + ■ ■ ■ + Rx3 for a finite set of generators x\,.. .,xs of M.
Since M is torsion, each generator Xi has a nonzero annihilator ideal, so we
can choose a nonzero element a* of Ann(xj) for each i.
Put a = ai ... as. Then a ^ 0, and axi = 0 for all i, which implies that
a{r\X\ + ■ ■ ■ + rsxs) = 0 for any element of M. Thus a G Ann(M).
The converse argument is obvious. □

8.2.2 Corollary
Let F be a field, let A be an n x n matrix over F and suppose that M
is the F[X]-module obtained from Fn with X acting as A.
Then M is a torsion F[X)-module.

Proof Since the vector space of all n x n matrices over F has dimension n 2 ,
the n 2 + 1 powers I, A,..., An must be linearly dependent. Thus there is
a nonzero polynomial g(X) of degree at most n 2 with g(A) = 0. It follows
that g(X) is a nonzero element of Ann(M). □

Remark: the annihilator of M actually contains a polynomial of degree n,


namely, the characteristic polynomial of A - see Exercise 9.2.

8.3 Primary modules


For the remainder of this chapter, we shall assume that the coefficient ring
R is a Euclidean domain.
We know from section 2.9 that a nonzero element a of R has a standard
factorization
n(l) n(fc)
a = upi .. .pkK '
where u is a unit of R, Px,-.-,Pk are distinct (that is, nonassociated) ir­
reducible elements of R, and n ( l ) , . . . ,n(k) are positive integers. The set
Pi,...,Pk is uniquely determined by the element o, apart from its order­
ing, and the integers n ( l ) , . . . ,n(fc) are unique once an ordering of the
irreducible factors has been chosen.
8.3. Primary modules 127

Let M be a finitely generated torsion fi-module and write Ann(M) =


Ra. Then M is said to be a p-primary module if there is a single irreducible
element p of R with a = upn for some unit u of R\ in this case, -Ra = Rpn,
so we can omit the unit u.
A cyclic module of the form R/Rpn is evidently p-primary, with anni-
hilator Rpn, and we will eventually prove that any p-primary module is a
direct sum of such cyclic modules with varying exponents n.
For the moment, we will show that any torsion module can be expressed
as a direct sum of primary components. The following lemma provides the
key tool.

8.3.1 L e m m a
Suppose that R is a Euclidean domain and that M is an R-module with
Ann(M) = Ra, a ^ O . Suppose also that a = be for coprime elements b,c
in R. Then the following hold.
(i) M = bM ® cM, where

bM = {bm | m e M) and cM = {cm | m £ M}.

(it) Ann(WW) = Re and Ann(cM) = Rb.

Proof (i) Since b and c are coprime, we have 1 = xb + yc for some


elements x, y of R. Then for any m in M,

m = 1 ■ m = b{xm) + c(ym),

and so
M = bM + cM.
If m e bM D cM, we have m = bl = en for some l,n € M and hence
cm = cbl = al = 0 and similarly bm = 0. Expanding 1 • m again, it follows
that m = 0, which gives bM D cM = 0 and hence M = bM © cM.
(ii) Suppose that x G Ann(6M). Then xbm = 0 for all m in M, and so
xb £ Ann(M) = Rbc. Since R is a domain, this means that x € i?c, which
shows that Ann(WW) C Re. But clearly Rc(bM) = 0, so that Ann(fcM) =
Re. The other equality follows by symmetry. □
Combining the above result with Theorem 6.5.2, we obtain the "inter­
nal" version of the Chinese Remainder Theorem (7.8.1) that was promised
in the preceding chapter.
128 Chapter 8. Torsion and the Primary Decomposition

8.3.2 Corollary
Let M = R/Rbc be a cyclic R-module where b and c are coprime ele­
ments in R. Then
M = bM®cM

with
bM * R/Rc and cM £* R/Rb.

8.4 The p-primary component


Let M be a finitely generated torsion .R-module, where R is a Euclidean
domain, and let p be an irreducible element in R. The p-primary submodule
or component of M is

TP(M) = {m e M | p r m = 0 for some r > 1}.

An argument similar to that in the proof of Proposition 8.1.1 shows that


TP(M) is a submodule of M.
It is not quite obvious that a module is p-primary if and only if M =
TP(M) - there is a slight problem with the question of whether or not a
submodule of a finitely generated module need itself be finitely generated.
This is in fact always the case when R is Euclidean, but we have to wait
until Theorem 9.4.2 is available before we can use this fact. For a similar
reason, we have not defined the p-primary submodule of an arbitrary finitely
generated .R-module - we do not know yet that the torsion submodule is
again finitely generated.
The next result shows how the nontrivial p-primary components of a
module are determined by its annihilator.

8.4.1 Theorem
Suppose that R is a Euclidean domain and that M is a finitely generated
torsion R-module with Ann(M) = Ra, a ^ 0. Let p be an irreducible
element of R and write a = ppn with p coprime to p.
(i) If p is not a factor of a (so that a = p), then TP(M) = 0.
(ii) In general, the p-primary component Tp(M) of M is pM.
(Hi) There is a direct sum decomposition

M = pM®pnM.
8.4. The p-primary component 129

Proof
(i) Since p does not divide a, the elements a and pn are coprime in R
for any n > 1. Thus 1 = xa + yp n for x,y e R, and m = m • 1 = 0 for m in
T„(M).
(ii) By Lemma 8.3.1, we have M = pM ® pnM, and Ann(pM) = Rpn.
If m i , . . . , mt is a finite set of generators for M, then pm\,... ,pmt is a
finite set of generators for pM, which is thus a p-primary module according
to our definition. This shows that pM C TP(M).
To prove the reverse inclusion, suppose that m is an element of TP(M).
Then m = x + y with x £ pM and y e pnM. But y must be annihilated
by p n , since m is, and by p, since pnM is. Thus y = 0 and so m is in pM,
which gives the desired equality,
(iii) The decomposition is now immediate from Lemma 8.3.1. □
We can now give the complete primary decomposition of a torsion mod­
ule.

8.4.2 Theorem
Suppose that R is a Euclidean domain and that M is a finitely generated
torsion R-module with Ann(M) = Ra, aj^O.
Let a = up™ .. .p2 be a standard factorization of a where u is a
unit of R and Pi, ■ ■ ■ ,Pk are distinct irreducible elements of R. Choose pt
so that a = p™ Pi for i = 1 , . . . , k.
Then, for each i = 1 , . . . ,k, the pi-primary component TPi(M) of M is
PiM, and M has the direct sum decomposition
M = TPl{M)®---®TPk{M).

Proof We induce on i If k = 1, then M is pi-primary by definition, and


the direct sum "decomposition" has only one term.
Assume now that k > 2. By the previous result, Theorem 8.4.1, we
have TPl (M) =p1M and
M = TPl{M)®pnl(l)M. (8.1)

By Lemma 8.3.1, p™(1)'M has annihilator Re with c = p£ p£ ■


Choose pi so that c = p™ pi for i = 2 , . . . , k. Then, for each i, the in­
duction hypothesis tells us that the pi-primary submodule of p™( 'M is
PiP^M, which is simply pzM. We also know, by the induction hypothesis
again, that
p^M = TP2(p?(1)M)©-..eTPfc(p?(1)M)
= p2M®---®pkM. (8.2)
130 Chapter 8. Torsion and the Primary Decomposition

Combining Eqs. (8.1) and (8.2), we see that

M = pxM © ■■■ ®pkM.

Since Theorem 8.4.1 gives the equalities ptM = TPi(M) for all i, the result
follows. □

8.5 Cyclic modules


Our results enable us to find the primary decomposition of a cyclic module
over a Euclidean domain R. By Theorem 6.4.1, such a module has the form
R/I for a unique ideal I oi R. If / = 0, the cyclic module is R itself, which
is not torsion.
We therefore assume that / is not zero, so that / = Ra for some nonzero
element a of R. The element a can be multiplied by a unit of R (Lemma
1.8.1) without changing the ideal / , so we can assume that a has a standard
factorization a = p" .. .pk .
By our main theorem above,

TPi(R/Ra) = p^R/Ra) = RpJRa.

Since a = p™ Pi, we have RpJRa = R/Rp^' (Theorem 6.5.2). Thus we


have isomorphisms

TPt (R/Ra) S R/Rp"{i) for i = 1 , . . . , k,

and so, using Proposition 7.7.1, we can express R/Ra as an external direct
sum of primary modules

R/Ra = R/Rp"{1) x • • •x R/Rpnk{k).

Note that, by Theorem 7.3.1, the modules R/Rp*^ are all indecom­
posable, so we cannot split R/Ra into smaller components, at least, not by
this technique. A uniqueness theorem to be proved later shows that any
other method of decomposing R/Ra into indecomposable modules must
give essentially the same result.

8.6 Further developments


Our treatment of torsion depends heavily on the hypotheses that R is com­
mutative and that it is ^^J^fifetTAlfe/igfftlJrit decompositions given in
Exercises 131

Theorems 8.4.1 and 8.4.2, for which we assume R to be Euclidean, hold


over principal ideal domains in general.
The theory of torsion can be extended to modules over arbitrary com­
mutative rings, where it is usually treated in tandem with the theory of
localization, which does not make an appearance in these notes. There is a
good theory of primary decomposition for modules over commutative rings,
which can be found in Chapter 4 of [A & McD] or of [Sharp]. There is also
a useful definition of torsion over noncommutative domains provided a fur­
ther requirement, the "Ore condition", is satisfied - see §2.1 of [McC & Ft].
The construction of satisfactory primary decomposition theories for classes
of noncommutative rings is a difficult problem; again the reader should look
at [McC & R] and the references therein.

Exercises
8.1 (In these exercises, R is a Euclidean domain and all modules are
finitely generated R-modules, unless otherwise stated.)
Suppose that L is a submodule of M. Show that T(L) is a sub-
module of T(M) and that TP(L) is a submodule of TP(M) for any
irreducible element p of R.
More generally, if 9 : L —y M is an R-module homomorphism,
show that 9 induces a homomorphism T(9) from T(L) to T(M), and
likewise for Tp.
If 9 is a surjection, does it follow that T(9) is a surjection?
8.2 Suppose there is an isomorphism 9 : L —y M. Show that T(9) :
T(L) ->■ T(M) and Tp(9) : TP(L) -> TP{M) are also isomorphisms.
8.3 Show that every free .R-module is torsion-free.
8.4 Suppose that M = L® N. Prove that
(a) T{M) = T{L)®T(N)-
(b) TP(M) = TP(L) 0 TP(N) for any irreducible element p of R.
Generalize these results to the case that M has k > 2 components.
8.5 Suppose that P = L x N. Prove that
(a) T(P) 3 T(L) x T(N);
(b) Tp(P) =* TP(L) x TP(N) for any irreducible element p of R.
Generalize these results to external direct sums with k > 2 compo­
nents.
8.6 Let the Z-module M = © i > 2 ^» be the infinite direct sum of the finite
cyclic modules Zj (Exercise 7.10).
Verify that every element of M is torsion, but that Ann(M) = 0.
132 Chapter 8. Torsion and the Primary Decomposition

8.7 Suppose that b and c are nonzero elements of R which are not coprime.
In this example, we outline an "elementary" argument which shows
that the direct product R/Rb x R/Rc is not cyclic. This should be
contrasted with the Chinese Remainder Theorem 7.8.1.
The fact that R/Rb x R/Rc is not cyclic can also be deduced
from the general uniqueness results which we shall obtain in Theorem
12.8.1 and section 12.9.
The argument is by contradiction, using the fact that a module
M is cyclic if and only if there is a surjection from R to M (section
4.11).
Verify the following statements, in which M = R/Rb x R/Rc.
(a) There is an irreducible element p of R which divides both b and c.
(b) There are positive integers m and n with TP(R/Rb) = R/Rpm,
TP{R/Rc) =* R/Rpn, and

TP{M) S R/Rpm x R/Rpn.

(c) If M is cyclic, so is TP(M) (see Exercises 7.6 and 7.7).


(d) Tp{M)/pTp{M) S R/Rp x R/Rp.
(e) If M is cyclic, there is an .R-module surjection /3 : R —> R/Rp x
R/Rp.
(f) Rp C Ker(/?), and so there is an induced surjection j3 : R/Rp —>
fi/ify x H/iZp.
(g) /? is also an R/Rp-module homomorphism.
(h) This cannot happen, since R/Rp is a field (Proposition 2.11.1)
and so there can be no surjective R/Rp-linear transformation from
R/Rp to {R/Rp)2.
Chapter 9

Presentations
Our results to date have given us a reasonable hold on the theory of cyclic
modules over a Euclidean domain R. Combining Theorem 6.4.1 with the
factorization theorems for elements of R, we know that a cyclic .R-module
has the form R/Ra for an element a of R which is unique up to multipli­
cation by a unit, and we can calculate the primary decomposition of R/Ra
as in section 8.5. The problem we now face is to extend these results to
arbitrary finitely generated i?-modules. To provide some motivation for
what follows, we take another look at how we recognize a cyclic module as
being isomorphic to one of the form R/Ra.
The statement that a module M is cyclic tells us that it has a single
generator, say m; then M = Rm. The generator will satisfy some "rela­
tions" , that is equations of the form rm = 0 for r £ R. If the only such
equation is the trivial one, Om = 0, then {771} is a basis for M, which means
that M is free and isomorphic to R.
If there are nontrivial relations, then there is a "fundamental" relation
am = 0 with the property that all other relations are consequences of this
fundamental relation, that is, they take the form (xa)m = 0 for x 6 R. This
assertion is simply a reinterpretation of the fact that the set of coefficients r
with rm = 0 forms the annihilator ideal Ann(m) of m, which is a principal
ideal Ra. The First Isomorphism Theorem now assures us that M = R/Ra.
It is the analysis of generators and relations that provides the key to
our description of /^-modules in general. Suppose that M has a finite set
{ m i , . . . , mt} of generators, which means that each element m in M can be
written
m = rivfix ~\ (- rtmt for some r j , . . . , rt e R.

Then there are usually some relations between the generators, that is, iden-

133
134 Chapter 9. Presentations

tities of the form


J"ITTII H 4- rtm t = 0.
If there is no relation except the trivial one with all coefficients r; = 0,
then the generating set is, by definition, a basis of M, and M = i?* is a free
module as in Chapter 5.
When there are nontrivial relations, we have two tasks. The first is to
isolate a fundamental set of such relations, that is, a set of relations from
which all others can be derived. The next is to reshape the fundamental
relations so that the structure of M becomes transparent.
Our general definitions in this chapter require only that the coefficient
ring R is a commutative domain, but we need to take R to be a Euclidean
domain to obtain some results.

9.1 The definition


Let M be an /^-module. A presentation of M is a surjective i?-module
homomorphism
6: Rl ->• M,
from a standard free module i?* to M.
The free module JR* has the standard basis { e i , . . . , et}, so that each
element x £ R* can be written

x = riei H \-rtet

for some unique members r\,..., rt of R (section 5.1). Thus, given a pre­
sentation 0 of a module M, each element m of M has the form

m = 6{x)
= ri0(ei) + --. + r t 0(e t )

which shows that {0(e!),... ,6(et)} is a finite set of generators for M.


Conversely, given a finite set of generators { m i , . . . , mt} for M, we can
define a presentation 9 by setting

6{x) = r1mi H (- rtmt for all x 6 R\

and then
0(ei) = m i , . . . , % ) =mt.
Notice that a module has many different presentations, since each choice of
a set of generators gives rise to a presentation.
9.2. Relations 135

Examples.
(i) A cyclic module R/Ra has a presentation TC : R -> /?/i?a, 7r(r) = f = r - I
and also a presentation —?r: r t-> r • (—1).
(ii) A module will have generating sets of different sizes, and so it will have
presentations involving free modules of different ranks. For example, take
M = R/Ra x R/Rb, the external direct product of two cyclic modules,
and put mi = (T, 0) and m 2 = (0,T). Then M = Rmi ® Rm2, and there
is a presentation p : R2 -» M given by

P(ri,r2) = (ri,ri) = r a m i + r2m2.


If a, b are coprime, then M is cyclic by the Chinese Remainder Theo­
rem 7.8.1 and so M has a presentation as in (i) above; on the other hand,
if a, b are not coprime, then M is not cyclic (Exercise 8.7).

9.2 Relations
Suppose that 9 : Rl —> M is a presentation of M, and write

m-i = 0(ei),...,mt = 9(et)


for the corresponding generators. Informally, a relation between the gener­
ators of M is an expression of the form
rimi H h rtmt = 0, r*i,..., rt € R.
However, this informal definition does not lend itself to calculation, the
problem being that it is not clear when one relation is to be regarded as a
consequence of others. To overcome this difficulty, we notice that

rxm\ -\ h rtmt = 0 «=> r\e\ H 1- rtet 6 Ker(0),


and so we make the formal definition that a relation on M is to be an
element of the kernel Ker(#) of 9. The module Ker(0) is called the relation
module for M with respect to 9.
Properly speaking, we should talk of relations on a particular set of
generators of M rather than on M itself, but it will be clear from the
context which set of generators is being used at any particular time.
Suppose that we can find a finite set of generators {pi,... ,ps} for
Ker(#), say
Pi = 7 n e i + • • • + 7ti e *
(9.1)
7i* e i + • • • + ltset.
136 Chapter 9. Presentations

Then any element of Ker(#) is a linear combination of {pi,..., ps}, which


fact can be interpreted as meaning that any relation among the generators
{ m i , . . . , mt} of M is a consequence of the "basic" relations

0 = 7nmi + h 7«mt \
: \ . (9.2)
0 = 7i s mi H h 7tsmt J

We therefore say that the set of relations {pi,..., ps} is a set of defining
relations for M.
For example, the cyclic module R/Ra has one defining relation, p\ =
aei, while the direct sum R/Ra x R/Rb has two defining relations, pi = ae\
and p2 = be%-

9.3 Defining a module by relations


So far, we have discussed presentations of a given module. Next, we look
at the reverse procedure, that is, the construction of a module with a given
set of defining relations.
Suppose that we are given a set of elements {p\,..., ps} in R* as in Eq.
(9.1). We form the submodule

K = Rpi + ■ ■ ■ + Rps

of Rl generated by the given elements and let M = Rt/K. There is a


canonical surjection 7r : .ft* —> M which is a presentation for M, with
relation module K = Ker(7r). Thus M has defining relations {p\,... ,ps},
and so, naturally enough, M is called the module defined by the given
relations.
Less formally, we say that M is defined by a set of generators and
relations as in Eq. (9.2) if M is defined by the corresponding relations
{pi,-..,Ps} in .ft'.

9.4 The fundamental problem


As we have noted, a particular module will have many presentations and
so many sets of defining relations. Conversely, two sets of defining relations
may or may not lead to the same module. Our fundamental problem then
is to give a procedure that allows to determine whether or not two sets of
defining relations do give (SbpydightetMM^rial
9.4. The fundamental problem 137

To illustrate the problem, here is a collection of sets of defining relations


for various Z-modules, together with a calculation of the modules so defined.
(i) Defining relations 3mi = 0, 5m2 = 0.
There is a presentation 6 : Z 2 ->• M with kernel K generated by 3ej
and 5e2- It's clear that

K = 3Zx5ZcZxZ

so that M ^ Z 3 x Z 5 (see Exercise 7.3).


By the Chinese Remainder Theorem 7.8.1, M = Z15 also,
(ii) Defining relation 3mi — bm.2 = 0.
Here, the relation module K has one generator p\ = 3ei — 5e2- Since K
has rank 1 and 1? has rank 2, it is reasonable to suspect that M = 1? jK
contains a module of rank 1, that is, a module isomorphic to Z.
Now notice that 1 = 2 - 3 — 1-5 in Z, and put n\ = 5 and n.2 = 3.
Then ri\, ri2 generate Z, and there is a presentation u : 1? —> Z, ui(y, z) =
yni + zn.2, with kernel K. Hence M = Z.
Observe that the relation in this example is a linear combination of the
relations in the first example,
(hi) Defining relations 3mi — 5rri2 = 0, 5m.2 = 0.
These relations are equivalent to those in the first example, since 3mi =
0 obviously. So we get the same module.
(iv) Defining relations 3mi — 5m2 = 0, 67711 = 0.
Clearly, 10rri2 = 0. Put n\ = 2m\ and 77,2 = 2m2, so that ni,ri2
generate the submodule 2M of M. But 7ii,ri2 satisfy the same relations as
the generators in the first example, which gives 2M = Z15.
We need to calculate M/2M. This module has generators p\,P2, which
are the images of mi, 777,2 respectively. These generators must satisfy the
original relations together with the relations 2pi = 0 = 2p2. Thus p\ = p2,
so that M / 2 M ^ Z 2 .
It is now not hard to show that

M =■ Z 3 0 = Z 2 x Z 3 x Z 5

by using the Chinese Remainder Theorem together with the results in sec­
tion 8.5.
The next result tell us that a set of defining relations can always be
taken to be finite.

9.4.1 Theorem
Let R be a Euclidean domain and let K be an R-submodule of a standard
free R-module Rf of rank3e>[jphejtot8cliWatm#tee, of rank s with s <t.
138 Chapter 9. Presentations

Proof We argue by induction on t. If t = 0, then Rl = 0 = K trivially. If


t = I, then K is an ideal of R and so K = Ra is principal; the rank of K
is 0 or 1, depending on whether a is zero or nonzero.
Now suppose that t > 1, and let <r : R? -> R be "projection to the last
term", that is,
cr(riei H \-rtet) = rt.
Put / = Im(cr). Since a is an R-module homomorphism, / is an ideal of R.
If I = 0, then K C fl*-1, so we are finished by our induction hypothesis.
So suppose that I = Ra is not 0. By definition of I, there is an element w
in K with a(w) = a, that is,

w = wiei + 1- wt-\et-i + aet,

where e i , . . . , e t is the standard basis of Rl. For any other element

x = x\e\ + 1- xt-\et-\ + xtet € K,

we have Xt = ba for some b in R. Then

x - bw € Ker(tr) n K.

Write L = Ker(<r) n K. We have

x = (x — bw) + bw £ L + Rw,

which gives
K = L + Rw.
If x £ L n _Rw, then x = bw with ba = 0. Thus 6 = 0 and so x = 0. This
establishes that
K = L © to.
Now L is a submodule of Ri~1, so the induction hypothesis tells us that
L is free of rank h say, with h < t — 1. Let { c j , . . . , c^} be a basis of L.
Since .Rw has basis the single element w (and has rank 1) , K is free with
basis { c i , . . . ,Ch,w} and rank s = h + 1 < i. D

Remarks.
(i) This theorem assures us that a finitely generated R-module with t gen­
erators has a presentation that requires s < t defining relations,
(ii) In vector space theory, a generating set can be reduced to a basis by
omitting elements. Such easy arguments cannot be expected to work for
modules over Euclidean (&wpfyfl}§h1<mfcM$eftk& generating set {2,3} of Z.
9.5. The presentation matrix 139

(iii) The argument in the proof of the theorem is, in principal, constructive.
Given an explicit set of generators

{Pl,---,Pu}

for K, we can compute a generator a of Im(<r) as an ^-linear combination


of
{a(pi),...,cr(pu)}

by Euclid's algorithm. Thus we can find a linear combination w of the given


generators so that a(w) = a, and hence a set of generators of K n R}~1 as
in the theorem.
However, this method is not so easy to implement if we are faced with
a large number of generators and relations, and the technique that we will
give in the next chapter is better for computations.
We digress from the main topic of the chapter to give an important
result that follows directly from the preceding theorem.

9.4.2 T h e o r e m
Let R be a Euclidean domain and let M be a finitely generated R-module.
Then every R-submodule of M is also finitely generated as an R-module.

Proof Let L be a submodule of M, and let IT : Rl —> M be a presentation


of M. Then the inverse image 7r*(L) of L in Rl is a submodule of Rl
(Proposition 4.13). By the theorem above, 7r*(L) is finitely generated, and
so also is L = 7r»(7r*(L)). □

9.5 The presentation matrix


Suppose that the module M has defining relations

Pi = 7 n e i H H 7ue, + h 7ne*,

Pi = 7 i j e i + • • ■ + jijei H + jtjet,

Ps = 1Q9PW&tafl'M0p4al- ■ + itset
140 Chapter 9. Presentations

as in Eq. (9.1). The corresponding presentation matrix for M is defined to


be the t x s matrix

/ 7n • • m ■ ■■ l u \

7« • lij lis

\ 7ti • ■ Itj ■ ■■ Its 1

notice that the coordinate vector of pj gives the j'-th column of the matrix.
Despite Theorem 9.4.1, we allow the possibility that s > t. The reason for
this is that a module may be presented to us with surplus relations, but it
may be far from obvious which relations can be discarded.
We can reverse our point of view, obtaining a module from a matrix.
Given any t x s matrix T over R, we define the submodule K of Rf to
be that with generators p\,... ,ps as above, and then define M to be the
quotient module M = Rt/K. By construction, T is a presentation matrix
for M.

9.6 The presentation homomorphism


It is sometimes useful to interpret the presentation matrix as the matrix of
a homomorphism 7 : R" —> Rl between free modules.
Define the presentation homomorphism 7 by 7(2) = Tx for each column
vector x e Rs. If we write E = {ex,..., et} for the standard basis of Rl
and B = {bi,..., bs} for the standard basis of Rs, then

7(^>i) = Pi = 7 n e i H h 7«et,
(9.3)
l{bs) = Ps = 71*ei H + ItsGt-

Thus T = ("I)E,B as in section 5.9.


Notice that the image Im(7) of 7 is the relation module Ker(#) of the
corresponding presentation 9 : R* -> M of M. In general, 7 will not be an
injective homomorphism. We record the circumstances in which it is.

9.6.1 T h e o r e m
The presentation homomorphism 7 is injective if and only if the set
{pi,..., ps} of defining relations forms a basis of the relation module K =
Ker(0).
9.7. F[X]-module presentations 141

Proof We know that the defining relations generate K, so they form a basis
precisely when they are linearly independent. But pi = -y(bi) for all i by
definition of 7, and since b\,..., bs is the standard basis of Rs, the defining
relations are linearly independent if and only if 7 is injective. □

9.7 F[X]-module presentations


Suppose that A is an n x n matrix with coefficients in a field F, and that M
is the F[X]-module obtained from Fn with X acting as A, in the familiar
way. Then there is a canonical presentation matrix for M as an F[X]-
module, which we now describe.
Let { e i , . . . ,e„} be the standard F[X]-basis of F[X]n, let { e i , . . . ,en}
be the standard F-basis of Fn, and define

7T : F[X]n -> Fn

to be the F[X]-module homomorphism with

7r(ei) =eu ... ,7r(en) = e n .

The action of X on the standard basis of Fn is given by the products


Ae\,..., Aen, which on expansion read

Xe\ = anei + ■ ■ ■ + aj\e~j + • • • + anien,

Xej = aijEi + • ■ • + a,jje~j + • ■ • + anjen,

Xen = a\ne\ + ■ ■ ■ + aj„e~j + • ■ • + annen,

from which we see that the defining relations for M are

Pi = (X - an)ei - ■•• —a^\&j - ■■■ -an\en,

Pi = -axiei - ■■■ +{X-aJJ)ej Q-njCni

pn = -&i n ei - CopyrightetPSASterial + (X — ann)en,


142 Chapter 9. Presentations

and that the presentation matrix for M is


_ a —a
/ X — on •• \j "■■ in \

X -a j] (9.4)
r= -Q.il —a;n

\ ~0.nl L
nj /
which is the characteristic matrix of the matrix A.
Let XI be the product of X with the n x n identity matrix 7, that is,
the n x n diagonal matrix with diagonal entries all equal to X. Then we
can write the characteristic matrix more succinctly as
XI- A. (9.5)

The determinant
det(r) = det{XI - A)
is the characteristic polynomial of A, which plays an important role both
in elementary linear algebra and in our analysis of the structure of F[.X']-
modules. A straightforward expansion shows that the characteristic poly­
nomial is monic, with
det(XI - A) = Xn - (on + ■ • • + ann)Xn-1 + ■■■ + ( - 1 ) " det(,4).

9.8 F u r t h e r developments
The attempt to describe modules in terms of presentations leads to some
of the most fruitful areas of mathematics, in that it provides a method of
characterizing various types of rings and modules.
The method can be illustrated by the key result of the chapter, Theorem
9.4.1, which assures us that any finitely generated module M over a Eu­
clidean domain R has a free relation module K, and that the rank of K is
at most the number of generators of M. This result holds for commutative
principal ideal domains ([Cohn 1], §10.5) and, in fact, it provides a charac­
terization of such rings. Suppose that the theorem holds for a commutative
ring R. Then a nonzero ideal I oi R must be a free .R-module with one
generator and so I — Ra is principal. Moreover a cannot be a zero divisor
in R, since the obvious surjection from R to 7 cannot have a nonzero free
kernel. Thus R must be a principal ideal domain.
If the ring R is not a principal ideal domain, then the .R-modules that
do have a presentation with a free relation module form an interesting type
of .R-module.
Exercises 143

Another variation on this theme is to look in turn at a presentation


tp : Rs -> K of the relation module K of M if K is not free. Perhaps the
relation module K' of this new presentation is free - if not, we can repeat
the operation. Proceeding in this way, it is possible to build what is called
a free resolution of M. An illustration is given in Exercise 9.3 below.
The study of such resolutions, and the characterizations of rings and
modules that arise through them, is the subject matter of homological alge­
bra. Introductions to this subject can be found in many texts, for example,
[Rotman] and [Mac Lane].
A ring in which every submodule of a finitely generated left module
is again finitely generated is called a left Noetherian ring. Theorem 9.4.2
therefore assures us that a Euclidean domain is a Noetherian ring. The
class of Noetherian rings is very large; it includes most of the types of
ring of current interest to algebraists, which may be seen by consulting
[McC & R]. An example of a non-Noetherian ring is given in Exercise 9.4.

Exercises
9.1 Let R be a commutative domain and suppose that the R-module M
has a presentation 7r : R* —> M with square presentation matrix T
and that det(r) ^ 0. Using the formulas given in Det 8 of section
5.12, show that the relation module Ker(-7r) for M contains

det(r)ei,...,det(r)et.

Deduce that M is a torsion module with det(r) G Ann(M).


9.2 The Cayley-Hamilton Theorem.
Let A be a square matrix over a field F and let M be Fn made
into an F[A"]-module with X acting as A. Show that if h(X) is a
polynomial in Ann(M), then

h{A) = h0I + h1A + --- + hkAk = 0.

In particular, if h(X) = det{XI - A), then h(A) = 0.


This result is the Cayley-Hamilton Theorem: "a square matrix
satisfies its characteristic polynomial".
9.3 Let R = F[X,Y] be the polynomial ring in two variables over a field
F, and let M be F regarded as an i?-module with both X and Y acting
as 0. The obvious presentation 7r : R —> M has Ker(7r) = XR + YR,
which is not a principal ideal of R (Exercise 1.6), and so Theorem
9.4.1 does not hold OcqsyFL
144 Chapter 9. Presentations

Define a : R2 -> XR + YR by a(r, s) = Xr + Ys. Verify that


T-.R^-R2, T(W) = (YW,-XW),

induces an isomorphism R = Ker(a). Thus we have constructed a


free resolution of M.
s Generalize this result to a polynomial ring in three variables.
(Warning: the computations rapidly become rather complicated -
the corresponding resolution for a polynomial ring in k variables has
k terms.)
s 9-4 Let R = F[X\,X2,---\ be the polynomial ring in an infinite set of
variables X\, X2, ■ ■ ■■ (This means that each element / of R is a poly­
nomial in a finite set of variables X\,..., Xk, but there is no bound
on the number k of variables allowed. Addition and multiplication in
R are as expected.)
Show that the ideal I = RXi + RX2 + ■ ■ ■, generated by all the
variables, cannot have a finite set of generators.
Let M be F regarded as an i?-module with each variable acting
as 0, and let 7r : R —> M be the evident presentation. Deduce that
any set of relations for M which arises from n must be infinite.
Remark: it can be shown that any presentation of M must have an
infinite set of relations, which is a stronger result than the above.
9.5 Let pi,... ,Pk be distinct irreducible elements of a Euclidean domain
R, and put

Qi = Pi ■ ■ -Pi-iPi+i ■ ■ -Pk, i — 1,.. -,k.


Show that 1 = wiqx + ■ ■ • + WkQk for some elements u>i,..,, Wk of R,
and deduce that Qi,- ■ ■ ,qk generate R as an ^-module, but that no
subset oi qi,... ,qk generates R.
Define 7r : Rk —> R by 7r(ej) = qi for each i. Show that this
presentation leads to a set of (k — l)k/2 relations for R.
Find an element w e Rh so that Rk = Ker(7r) © Rw, and show
that Ker(7r) has generators

z\=e.\ — qiw,... ,zk = ek - qkW.


Show that the corresponding presentation matrix is
T = I - w.qT

where w is the column vector (vji,...,Wk)T and q is the obvious row


vector.
Prove also that W\Z\ H + WkZk = 0.
Chapter 10

Diagonalizing and
Inverting Matrices

Let R be a Euclidean domain. The discussion in the previous chapter shows


that a finitely generated fl-module is specified by a presentation, which in
turn is determined by a presentation matrix T with entries in R. We now
show that any matrix T over R can be reduced to a standard diagonal
matrix A, the invariant factor form of T, by elementary row and column
operations. This reduction will allow us to find very nice presentations for
modules.
The diagonalization technique is an algorithm which provides us with
explicit invertible matrices P and Q so that PTQ = A. The algorithm also
allows us to determine whether or not a matrix is invertible, and gives a
method of computing the inverse. A further application is to the equivalence
problem for matrices: given a pair of matrices T and F' over R, are there
invertible matrices P and Q with T' = PTQ?
Throughout this chapter we take the coefficient ring R to be a Euclidean
domain.

10.1 Elementary operations


Our diagonalization technique is to perform a sequence of elementary row
and column operations on a given matrix until we have reduced it to the
desired form. These operations are very nearly the same as those encoun­
tered in elementary linear algebra, but we must be a little more careful as
we are working over a Euclidean domain rather than a field.

145
146 Chapter 10. Diagonalizing and Inverting Matrices

Let r be a t x s matrix over R. The elementary row operations which


can be performed on T are as follows.

EROP 1 Interchange two rows of T.


EROP 2 Add to any one row of T a multiple of a different row.
EROP 3 Multiply a row of T by a unit of R.

The elementary column operations are defined analogously.

ECOP 1 Interchange two columns of E\


E C O P 2 Add to any one column of T a multiple of a different column.
ECOP 3 Multiply a column of T by a unit of R.

Notice that each type of elementary operation has, as a special case,


the identity operation, which do not change any matrix. For example, the
multiplication of a row or column by the identity element 1 of R is a type 3
manifestation of the identity operation. On the other hand, a non-identity
operation may well have no effect on a particular matrix; for example, the
zero matrix is not changed by any elementary operation.
Some explanation is needed for the emphasis in the definitions of type 3
operations. The point here is that we want our elementary operations to be
reversible, that is, we wish to be able to regain the original matrix by using
another elementary operation that is also defined over R. If we multiply a
row by a unit u, we can undo the effect by multiplying by its inverse u _ 1 ,
but if we multiply by a non-unit a, we cannot reverse the operation inside
R since a has no inverse in R.
It is obvious that operations of types 1 and 2 can always be reversed.
If we are working over a field F, as is the case in elementary linear
algebra, then a unit of F is the same as a nonzero member of F, so the
definitions usually speak of "multiplication by a nonzero element".
The effect of multiplying a relation by a non-unit is easily illustrated.
The single relation 2m\ = 0 on one generator mi defines the Z-module Z2,
but the relation 4m 1 = 0 defines instead Z4.

10.2 The effect on defining relations


Next we consider the correspondence between elementary operations on the
presentation matrix and ^ftftW{\fyffij^$jgiset of d e n n m
g relations.
10.2. The effect on defining relations 147

Recall that the entries of T are given by the coefficients 7^ in the defining
relations of Eq. (9.2):

0 = 7 n m i + ••• + 7 u m t ,
(10.1)
0 = 7i s mi + ■•■ + 7 t s m t .
The effect of column operations is transparent. Each column of T is
made up of the coefficients from an individual relation, so an elementary
column operation on T corresponds to the "same" operation on the rela­
tions. Here is a detailed list.
exchanging columns i and j :
exchanging relations i and j

adding r times column i to column j for j jt i :


adding r times relation i to relation j

multiplying column i by a unit u € R :


multiplying relation i by u.

The result of row operations is less obvious, since it involves a change


in the generators that occur in the relations. Write V = (7? •) for the
matrix obtained by performing an elementary row operation on T, so that
we obtain a new set of relations on some generators TJIJ, . . . , m't as follows:
0 = -y'nm'1 + ■■■ + j'nm't,
(10.2)
0 = 7iXi + ■ ■ ■ + ll<-
These relations must be equivalent to the original set in Eq. (10.1) above,
which requirement determines the new generators. We consider cases.
(1) Suppose that V is obtained from T by exchanging row h and row i.
Then j ' h - = j t j and 7^ = fhj for all j , while -y'gj = j g j for all g ^ h,i. In
this case, the change in generators is clear:
m'h = rrii, rrii = rrih, and m^ = mg for g ^ h,i.
(2) This is the trickiest case. Suppose that T' is obtained from T by
adding r times row h to row i, h =/= i. Manipulating the new j - t h relation,
we obtain
o = 7iXi+"-+ TMK +•••+ ^3< +---+it3m-
=-fljm'1+■ ■■+ lh3m'h + ■ ■ ■+(-yij + nhj)m'i+■ ■ ■+jtjm
= lum1! +■■■ +lhjkotipfri&frf£&Material Hjm'i +"• +ltjm
148 Chapter 10. Diagonalizing and Inverting Matrices

Thus, to recover the original j - t h relation, we must have

m! = mg for all g ^ h and m'h = m^ — rrrii.

(3) Suppose that we multiply row i by the unit u of R. The new j-th
relation is

0 = 7 y m i + • • • +l'hjm'h+ ■■■ + ii3< +■■■ +l'tjm't


= -yijm[+ ■ ■ ■ +fhjm'h+ ■ ■ ■ +u~jijm'i+ ■ ■ ■ +Jtjm't

so it is clear that we must have

m' = n%g for all g ^ % and m!i = u~ m;.

A n example. Here is a numerical example. Consider the following defining


relations for a Z-module:
0 = 2mi + 4rri2
(10.3)
0 = 3mi + 7m2- }
The associated matrix is

Using column operations (and omitting some steps), we make the transfor­
mations

r<ll)All)
giving the relations
0 = mi + 3m,2
0= 2m 2 .
Subtracting 3 times the first row from the second gives the matrix

(i!)
and relations
0 = m'1
0= 2m'2
with
m1 = m,\ — 3m2, m'2 = ra2-
Thus the Z-module defined by the relations (10.3) is Z 2 .
10.3. A matrix interpretation 149

10.3 A matrix interpretation


Before we give the promised diagonalization method, it will be useful to
have an interpretation of elementary operations in terms of multiplication
by invertible matrices.
Suppose that p is one of the elementary row operations, acting on a t x s
matrix T, and denote the resulting matrix by pT. Then we can perform
the same row operation on a t x t identity matrix /, obtaining pi, and then
take the product (pI)T.
Let x be one of the elementary column operations. We denote the result
of applying x to T by Fx- In this case x operates on an s x s identity matrix,
also called / (there should be no genuine chance of getting confused!), and
we can compute the product T(Ix)-
The operations and matrices are related by the following basic lemma.

10.3.1 Lemma
(i) PT = (pI)T.
(li) TX = Y{Ix).
(Hi) Let p~l and x _ 1 be the elementary operations which reverse the
effects of p and x respectively. Then the matrices pi and Ix are
both invertible, with inverses p~xI and Ix-1 respectively.
(iv) (pT)x = p ( r x ) .

Proof
(i) The proof is by direct calculation. Since only two rows of I and T
are alterd by the row operation, it is enough to make the verification when
r is 2 x s and / is 2 x 2. (The doubtful reader can fill in the unchanging
rows for reassurance.) Suppose, for example, that p is of type 2, "add r
times row i to j , i ^ j " . Ignoring the unaffected rows (and assuming i < j),
we have
1 0
pi
( r 1 )
and
7»i 7i2 ••• liS
(pi)r ) ■
(
which is obviously pT. The calculations for the other types of elementary
row operation are easier.
(ii) The argument for column operations is similar, working with t x 2
matrices.
150 Chapter 10. Diagonalizing and Inverting Matrices

(iii) We have
{p-ll){pl) = p~\pl)
= {P~1P)I
= I
and similarly (pl)(p~1l) = I, confirming that (pl)~l = p~lI-
The calculation for x is much the same, working on the right rather
than the left.
(iv) In matrix terms, we require (pI)(T ■ Ix) = (pi ■ r ) ( J x ) , which is
true since matrix multiplication is associative. □

10.4 Row &; column operations in general


We will need to use many elementary row and column operations to di-
agonalize a matrix, and so we must expand our definitions to encompass
sequences of operations.
A row operation p on a t x s matrix T is defined to be the product of
a sequence p\,..., ph of elementary row operations performed on T in the
given order; thus p is a product of operators

P = Ph ■ ■ -Pi,

and
pr = ph(...(Plr)...).
Notice that the elementary operations are listed from the right to the left
in the product since row operations operate on the left of matrices.
Likewise, a column operation x on T is the product of a sequence
Xii • • • tXh of elementary column operations, again performed in the given
order. We can write x as
X = Xi---Xk
where the terms are ordered from left to right, since column operations act
on the right of matrices.
The ordering of the elementary row operations is very important, since
the same operations in a different order will give a different row operation.
For example, let p\ be "exchange row 1 and row 2", and let pi be "add row
1 to row 2". Then
P2PJ= ( J J)
while
10.4. Row & column operations in general 151

Similarly, the ordering of the elementary column operations is crucial in


defining their product column operation.
On the other hand, if we are performing both column and row operations
on a matrix, then the identity (pT)x = p{^x) OI Lemma 10.3.1 shows that
it does not matter if we perform all the row operations first, and then the
column operations, or vice versa; we can go back and forth between row
and column operations as we wish.
Here is an extension of Lemma 10.3.1.

10.4.1 Lemma
(i) Let p = Ph • ■ ■ P\ be a row operation, where each pi is an elementary
row operation. Then p is invertible, with inverse

p^=p-^...p-h\

(ii) There is a matrix equation

pI = (phI)...(PlI).

(Hi) The matrix pi is invertible over R, with inverse (p^ I)... (p~^ I),
(iv) Let x — Xi ■ ■ ■ Xk be a column operation, where each \i *s an e^e~
mentary column operation. Then x *s invertible, with inverse

X" 1 =Xk1---Xil-
(v) There is a matrix equation

IX = (IXi)---(IXk).

(vi) The matrix Ix is invertible over R, with inverse (IXk ) ■ • ■ C^Xf )•


(vii) (pT)x = p(Tx)-

Proof
(i) Multiplying pn ■ ■ ■ P\ by p\l ... p^1, on either side, we see that all
the terms cancel, so we are left with the identity row operation, which does
nothing to any matrix that it operates on. So the product of the inverses
is indeed p~l.
(ii) This follows by induction on h from part (i) of Lemma 10.3.1. We
have

pi = (ph{Ph-i ■ ■ ■ Pi))I
= {PhI){{Ph-\ ■ ■ ■ P\)I)
GQp0gmfrM3terial. (PlI)).
152 Chapter 10. Diagonalizing and Inverting Matrices

(iii) Immediate from the above, using the fact that, for invertible ma­
trices A,B, we have (AB) _ 1 = B _ M _ 1 .
The assertions about column operations have similar proofs, and the
final claim follows from the associativity of matrix multiplication. □

10.5 The invariant factor form


We now come to the main computational result of this text, which shows
how a matrix over a Euclidean domain can be reduced to a convenient
standard form. Before we give the technique, we must describe this desired
form.
A matrix A over R is in invariant factor form or Smith normal form if
it is t X s diagonal matrix

5l 0 ■ 0 0 ■• ■
1 52 ■ ■
° \
0 0 0 ■■ • 0

A = 0 0 • 5T 0 ■• • 0
0 0 • ■ 0 0 ■■ • 0

VV oo o0 • • 0 0 •• • o/
whose nonzero entries 5i,... ,5r satisfy the relations

Si | 52, S2 | S3,..., 6r-i | 5T.

Notation: we will write a diagonal matrix as

A = diag(<5i,... A , 0 , . . . , 0 )

when convenient.
Now let T be any s x t matrix over a Euclidean domain R. An in­
variant factor form or Smith normal form for T is any diagonal matrix
A = diag(<Ji ,...,Sr,0,...,0) that is itself in invariant factor form and which
is related to T by an equation

PTQ = A

in which P and Q are invertible ^-matrices.


The nonzero entries 5\,..., 5r are called the invariant factors of T, and
the integer r is the rank qf-X'nvririhterl Material
10.5. The invariant factor form 153

The invariant factors are not unique, since any of them can be multiplied
by any unit of R. However, this is the only type of change permitted. We
will prove this assertion in the next chapter, Theorem 11.3.1, together the
uniqueness of the rank. Our immediate task is to show that any matrix
does have an invariant factor form.

10.5.1 The Diagonalization Theorem


Let r be at x s matrix with entries in a Euclidean domain R. Then we
can perform a sequence of elementary row and column operations of types
1 and 2, all of which are defined over R, on T, with the result that T is
transformed into a t x s matrix A that is in invariant factor form.
Furthermore, there are matrices P and Q, both of which are invertible
over R, so that
PTQ = A,
and hence A is an invariant factor form for T.
Note: we require not only that the matrices P, Q have entries in R but also
that their inverses have entries in R.

Proof
We start by observing that the second assertion is a consequence of
the first. Suppose that pT\ = A where p is a product of elementary row
operations and x is a product of elementary column operations, and put
P = pi and Q = I\ for suitable identity matrices. Then, by Lemma 10.4.1,
both P and Q are invertible over R and PTQ = A.
The proof of the first assertion is by double induction. We induce on
both the size of the matrix and the size of its minimum nonzero entry. We
present the argument in a series of steps; to avoid over-elaborate notation,
we behave as programmers and allow the meaning of T and some other
matrices to vary during the argument.
The fact that we use operations only of types 1 and 2 will be clear from
the method. There are two starting cases in which we need not do anything.
If T = 0, then it is in the desired form with r = 0, and if T is 1 x 1 it is
obviously diagonal.
We define the "size" $(T) of the minimum entry by using the function
if : R -> Z (section 2.1). Suppose that T = (7^) is nonzero. Then we put

* ( r ) = m i n t e d ) I 7 i ^ 0}.

Now for the procedure. We assume T ^ 0 and that T is not l x l .


Step 1. Choose an entry 7 ^ with $ ( r ) = <p(jhk) and then exchange
row h with row 1 and col\Cop>fefg/i<iferf;flJateJVd/. If by chance k = 1, there is
154 Chapter 10. Diagonalizing and Inverting Matrices

no need to exchange rows, that is, we perform the identity row operation,
and likewise if k = 1.
Step 2. We can now assume that 3>(r) = <£>(7n). If s = 1, that is, T
has only one column, go straight to step 3. If not, proceed as follows.
For j = 2 , . . . , s, use the division algorithm to write

7ij = Qijln + 7y with qijs 7 y € R, </?(7i.j) < v(7ll)i

and then let Xj be the column operation "subtract q\j times column 1 from
column j " for j = 2,..., s.
If some 7£ ■ ^ 0, then we have a new matrix T' with 3>(r') < 3>(r). By
induction hypothesis, we can reduce T' and hence T to the desired diagonal
form.
If j[- = 0 for all j , we turn to row operations.
Step 3. If there is only one row, the preceding arguments will diagonalize
T, so we may assume that t > 2. For i = 2 , . . . , t, write

7a = 9u7n + 7a, with q'ilt -y'n £ R, ip{iiX) < f(ln)-

We then subtract q'n times row 1 from row i for i — 2 , . . . , t. If some 7^ =£ 0,


we have reduced 3>(r) and we are home by the induction hypothesis. If
7^ = 0 for all i, then we have transformed T into a block form, which we
again call T to save notation:

/ 711 0 •■■ 0 \
0
r
= : r
V 0 )
Step 4- If by chance T' = 0, we are finished. If not, we perform long di­
visions to see whether or not 711 divides the entries 7^ of the submatrix V.
It will be convenient to index the entries of T' according to their positions
in the larger matrix T rather than their positions in V itself.
For i,j > 2, write

Hij = lijln +lij with qij, Ht e R, ip(j'/j) < <p(7n)-

If some 72" / 0, we first add row i to row 1 in V and then subtract q^


times column 1 from column j to obtain a new matrix, say T", with 1, j - t h
term 7,". We now have $ ( r " ) < $ ( r ) , so we can replace F by T" and start
again.
10.6. Equivalence of matrices 155

After a finite number of trips around this loop, reducing $ ( r ) each trip,
we must arrive at the stage where
/ 711 0 ... 0 \
0
r= • r
'

and 7 n divides 7^ for all i,j > 2.


Step 5. We can now finish the argument by induction. If V = 0 or V
is 1 x 1, there is nothing to do. Otherwise, I?' is t — 1 x s — 1, and so we
can reduce it to invariant factor form A' = d i a g ^ , . . . , <5r, 0 , . . . , 0), with
82 I S3,..., <5r_i I 5r, by a series of elementary row and column operations.
Since none of these operations can have any effect on the first row and
column of T (other than adding, subtracting and permuting 0's), we have,
on putting 711 = Si, transformed T to
A = diag((5i,<52)... A , 0 , . . . , 0 ) .
Finally, we have to show that <5i divides 82- Since A' is obtained from V
by applying elementary row and column operations, the entries of A' are
.R-linear combinations of the entries of V. But we have arranged matters
so that 5\ divides all entries of V, and so <^i must divided 82- □

10.6 Equivalence of matrices


Let R be any ring. Two matrices V and I" over R are said to be equivalent
or associated if there are invertible matrices P and Q with V = PTQ. The
equivalence problem for matrices requires that we determine precisely when
two matrices of the same size are equivalent. The preceding result gives a
partial solution to this problem for matrices over a Euclidean domain, since
we now know that any matrix is equivalent to its invariant factor form, and
in the next chapter we will obtain a complete solution (Corollary 11.3.2).
A special case is well known from elementary linear algebra. A field
F can be regarded as a degenerate Euclidean domain in which <p(r) = 1
whenever r ^ 0. Since any nonzero element of F is a unit, we can write
the invariant factor form of an F-matrix T as
Ir 0
0 0
where Ir is the rxr identity matrix. Since the rank of a matrix is unchanged
by row and column operations, r is the rank of T. Thus the equivalence
class of a matrix over a fXMpfri&ttfaffaMiafletigl its rank.
156 Chapter 10. Diagonalizing and Inverting Matrices

10.7 A computational technique


The procedure for computing the invariant factor form is an algorithm, that
is, it can be done by a computer provided that the computer knows how to
do arithmetic in the Euclidean domain R.
Here is a way of setting out the calculations so that we record the
matrices P and Q as well as finding the invariant factor matrix A.
Given a t x s matrix T over a Euclidean domain R, form the augmented
(t + s) x (s + t) array
r /
/
in which the top right matrix is a t x t identity matrix and the bottom left
matrix is an s x s identity matrix.
If we perform a row operation p (elementary or not) on T, we can record
its effect on the appropriate identity matrix / by performing it on the whole
array:
T I pT pi
I ~* I
Similarly, the array will record the effect of a column operation on the s x s
identity matrix:
T I TX I
I IX
Continuing this way, we can record the effect of any sequence of row and
column operations, so that when pFx = A, the array will have become
A pi
IX
from which we can read off P = pi and Q = Ix-
Here is a numerical example. Not all the steps of the algorithm are
needed - a numerical computation in which they all appeared would be too
lengthy - but enough steps are used to illustrate the ideas, I hope. This
calculation is also long-winded in that the method is followed slavishly; in
"real life" computation, it's usually easy to spot alternative operations that
shorten the calculation. It should be remembered that any sequence of row
and column operations is legitimate provided that the result is in invariant
factor form.
We work over the ring of integers Z. Let

/ 12 9 12 \
T= 3 2 4 .
CopyrtgHtid MaffiieJ
10.7. A computational technique 157

The augmented array is a 6 x 6 array, which we manipulate as follows.

12 9 12 1 0 0
3 2 4 0 1 0
r/ / 12 8 22 0 0 1
1 0 0
0 1 0
0 0 1

3 2 4 0 1 0
12 9 12 1 0 0
12 8 22 0 0 1
1 0 0
0 1 0
0 0 1

2 3 4 0 1 0
9 12 12 1 0 0
8 12 22 0 0 1
0 1 0
1 0 0
0 0 1

2 1 0 0 1 0
9 3 -6 1 0 0
8 4 6 0 0 1
0 1 0
1 -1 -2
0 0 1

1 2 0 0 1 0
3 9 -6 1 0 0
4 8 6 0 0 1
1 0 0
-1 1 -2
0 0 1

1 0 0 0 1 0
3 3 - 6 1 0 0
4 0 6 0 0 1
"*■ 1 - 2 0
-1 3 -2
Copygightqfi Material
158 Chapter 10. Diagonalizing and Inverting Matrices

1 0 0 0 1 0
0 3 - 6 1 - 3 0
0 0 6 0 - 4 1
1 -2 0
-1 3 -2
0 0 1

1 0 0 0 10
0 3 0 1 - 3 0
0 0 6 0 - 4 1
1 -2 -4
- 1 3 4
0 0 1
Thus the invariant factor form of T is

1 0
°\
A = 0 3
\ 0 0 6/
° '
the row operations are recorded as

0 1 0
P = pl = 1 -3 0 ,
Vo -4 li
and the column operations as

/' l -2 -4\
Q = ix = -l 3 4
\ ,° 0 1/

10.8 Invertible matrices


The diagonahzation method also provides a technique for constructing in-
vertible matrices over a Euclidean domain. The key observation is the
following very easy lemma.

10.8.1 Lemma
Let A = diag(<5i,..., 6r, 0 , . . . ,0) be atxs diagonal matrix with nonzero
entries 6\,... ,8r contained in a Euclidean domain R.
Then the following statements are equivalent.
(i) A is invertible ovec^yrighted Material
10.8. Invertible matrices 159

(ii) A is a square matrix with r = t = s, and each of the diagonal terms


5\,..., 8r is a unit in R.

Proof Suppose that A is invertible. If, say, t > r, then A has a row of
zeroes, and so does A • A - 1 = /, a contradiction. Thus t = r and likewise
s = r. The fact that all the terms Si are units now follows by direct
calculation, or alternatively, from the fact that det(A) = <5X . ■ -Sr is a unit
in R (Theorem 5.12.1).
The converse is obvious. □
This lemma gives several characterizations of invertible matrices which
we list in the next theorem.

10.8.2 Theorem
Let r be a t x s matrix with entries in a Euclidean domain R.
Then the following statements are equivalent.
(i) r is invertible over R.
(ii) T is a square matrix and the invariant factor form ofT can be taken
to be the identity matrix.
(Hi) T — pi for some row operation p.
(iv) V = I\ for some column operation x-

Proof (i) => (ii): By Theorem 10.5.1, there are invertible matrices P, Q so
that the invariant factor form of T is A = PTQ. Since T is invertible, so
is A. Thus A and hence T are square matrices. Also, 6i,...,fir are units
in R, and so, by multiplying column i by S~ for each i, we can take the
invariant factor form to be the identity matrix /.
(ii) => (iii): We have PTQ = I, where P and Q are invertible square
matrices, say r x r. Thus T = P~1Q~l.
By construction, P is obtained from the identity matrix / by applying
a sequence of elementary row operations, and P _ 1 is obtained by applying
their inverses - see Lemma 10.4.1. Thus P~l = p'l for some row operation
P'-
By the same lemma, Q l = (Ixi) ■ ■ ■ (IXk) f° r some sequence of elemen­
tary column operations X\i■ ■ ■ iXk- But for each such elementary column
operation, there is an elementary row operation pi so that pil = Ixi - this
is easily seen by evaluating Ixi for each of the three types of elementary
column operation. Thus, writing p" = pi... pk, we have T = p'p"I.
(iii) => (iv): Arguing as in the previous paragraph, we can replace each
elementary row operation in p by an elementary column operation, and so
find a column operation x with pi = I\-
(iv) => (i): Already prove^cipj^Jgfttedl AfefeA'a/ □
160 Chapter 10. Diagonalizing and Inverting Matrices

The computational form of the diagonalization method set out in sec­


tion 10.7 also enables us to determine whether or not a given matrix T is
invertible, and to compute the inverse when T is invertible.
If T is not square, then it has no inverse anyway. If it is square, we
reduce it to invariant factor form. If the invariant factor form has a non-
unit diagonal term, again T has no inverse over R. If all the diagonal terms
in the invariant factor form are units, then we can convert the invariant
factor form to the identity matrix.
At the end of the calculation, we have found invertible matrices P, Q so
that
PTQ = I
from which
T" 1 = QP.
If we have used only one kind of operation, say column operations, then
P = I and Q = V~l. However, usually both row and column operations
will be used, although, in principle, only one kind need be.
If the inverse of P and/or Q is required, then at each step in the calcu­
lation we should record the elementary row or column operations used, as
appropriate. We obtain a sequence pi,..., ph of elementary row operations
and a sequence Xi> ■ ■ ■ iXk of elementary column operations so that

P = pk...pil and Q = Ixi--Xk,

from which

P~l = pf 1 • • ■ Pk'l and Q-1 = I%1l ■ ■.Xi1 •

10.9 Further developments


Elementary row and column operations can be performed on matrices with
entries in an arbitrary ring. However, when the ring is noncommutative,
scalars must act as left multipliers on rows but as right multipliers on
columns.
The Diagonalization Theorem holds for matrices over (commutative)
principal ideal domains, but the proof is different - in essence, elementary
row and column operations no longer suffice to achieve the diagonalization,
another type of operation being needed. Details can be found in §3.7 of
[Jacobson] or §10.5 of [Cohn 1].
A consequence is that the "constructive" description of invertible ma­
trices given in Theorem 10.8.2 does not extend to principal ideal domains
Exercises 161

- for some principal ideal domains there are invertible matrices which can­
not be reduced to an identity matrix by any sequence of row and column
operations. However, examples of this phenomenon are surprisingly hard
to find; some are given in [Grayson] and [Ischebeck].
The diagonalization results also extend to noncommutative Euclidean
domains ([B & K: IRM], §3.3) and to noncommutative principal ideal do­
mains ([Cohn: FRTR], Chapter 8).
When the coefficient ring is not a principal ideal domain, we cannot
expect that a matrix will have a standard form that is as simple as the
invariant factor form. Even if the ring of scalars is close to being a Euclidean
domain, the analogue of the invariant factor form can be rather complicated,
as can be seen from [L & S].

Exercises
10.1 Let M be a Z-module with generators mi,m2 and relations

m\ + 2m,2 = 0,
2mi + 3m 2 = 0.

Show directly that M = 0.


Prove also that the presentation matrix for M has invariant factor
form / .
10.2 Let the Z-module M have generators mi, 77-12,7713 and relations

2mi + 2m 2 + 3m 3 = 0,
4mi + 4m 2 + Im-j, = 0,
10mi + llm2 = 0.

Find the invariant factor form of the presentation matrix of M.


Hence make an educated guess for the structure of M. (A formal
technique for determining module structures will be given in Chapter
12.)
10.3 Let M = L © N, and suppose that L = R/Rb and N = R/Rc are
cyclic, with generators l,n respectively.
Given that b and c are coprime (and that R is a Euclidean domain),
find a generator for M in terms of I and n.
Write down the 2 x 2 relation matrix for M, and show that it can
be transformed into diag(l,6c).
10.4 Let M be a Z-module with generators mi, 7712 and relations

mfiP&R&&& 4fefcfl#i + m 2 = 0.
162 Chapter 10. Diagonalizing and Inverting Matrices

Write down the presentation matrix T of M. Find the invariant factor


form A of T, giving explicitly the invertible matrices P, Q with PTQ =
A. Find the bases of Z 2 corresponding to P _ 1 and Q. Deduce that
M is a cyclic module, and find a single generator of M in terms of
the original generators.
10.5 (This is three problems in one - keep a arbitrary for as long as you
can.)
A Z-module M is defined by generators 7711,77x2,7713 and relations

3 m i + 3 m 2 + 27713 = 0 = 2m\ + 67712 + 07713

where a € Z is a parameter. Write down the presentation matrix for


M. For a = 0,1,2, find the invariant factor form A of T, together
with the matrices P, P~x and Q.
Chapter 11

Fitting Ideals

We have shown that a matrix over a Euclidean domain has an invariant


factor form, but we have not yet shown that this form is unique. To prove
the uniqueness, we introduce an alternative method for finding the invariant
factors of a matrix, through the computation of its Fitting ideals. These
ideals are defined in terms of the determinants of the square submatrices of
the given matrix, and so they can be calculated directly from the matrix,
without recourse to row or column operations. However, the Fitting ideals
are unchanged by row and column operations, and so the Fitting ideals of a
matrix are the same as those of its invariant factor form, which leads to the
desired uniqueness result. Another consequence is that we can complete
the solution of the equivalence problem for matrices.
Throughout this chapter, the ring of scalars R is taken to be a Euclidean
domain, although the basic definitions require only that R is commutative.

11.1 The definition


Let T = (jij) be at x s matrix with entries in a Euclidean domain R. For
any integer h < s,t, a minor of T of order h is the determinant of an h x h
submatrix of T.
The Fitting ideal of level or order h for T is the ideal Fit^(r) of R
generated by all the minors of T of order h.
For example, take the matrix

/ 3 2 1 0 \
T= 10 5 - 5 15 I ,
\ -20 -10 10 -30 /

CopyrightdSft/laterial
164 Chapter 11. Fitting Ideals

with entries in the ring of integers


The order 1 minors are
3 2 1 0 . . . 10 -30
the order 2 minors are
- 5 --25 -45 -15 30
10 50 -90 30 -60
0 0 0 0 0
and all order 3 minors are 0. Thus
Fit!(r) = z ]
Fit2(r) = 5Z } (11.1)
Fit3(r) = o1
and T has no other Fitting ideals.
When a matrix is in invariant factor form, its Fitting ideals are easy to
find.

11.1.1 L e m m a
Let A = diag(<5i,..., 5r, 0 , . . . , 0) be a matrix in invariant factor form.
Then the Fitting ideals of A are

Fit,(A) = M 5 1 ;0- - 4 lf h<r


if h > r

Proof First, take r = 1. The nonzero l x l subdeterminants are simply the


nonzero entries 5i,..., 5T of A, so Fiti(A) is generated by these elements.
But, as the matrix is in invariant factor form, we have
51\62\...\Sr
and so F i t ^ A ) = RSi.
For arbitrary h < r, a n / i x f t submatrix of A will contain a row or
column of zeroes unless it is a diagonal submatrix of the nonzero r x r
diagonal submatrix of A. Thus the nonzero generators of Fit/t(A) are all
the products of the form

where
1 < i(l) < . . . < i(h) <r
is an ascending sequence of integers. The divisibility condition on the en­
tries of A now shows that 8\... 5h divides any such product and so must
generate Fit^(A).
Finally, if h > r, then any hxh submatrix contains a row of zeroes and
so has zero determinant. Copyrighted Material □
11.2. Elementary properties 165

11.2 Elementary properties


Next we show that the Fitting ideals are unchanged if we perform an ele­
mentary row or column operation on a matrix.

11.2.1 L e m m a
Let r be atx s matrix, let p be an elementary row operation on V and let
X be an elementary column operation on T. Then for h = 1 , . . . , min{i, s},

Fith(/oT)=Fitfc(rx)=Fith(r).

Proof We give the argument only for a row operation p. Consider the effect
of p on the determinant of an h x h submatrix E of T. If p does not involve
any rows of E, nothing happens to the determinant. If p involves two rows
of E, then p is effectively a row operation on E and, by standard properties
of determinants (section 5.12), it changes the determinant as follows.
• If p exchanges two rows,

det(pE) = - d e t ( E ) .

• If p adds a multiple of one row to another,

det(pE) = det(E).

• If p multiplies a row by a unit u of R,

det(pE) = udet(E).
Finally, we have the situation when p involves two rows, one of which,
say row i, contributes to E and the other, row j , does not. There are now
two cases in which the determinant might be changed.
• Suppose p exchanges rows i and j . Then pE is an h x h matrix,
formed from the same columns as E, in which the entries of E from
row i of T are replaced by the corresponding entries from row j of
T. Thus there is an h x h submatrix E' of T so that pE and E' have
the same rows, but perhaps in a different order, which gives

det(pE) = ±det(E').

• Suppose that p adds r times row j to row i. Then

det(pE) = det(E) ± r • det(E')


h t
where E' is the snhmamx
Su9hWm MM §&Me.
166 Chapter 11. Fitting Ideals

Since the Fitting ideal Fit h(pT) of order h is generated by the determi­
nants of all the h x h submatrices of T, we can now see that Fith{pT) C
Fith(r). But p is invertible, so we also have Fit/»(r) C Fith{p£)- D

11.2.2 Corollary
Let r be at x s matrix.
(i) Given a product p of elementary row operations on T and a product
X of elementary column operations onT, we have

F i t h ( p r x ) = Fith(r)

for h= 1 , . . . , min{t, s}.


(ii) Given invertible matrices P and Q, t x t and s x s respectively, we
have
Fith(PTQ) = F i U ( r )
for h= 1,... ,min{t,s}.

Proof (i) This is immediate from the Lemma together with part (iv)
of Lemma 10.3.1.
(ii) By Theorem 10.8.2, we can write P = pi and Q = I\, so the claim
follows. □

11.3 Uniqueness of invariant factors


We can now show that the invariant factors of a matrix are essentially
unique.

11.3.1 Theorem
Let r be a matrix over a Euclidean domain R and suppose that

A = diag(<5i,... A , 0 , . . . , 0 )

and
A' = diag(*i,...,£.,0,... > 0)
are both invariant factor forms of T.
Then r = r', and there are units U\,..., ur of R with

5i — « i 5 i , . . . , S'r = ur5r.
11.4. The characteristic polynomial 167

Proof By Theorem 10.5.1, we have A = PTQ and A' = P'TQ' for invertible
matrices P,P',Q and Q', so that A' = P'P^AQ^Q'.
Thus A and A' have the same Fitting ideals by the preceding corollary,
which means that r = r' and that

R51...5h = R6[ ...5'h

for h = 1 , . . . ,r.
Taking h = 1 we get 5[ = ui<5i for some unit wi of the domain R, and
the rest follows easily. □
Standard choices. In the ring Z of integers, we take an invariant factor
to be positive, and in a polynomial ring F[X], we take an invariant factor
to be monic. With these choices, the invariant factor form of a matrix is
unique.
The preceding results yield the solution to the equivalence problem that
we posed in section 10.6.

11.3.2 Corollary
Let r and V be t x s matrices over a Euclidean domain R. Then the
following statements are equivalent.
(i) r and V are equivalent,
(ii) Fith(r) = Fit f t (r') forh = l,.. .,min(t,s).
(Hi) T and V have the same invariant factor forms (up to multiplication
of their entries by units).

11.4 The characteristic polynomial
Let A be an n x n matrix over a field F, and let M be the module over the
polynomial ring F[X] which is defined by A acting on the space Fn. By
section 9.7, the presentation matrix of M has the form V = XI — A, where
I is an n x n identity matrix. For matrices of this type, it is often more
convenient to compute the invariant factors through the Fitting ideals.
It is clear that the n-th Fitting ideal of XI — A is

Fit n (X7 - A) = det(XI - A)F[X),

where det(XI — A) is, by definition, the characteristic polynomial of A. It


follows that the invariant factors 8%,... ,§n of XI — A satisfy the equation

ftpyfflte&fM&r^A). (n.2)
168 Chapter 11. Fitting Ideals

Since the invariant factors are themselves polynomials, we can see, just
by counting their degrees, that many of them are likely to be constant
polynomials. Indeed, suppose that no invariant factor is a constant. Then
all must have degree 1, and by the divisibility condition S\ | • • • | <5„, they
are all equal to X — a for a fixed constant o. Thus A = al.
At the other extreme, consider the rational canonical block matrix

(° 0
0
• •
• •
0
0
0
0
-/o
-/l
\
1
0 1 • • 0 0 -/a
c
0 0 • • 1 0 — fn-2
0 ■ • 0 1 -fn-1 )

associated to the polynomial f = fo + fiX + ■ ■ ■ + fn-\Xn 1


+ Xn.
Then det(XI - C) = f (Exercise 6.5), and XI-A has an ( n - 1 ) x ( n - 1 )
identity submatrix, which shows that 5i.. . <5n-i = 1 and hence

51 = ■ ■ ■ = $n-i = 1 and 6n = f.

To see some more possibilities, we analyse what happens when we take


a b
A= I c d, I to be an arbitrary 2 x 2 matrix over F.
X -a -b
We have XI — A = ( ,, , ), so that the characteristic poly-
-c X-d
nomial is
Xr2
z
- (a + d)X + (ad - be) = 5XS2.
If either b / 0 or c ^ 0, then Fit!(XI - A) = F[X] and Si - 1.
If 6 = c = 0, then Fiti (XI — A) is generated by X — a and X — d.
There are now two cases. If a ^ d, then a — d = (X — d) — (X — a) is in
Fiti(AT - A), so 6i = 1 again.
On the other hand, if o = d, then S\ = X — a. Since 5\S2 = (X — a) 2 ,
we find that 52 = X — a as well.

11.5 Further developments


It is easy to see that Fitting ideals can be defined for a matrix T over
any commutative ring R, and that Lemma 11.2.1 still holds. A direct
calculation with determinants confirms that the Fitting ideals of T are the
same as those of PTQ fojQaQyipgtlterfl Moi&Me matrices P, Q over R.
Exercises 169

Thus the existence of an invariant factor form for a matrix T is equivalent


to the assertion that all Fitting ideals of T are principal. Further, if every
i?-matrix has an invariant factor form, then every finitely generated ideal
of R must be principal, since we can arrange the generators as a 1 x t
matrix for some t. Thus, if we are given that a commutative domain is
Noetherian and that every .R-matrix has an invariant factor form, then R
is a principal ideal domain. (Exercise 11.7 shows that we must assume that
R is Noetherian.)
In the reverse direction, if R is a principal ideal domain, then any R-
matrix has a (unique) invariant factor form ([Cohn 1], §10.5). Whether or
not this assertion holds for principal ideal rings that are not domains is a
point on which the literature seems to maintain a silence.
In the noncommutative case, there is no analogue of the determinant
which would make it possible to define Fitting ideals with any useful prop­
erties, at least, as far as I know.

Exercises
11.1 Use the Fitting ideals to compute the invariant factors of the integer
matrix discussed in section 10.7:

12 9 12 \
3 2 4
- 8 22 /
V 12

11.2 Compute the Fitting ideals of the following matrices, and hence con­
firm your calculations in the exercises to Chapter 9.

1 2 \
(a) 3
I2 )
2 4 10 \
/
2 4 11
(b)
7
\ 3
2 '\
oj
(c) ( 1
I2 1 J
3 2 \
/
(d) 3 6 ■

\2 / a
11.3 In the last part of the above exercise, show that the three cases a =
0,1, 2 cover all the rS@flB5i9fiel<3foVf^a^ariant factors of the matrix.
170 Chapter 11. Fitting Ideals

/ 1 0 o \
11.4 Let A = I 0 1 a ) be a matrix over a field F. Discuss the pos-
\ 0 0 b )
sibilities for the invariant factors of XI — A as a and b vary.
11.5 For those who relish a challenge!
Let M be a Z[i]-module with generators mi, m 2 , rnz and relations

(1 + i)m 2 = 6m! + 4m 2 + 2(1 + i)m3 = 6mi + (5 - i)m 2 + 2im 3 = 0.

Find the invariant factor form of the presentation matrix of M.


Hint: note that 1 + i, 3 and 2 — 3i are irreducible in Z[i] and that
5 - i = (l+i)(2-3i).
s 11.6 Prove the assertions made in section 11.5.
Now take R = F[X, Y] to be the polynomial ring in two vari-
X Y
ables over a field F, and let T = I v Y I. Compute Fiti(r) and
Fit 2 (r), and conclude that T has no invariant factor form (see Exercise
1.6).
s 11.7 Let R = F[Xi, X 2 ,...] be a "polynomial ring" in an infinite set
X i , X 2 , . . . of variables over a field F, but with the variables sub­
ject to the relations Xi = Xf+1 for all i. Show that any finite set of
members of R belongs to a polynomial ring -F^fc] for some k, and
deduce that any finitely generated ideal of R is principal.
Show also that the ideal generated by all the variables is not finitely
generated (and hence not principal).
Chapter 12

T h e Decomposition of
Modules

Now that we have the invariant factor form of a matrix at our disposal,
we can exploit it to obtain a very nice presentation for a module M over
a Euclidean domain. This presentation leads to a decomposition of the
module as a direct sum of cyclic modules, which in turn allows us to find
the torsion submodule T{M) and the p-primary components TP(M) of M,
and to show that the quotient module M/T(M) is in fact a free module.
We also show that the descriptions which we obtain are unique, subject
to some conditions on the way in which we arrange the cyclic summands of
M.
As applications of our results, we obtain the structure theory of finitely
generated abelian groups and some properties of lattices in R n .
Throughout this chapter, R is a Euclidean domain and all .R-modules
are taken to be finitely generated.
We start with a description of the submodules of a free module.

12.1 Submodules of free modules


Let K be an .R-submodule of the standard free i?-module Rl of rank t. By
Theorem 9.4.1, K is also free, of rank r with r < t, but the proof of that
theorem did not reveal a basis for K. We now exploit our calculations with
matrices to find a basis of K from a given finite set of generators p\,..., ps
oiK.

171
172 Chapter 12. The Decomposition of Modules

Write
Pi = 7 n e i + 1" ltiet
(12.1)
Ps = 7i*ei + ■ ■ ■ + 7t»Ct
as in Eq. (9.1), where { e j , . . . , e t } is the s t a n d a r d basis of Rl, and p u t

/ 7n •• 7ia \
T= : •.. :
\ 7u ■•■ Its )

As shown in section 9.6, T can be viewed as the matrix of a homomorphism


7 : Rs -» Rl, where 7 is defined by 7(2) = Ta; for each column vector
x € Rs. Denote the s t a n d a r d basis of R* by E and the s t a n d a r d basis of
Rs by B = { & i , . . . , bs}. Then

l{b\) = p i , . . . , 7 ( 6 s ) = ps

and T = ( 7 ) E , B (section 5.9). Furthermore, the image Im(7) of 7 is the


module K.
Suppose now that there is a t x s matrix A with A = PTQ for invertible
matrices P and Q, of sizes t x t and s x s respectively. Exploiting the
calculations in Chapter 5, we can find new bases E' of R* and B' of Rs so
t h a t A = ("()E',B'- T O accomplish this, we note t h a t the relation 5.13 of
section 5.11 tells us t h a t

A = {J)E>,B> = PE',E ■ {l)c,B ■ PB,B>

provided we choose the new bases so t h a t PE',E = P and PB,B' = Q-


Using the formula given in Eq. (5.2), we see t h a t the basis B' is given
explicitly in terms of B and t h e matrix Q by the equations

b'x = qnh + 921^2 H h qsibs


b'2 = quh + <722&2 H h qS2bs
(12.2)

K = <?i A + 92S&2 + • • ■ + 9 SS 6 S

On the other hand, the basis E' is given implicitly by t h e formula of


Eq. (5.1):
ei = p n e i + p 2 i e 2 + ■■■+ ptie't,
e2 = V\ie\ + p22e'2 + ■■■+ Pt2e't,
(12.3)

e
* ^Wti^m^Mitenaf P"e'f
12.1. Submodules of free modules 173

l
To find E' in terms of E, we must calculate the inverse P = (p^A Then
. P - 1 = P E , E ' by the results of section 5.6, and
e
l = P l l e l +P21 e 2H \-Ptlet,
4 = P l 2 l + P~22 2 + ••• +Pt2 e *>
e e

: (12-4)
e
t = Puei +P~2te2 H +P«et.
We can now obtain a very useful description of a submodule of a free
module.

12.1.1 Theorem
Let K be a submodule of the standard free R-module i?*. Then K is free,
and we can find a basis E' = { e ^ , . . . , e't} of Rl and elements 5\,..., Sr of
R so that
8\e\,... ,8Te'r
is a basis of K, with 5\\ ■ ■ ■ \ 5T.

Proof We keep the above notation. By the Diagonalization Theorem 10.5.1,


there are invertible matrices P and Q with
PTQ= A = diag(<Si,...A,0 ) ...,0) >
a matrix in invariant factor form, and A is the matrix of 7 with respect to
bases E' of R* and B' of Rs.
Since K is the image Im(7) of 7, we see that

1(b'1)=S1e[,...,1(b'r)=5re'r

is a set of generators for K. But this set must also be linearly independent,
since the set e\,..., e'r is already linearly independent. □
A computation. The techniques given in Chapter 10 enable us to carry
out numerical calculations. For example, suppose that K is the submodule
of 1? with three generators
pi = 12ei + 3e 2 + 12e3, ]
p2 = 9ei + 2e 2 + 8e 3 , > (12.5)
pz = 12ei + 4e 2 + 22e 3 . J
The corresponding matrix is
/ 12 9 12 \
T= 3 2 4
Copyr'fahlSd l&latffiall
174 Chapter 12. The Decomposition of Modules

and the calculations in section 10.7 show that T has invariant factor form
1 0 0
PTQ = A = | o 3 o
0 0 6

with
0 1 0 1 2 -4
P- 1 -3 0 and Q = -1 3 4
0 -4 1 0 0 1
We find that
3 1 0 \
>-i
1 0 0
V4 o i;
Thus the basis
= 3ei +e2 + 4e 3
= ei (12.6)
= e2
of 1? gives the basis
e
i> 3e 2 , 6e 3
of K.

12.2 Invariant factor presentations


Let us see how the preceding discussion leads to a nice presentation of a
module. Let M be an .R-module which is defined by generators m i , . . . ,mt
and relations
0 = 7 n m i + ■•■ + 7tim t ,
(12.7)
0 = 7is"ii H + 7tsmt,
as in section 9.3. The corresponding presentation is the surjective .R-module
homomorphism 6 : R* —» M, given on the standard basis of i?' by

0(a) = m1,...,6(et) +mt,

and the relation module for M is the kernel Ker(#), which has generators

Pi = 7 n e i H +7tiet,
(12.8)
e
P^
bopfaffited ±3te t-
Material
12.2. Invariant factor presentations 175

Thus the presentation matrix of M is the t x s matrix T over R.


By Theorem 12.1.1, there is a basis E' for Rl so that Ker(#) has basis
5
Pi = he-'i,- ■■,p'r = re'r.

Put

Since 6 is surjective, the set {m\,..., m't} is a set of generators of M with


the particularly convenient relations

Sim'i = 0 , . . . , STm'r = 0, where Si \ ■ ■ ■ \ Sr.

Such a presentation of M will be called an invariant factor presentation


of M, although the less accurate term diagonal presentation is sometimes
used. (Strictly speaking, a diagonal presentation is one in which the Si's
need not satisfy any divisibility condition.)
If r < t, then some of the new generators are not constrained by any re­
lations, a phenomenon which will be interpreted soon - see (iii) of Theorem
12.3.1 below.
As we have remarked, the method given in the Diagonalization Theorem
10.5.1 is a computational technique which enables us to find an explicit
invariant factor presentation for M. However, if we are not seeking an
explicit set of generators of M, then it may be more convenient to calculate
the invariant factors through the Fitting ideals of T.
The computation again. Let M be the Z-module which has generators
7711,7722,7713 and relations

12mi + 3 m 2 + 12m 3 = 0, ]
9mi + 2 m 2 + 8 m 3 = 0, i (12.9)
12mi + 4m2 + 22m3 = 0. J
The calculations in the preceding section show that we can find new gen­
erators for M, given by

m'j = 3mi + m 2 + 47713, \


m 2 = mi, > (12.10)
m3 = m2, J

so that the relations have become

7771=0, 3m 2 = 0 and 6m 3 = 0. (12.11)

Next, we see how the structure of a module is determined by its invariant


factor presentation.
176 Chapter 12. The Decomposition of Modules

12.3 The invariant factor decomposition


Suppose now that the .R-module M is given by an invariant factor presen­
tation with generators { m i , . . . , mt} and relations

Simi = 0 , . . . , Srmr = 0.

It may happen that the presentation starts with some terms whose co­
efficients Si, i = 1 , . . . , h, are units in R, since a presentation matrix for M
may well have unit invariant factors. As a unit can always be replaced by
the identity element of R, the presentation begins

m i = • • ■ = nth = 0.

But these generators obviously contribute nothing to the module M, so


we can omit them and renumber the generators and relations so that each
coefficient Si is a non-unit in R.
We state the main result formally as a theorem.

12.3.1 Theorem
Let R be a Euclidean domain and let M be a finitely generated R-module.
Then the following statements are true.
(i) M has a set of generators m\,... ,mt so that M is the internal direct
sum
M = Rmi © ■ • ■ © Rmr © Rmr+i © • • • © Rmt
with
Run & R/RSi for i = l , . . . , r
and
Rmj = R for j = r+ l,...,t,
where Si,..., Sr are non-units in R with Si \ • • ■ \ Sr.
(ii) The torsion submodule of M is

T(M) = Rmi © • • • © Rmr,

which has annihilator Ann(T(M)) = RSr.


(Hi) Put F(M) = Rmr+1 © • • • © Rmt. Then F(M) S Rw is free, and

M = T(M) © F{M) with F{M) S M/T{M).

(iv) The integer u) is uniquely determined by


Copyrighted Materiar
12.3. The invariant factor decomposition 177

Proof (i) By the previous remarks, we can assume that M is given by


an invariant factor presentation with generators { m i , . . . , m t } and relations
Simi = 0 , . . . , 5rmr = 0, and that the coefficients satisfy the conditions
stated above. Thus there is a presentation 0 : Rl —> M of M and a basis
{b\,... ,bt} of -R* with 6{bi) = rrii for each i and

Ker(0) = RSih © • • • © RSrbT.

Define a map <p from R* to the external direct product

N = R/RSi x • • • x R/R5T x Rw, w = t-r

by

<t>{o-\b\ + ■ ■ ■ + arbr + a r + i 6 r + i -I 1- atbt) = ( a i , . . . ,ar,ar+i,... ,at)

where a\,..., at are in R and 5^ is the image of ai in R/RSi for i = 1 , . . . , r.


Then 0 is a surjective i?-module homomorphism with Ker(</>) = Ker(0).
By the First Isomomorphism Theorem 6.3.2, <p induces an isomorphism 4>
from M = Rt/Kex(9) to iV, and it is clear that the restriction of <f> to
the component Rrrii gives an isomorphism Rrrii — R/RSi if 1 < i < r or
Rrrii = R if r < i < t.
(ii) By Proposition 8.2.1, a finitely generated i?-module P is torsion
provided that aP = 0 for some nonzero element a of R. If an element

y = a i m i + ■ • ■ + armr + ar+imr+\ + ■ ■ ■ + atmt G M

has ay = 0 for some a ^ O , we must have

ar+i = ■ ■ ■ = at = 0

and
aai £ RSi for i = 1,... ,r.
Thus T(M) C Ami ffi - • •ffii?m r . However, the divisibility conditions on
5\,..., Sr show that

ST{Rmx © • • • © RmT) = 0,

which gives the result.


(iii) This is now clear.
(iv) The integer w is the rank of the free module M/T(M), which is
independent of any choic^gfjp^tecfWeS^T6111 5 5 2
- - - n
178 Chapter 12. The Decomposition of Modules

The elements 5\,..., Sr are called the invariant factors of M and the
integer w is called the rank of M. The direct sum decomposition of M
into cyclic summands is known as the invariant factor decomposition of M.
The uniqueness of the invariant factors of M will be established in Corollary
12.8.2.
Notice that the uniqueness theorem for the invariant factors of a matrix
does not lead directly to the corresponding result for modules, the point
being that a given module can have many unrelated presentation matrices.
The invariant factor decomposition may also be less precisely referred
to as the cyclic decomposition, although it contains more information than
simply telling us that there is a cyclic decomposition of a module.

12.4 Some illustrations


Here are some computations to illustrate the theory.
(i) Suppose that M = R/Rp2q x R/Rpq2 where p, q are distinct irreducible
elements of R. The presentation matrix of M is T = dia.g(p2 q,pq2),
which, although diagonal, is not in invariant factor form. The Fitting
ideals of T are easily seen to be Rpq and Rp3q3, so the invariant factor
form of T is A = dmg(pq,p2q2) and
M =* R/Rpq x R/Rp2q2.
I 12 9 12
(ii) Let M be the Z-module with presentation matrix I 3 2 4 ). By
\ 12 8 22
section 12.2, M has the invariant factor presentation
777, 0, 3m'2 = 0 and 67713 = 0.

Omitting the trivial term with coefficient 1, we see that r = 2, w = o


and that
M = Z3 x Z6.
(iii) Next, we consider the integer matrix

r=
1 3
10
2
5 -5
1
°\
15
\ -20 -10 10 -30 /
that we introduced in section 11.1. The Fitting ideals of this matrix are
Fiti(r) = z )
Fit 2 (r) = 5Z V (12.12)
12.5. The primary decomposition 179

and the invariant factor form of T is

A = diag(l,5,0).

Thus the module with presentation matrix T is isomorphic to

Z 5 x Z,

so we have r = 1 and w = 1.
The next result, although also an illustration, is important enough to
be recorded separately.

12.4.1 Proposition
Let F be a field, let C be the rational canonical block matrix associated
to the polynomial f = f0 + f1X H h / n _ i X n _ 1 + Xn, and let M be the
module over the polynomial ring F[X] which is defined by the action of C
on Fn.
Then
M = F[X]/F[X]f(X).

Proof The presentation matrix of M is T = XI — C with

(° 0 • • 0 0
0 • • 0 0
-/o
-/l
\
1
0 1 ■ • 0 0 -h
C

0 0 • ■ 1 0 — fn-2
0 ■ • 0 1 -fn-1 /
As in section 11.4, the invariant factors of T are

S1 = ■■■ = 5n-i = 1 and 5n = / ,

which gives the result. □

12.5 The primary decomposition


In Theorem 8.4.2, we proved that a finitely generated torsion module over
a Euclidean domain has a primary decomposition, that is, it can be ex­
pressed as a direct sum of its p-primary components for certain irreducible
elements p. We also found the structure of these primary components when
the module is cyclic (secfJQfti^/^^e^j^H^^^put these results together to
180 Chapter 12. The Decomposition of Modules

obtain the structure of the primary components of a general torsion mod­


ule, which we can do since the invariant factor decomposition expresses a
torsion module as a direct sum of cyclic modules.
Let M be a finitely generated module over a Euclidean domain R. Recall
from section 8.4 that, for an irreducible element p of R, the p-primary
component of M is the submodule Tp(M) of M consisting of those elements
m of M which have pkm = 0 for some k > 0. Evidently, the primary
components of M are the same as those of its torsion submodule T(M), so
we may assume that M is a torsion module.
First, we review the results for the case that M = R/R5 is a cyclic
module. Let <5 = up™ ■■■Pk be a standard factorization of S. The
discussion in section 8.5 shows that R/R5 has the primary decomposition

R/RS = TPl (R/RS) © • - - © TPk (R/RS)

with
TPi (R/RS) « R/Rpfj) for j = 1 , . . . , k.
Now we return to the general case. Suppose that M = R/RSi © • ■ ■ ©
R/R5r is the invariant factor decomposition of M, and let

5T = u r p j ---Pk
be a standard factorization of 5r. Then for i < r we can write
r n(i,l) n(i,k)
Si =Uip^ ■■■Pk ■

These factorizations need no longer be "standard" since some exponents


n(i,j) may be 0, but, for fixed j , the exponents of the irreducible element
Pj form a nondecreasing sequence

0<n(l,j)<---<n(r,j).

Since the p-component of a direct sum is the direct sum of the p-


components of the summands (Exercise 8.1), we find that the nontrivial
p-primary components of M are

TPj (M) = R/RpfUi) x ■ ■ • x R/Rpfr'j) for j = 1 , . . . , Jfc. (12.13)

Notice that some of the summands may be zero modules, corresponding to


the possibility that n(i,j) = 0.
The collection of nontrivial powers

{p"{hj) I i =l.....r1L.Jj=ila^.^k, n(i,j) ^ 0}


12.6. The illustrations, again 181

that occur in the primary decomposition of M is called the set of elementary


divisors of M.
We also note that we can find an explicit set of generators 31,1, •■ ■ ,gr,k
of the torsion module M which gives the primary decomposition of M as
an internal direct sum; that is,

M = Rg1%x © • ■ • © Rgr:k

with
RgtJ^R/Rpf-j) for all i,j.
Suppose that we have already found a set of generators m\,..., mr which
give the invariant factor decomposition of M:

M = Rmi © • • • © RmT with Rrrn S! R/RSi for all i.

For each irreducible factor pj of Sr, let €i{pj) be the complement of pj in


Si'.
e.(n.\
- „"(«.!) ri(ij-l) n(i,j+l) n(t,fc)
^UO/ — Pi ■■■Pj_l Pj+1 •••?*:
(If Pj does not genuinely occur in Si, take eiipj) = Si.)
Then the results of section 8.5 show that the required generators, some
of which may be zero, are given by

9%,j = £i(Pj)m» for i = 1 , . . . ,r, j = 1 , . . . , k.

12.6 T h e illustrations, again


Here are the elementary divisors of the modules whose invariant factor
forms were computed in section 12.4 above.
(i) Let M = R/Rp2q x R/Rpq2 where p, q are distinct irreducible elements
of R as in (i) of 12.4. Then M = R/Rpq x R/Rp2q2 and so the primary
components of M are

TP(M) « R/Rp x R/Rp2 and Tq(M) S fl/ity x R/Rq2.

The list of elementary divisors of M is {p,p 2 , q,q2}-


(ii) The Z-module of 12.4(ii) is M = Z 3 x Z 6 . The primary decomposition of
M is therefore
M = T2(M)®T3(M)
with
T2(mpyfight&tflM&yM®l= Z 3 x z 3 .
182 Chapter 12. The Decomposition of Modules

T h e set of elementary divisors of M is { 2 , 3 , 3 } .


We also showed t h a t M has generators m i , m,2 with relations 3 m i =
6m2 = 0, where m i , m 2 occur in t h e original generating set for M. To
find generators of M adapted to its primary decomposition as in section
12.5 above, we take p\ = 2, P2 = 3, and write

(5i = 3 = 2°3 1 and 52 = 6 = 2 1 3 1 ,

so t h a t
n ( l , l ) = 0, n ( l , 2 ) = l, n ( 2 , l ) = l , n ( 2 , 2 ) = l .
T h e generators are then

3m
£?i,i = i > 9i,2 = " i i , 92,1 = 3 m 2 , 92,2 = 2 m 2

with relations

9i,i = 0, 301,3 = 0, 202,1 = 0, 3<?2,2 = 0.

(hi) T h e module of 12.4(iii) is Z5 x Z, whose torsion submodule Z 5 is already


5-primary.

12.7 Reconstructing the invariant factors


T h e invariant factors of a module can be reassembled from its elementary
divisors. We give the general argument first, and then an example.
Suppose t h a t pi,... ,Pk are distinct irreducible elements of R and t h a t
we are told that the elementary divisors of M are

2(1.1) *(l,y(l)) . . „z(fe,l) z(k,y(k))


Pi ,■■■,Pl >■■■1 Pk >■••>Pk '

where the exponents are all nonzero and are listed in nondecreasing order
for each pj, t h a t is,
z(j,l)<---<zU,y(j)).
Essentially, the method is to construct t h e "largest" invariant factor by
taking the product of largest powers of each irreducible element t h a t occur
in t h e list, then to find t h e next largest in terms of the remaining powers,
and so on.
More formally, we proceed as follows. T h e number of terms in which
Pj occurs is y(j). Let t h e integer r be the maximum value of y(j) for
j = 1,... ,k, and put

n rr z kk
<( >!)
>*) l±^Htihlamtt&S= z=(k>y(
nh^lMriM^^) ^v())))
12.8. The uniqueness results 183

and
°r — Pi • ■ • Pk
Next, write

n(r - 1,1) = 2(1,2/(1) - l)...n(r- l,fc) = -z(fc,y(fc) - 1)


and
Or-1 — Pi Pfc ,
where the exponent z(j, y(j) - 1) is to be interpreted as 0 if it happens that
y(j) = 1 for any j .
Continuing in this way, or, more properly, arguing by induction on r.
we obtain the set
<5i,...,<5r
of invariant factors of M.
An example. Here is a concrete illustration of the reconstruction argu­
ment. Suppose that the Z-module M has elementary divisors
2,2 2 , 2 4 , 2 6 ; 5 2 ; 7, 7, 7 3 ; 11, l l 2 , l l 4 .
In this case, r = 4, and collecting the highest powers gives
5i = 2 6 - 5 2 - 7 2 - l l 4 .
We then find
<53 = 2 4 ■ 7- l l 2 , 62 = 2 2 - 7 - l l , <5i=2.

12.8 The uniqueness results


We next turn to the question of the uniqueness of the invariant factors and
elementary divisors of a module. We first consider the case of a p-primary
module, that is, M = TP(M) for some irreducible element p of R.

12.8.1 Theorem
Let M be a p-primary R-module. Then the following hold.

M ^ R/Rpn{1) x • ■ •x R/Rpn^

where the nondecreasing sequence of positive integers

n(l) < ... < n{y)

is uniquely deterrr@KB$/liglitkd Material


184 Chapter 12. The Decomposition of Modules

(ii) The set {pn^\ ■ ■ ■ ,pn^} is both the set of elementary divisors of M
and the set of invariant factors of M, and it is uniquely determined
by M.

Proof Since M is annihilated by some power of p, the primary decompo­


sition of M cannot have any terms of the form R/Rq* if q ^ p. It follows
that the primary decomposition of M must be as stated, and that this
must be the same as the invariant factor form of M. Clearly, the unique­
ness assertion in part (ii) will follow immediately from the uniqueness of the
exponents n ( l ) , . . . ,n(y), which we establish by induction on the exponent
n(y). Notice that p n ( y ) is the smallest power of p such that pn^M = 0.
First, suppose that n(y) = 1, in which case we must have n(j) = 1 for
j = 1 , , , . , y and pR = 0. Thus M can be regarded as a vector space over
the field R/Rp - see Theorem 2.11.1 and Exercise 3.8 - and y is simply the
dimension of M over R/Rp, which we know to be unique by elementary
linear algebra.
Now suppose that n(y) > 1, and define z to be the number of terms
which are isomorphic to R/Rp; that is, z is the largest integer so that

n ( l ) = ■■■ = n(z) = 1.

If all the exponents n(i) are greater than 1, we take z = 0, and, in any case,
z <y.
For any exponent n > 1, we have Rp/Rpn = R/Rp71-1 (Exercise 6.4),
while p{R/Rp) = 0. Hence

pM £=! R/Rpn{z+l)-x x • • •x R/Rp"^-1.

By our induction hypothesis, the sequence of integers

n(z + l ) - l , . . . , n ( y ) - l

is uniquely determined by pM, which shows that the sequence

n(z + l ) , . . . , n(y)

is uniquely determined by M; in particular, the number y — z of terms in


this sequence is unique.
Next, consider the quotient module M/pM. If n > 1, then

(R/Rpn)/p(R/Rpn) a R/Rp,

and so M/pM is a vector space over R/Rp, of unique dimension y. Thus


the number of cyclic com&^}§^^i\/j£t&fafOTm R Rp
/ is
12.9. A summary 185

z = y- (y- z),

which is again unique as it is the difference of two numbers themselves


uniquely determined by M. □
Since the invariant factors of a module can be reconstructed from its
elementary divisors as in section 12.7, the next corollary needs no further
proof.

12.8.2 Corollary
Let M be an R-module. Then the invariant factors of M are uniquely
determined by M, except that each of them can be multiplied by a unit of
R. ' ' " □

12.9 A summary
As our description of the structure of a module has emerged piecemeal,
here is a brief summary of the main results.
Let M be a finitely generated module over a Euclidean domain. We
start by decomposing M as a direct sum T(M)®F{M), where T(M) is the
torsion submodule of M and F{M) is a free complement, as in Theorem
12.3.1. The submodule T(M) is absolutely unique, as we can see from its
definition as a subset of M. Exercise 12.1 shows that the free complement
F(M) is not usually unique as a subset of M, but it does have a unique
rank.
The next step is to express T(M) in invariant factor form

T(M) £* R/RSi x ■ • ■ x RSr;

as we have just seen, the invariant factors are unique apart from changes
of the form 5't = UiSi for units u, of R. The corresponding internal direct
sum decomposition of T(M) is usually far from unique, as can be seen by
considering all possible bases of the vector space R/Rp x R/Rp - any such
basis gives an internal direct sum decomposition of R/Rp x R/Rp as an
.R-module.
The torsion submodule can further be decomposed into p-primary com­
ponents TP(M) for irreducible elements p of R. These components are
absolutely unique since they are specific subsets of M. There is a unique
finite set pi,...,pk of irreducibles for which TPi(M) ^ 0, namely those
occurring as factors of the annihilator Ann(T(M)) of T(M).
Copyrighted Material
186 Chapter 12. The Decomposition of Modules

Finally, each TP(M) has a decomposition R/Rpn^ x •■■ x R/Rpn^\


n n(y
where the set of invariant factors {p ^\.. . ,p ^} of Tp(M) is unique pro­
vided that we take the exponents in non-decreasing order. Again, the corre­
sponding internal direct decomposition of TP{M) is not absolutely unique.
The set of elementary divisors of M is the set of powers {p"' 1 ', ■ ■ • ,pn<-y'}
where p ranges over all the irreducible factors of the annihilator of T{M);
this set is again unique apart from the order in which it is listed.

12.10 Abelian groups


As we saw in section 3.2, a Z-module is an additive abelian group under
another name. This observation means that our results on modules can be
interpreted in the language of group theory. Here is a brief sketch of this
interpretation.
A group theorist will usually prefer to write a group multiplicatively, as
in section 1.1. A cyclic group C will then appear as the set of powers

C = {xi\i£Z}

of a generator x instead of the set of multiples that is expected when we


use additive notation.
There is an infinite cyclic group Coo in which all the powers xl are
distinct. The finite cyclic group of order n is

Cn = { l , ! , . . . , ! 7 1 - 1 } with i n = l , i V l for 1 < i < n.

To see that these groups are really Z and Z n in disguise, note that there
are bijective maps
:
Qoo ^ —t ^ o o j

given by
aoo(i) = x\
and

given by
an(i) = x\
12.11. Lattices 187

which are isomorphisms of groups since the equations

aoo(i+ j) = aoo(i) + aoo(j)

and

an(i+j) = an(i) + an(j)

hold for all i,j and n.


Given a prime p of the ring of integers Z, the p-primary component of
a finitely generated multiplicative abelian group A takes the form

TP(A) = Cp„<i) x • ■ ■ x CpMv)

for a unique set of integers

n(l) < • ■' < n(y).

Each p-primary component of -A is a finite group, and so the torsion


subgroup T(A) of A is also a finite group.
The free component of A will have the form C ^ for a unique integer w,
the rank of A. Thus we recover a classical result, namely, that a finitely
generated torsion free abelian group is in fact free.
When A is a finite group, the p-primary component of A is the same as
the Sylow p-subgroup of A. Further details can be found in [Allenby].

12.11 Lattices
Our results on additive groups have a geometric interpretation when we g
consider finitely generated additive groups that are subgroups of a real
vector space R™. Such an additive group L is called a lattice in R n . It
is clear that L must be torsion-free as a Z-module. Thus, by Theorem
12.3.1, L is isomorphic to Z7" for some integer r, and hence L has a basis
{ a i , . . . , ar}. We say that L is a full lattice in R" if r = n, in which case
{ a i , . . . , an} is also a basis of R n .
The reason for the use of the term lattice can be seen from the following
diagram in R 2 :
188 Chapter 12. The Decomposition of Modules

• o o o c) O • o 0 0 0 0

o o o o 1» o o o o 0 • 0

o 0 • o <) O 0 o • 0 0 0

• o 0 o () O • 0 o 0 0 0

o 0 o o (» o o 0 0 0 • 0

o o 6i^ o c) O o o • 0 0 0

• o 0 c) o o 0 0 0 0

»-* Q Q Q —0 0
G— ©— © w h-i^™ V
0 6ei
o o • o c) O o o ^•62 0 0 0

• o o o c) O • 0 0 0 0 0

o o o o <» o o o 0 0 • 0

o o • o () O o o • 0 0 0

Here, we take L to be the lattice given by the standard basis {ei, 62}, and
M is a lattice with basis {61, 62}, where 61 = I I and 62 = I
The points belonging to M are indicated by solid circles • and those in L
by open circles o, except where they are hidden under a point of M.
A basis { a i , . . . , a n } of a lattice L defines a fundamental parallelepiped
II(L) whose vertices are the origin, the vectors a%,... ,an, and all the sums
it! + ■ •• + Otfei with i\ < • •■ < ik and 1 < k < n. We then associate with
L the volume
vol(L) = I det(ai . . . a n ) |
of II(L), which is (by definition) the absolute value of the determinant of
the matrix whose columns are the (column) vectors a\,... ,an.
For n = 2, we prefer to speak of parallelograms and area. In our illus­
tration, a fundamental parallelogram II(L) of L is indicated by dotted lines
and a fundamental parallelogram II(M) of M by solid lines. It is easy to
see that L has area vol(L) = 1, while vol(M) = 6.
Our main result in thi$^§^$gyj^p j^p^f^he volume of a lattice relates
12.11. Lattices 189

to the volume of a sublattice. We write \G\ for the order of a finite group
G.

12.11.1 Theorem
Suppose that L and M are both full lattices in the real space W1, and
that M is contained in L. Then the following statements hold.
(i) vol(L) does not depend on the choice of a basis for the lattice L.
(ii) The quotient group L/M is finite, and

vol(M) = vol(L)-|L/M|.

Proof
(i) Let { a i , . . . ,an} and {6i,... ,bn} be bases of L as a Z-module, write

b3 = aipij H h anpin for j - 1 , . . . , n,

and put P = (pij), an n x n matrix with entries in Z. Then P is a change


of basis matrix, as in section 5.6, and so it is invertible as an integer matrix
(Eq. (5.7)). By Theorem 5.12.1, we see that

det(P) = ± 1 .

Now let A = (ai ... an) and B = {b\ . . . bn) be the matrices whose columns
are the vectors in the respective bases. A careful check shows that

B = AP,

which explains why our scalars have suddenly appeared on the right, and
also why we had an unexpected transposition of suffices in our original
definition of the change of basis matrix.
Thus
det(B) = ±det(A),
which proves the assertion.
(ii) By Theorem 12.1.1, there are bases of L and M of the forms {bx,..., bn}
and {S\bi,..., 5rbr} respectively, where S\,... ,5r are the invariant factors
of L/M. But a basis of M is also a basis of R n , so we must have r = n. By
Theorem 12.3.1, we have

L/M^Z6l x.-xZ,n,

so that L/M is finite, with order

Co$ffi8t\tedWlatwfal
190 Chapter 12. The Decomposition of Modules

Clearly,

vol(M) = |det(5i&i . . . §
= <$i...(5n|det(&i . . . bn)\
= \L/M\vo\(L).

D
Thus, in the example above, we have L/M = 1,6. It is easy to see that
the basis {62, ^1} is a basis of L and that {62,6ei} is a basis of M.

12.12 Further developments


Our arguments in this chapter depend on the fact that the presentation
matrix of a module can be put into invariant factor form. Thus the results
of this chapter can be extended to modules over principal ideal domains.
They also hold for noncommutative Euclidean and principal ideal domains,
but with less satisfactory versions of the uniqueness results. Details can be
found in Chapter 8 of [Cohn: FRTR] and in [G, L & O].
Results on modules over some rings that are "nearly" Euclidean can
be found in [A & L], and an investigation of presentations of modules over
rings which are not principal ideal domains is given in [G & L].

Exercises
12.1 Let M = Z a x Z with a / 0 . Write mx = (T, 0) and m 2 = (0,1). Show
that M = Zmi © Zm 2 and that Z a S TLrax and Z £ Zm 2 .
Find all elements x = x\m\ + X2W2 € M so that Z = Zx.
For which of these x is there an element y G M so that M =
Zx © Zyl
Hint: consider intersections first.
12.2 Describe the invariant factor and elementary divisor forms of the Z-
modules with the following presentation matrices.
2 4 10 \
(a) 2 4 11
\ 3 7 0/
2 '\
(b) (I
I 32 12 ')
f \
(c) I 33 6 I.■ (See
(s Exercises 11.2 and 11.3.)
\2 a /
Exercises 191

/ 1 0 a \
12.3 Let A = 0 1 a I be a matrix over a field F, and let M be the
\ 0 0 b )
F[X]-module defined by the action of A on F 3 .
Discuss the possibilities for the invariant factor and elementary
divisor forms of M as a and b vary. (See Exercise 11.4.)
12.4 The challange again! Using Exercise 11.5, find the invariant fac­
tor and elementary divisor forms of the Z[i]-module with generators
mi,m2,m3 and relations

(1 + i)m,2 = 6mi + 4m2 + 2(1 + 1)7713 = 6m 1 + (5 — i)m,2 + 2imz = 0.

(Note that 1 + i and 3 are irreducible.)


12.5 Let L be a lattice in W1, and let b\,..., bn be any set of members of
L. Show that 61, ...,&„ is a basis of L as a Z-module if and only if

vol(L) = I det(6i . . . bn)\.

12.6 Let R be a Euclidean domain and let ip : R —> Z be the function of


section 2.1. We define (very unofficially!) the order \M\ of a finitely
generated torsion i?-module M by first setting \R/R5\ — tp(6) for
a cyclic module, and then using the invariant factor decomposition
(Theorem 12.3.1) to extend the definition to general modules.
Verify that this definition coincides with the usual one when R is
the ring of integers Z.
Show that Lagrange's Theorem still holds, that is if N is a sub-
module of M, then \N\ divides \M\, with quotient \M/N\.
Let K be the field of fractions of R. Extend ip to a function
<p : K -> Q (hint: Axiom ED 2). Define an iMattice L in Kn
and a "volume" vol(L), and show that, after obvious modifications,
Theorem 12.11.1 is still true.
Chapter 13

Normal Forms for


Matrices

Our aim in this chapter is to describe some normal forms for a square
matrix A over a field F. Before we can begin, we must describe what we
are seeking. A normal form for A is a matrix C whose entries conform to
some standard pattern and which is similar to A, that is, C = PAP~l for
an invertible matrix P over F.
The first normal form that we find is the rational canonical form. This
form can be computed from the invariant factors of the characteristic matrix
XI — A, and so it can be found by purely algebraic calculations, by which
we mean computations that involve only addition, multiplication and long
division of polynomials. The calculation of rational canonical forms enables
us to solve the similarity problem, that is, we can determine precisely when
two matrices are similar.
The second form is the Jordan normal form. This form is more elegant
when it can be obtained, but it can be found only when all the irreducible
factors of the characteristic polynomial det(XI — ^4) are linear polynomials
X — A. Thus, if we wish to ensure every matrix over F has a Jordan normal
form, we must impose the condition that every irreducible polynomial over
F is a linear polynomial, or, in other words, that F is algebraically closed.
Even when this requirement is satisfied, there is in general no purely alge­
braic method for determining the Jordan normal form, since the roots of a
polynomial cannot be determined algebraically.
We also discuss the versions of the Jordan normal form that can be
found when the irreducible factors of the characteristic polynomial need
not be linear.

193
194 Chapter 13. Normal Forms for Matrices

The Jordan normal form is useful for solving polynomial equations in


matrices, which we illustrate in a couple of examples, and we hint at how
such calculations occur in the representation theory of groups.
The margins of this chapter are liberally scattered with the "supple­
mentary" material indicator. In the original lecture course, I was able to
treat only the rational canonical form and the Jordan normal form over
the complex numbers. The extensions of the Jordan normal form and the
applications are supplementary material.

13.1 F[X]-modules and similarity


Our results on normal forms are derived from the structure theory of mod­
ules over the polynomial ring F[X], using the correspondence between
F[A"]-modules and matrix actions that we first discussed in section 4.4.
Let A be an n x n matrix over the field F and let M(A) be the space
Fn made into an F[X]-module with X acting as A, in the usual manner.
Then there are two types of change we can make to M(A) that result in
the replacement of A by a similar matrix, that is, a matrix of the form
A' = PAP-1. (Following the practice in group theory, we will sometimes
say that A' is a conjugate of A.)
The first type of change corresponds to isomorphism between F[X]-
modules. If we are given that A' = PAP-1, then the linear map 7r : x —> Px
on Fn is an F[X]-module homomorphism from M(A) to M(A') (section
4.6) and moreover it is an isomorphism since P is invertible. Conversely, if
we are given an F[X]-module isomorphism TX from M(A) to M(A'), then
TV must be given by an invertible matrix P and A' and A will be similar
through P.
The second type of change is to change the basis of Fn. Let B be the
standard basis of Fn, let B' be another basis for Fn, and write A' for the
matrix of the linear transformation v —> Xv with respect to B'. Then

A' = PAP'1

where P = PB\B is the change of basis matrix (Eq. (5.13) of section 5.11).
Conversely, if we are given that A' is similar to A through P , then we
choose B' so that P = PB',B- Thus A' is again the matrix of the action of
X with respect to the basis B'.
Of course, these two types of change are really two interpretations of
one phenomenon. If we are given a basis B = {b\,..., bn} of Fn and an
isomorphism 7r : Fn -)■ Fn, then B' = {7r(6i),... ,7r(6„)} will be another
basis of Fn. Vice versa, if B' = {b'j,..., b'n} is another basis, we can define
an isomorphism TX by ■K<^@pyr^m^l4terial
13.2. The minimum polynomial 195

These remarks suggest our strategy for finding a normal form for A: we
analyse the structure of M(A) as an F[X]-module and hence choose a basis
of Fn so that the action of X is represented by a matrix in some desirable
form. We start by reviewing the results that we found in previous chapters.
As we saw in section 9.7, the characteristic matrix
T= XI-A,
is a presentation matrix for M(A). The n-th Fitting ideal of XI - A is
generated by the characteristic polynomial det(XI — A) of A, which is a
monic polynomial of degree n (section 11.4). The invariant factors of XI-A
are polynomials Si(X),..., 6n(X) that satisfy the relations
61(X)... 5n(X) = det(XI - A), S^X) | • • • | Sr(X),
and, by Corollary 11.2.2,
51(X)...5h(X)=Fith(XI-A) for h=l,...,n.
Since the nonzero constants are units in F[X], we can take the invariant
factors to be monic polynomials, and then, by Theorem 11.3.1, they are
uniquely determined by the matrix XI—A and hence by the matrix A itself.
We also know from Corollary 12.8.2 that the invariant factors are uniquely
determined by the module M(A), that is, no alternative presentation of
M(A) can give different invariant factors.
Thus, by Theorem 12.3.1, M(A) has an invariant factor decomposition
M(A) £ F[X]/F[X]S!{X) x • • • x F[X]/F[X]5n(X) (13.1)
as an F[X]-module.
Note that the rank w of M{A) must be 0, that is, M(A) cannot have a
nontrivial free component F[X]W. The simplest way to see this is to note
that M{A) is finite dimensional as a vector space over F and F[X] is not.
A more sophisticated approach is to recall that Ann(M(A)) ^ 0 since it
contains the characteristic polynomial det(XI — A) of A (see Exercises 9.1
and 9.2).
In general, some of the invariant factors of XI — A will be the constant
polynomial 1. As these invariant factors give a zero component of M(A),
we say that they are trivial invariants factors. We sometimes omit such
terms from expressions such as that in Eq. (13.1) above.

13.2 The minimum polynomial


We now have a better understanding of the annihilator Ann(M(A)) of
MIA).
196 Chapter 13. Normal Forms for Matrices

13.2.1 T h e o r e m
(i) Ann{M(A)) = F[X]6n(X).
(ii) Sn(A) = 0.
(Hi) If h{X) is any other polynomial with h{A) 0, then 5n(X) divides
h(X).

Proof The divisibility conditions S\ \ • ■ ■ \ 5n show that Sn(X) annihilates


M(A). But Ann(F[X]/F[X]5n(X)) = F[X]Sn(X), which gives (i).
Thus the matrix 8n{A) acts as the zero linear transformation on the
underlying space Fn, and so must be the zero matrix.
Finally, note that if h(A) = 0, then h(X) £ Ana(M(A)). D
The polynomial 5n(X) is called the minimum (or sometimes minimal)
polynomial of the matrix A, as it is the (unique) monic polynomial of small­
est degree that is satisfied by A. The Cayley-Hamilton Theorem (Exercise
9.2) shows that A also satisfies its characteristic polynomial det(XI — A).
Examples. Here are some illustrations based on the calculations in sec­
tion 11.4. Let
(0 0 • • 0 0 -/o \
1 0 • ■ 0 0 -h
0 1 • • 0 0 -h
C

0 0 • • 1 0 — fn-2
V0 0 • • 0 1 — fn-1 )
be the companion matrix of the polynomial / = fo + fiX-\ h/n-i^n_1 +
Xn.
Then, as in Proposition 12.4.1,

S^X) = ■■■ = 5n-!(X) = 1 and Sn{X) = f(X),

so that the minimum polynomial of C is the same as the characteristic


polynomial of C; the F[X]-module corresponding to C is F[X]/F[X]f(X).
Let ^ 4 = 1 , be a 2 x 2 matrix, which has characteristic polyno­
mial
f = X2-(a + d)X + (ad-bc).
The calculations in section 11.4 show that there are two possibilities for
M(A).
(i) If any of the inequalities

C6p$rl&)te<¥Matarial^ d
13.3. The rational canonical form 197

holds, then 8\(X) = 1 and 8^{X) = f, so that M(A) is isomorphic to


F[X]/F[X]f(X) and the characteristic and mininum polynomials of A
coincide,
(ii) If b = c = 0 and a = d, then 6X(X) = X - a = S2(X), so that M(A) =
N x N, where N = F[X]/F[X](X - a) is the field F regarded as an
F ^ - m o d u l e with X acting as the scalar a. The minimum polynomial
of A is X — a and the characteristic polynomial is (X — a)2.

13.3 The rational canonical form


We obtain our first canonical form for a matrix. Let A be an n x n matrix
over a field F, and let M = M(A) be the F[X]-module defined by the
action of A on Fn in the usual way. For any F[X]-submodule of M, the
action of X on N defines the F-linear transformation cr(X) : n —> Xn. We
sometimes refer to a(X) informally as "the linear transformation X".
We write the invariant factor form of M as an internal direct sum (omit­
ting trivial components)

M = Mi 0 • • • © M r

with
Mi*F[X]/F[X)Si(X) for i = l,...,r.
Choose a basis Bi of each summand M, as a vector space over F, and let
Di be the matrix representing the action of X on M% for each i. Then the
union B = B\ U . . . U Br is a basis of M, and, as in section 7.5, the matrix
of the linear transformation X with respect to this basis is a block diagonal
matrix
/ £>i 0 . . . 0 \
0 D2 ... 0
D dia.g(Di,...,Dr).

\ 0 0 ... Dr /
Fix an index i. We exploit the fact that we have an isomorphism

Bi-.Mit* F[X]/F[X]Si{X)

to choose Bi to correspond to the canonical basis of F[X]/F[X]Si(X) that


we constructed in section 2.12: explicitly,

Bi = {&i,i, ■ ■ • ,bi<n^}

where
»(i)-i
0(&i,0tw>^ft?«f(M$$at x
198 Chapter 13. Normal Forms for Matrices

and n(i) is the degree of St. Then the corresponding matrix A for X is the
companion matrix C(6i) of Si. Put
1
Sl = *W + 5 « X + ... + 5 « X^" + Xn»,

so that
0 0
(° 0
0
..
.. 0 0
-4° \
1

C(*i)
0 1 .. 0 0
-4°
0 0 . 1 0 °n(t)-2
0 .. 0 1 °n(t)-l /
We summarize these remarks, and a little more, as a theorem.

13.3.1 The Rational Canonical Form.


Let A be an n x n matrix over a field F. Then there is an invertible
matrix P such that

PAP-1 = C(A) = diag(C(<5i),..., C{5r))

is a block diagonal matrix over F, in which the diagonal terms are the
companion matrices C(Si) of the nontrivial invariant factors Si of the char­
acteristic matrix XI — A of A, and

<5il...|<5r.

Furthermore, the nontrivial invariant factors of XI — C(A) are also


Si,... ,Sr.
Definition: the matrix C(A) is called the rational canonical form of A.

Proof The points not covered by the preceding discussion are the existence
of the invertible matrix P and the claim about the invariant factors.
In the notation of section 5.11, A = (X)E,E is the matrix of the linear
transformation X of Fn with respect to the standard basis E of F n , while
C(A) = (X)B,B is the matrix of X with respect to B. Let P = PB,E be
the corresponding change of basis matrix, which is invertible, with inverse
P~l = PB,B- Then Eq. (5.13) gives C(A) = PAP'1 as required.
Since A is similar to C(A), we see that XI — A is similar to XI — C(A),
so both matrices have the^sa^JnyMianl.faGtprs by Theorem 11.3.1. □
13.3. The rational canonical form 199

13.3.2 Corollary
The rational canonical form of A is unique.

Proof
Suppose that a matrix A has two rational canonical forms, say

PAP-1 =C = diag{C(51),...,C{5r))

as above and also

QAQ-1 = C = d i a g ( C ( ^ ) , . . . , C{5'r,))

with
5'x\...\5'r,.

Since the matrices C and C" are similar, their characteristic matrices
XI — C and XI — C" must have the same invariant factors, again using
Theorem 11.3.1. We know that the nontrivial invariant factors of XI — C
are the same as those of XI — A, namely 5\,..., Sr, the remainder all being
the identity 1. However, a direct computation of Fitting ideals shows that
the nontrivial invariant factors of XI — C must be 5[,..., 6'r,. It follows
that r = r' and, since all these polynomials are monic, that 5, = 5[ for
i = l,...,r. □

We can now determine when two nxn matrices A and A' over the field
F are similar.

13.3.3 Theorem
Two nxn matrices A and A' over a field F are similar if and only if
their characteristic matrices XI — A and XI — A' have the same invariant
factors.

Proof
If A and A' are similar, so are their rational canonical forms. By Corol­
lary 13.3.2, these rational canonical forms must be the same, so that the
invariant factors of the matrices XI — A and XI — A' must also be the
same.
Conversely, if XI — A and XI — A' have the same invariant factors, A
and B are both similar to the same matrix in rational canonical form and
so are themselves similar.

200 Chapter 13. Normal Forms for Matrices

13.4 The Jordan normal form: split case


The second canonical form for a matrix that we exhibit is the Jordan normal
norm. First, we give it in its most familiar version, which arises when
the characteristic polynomial of A splits into linear factors. The standard
factorizations of the invariant factors of XI — A will then be products of
powers (X — A)' for various scalars A and exponents t, and each such power
will give rise to a component matrix of the Jordan normal form, in the
following way.
Let M = F[X]/F[X](X - A)' and regard X and X - A as linear trans­
formations of M, viewed as a vector space over F. Define a set of elements
W = {u>\,..., Wt) in M by the equations

w1 = 1,
VJI = (X - X)w\,
u>3 = (X - A)w2 = (X-A)V, (13.2)

Wt (X-X)wt-! = (X-A)*"1^.

Then W must be a basis of M as an F-space (or otherwise (X — A)' would


not be the minimum polynomial of the linear transformation X of M), and
the action of X on this basis is given by

Xwi = Xwi + u>2,


Xu>2 = \w2 + U>3,
(13.3)
Xwt-i = Awt_x +wt,
Xwt = Xwt.

It follows that the matrix representing X with respect to the basis W is

/ A 0 0 0 0 0 \
1 A 0 0 0 0
0 1 A 0 0 0
J(X,t) (13.4)
0 0 0 A 0 0
0 0 0 1 A 0
\ 0 0 0 0 1 A )

Such a matrix is called an elementary Jordan matrix.

13.4.1 The Jordan No^t9^r^mm Material


13.4. The Jordan normal form: split case 201

Let A be an n x n matrix over a field F. Suppose that the irreducible


factors of the characteristic polynomial det(XI — A) of A are all linear, so
that
det(XI -A) = (X- Ai) 2(1) {X - Xs)z{s)
for some scalars Xi, ...,XS in F and exponents z ( l ) , . . . ,z(s).
Then there is an invertible matrix P such that

PAP'1 = J(A) = diag( J(Ai, t(l, 1 ) ) , . . . , J{Xut(iJ)),..., J(Xs,t(s, r)))

is a block diagonal matrix over F, in which the diagonal matrices are ele­
mentary Jordan matrices.
The size t(i,j) of the entry J{Xi,t{i,j)) is given by the exponent of
X — Xi in the factorization

SjiX) = {X- Xi)t{l'j) (X - Xs)t{s-j)

of the j-th nontrivial invariant factor Sj(X) of XI — A.


NB: It may happen that some of the exponents t(i,j) are zero when j < r. In
this case, the term J(Xi,0) is to be interpreted as a phantom 0 x 0 submatrix
of J.
Definition. The matrix J(A) is called the Jordan normal form of A.

Proof By the results in section 12.5, the primary decomposition of the


F[X]-module corresponding to A is

M = Mi,! © • • • © Mi,j © ■ • • © M s , r

in which
AfijeiF[X]/F[X\{X->n^,
the polynomials

{X-XiY^ for i = l,...,s, j = l,...,r,

being the elementary divisors of M.


We take an F-basis of each summand Mit]- so that X is represented
by the elementary Jordan matrix J(Xi,t(i,j)) - if some t(i,j) = 0, the
corresponding summand is 0, which has the empty set as its basis. The
union of all these bases gives a basis of M and the action of X on this basis
is represented by the matrix J as above.
The matrix P is then the change of basis matrix from the standard basis
of M to the new basis, asC3up>togf^te«6Mi/l'4ia6rem 13.3.1 above. Q
202 Chapter 13. Normal Forms for Matrices

13.4.2 Corollary
Let A be annxn matrix over a field F and suppose that A has a Jordan
normal form. Then the Jordan normal form of A is unique, apart from the
order in which the elementary blocks are written.

Proof It is clear that we can always permute the order of the diagonal
blocks of the Jordan normal form, since this amounts to a renumbering of
the roots of the characteristic polynomial of A. The argument to show that
no other change is possible parallels that given for the rational canonical
form in Corollary 13.3.2 above. If J = J(A) is a Jordan normal form of
A as constructed in the theorem, then the nontrivial invariant factors and
hence elementary divisors of XI — J must be the same as those of XI — A.
However, direct calculation shows that the invariant factors correspond­
ing to an elementary Jordan matrix J(A, t) are 1 , . . . , 1, (X — A)*, which are
also the elementary divisors of XI — J(\,t). Thus, if

J ' = diagCJCA;, t'(l, 1 ) ) , . . . , J(Aj,t'(», j)),..., J(\'s,,t'(s',r')))

is an alternative Jordan normal form for A, we find that the nontrivial


elementary divisors of XI - A are (X - A'^)' W for i = l , . . . , s ' and
j = l,...,r'.
This forces the equalities s = s', r = r', and Xi = AJ for all i and
t(hJ) = t'(i,j) for all i,j. □

13.5 A comparison of computations


The methods for computing the two normal forms of a matrix, the rational
canonical form and the Jordan normal form, are rather different, in that
the calculation of the rational canonical form is purely algebraic, while the
calculation of the Jordan normal form is not.
Our theory shows that to find the rational canonical form of a matrix A,
we must compute the invariant factors of the characteristic matrix XI — A.
We can do this by finding the Fitting ideals of XI — A and then using
Lemma 11.1.1, or, alternatively, we can use elementary row and column
operations as in Theorem 10.5.1. In either case, it will be arduous to
perform the calculations by hand unless the matrix A is small, or has some
special form, since we must work in the polynomial ring F[X],
However, the calculations are purely algebraic in the sense that we need
only use the operations of addition, multiplication and long division in
the ring F[X]. They are also algorithmic in that a computer can be pro­
grammed to perform theTQopyrjghted Material
13.6. The Jordan normal form: nonsplit case 203

In contrast, it may happen that a matrix with entries in a given field F


does not have a Jordan normal form over that field, since the irreducible
factors in F[X] of the characteristic polynomial XI - A may not be linear.
For example, the real matrix

H-ll)
has characteristic polynomial X2 + 1, which is irreducible over the real
numbers.
This failing can be remedied in two ways. Given a polynomial f(X) over
a field F, it is always possible to embed F in a splitting field E in which
all the irreducible factors of f[X) are linear (Proposition 2.13.1). We can
therefore construct a field over which A does have a Jordan normal form
by adjoining roots of the characteristic polynomial of A to F.
Thus if we adjoin i = y/—l to R, obtaining the complex numbers C,
then the matrix A above has the Jordan normal form
o - "
The second, more drastic, method is to embed F in an algebraically
closed field E. By definition, E has the property that the only irreducible
polynomials over E are the linear polynomials X — A. This property can be
restated as "every nonconstant polynomial over E has a root in E". It is
a fact that any field can be embedded in an algebraically closed field - see
§6.1 of [Cohn 2].
The field C of complex numbers is the most familiar example of an
algebraically closed field, but some analytic tools are required to establish
this fact. A proof is given in §7.4 of [Cohn 2]. Moreover, the evaluation in C
of the roots of a polynomial over Q (for example) cannot be always carried
out algebraically, as is evidenced by the existence of quintic polynomials
that cannot be solved by radicals.
Thus the Jordan normal form cannot be computed purely by algebraic
computations in the polynomial ring F[X] save in special circumstances.
Notice that the rational canonical form of a matrix A is computed
through the invariant factors of the module M(A), but that the Jordan
normal form requires instead the elementary divisors of M{A).

13.6 The Jordan normal form: nonsplit case


We next derive the version of the Jordan normal form which can be found O
even when the characteristic polynomial of A does not split into linear
factors over the coefficientfi&)0J*ft?/WfeCBMi#eHH/thisform the nonsplit JOT-
204 Chapter 13. Normal Forms for Matrices

dan normal form to distinguish it from the standard version of the Jordan
normal form.
As usual, let M(A) be the F[.X"]-module associated to a square ma­
trix A. The primary decomposition of M{A) as F[X]-module expresses
M(A) as a direct sum of components which are (isomorphic to) cyclic mod­
ules F[X]/F[X]p(X)k, where p(X) varies through the irreducible factors
of XI -A.
On general principles, A is similar to a block diagonal matrix J+{A)
whose diagonal terms correspond to the action of X on the various cyclic
components of M{A). We shall give a description of a typical diagonal term,
which we designate J+(p, k).
It will be convenient to use a double-suffix notation to describe the basis
that we construct. Write p(X) = p0 + p\X -\ 1- Xh, so that deg(p) = h,
and put
u>i,i =T,wi, 2 = X w i , i , . . . , w u = X h _ 1 w;i ? i.
Notice that if we reduce mod p, these elements map to the canonical F -
basis 1,lC, ...,Xh~l of F[X]/F[X]p(X) as constructed in section 2.12.
Now for i = 2 , . . . , k we define

ma = p ( * ) l - 1 w i , i . • • • > wi,h = p{X)%~lwlM.

Since the elements in "layer" i map to the canonical F-basis of

F{x}p(xy-l/F[x}p(xy s F[X]/F[X\P(X),
the collection W = {wij} spans M as an F-space. Since W has kh mem­
bers, it is therefore an F-basis of M.
The action of X on the basis elements is as follows. For j < h,

Xwij = wi]J+i

whatever the value of i. For j = h and i < k we have

Xwith = X wlA = -po«>t,i - PiVJi,2 Ph-\Wi,h + Wi+i.i

and for j = h and i = k, we have

XWk,h = X Wk,l = -poWk,l - PlU>fc,2 Ph-lWk,h-

To write down the corresponding matrix J+(p,k), we take the basis


elements in the order in which we constructed them; that is, we give the
set of suffices
^'Jbofayri&htecr haien^ •• • >h
13.6. The Jordan normal form: nonsplit case 205

the lexographical ordering

(1,1), ( 1 , 2 ) , . . . , (1, h); (2,1), (2, 2 ) , . . . (2, h);...; (k, 1), (k, 2 ) , . . . , (fc, h).

Thus the rows and the columns of J+(p,k) must also labelled with
double-indices (i,j) arranged in this order.
The column (i,j) of J+(p,k) gives the effect of X on Wij. For j < h,
we see that the (i,j)-th column is

(.. . , 0 ; 0 , 0 , . . . , 0 , 1 , 0 , . . . , 0 ; 0 , . . . ) T

where we exhibit the entries in the rows labelled (i, 1 ) , . - . , (i, h) (we have
transposed the column for convenience). The only nonzero entry is in the
(i,j + l),(i,j)-place.
For j = h and i < k, the (i, h)-th column is

(..., 0; -po, - p i , . . . , -Ph-i; 1,0,..., 0; 0,.. .) T ;

the entry — pi occurs in row (i, h) while the entry 1 is in row (i + 1, h).
Finally, for j — h and i = k, column (k, h) is

(•■•,0;-p0,-Pi,- -Ph-i)

the entry — pi occurs in row (i, h)


We can now describe the matrix J+(p, k) as a block matrix. Let

(o 0
0
• ■ 0
■ • 0
-Po \
1 -Pl
C = C[p)
0 0 • • 0 -ph-2
0 • • 1 -Ph-1 I

be the companion matrix of p and let Y be the h x h matrix

/ 0 0 0 1\
0 0 0 0
Y =
0 0 ••• 0 0
Cor)yifyh&d Material0 J
206 Chapter 13. Normal Forms for Matrices

Then
/ C O O 0 0 0 \
Y C 0 0 0 0
0 Y C 0 0 0
J+(p,k) (13.5)
0 0 0 c o o
0 0 0 Y C 0
\ o o o 0 Y C )

The uniqueness of the nonsplit Jordan normal form is proved in the


same way as in the case of the ordinary Jordan normal form. The question
of computability amounts to that of the computability of the irreducible
factors of the characteristic matrix, which again is not usually possible by
algebraic methods.

13.7 The Jordan normal form: separable case


s EinFinally, we give a variation on the nonsplit Jordan form which is used
[Green] to calculate the characters of elements of linear groups. This
ii
variation does not appear to be discussed in any textbooks, at least, not at
a relatively elementary level.
We assume again that we are given a n n x n matrix A with coefficients in
a field F and that some of the irreducible polynomials p(X) which occur in
the factorization of characteristic polynomial XI — A of A are not linear. (If
the irreducible factors are all linear, we simply regain the ordinary Jordan
normal form.)
The existence of this alternative form depends on a hypothesis about
the roots of the irreducible polynomials p(X). As we proved in Proposition
2.13.1, we can extend the field of coefficients F to a bigger field E in which
all the irreducible factors of the characteristic polynomial are linear, which
amounts to the same thing as saying that each polynomial p(X) has all its
roots in E.
Our hypothesis is that each irreducible polynomial p(X) has distinct
roots in E. The technical expression for this condition is that each p(X) is
a separable polynomial.
Notice that different irreducible factors p{X),q(X) of XI — A are per­
mitted to have roots in common in E. An example of a non-separable
polynomial is given in Exercise 13.7.
The claim is that A is similar to a matrix JS(A) whose diagonal blocks
Copyrighted Material
13.7. The Jordan normal form: separable case 207

are matrices of the form

cI 0 0 • 0
■■ 0 0
c 0 •• • 0 0 0
0 I c ■■ ■ 0 0 0
J°(p,k) (13.6)
0 0 0 •• • c 0 0
0 0 0 •• ■ I c 0
\ 0 0 0 •• ■ 0
cC ) I
where C is the companion matrix of p and I is an h x h identity matrix,
h = deg(p). Such a block has the same form as the matrix J+(p, k) of Eq.
(13.5) above except that the matrix Y is replaced by an identity matrix I
throughout.
The matrix J s (^4) is called the separable Jordan form for A.
The claim will be established by showing that Js(p,k) is similar to
+
J (p, k) for any p and k. Since the polynomialp has distinct roots Ax,..., A/,
in the extension field E, we know that the Jordan normal form of the matrix
Cis
/Ax 0 ... 0 \
0 A2 ■•• 0
-l
A SCS
\ 0 0 ■■■ A,, /
where S is an h x h invertible matrix having entries in E.
Let T = diag(5 S ■ ■ ■ S) be a block diagonal matrix with k copies of S
on the diagonal. Then
A 0 0 • • 0 0 0
(
J A 0 • ■ 0 0 0
0 I A • ■ 0 0 0
1
J' = TJ°(p,k)T- = (13.7)
0 0 0 ■ ■ A 0 0
0 0 0 ■ • / A 0
^o 0 0 ■ • 0 I A /

We write the row and column indices of J' in the form


(i — \)h + j where j = I,... ,h and i = 1 , . . . , k
to conform with the partition of J' into h x h blocks. Thus, for fixed i,
the rows and columns labelled with (i — l)h + j , j varying, give the i-th
diagonal block of J', and the nonzero entries in J ' are

Aj in the ^Qpyh^hfedMate^ + J) " t h P l a c e


208 Chapter 13. Normal Forms for Matrices

and

1 in the (ih + j , (i - l)h + j ) - t h place for i = 1 , . . . ,k - 1.

Next, we permute the rows and columns of J' by moving row (i — l)h+j
to row (J — l)k + i and likewise moving column (i — l)h + j to (j — l)fc + i,
obtaining a new matrix J".
Let P be the hk x hk permutation matrix corresponding to the permuta­
tion of the rows, that is, P is the result of performing the row permutations
on an hk x hk identity matrix. Then the matrix P ~ x corresponds to the ef­
fect of performing the column permutations on an hk x hk identity matrix,
and
J" = PJ'p-\
We now observe that the nonzero entries of J " are

\j in the ((j: — l)k + i, (j - l)k + i)-th place, j = 1 , . . . ,h, i = l,...,k

and

1 in the ((j - l)fc + (i+ 1), {j - \)k + i) -th place for i = 1 , . . . , k - 1.

Thus J" is a diagonal block matrix, having h diagonal blocks corresponding


to the range of values of j , and the j-th block is the k x k matrix

A 0 0 • • 0 0 0
( J \
1 \j 0 • ■ 0 0 0
0 1 A, • • 0 0 0
Ai =
0 0 0 • A, 0 0
0 0 0 • 1 Xj 0
Vo 0 0 ■ ■• 0 1 ^ 1
which is evidently the elementary Jordan matrix associated to the poly­
nomial (X — \j)h- Hence J " is in Jordan normal form, associated to the
polynomial
(X-\1)k...(X-\h)k=p(X)k.
But then J" must be the Jordan normal form of the matrix J+(p,k),
+
since the characteristic matrices of both J" and J (p,k) have the same
invariant factors. It follows that J " and J+ (p, k) are similar, and, trac­
ing through the various similarities that we have used, that J" (p, k) and
J+(p, k) are similar as ™%mm®<fitiaterial
13.8. Nilpotent matrices 209

Finally, we have to prove that Js(p,k) and J+(p,k) are in fact similar
as matrices over F, that is, we can find a matrix Q with entries in F so
that QJs{p,k)Q~l = J+(p,k).
As matrices over E, the characteristic matrices of both J"{p,k) and
J+(p,k) have the same set of invariant factors. However, these invariant
factors already belong to F, since both matrices in fact have entries in F,
which shows that Js(p,k) and J+(p,k) are similar over F, by Theorem
13.3.3.

13.8 Nilpotent matrices


We use the Jordan normal form to show how some elementary matrix equa­
tions can be solved by listing the possible Jordan normal forms of a solution.
A square matrix A over a field F is nilpotent of exponent k if Ak = 0
for some integer k > 1, but Ak~l ^ 0.
Suppose that A is nilpotent. By Theorem 13.2.1, we know that the
minimum polynomial of A is the "highest" invariant factor 5n (X) of XI — A
and that Sn(X) divides Xk. Thus the nontrivial invariant factors of XI — A
take the form Xii for exponents t\,..., tr with t\ < ■ ■ ■ < tT, and so A has
Jordan normal form
J( J 4) = d i a g ( J ( 0 , t 1 ) , . . . , J ( 0 , t r ) ) .
An easy calculation confirms that each block J(0,ti) is nilpotent of expo­
nent ti, so J{A) is nilpotent of exponent tr and the exponent of its conjugate
A must also be tr.
Conversely, if J(A) has the form above, then J{A) and hence A are
nilpotent.

13.9 Roots of unity


Next, we seek matrix solutions of the equation Xk = I, I an identity
matrix. In the next chapter, we shall see how such solutions can be used
in determining the representations of a cyclic group (section 14.4; Exercise
14.3). For simplicity, we assume that A has a Jordan normal form J(A).
Since each invariant factor of XI- A is a divisor of the minimum polynomial
5n(X) of XI - A and, in turn, 5n(X) divides Xk - 1 (Theorem 13.2.1),
this hypothesis will be satisfied if Xk — 1 splits into linear factors in the
coefficient field F, that is, if F contains the fc-th roots of unity.
Clearly, the matrix A is a root of unity if and only if its conjugate J(A)
is a root of unity, so we can reduce the problem to that of determining the
elementary Jordan m a t r i © § p $ 7 Y $ ^ e j ' ^ £ f / a /
210 Chapter 13. Normal Forms for Matrices

Suppose J is such an elementary Jordan matrix, of size txt with t > 1.


An easy calculation shows that
/ Xk 0 •■
k
J = kXk~l Xk
\ : : '
so we must have
Afc = 1 and kX^1 = 0. (13.8)
The analysis now separates into two cases. These equations are incom­
patible if the characteristic of F is either 0 or a prime p which does not
divide k, since then k ^ 0 in F. Thus, J(A) must contain only l x l blocks,
that is, it is a diagonal matrix, and each "block" must be a fc-th root of
unity in F.
s Suppose, on the other hand, that F has nonzero characteristic p which
does divide k. We compute the powers of the elementary Jordan matrix J
in two steps. Write k = psh with h coprime to p and put
J = XI + L,
where XI is a scalar matrix and L is the obvious lower triangular matrix.
Since XI commutes with any matrix, and p divides all the binomial coeffi­
cients except the first and last, we have
JP = XPI + Lp,
and inductively
JP' = Xp'l + Lp'.
p 3
Notice that L ' = 0 <£> p > t.
Next, consider a matrix of the form p,I + W with W strictly lower
triangular. Then
{Hi + W)h = nhI + hp,h~lW + ■■■,
from which we see that
(/j,I + W)h = / o / = l a n d l V = 0,
since h ^ 0 in F.
Thus
(XI + L)p'h = 1^ Xp'h = 1 and ps > t.
Finally, observe that, in field of characteristic p,
Xp'h-l = (Xh - l ) p
\
from which we conclude that an elementary Jordan matrix with Jk = I is
a t x t matrix with t < g g A B / l i J J ^ ^ U ^ ^ / ^ - t n root of unity in F.
13.10. Further developments 211

13.10 Further developments


The results of this chapter depend very much on the fact that we work
with matrices over a field F. Section 8.5 of [Cohn: FRTR] gives some
results when F is a division ring; beyond this, it is very difficult to find
normal forms for matrices, even when the coefficient ring is commutative,
as can be seen from [G & L], [G, L & O] and [L & S].

Exercises
13.1 Let
-3-2 4 \
'■[ 4
0 - 1
1-4
1 /
be a complex matrix. Find the rational canonical form and Jordan
normal form of A, and show that they are the same as for
-1 0 0
B 0 0 -1
0 1 -2 )
Notice that B is not in rational canonical form, although it is a
diagonal block matrix made up of companion matrices. Explain.
13.2 Find the rational canonical form and Jordan normal form for each of
the following matrices:
2 0 0 0 \
1 0 1

A=| 0 0 M
1 ; B = ! o o 1
0
\
C
3 2 0 -2
0 0 2 0
Vo 0 0/ l l 0 o/ V0 0 2 2

and
/ 1 1 1 • • 1 1
0 1 1 • • 1 1
D 0 0 1 • • 1 1

V\00 00 00 ■■•■■ 00 11 / /
where D is an n x n matrix.
13.3 Repeat Exercises 13.1 and 13.2, but regarding the matrices as having
entries in the following fields in turn: R, Z2, Z3, Z5. (So this is 6 x
4 problems in one. You should always find a rational canonical form,
but there may be no Jordan normal form.)
Find the nonspliC<^9/dgfi^6rfiMa/6gJS/there is no Jordan form.
212 Chapter 13. Normal Forms for Matrices

13.4 Let A be a square matrix over a field F, and let AT be its transpose.
Show that the invariant factors of the characteristic matrices XI - A
and XI — AT axe the same. Deduce
(a) A is similar to AT;
(b) the F[X]-modules M(A) and M(AT) are isomorphic.
13.5 Let F be any field. Find the possible Jordan normal forms of an n x n
matrix A which satisfies the equation A2 = A.
s 13.6 Let p(X) be a separable irreducible polynomial over a field F , with

p(X) = (X-X1)...(X-Xh)

in some extension field E of F, and write

?i{X) = (X - Xx)... (X - A i - i ) ( * - A i+1 ) ...(X-Xh)

for i = 1 , . . . ,k.
Let J' be the matrix in Eq. (13.7), and let Mi be the (hk - 1) x
(hk - l)-minor of XI - J' formed by eliminating row 1 and column
h+1 of J'.
Show that Mi = p1.
Find (hk - 1) X (hk - l)-minors Mi of XI - J' with M{ = pi for
each i.
Deduce that Fithk-i(J') = 1, and devise an argument to show
that J' is similar to J+ (p, k) without using the ancillary matrix J " of
section 13.7.
s 13.7 This exercise shows that the hypothesis of separability is essential for
the equivalence of the separable Jordan normal form with the nonsplit
version of the Jordan normal form.
Let Q be any field and let Q[t] be the ring of polynomials over Q
in an indeterminant t. Further, let F = Q(t) be the field of rational
functions over Q, that is, the field of fractions of Q[t\. Verify that the
polynomial p(X) = X2 — t is irreducible over F.
Let E = F(y/t) be a splitting field for p(X). If the characteristic
of Q is not 2, that is, 2 ^ 0 in Q, then p(X) = (X + Vt)(X - yfi)
is separable, but if Q has characteristic 2, say Q = Z2, then p(X) =
(X — v^) 2 and p(X) is not separable.
/ 0 t 0 0 \
1 0 0 0
Let A which is the matrix Js(p,2) of Eq.
l o o t
\ 0 1 1 0J
(13.6).
Exercises 213

Show that

F i t i p f l - A) = Fit2(XI - A) = 1,

that Fit 3 (.X7 - A) has generators 2X, X2 + £ and X 2 - t, and that


Fit4(X/-J4)=p(X)2.
Confirm that if the characteristic of Q is not 2, then

Fit 3 (XZ -A) = l

and hence that A is similar to the nonsplit Jordan matrix J+(p, 2) of


Eq. (13.5).
Suppose that the characteristic of Q is 2. Show that

F i t 3 ( X I - A) = p(X)

and that the invariant factors of XI — A are 1, l,p(X),p(X). Deduce


that A is similar to the rational canonical form matrix
/ 0 t 0 0 \
l
R- ° °°
0 0 0 t '
V0 0 1 0 /
Chapter 14

Projective Modules
In this final chapter, our aim is to provide some contrast to the results that
we have obtained for the structure of modules over Euclidean rings. We do
this by taking a brief look at those modules which occur as a direct sum-
mand of a free module - these are the projective modules. If the coefficient
ring is Euclidean, then a projective module must be free, since any sub-
module of a free module is free. For rings in general, a projective module
need not be free, nor need a submodule of a free module be projective.
We also discuss the types of ring that are defined by imposing "pro-
jectivity" conditions on modules, and we show that one such class, the
Artinian semisimple rings, occurs naturally in the representation theory of
groups.
The material in this chapter is all supplementary to the original lee- g
ture course on which these notes are based. Besides the aim of placing
the Euclidean results in a wider context, the results and references given
here provide an introduction to some topics that might be included in an
"enhanced" MMath or MSci version of the course.

14.1 The definition


Let R be any ring. A left .R-module P is said to be projective if it is a direct
summand of a free module R1, where I is some index set that need not be
finite (Exercise 7.10). Thus there is a left R-module Q so that

P x Q S R1.

Since an external direct sum can be rewritten as an internal provided that


we replace the given modules by isomorphic modules (section 7.7), we see

215
216 Chapter 14. Projective Modules

that a module P is projective if (and only if) there is an internal decompo­


sition P' 8 Q' = R1 for some P' with P' = P as an R-module.

.EaiampZes.
(i) It is immediate from the definition that the ring R is itself a projective
left (and right) R-module, as is any free module R1.
(ii) The zero module is projective.
(hi) Let R be the ring of all n x n matrices over a field F , and for j = 1 , . . . ,n,
let Ij be the set of matrices whose entries are all 0 except in column j .
Then (Exercise 7.8) each Ij is a left H-module and
R = h ©-•■«!„.
Thus each summand Ij is projective. Note that the summands are all
isomorphic to one another as left .R-modules.
No Ij is a free .R-module, since dim(Jj) = n as a vector space over F,
while the dimension of any free R-module is a multiple of n 2 .
(iv) Let D be the ring of 2 x 2 diagonal matrices over a field F, and let e = en
and / = / n (see Exercises 1.7 and 7.4. Then D = De®Df', which shows
that De and Df are (nonisomorphic) projective D-modules. It is easy
to see that neither is a free D-module.
(v) The cyclic Z-module Z a , a > 1, is not projective. To establish this
fact, we argue by contradiction. Suppose that there is an isomorphism
6 : Z a x Q = Z1 for some module Q and index set I. The image of (1,0)
is a torsion element in Z 7 , since it is annihilated by a, and it is nonzero
since 6 is an injection. But Z 7 contains no nonzero torsion elements.
(vi) Let A be an n x n matrix over a field F, and let M be the F[X]-module
defined by X acting as A on Fn. Arguing as above, we see that M is not
projective as an .F[X]-module.
Before we give more examples, it will be useful to reformulate the defi­
nition in terms of the splitting of homomorphisms.

14.2 Split homomorphisms


Let M and P be left .R-modules and let 7r : M —► P be a homomorphism
of left .R-modules. We say that 7r is split by the .R-module homomorphism
cr : P -4 M if
7T<7 = idp,

where idp is the identity map on P. Note that a split homomorphism 7r


must be a surjection, since p = 7r(u(p) for all p in P. The existence of a
splitting leads to a direct sum decomposition of M with P as a summand.
14.2. Split homomorphisms 217

14.2.1 Lemma
Let M and P be left R-modules and suppose that n : M —► P is split by
a : P -> M. Then

M = Ker(7r) © a(P) with a(P) S P


and
M 3 Ker(7r) x P.
Conversely, if M = K x P for some module K, then there is a split
homomorphism IT : M —> P with K = Ker(7r).

Proof Let m e M. Then

m = [idM — air){m) + air(m),


and
■n{idM — cr7r)(m) = 7r(m) — 7r<77r(m) = 0,
so that
M = Ker(7r)+cr(P).
If m S Ker(7r) n a(P), then m = cr(p) for some p, and then
0 = 7r(m) = 7r<7(p) = p,

which gives m = 0 and hence


M = Ker(7r)© ( r(M).
It is clear that the map m —> cr(m) is an isomorphism from P to tr(P) and
that the map
a : M -)■ Ker(7r) x M
given by
a(m) = ((idM — o-Tr)(m),ir(m)),
is also an isomorphism of R-modules.
For the converse, let a : M = K x P be the given isomorphism. By the
definition of the direct sum, for each m in M we can write a(m) = (k,p)
for unique elements k of K and p of P . Define 7r by 7r(m) = p and <7 by
a(p)=a"1(0,p). □
We now obtain a very powerful characterization of projective modules.

14.2.2 Theorem
Let P be a left R-module over a ring R. Then the following statements
are equivalent.
218 Chapter 14. Projective Modules

(i) P is a projective R-module.


(ii) If 7T : M -> P is any surjective homomorphism of R-modules, then
■K is split.

Proof (ii) => (i). Let {pt \ i E 1} be any set of generators of P, where the
index set I may not be finite. An element x = (XJ) of the free module R1 is
a sequence of members Xi of R, indexed by / , and with only a finite number
of nonzero terms. Thus we can define a surjective .R-module homomorphism

9 : R1 —> P

by
6{x) = ^XiPi.
iel
Since 9 is split, P is projective.
(i) => (ii). By definition, we have
R1 ^PxQ

for some index set / and module Q. Let 9 : R1 —> P be the corresponding
surjective homomorphism and let ui : P —> R1 split 9. Let {ej} be the
standard basis of R1 (Exercise 7.10) and put pi = 9(ei) for each i, so that
{pi} is a set of generators of P.
Since n : M —> P is surjective, we can choose a set of elements {rrii} of
M so that
7r(m;) = pi for all i £ I.
Now define A : R1 —> M by the requirement that
A(ej) = rrii for all i e i ,
and write
a = \u>.
Then
7Tcr = TTXU) = OUJ = idp,

which shows that we have split 7r. □

14.2.3 Corollary
Let P be a left R-module over a ring R. Then P is a finitely generated
projective module if and only if there is a split surjection

R* —>P

where Rl is a free left moffefeyrfgfiflift ffffikrial



14.2. Split homomorphisms 219

14.2.4 Corollary
Suppose that R is a Euclidean domain. Then every finitely generated
projective R-module P is free.

Proof By the preceding result, we can view P as a submodule of a free


.R-module of finite rank, so the assertion follows from Theorem 12.1.1. □
E x a m p l e : a projective nonfree ideal.
Next, we look at one of the basic examples in number theory. Let
R = Z[\/^5], and let / be the ideal 3R + (2 + v7—5)R- Unique factorization
does not hold in R, since 2, 3, 1 + \/—5 and 1 — A/—5 are distinct irreducible
elements of R with

2 • 3 = (1 + V^5) • (1 - v 7 ^ ) ;

furthermore, the ideal I is not principal (Exercise 2.7). However, if / were


free as an .R-module, it would have to be a principal ideal, since any R-
basis of / would also be a A'-basis of the field of fractions A = Q(v/—5) of
R and so could have only one element. It follows that / cannot be a free
R-module. Nevertheless, / is a projective R-module, as we will show.
We define 9 : R2 -» I by

fl(j)=3x + (2+ ^5)y,

so that 9 is a surjective .R-module homomorphism. To split 9, we define


to : I -> R2 by

u>(z) --
z I
1
An easy calculation confirms that the image of a? is in R?2 and that 9ui = idj.

E x a m p l e : a non-projective ideal.
Let R = F[X, Y] be the polynomial ring in two variables over a field F,
and let I = RX + RY be the ideal generated by the variables. It is easy to
show by direct calculation that / is not principal, and we now show that /
is not projective. This result provides a contrast to the fact that any ideal
of a Euclidean domain is principal, and hence free.
There is an evident presentation

9:R2^I

with
9(f,g)Dep^figftt9^IVfate^klf,9 € R.
220 Chapter 14. Projective Modules

By Theorem 14.2.2, it is enough to show that 6 is not split. We argue by


contradiction.
Suppose that 9 does have a splitting u. Then u{X) = (a, b) for two
elements a, b of R, and we must have
X = 8u(a,b) = aX + bY.

Write a = a0(X) + ai(X)Y + ■■■ + am{X)Ym, a polynomial in Y with


coefficients in F[X}. Comparing the coefficients of each term Yl in our
expression for X, we find that
m m-1
a=l + J2ai(x)Yl and
b=-Yiai+i(X)XYi.
i=l i=0

Similarly,
w(F) = (c,d)
with
n n—1
c = ^ C j ( X ) y ^ and d = I - d{X)X ~^2cj+1{X)XY:>.
3=1 3=1

Computing UJ{XY) in two ways, we see that

Yb = Xd

and hence that


X 6 fiX2 + RY,
which contradicts the fact that X and Y are independent variables.

14.3 Semisimple rings


An obvious question in the investigation of rings and modules is to ask what
can be said about a ring if we insist that all its modules are projective. We
need some definitions before we can state the results.
A left ideal / of a ring R is minimal if / is nonzero and there is no left
ideal J with
0 C 3 C /.
Then / is a simple .R-module.
A ring R is left semisimple if R is a direct sum

CopyrightedMaterial
14.3. Semisimple rings 221

where each I\ is a minimal left ideal of R, and A is some index set that
need not be finite.
The first result is as follows.

14.3.1 Theorem
A ring R is left semisimple if and only if every left R-module is projec-
tive.

Proof [Rotman], Theorem 4.13. □


If we impose a further condition on the ring, we obtain a very concrete
description of its structure. A ring R is left Artinian if any descending
chain
R D h D ■ ■ ■ D Ii D Ii+i D ■■■DO
of left ideals in R must have only a finite number of terms.

14.3.2 The Wedderburn-Artin Theorem


The following assertions are equivalent.
(i) R is a (left) Artinian semisimple ring,
(ii) There are division rings D%,..., D\. and integers n i , . . . ,Bfc so that
R = Mni(D1)x---xMnk(Dk),
a direct product of matrix rings. The division rings Di and the
integers rii are uniquely determined by R, apart from the order in
which they are listed.

Proof In one direction, the argument is not too difficult. A matrix ring
Mn(F) over a field F is Artinian since it is a finite dimensional vector space
over F, and any ideal is a subspace. Exercises 3.10 and 7.8 combine to show
that Mn(F) is semisimple. It is not hard to see that essentially the same
arguments work when F is replaced by a noncommutative division ring D
and that a direct product of rings is Artinian semisimple if its components
are.
The proof in the reverse direction is much harder. Full details can be
found in [Cohn 2], §4.6 or [B & K: IRM], §4.2. □
The structure of modules over an Artinian semisimple ring is transpar­
ent. For each matrix ring Mi(Di) above, let Si be its "first column", that
is, the set of matrices whose entries must be zero outside the first column.
Then Si is a left Mi(Di)-module and we can make Si into left R-module
by stipulating that the other components of R act trivially on Si. Proofs
of the following result c a ^ b e ^ u ^ ^ j h ^ ^ f ^ r e n c e s given above.
222 Chapter 14. Projective Modules

14.3.3 T h e o r e m
Let M be a finitely generated left module over an Artinian semisimple
ring R. Then there are non-negative integers ai,...»ftfe and an R-module
isomorphism
M = oiSi x • • • x akSk,
where a^Sj denotes the external direct sum of a^ copies of Si. (If some
Oi = 0, we take this to be the zero module.)
If N is a finitely generated left R-module with
TV^feiS1! x ••• xbkSk,
then
M = N <=s>
^=> ai = bi for i = \,...,k.
U

14.4 Representations of groups


We take a brief look at the connection between the representation theory
of groups and module theory. We show that that a representation of a
group corresponds to a module over a certain type of ring, namely a group
ring, and that a group ring is an Artinian semisimple ring provided that
the order of the group satisfies a certain condition.
First, we must make our definitions. Let G be a finite (multiplicative)
group and let F be a field. A representation of G over F is a map
p : G -y Mn(F)
from G to the ring oin x n matrices over F, with the following properties.
GRep 1: p ( l c ) = / „ , where 1Q is the identity element of the group and
I„ is the identity matrix.
GRep 2: p(gh) = p(g)p(h) for all g,heG.
Clearly, each matrix p(g) is invertible, with inverse p(g~x).
The group ring FG of G over F is defined as follows. Let k be the
order of G, and list the elements of G as 1 = gi,<fy,• • -,3k- Then FG is a
fc-dimensional vector space over F with basis {l,g2, ■ ■ ■ ,gk}, made into a
ring by using the multiplication in G together with the distributive laws.
Thus, for

x = xi+ x2g2 H V xkgk and y = j/i + j/ 2 02 H h yk9k in FG,


we have
k

xy
x
y== J2( 22 xhxxhXi)gj.
^2( Yl i)9i-
3= 1
24.4. Representations of groups 223

The identity element of FG is the element 1 = lp ■ (?i, where 1^ is the


identity element of F . For example, suppose that

C = ( c ) = {l,c,...,Cfc-1}

is the cyclic group of order k, with generator c. An element of FG has the


form
X = X i l + X2C-\ hXfcC f c _ 1

and the multiplication is derived from the rule ck = 1. If we put r? =


1+ cH + c fc_1 , then r?2 = fcr? and (1 - c)rj = 0.
Suppose we are given a representation p : G —► Mn(G). Each element <?
of G acts on the vector space F n by the rule

g ■ v = p(g)v for all v 6 Fn,

and we extend this action to all of FG by linearity. It is easy to check that


Fn has become an FG-module, with "G acting through p".
Conversely, suppose that M is an FG-module and that M has finite
dimension n as a vector space over F. For each g in G, we let 9(g) be the
linear transformation of M that is given by

9(g) : m —> gm for all m e M,

and we define a representation by choosing a basis E of M as an F-space,


and taking
P(9) = (0(9))E,E,

the matrix of 9(g) with respect to E. Condition GRep 1 is satisfied because


the identity element of FG acts as the identity operator. For condition
GRep 2, we note that for any g, h in G, 9(gh) = 9(g)9(h) since (gh)m =
g(hm) for all m in M, and then that

(0(gh))E,E = (6(g))E,E(9(h))E,E

by the multiplicativity formula in Proposition 5.9.1.


A representation is said to be irreducible if the corresponding module is
irreducible, that is, simple.
A representation of a cyclic group C = (c) of order k is completely
determined by specifying a single invertible matrix A with Ak — In, since
it is enough to know the matrix p(c). The action of A also makes F n into a
module over the polynomial ring F[X] as in section 3.3, and so calculations
such as those in section 13.9 can be exploited to reveal the representation
theory of cyclic groups, ffiopyragfeted AteferJBfe considered in the exercises
224 Chapter 14. Projective Modules

to this chapter. A detailed introduction to representation theory can be


found in [J & L].
The promised connection between representation theory and Artinian
semisimple rings is given by the next result.

14.4.1 Maschke's Theorem


Let G be a finite group of order k and let F be a field in which k ^ 0
(that is, the characteristic of F does not divide k). Then the group ring
FG is Artinian semisimple.
Remark: Thus the complex group ring CG is Artinian semisimple for any
finite group G.

Proof A left ideal of FG is also an F-subspace of FG, and, since FG has


finite dimension, any descending chain of left ideals must therefore be finite.
Hence FG is Artinian.
Let M be any left FG-modu\e. As in the proof of Theorem 14.2.2, there
is a surjective FG-module homomorphism
9 : {FG)1 -> M
for some index set /. We now invoke the fact that any vector space over
a field has a basis, even the space is not finite dimensional ([Conn 2], §1.4;
[B & K: IRM], Theorem 1.2.20). Since 9 is, in particular, an F-linear trans­
formation, we can define an F-linear transformation w : M —> {FG)1 that
splits 9 simply by making an appropriate choice for the values of w on the
members of some basis of M.
Now define <j> : M -> {FG)1 by

<t>{m) = - 2 J /i -1 (w(/im)) for each m € M.

For each g in G, we have

<t>{gm) = T X] (^rV((%) m )) = ^(m)>


gh€G

so that <f> is an FG-homomorphism, and it is easy to see that p<j> = zdjvf-



14.5 Hereditary rings
The success of the structure theory of modules over a Euclidean domain
stems from two facts. OQQffir$0e§ /f^gjj^g/generated module M has a
Exercises 225

presentation, that is, there is a surjective .ft-module homomorphism p from


a free module R* to M, whose kernel K is, by definition, the relation module
(section 9.1). The other is that a submodule of a free module is again free,
with a convenient basis (Theorem 12.1.1), so that we have a tight control
over the relations for M. There is no difficulty in making the definition of
a presentation for a module over an arbitrary ring, but, as the examples in
section 14.2 show, we cannot expect that the corresponding relation module
will be a free module, or even projective.
The following question then arises: "Are there interesting rings for which
each submodule of a free module is projective, even if not necessarily free?"
The answer is a resounding yes. A ring R is said to be left hereditary if
every left ideal of R is a projective .R-module. It can then be shown that ev-
ery submodule of a projective left R-module is again projective ([Rotman],
Corollary 4.18).
Artinian semisimple rings are hereditary, as are Euclidean domains.
If a commutative domain is hereditary, then the ring is, by definition, a
Dedekind domain. There are several alternative definitions of a Dedekind
domain, some of which can be found in [Rotman], Chapter 4. The impor-
tance of Dedekind domains stems from the fact that they arise in algebraic
number theory as rings of integers, a very special example being the ring
Z[V—5]. A discussion of Dedekind domains and their module theory can
be found in Chapters 5 and 6 of [B k K: IRM].
In the noncommutative case, the theory of Artinian hereditary rings is
very complicated, as can be seen from Chapter 8 of [A, R & S]. There are
noncommutative generalizations of Dedekind domains, from the point of
view of ring theory in [McC & R], Chapter 5, and from the point of view
of number theory in [Reiner].
There are many questions of a similar type, for instance: "If a relation
module K is not projective, we can take a presentation of K and look at
its relation module, K\ say. Is K\ projective?" "What can be said about
rings for which K\ is always projective?"
The answers to these questions require the tools of homological algebra,
and so they are beyond the scope of this first course. There are many
projects here for the enthusiastic investigator. A good introduction is given
by [Rotman].

Exercises
14.1 Let R = F[X, Y] be the polynomial ring in two variables over a field
F, and regard F as an .R-module with both X and Y acting as 0.
Show that the obvious presentation R—>F has kernel / = RX + RY,
226 Chapter 14. Projective Modules

which is not projective by the calculation in section 14.2.


Using Exercise 9.3, verify that the presentation

R2 ->L
(0 ^Xf + Yg,

has a projective kernel.


Remark: it can be shown that, for any R-module M which is not pro­
jective, we must reach a projective module after at most two iterations
of the process of taking a presentation module of a relation module.
In the language of homological algebra, R has global dimension 2.
More generally, if R is a polynomial ring in 771 variables over a field
F, then R has global dimension m.
14.2 Let F be a field and let F[e] be the residue ring F[T]/(T2), where
e = T; F[e] is called the ring of dual numbers over F. We can view F
as an F[£]-module with e acting as 0.
Show that F is not projective. Prove also that the obvious pre­
sentation 9 : F[e] -> F has kernel K = F.
Show further that if M is an F[T]-module on which T 2 acts as
0, then M is an F[e]-module, and vice versa. Hence find all finitely
generated F[e]-modules.
Remark: it can be shown that no matter which presentation of F
we take, the relation module is not projective; nor do we ever obtain
a projective module by taking a presentation of the relation module
and iterating this construction. In contrast to the polynomial rings
of the previous question, the ring of dual numbers has infinite global
dimension.
14.3 Let C = (c) be a cyclic group of finite order k. Verify that an FC-
module M is given by a module over the polynomial ring F[X] on
which the variable X acts as a fc-th root of unity, as in section 13.9.
Deduce that the irreducible (that is, simple) FC-modules correspond
to the irreducible factors of the polynomial Xk — 1.
Here are two extreme cases.
(a) Let F be the field of complex numbers C and let w be a primitive
fc-th root of unity in C. For i = 0,... ,k - I, let St be C regarded
as an CC-module with c acting as u>1.
Show that each S{ is a simple CC-module, and that S% and Sj are
not isomorphic if i jt j . Deduce that

CC^So x •■■ x Sk-i


as a CC-module.
Exercises 227

(b) Take k = p, a prime number, and put F = Z p , the field of p


elements. Write e = 1 — c in FC. Show that ep = 0 and hence that

FC = F[e] S F[X]/F[X]X p .

Deduce that C has only one irreducible representation over F.


14.4 Maschke's Theorem is false if the order k of G is 0 in F, that is, if the
characteristic p of K divides k.
Let rj = 5Z{5 I 9 S G}. Verify that n2 = kr) = 0.
Now consider the surjection 7r : F G -* FT] given by 7r(x) = r\x for
all x in FG. Suppose that 7r can be split by an FG-module homomor-
phism UJ, and let e = u){rf). Show that ge = e for all g in G and hence
that e = arj for some a in F . Deduce that irur = 0, a contradiction.
Hints and Solutions for
the Exercises
Here are the answers for the exercises that involve numerical calculations,
and some hints for the less obvious theoretical problems. If an answer is not
provided, then either the problem contains its own hints, or the solution is
a routine matter of verifying axioms.
1.4 Zg : Units 1,3,5,7, the remaining elements being zero divisors.
Zio: Units 1,3,7,9, the remaining elements being zero divisors.
1.6
(a) Note that F is a domain and use Lemma 1.6.1 twice.
(b) Use the preceding exercise, with A = F[Y}.
(c) Suppose / is an ideal which contains RX + RY properly. Then there
is an element / = /oo + Xh + Yk in I with /oo ^ 0. Since / — Xh — Yk
is in / , I contains the unit /oo, and so I = R. A generator / of RX + RY
would have to be divisible by both X and Y.
1.9 Write ey for the matrix with entry 1 in place i,j and entries 0
elsewhere, i,j = 1,2. Then, for any nonzero element x of the matrix ring
R = M2(F) and any pair of indices, ey = axb for some a, b in R. So every
element of R is in RxR and so R has no two-sided ideals except 0, R.
2.5 Let / be the given polynomial. Then
2.5 Let / be the given polynomial. Then
/ ( y ) = ((y + i ) " - i ) / ( ( y + i ) - i ) ,
/ ( y ) = ((y + i ) " - i ) / ( ( y + i ) - i ) ,
which has the form
which has the form
Y"-11 + fp-2Ypp-22 + ■ ■ ■ + fiY + p
Y"- + fP-2Y - + ■ ■ ■ + fiY + p
with p\fi for i = 1 , . . . ,p~ 2. Thus / is irreducible (Eisenstein) and so also
is/.
2.6 FfJf,
F[X, Y] has a non-principal ideal.

229
230 Hints and Solutions

2.7 An element r of R has the form r = a + fei/-5 for integers a, b.


Attempting to argue as with the Gaussian integers (section 2.4), we put
tp(y) = a2 + 5b2, and verify that for r,s 6 R,

<f(r) ■ <fi(s) = <f(rs)-

Now consider 3. We have <p(3) = 9, so any factor r of 3 in R must have


tp(r) = 1,3 or 9. If ip{r) = 1, then r = ±1 is a unit, and if tp(r) = 9, the
other factor of 3 is a unit. Since it is clearly impossible to have <p(r) = 3, we
see that 3 is irreducible. Similarly, 2 + \/5 and 2 - \/b are both irreducible.
Since the only units in R are ± 1 , no two of 3, 2 + y/b and 2 - \/5 are
associates. Thus 9 has two genuinely different factorizations in R, hence R
is not Euclidean.
2.9 2 = t(l — z)2 is essentially a square in Z[i], while 3 and 7 are
irreducible since there are no integer solutions of a2 + b2 = 3, 7. On the
other hand, 5 = (2 + i)(2 — t) is a product of two irreducible elements of
Z[i], which are not associates - their ratio is not a Gaussian integer. So the
list is { l - i , 3 , 2 + i,2 -i,7}, and

4200 = 2 3 • 3 • 5 2 ■ 7 = -i • (1 - if ■ 3 ■ (2 + i) 2 ■ (2 - i)2 • 7.

3.1 Let { p i , . . . ,pfc} be the first k primes, and let

a% ~Pi- ■ -Pi-iPi+i ■ ■ -Pk, i = 1, • ■ -,k.

Use induction on k. The result for k — 1 tells us that o i , . . . , a/t_i generate


PfcZ; since pfc and a^ are coprime, pjtZ + aj^Z = Z. If we omit a, for any i,
the remaining aj's can only generate pjL.
3.4 For any field F, a proper, nonzero, submodule of M must be one-
dimensional as a vector space over F, and so must be given by an eigenvector
of A. The eigenvalues of A are the roots of X2 + 1. Now take F = Z p , p
prime.
If p = 1 mod 4, Z p contains two square roots of —1. If we call them ±i
again, there are two one-dimensional submodules as in the complex case.
If p = 3 mod 4, Z p contains no square root of —1, so M has no submodules
apart from 0 and M, as in the real case.
If p = 2 (I bet you missed this!), X 2 + 1 = (X + l ) 2 = (X - l ) 2 . So there
is only one eigenvalue, A = 1 with eigenvector I ), and so there is a
unique one-dimensional submodule of M.
3.5 An n x n complex matrix A has at least one eigenvector and hence at
least one one-dimensional eigenspace. Thus M has at least one submodule
which is one-dimensiona]Q0isyrfc}ht(£hM3MriM is simple, it must itself be
Hints and Solutions 231

one- dimensional, that is, n = 1. Conversely, if n = 1, then M can have no


proper nonzero subspaces and hence no proper nonzero submodules.
3.6 Unchanged if we replace the complex numbers by an arbitrary field
of coefficients.
3.7
(a) The eigenvalues of B are 1,LO,U>2 where a; = exp(27ri/3) is a complex
cube root of 1. This gives three independent eigenvectors and three distinct
/ I \ / i \ / i \
1
one- dimensional subspaces C 1 1 ,C u> and C 1 J 1
2
V1/ V- 1 \ - /
(b) Almost any vector works; try e%, a standard unit vector.
Almost any vector works; try e%, a standard unit vector.
(c) Write w = I — 1 I. By direct calculation, B2w = —Bw — w, so
Write w = 1 —1 1. By direct calculation, B2w = —Bw — w, so
C{X 1 is spanned over C bv the vectors w, Bw, which are obviouslv linearlv
independent.
"] is spanned over C by the vectors w, Bw, which are obviously linearly
(d) The existence of one-dimensional subspaces depends on the exis-
tence of a nontrivial cube root of 1 in C, and will carry over to any field
which also has a nontrivial cube root of 1. Z7 has three roots of 1, namely
1,2, 4, so we get the "same" answer. Over R, Z2, or Z3, there is only one
one-■dimensional subspace.
4.9 Suppose L = M. There is invertible 2 x 2 matrix T giving the
isomorphism, so TA = BT. But then TA2 = B2T, hence T = -T, hence
I = —J, since an invertible matrix can be cancelled. But / ^ —I, so L is
not isomorphic to M.
Similar calculations rule out isomorphisms between all pairs except M and
Q; as B2 = E2 = —I, this approach gives no information.
So we compute matrices T with TB = ET, and we find that 1 )
x 0 -1
is such a matrix, and hence M = Q.
4.10 Direct computation. A homomorphism from L to Z is given by a
3 x 2 matrix with
/0 1 0 \
0 1
i 0 0 0 1 T,
- ( s l 0
\o 0 1 J
0 0 \
and so T = ( 0 0 I with t e R arbitrary. Thus Hom(L, Z) has dimen-
t t )
sion 1 as an K-space.
Clearly, the column space of T has dimension 1 or 0, so the nullity dim(Ker(T))
of T is 1 or 2. Hence T is jC^ttjeiQ^eg/tftfetef^urjective - see section 4.10.
232 Hints and Solutions

Hom(Z, L): similar.


4.11 0*6,(L) = L + Ker(0) and 0*0*(P) = P D lm(0). So 0*0.(0)
0*0„(Ker#), whether or not 0 is injective.
5.5 Since (idjvf)2 = id.M i
(idM)D,c ■ (idM)c,B = (idM)D,B >

which gives PD,c • Pc,B = PD,B by the result above. (Alternative proof:
calculate the product explicitly!)
5.6 Prom the above, PD,B = -PD.B-PE.BJ where

PE,B = ( _\ \ J and PE,D = ( _\ _23 J


Hence

PD I E = (PS.D)- 1 = ( _3X _ \ ) and P D , B = ( J _\ ) .

To confirm the calculation, note that b\ = di and 62 = 4di — c?2-


5.7 By Exercise 5.3, we have to determine when the matrix P = I 1
is invertible over the Gaussian integers Z[i\. By Theorem 5.12.1, P is in-
vertible if and only if det(P) = 2a - (2 + i) is a unit in Z[i], that is,

2a — 2 — i = 1, — l , i o r — i.

This gives
a = (3 + i)/2, (1 + i)/2,1 + i or 1.
But the first two of these are not allowed, since they are not Gaussian
integers, so the permitted values are a = 1 + i, 1.
5.8 Similar to the preceding exercise: we determine the polynomials
f(X) for which the matrix P = ( _ yiA j is invertible over R[X]. The
determinant is f(X) — 2g(X), and the units in R[X] are the nonzero real
numbers, so the permitted values of f{X) are r + 2g(X), r / 0 e R; g(X)
is arbitrary.
6.1 The initial parts are routine checking.
An element of Hom(Z p , Z q ) is determined by an integer r such that pr € Zq.
This means that q divides pr, and so q divides r since p, q are distinct prime
numbers (Proposition 2.8.2). Hence f = 0 in Z q , so the corresponding
homomorphism must be 0.
Since Z p (p) = Z p (p 2 ) = Z p , we have
Hints and Solutions 233

Z p2 (p) = {x | px e Zp 2 } = {py | y e Z} = pZ p 2 , thus Hom(Z p ,Z p 2) S


pZ p 2. (By Exercise 6.4, p.Z p2 S Z p .)
6.5 We have

( X 0 0 0 /o \
X 0 0 /i
0 o h
XI-A

0 0 -1 ^ /n-2
Vo 0 0 -1 /n_!+X /

Adding X times row n to row n — 1, then X times row n — 1 to n — 2, etc.,


does not change the value of the determinant, but converts XI — A to

0 0 0 f(X) \
f °
-1 0 0 0 /i + - + /„-iI"- 2 +X'- 1
0 -1 0 0 f2 + ■ ■ ■ + fn-lXn~3 + Xn~2
B =

0 0 -: 0 / n _2 + / „ - l X + X 2
0 0 -1 fn-l+X
Io
Expanding from the first row, det(B) = ( - l ) " " 1 / • ( - l ) n _ 1 = / .
6.6 In these pictures, "x" denotes the submodule xM of M.
(c): R/Rlpq •

Ip

0 •
234 Hints and Solutions

(d): R/Rp2q*

0 •

(e): R/Rp2q2 „ 2 ,

pq

p2q pq2 •

0 •
Hints and Solutions 235

6.7 Write a = p\...pk where each pi is irreducible - repetitions are


allowed. By Exercise 6.4,

RPi...pk-i/Ra^R/Rpk,

where R/Rpk is simple. So proceed inductively, taking M% = Rpi.. .pl/Ra


for i = 1 , . . . , k.
R itself has no composition series, since it has no simple i?-submodule - a
submodule of R is a principal ideal Rb for some fr / 0, and p ■ Rb C Rb for
any irreducible p of R.
7.1
(a): R/Rlpq = l(R/lpq) © pq(R/Rlpq), among others.
(b): R/Rp2q = p2(R/p2q2) © q(R/Rp2q2) - unique.
(c): R/Rp2q2 = p2{R/p2q2) © q2(R/Rp2q2) - also unique.
7.2 As we have seen in the solution to Exercise 3.7, TV has three one-
dimensional submodules, namely the eigenspaces of B corresponding to
the three cube roots of 1 in C. Thus TV must be the direct sum of these
submodules. Further, the decomposition is unique, since there are no other
one-dimensional submodules.
The answer is similar over Z7, since this field contains three distinct cube
roots of 1.
In the remaining cases, we cannot decompose TV into one-dimensional sub-
modules, since B does not have enough eigenspaces. Over R and Z2, we
can obtain a decomposition of TV by observing that

N^F[X}/F[X](XZ-1)

for any field F, which can be seen by a slight modification of the argument
in section 6.6.
Since X3 — l = (X — l)(X2+X+l) with the second term irreducible, we have
N = P®Q with P = (X2 + X + 1)TV one-dimensional and Q = {X - 1)TV
two-dimensional.
Over Z 3 , X 3 - 1 = (X - l ) 3 and so TV has no direct sum decomposition -
its submodules form a chain.
7.3 Define

a : Mi x ■ ■ • x Mfc ->■ M1/L1 x • ■ • x Mk/Lk

by
a(mi,...,mfe) = ( m l r . . , m t ) ,
and check that a is surjective with kernel L\ x ■ • • x Lk, so that the claim
follows from the Induced Sag^fi^epfcMsfefl^/
236 Hints and Solutions

7.4 Let 6 : De -> Df be a homomorphism, and note that

0(e) = 9(e2) = e6(e) = exf


for some x i n D.
7.10 An infinite set {&» | i € / } is a basis of a module M if
(i) for any m £ M, we have m = XZie/ ri^i> where at most a finite
number of scalars r* are nonzero;
(ii) J2i£i ribi = 0 if and only if all r, are 0.
The rest of the question is routine checking.
8.1 T(L) = L n T(M) and TP(L) = L D r p ( M ) .
T(0) need not be a surjection even if 9 is. Take 6 : Z -> Z 2 to be the
canonical surjection. Then T(Z) = 0 but T(Z 2 ) = Z 2 .
8.6 Let m € M. Then there is an index s with m* = 0 for all i > s.
Since i-rrii = 0 for each component m* of m, (s!)m = 0. Thus every element
is torsion.
On the other hand, given a nonzero integer a, let ea+i be the element of
M which has (a + l)-th component T € Z a + i and all other components 0.
Then aea+i =£ 0, hence a e" Ann(M). Thus Ann(M) = 0.
9.1 Multiply each relation

Pi = 7 i i « i + - - - + 7 t j e t
by the cofactor Fjk, k a fixed index, and then sum to obtain the relation

Tupi H h TtfcPt = det(r)e fc ,


using the formulas from Det 8, section 5.12.
Thus the generators m^ = ir(ek) have det(r)mjt = 0 for all k, which shows
that det(T)M = 0.
9.2 The Cayley-Hamilton Theorem.
If B is an n x n matrix over F with Bv = 0 for every v £ F*, then B = 0.
Now h(X) 6 Ann(M) means that h(A)v — 0 for all such v, hence h(A) = 0.
By the preceding exercise, if we take h(X) = det(XI — A), we have h(X) e
Ann(M), so that h(A) = 0.
9.5 Any irreducible divisor of the greatest common divisor of the set
q\,... ,qk would be a divisor of each of the q^s, which is impossible by
construction. Hence their GCD must be 1 and so 1 = wiqi H 1- Wkqk for
some elements w\,..., Wk of R.
Since r = r ■ 1 for any r in R, we see that qi,..., qk generate R. The
members of any proper subset oiqi,...,qk will all be divisible by some p*,
and so cannot generate R.
The (k - l)k/2 relations for R are the differences p^e* —pieu, h <i, written
in some convenient order
Hints and Solutions 237

Takew = w1eiH hwjtefc e Rh. Then7r(u;) 1, giving Rk = Ker(7r)©Rw.


The generators
z\ = ei -qiw,...,zk = ek qkw
of Ker(7r) are those suggested by Theorem 9.4.1.
The form of the presentation matrix follows simply by writing out w, and
the identity W\Zi + ■ ■ ■ + w^Zk = 0 is immediate.
/ I 0 0\
10.2 Invariant factor form 0 1 0 I, giving M = Z 2 .
\ 0 0 2 /
10.4 There are many correct answers for P and Q since you can perform
many different sequences of row and column operations. Here is one such
sequence.
r P Q
1 2
I I
2 1
1 2 1 0
I
0 -3 -2 1
1 2 1 0
I
0 3 2 -1
1 0 1 0 1 -2
0 3 2 -1 0 1

Then P l = P (pure chance), and e\ = ei + e 2 , e'2 = e 2 and /{ = / i , / 2 =


-2/i+/2.
The single generator is simply m 2 , with 4m 2 = 0.
2^i
/3
10.5 Again, many possible answers. In general, r = 3 6 ■ This
\2 ay 1
/ 1 0 -1 1
1 1\ "M /3
reduces to r" 0 4
3a/
with P ■■ -1 1
V-3 1
°3 / P-I = 3
V2
0 1
-1 1 )
Vo
2
- M S °7 )-
a = 0 : the matrix I" is in invariant factor form, so we can take P as above
- anticipating results from the next chapter, we have M = Z4 x Z, nonzero
generators m'1 = -mi - m 2 and m 2 = mi + m 2 + 7713.
1 0 / 1 0 -1 \ / 3 -1 0
a = l:A= 0 1 ,withP= 2 0 -3 ,P~1= 3 3 1
\ 0 0 / V - 9 1 1 2 / \ 2 - 1 0 )
' 1 1
and Q
0 1
238 Hints and Solutions

Then M = Z, generator m!, = m 2 .


/ 1 0 \ / 1 0 -1 \ / 3 -1 0 \
a = 2:A = 0 2 , with P = 2 0 -3 , P"1 = 3 -3 1 ,
\ 0 0 / \ 3 1 -6 / \ 2 -1 0/

- « - ( S -.)■
Then M = Z 2 x Z, generators mj = - m i - 3m 2 - m 3 and m 2 = ma-
ll.1 Fitj = 1, Fits = 3, Fit 3 = 18, confirming that St = 1, 52 =
3, 53 = 6.
11.2
(a) Fit 2 (r) = ± d e t ( r ) = 1; <5i<52 = 1, hence 5t = 52 = 1.
(b) Fiti = <5i = 1. Fit 2 has among its generators 2 x 1 1 - 2 x 1 0 = 2 and
4 x 0 - 7 x 1 1 so Fit 2 = 52 = 1 also. Expanding, say, from row 3, we get
det = - 2 so 53 = 2.
(c) Fiti = 1, Fit 2 = 3.
(d) Fiti = 1 and Fit2 is generated by 12,3o - 4 and 3a - 12, so that Fit 2
is given by the GCD of 3a, 4.
For a = 0, this is 4, for a = 1, this is 1, and for a = 2, this is 2.
11.3 Since the GCD of 3a, 4 must divide 4, the only possible values of
Fit 2 are 1, 2,4, so the values a = 0,1, 2 cover all the possibilities.
11.4 Fiti = 1 always. If a ^ 0, a(X — 1) is a 2 x 2 minor, and X — 1
is a factor of any 2 x 2 minor. So Fit 2 = X — 1, therefore 52 = X — 1, $3 =
(X - l)(X - b).
If a = 0, the distinct l x l minors are X — 1 and X — b, and the distinct
2 x 2 minors are {X - l ) 2 and (X - 1)(X - b).
If 6 ^= 1, we find that Fitj = 1, Fit 2 = X - 1 = 82 and 63 = {X - 1)(X - b)
again.
lib = l,61 = 52 = 53 = X-l.
11.5 The presentation matrix is

/ 0 6 6 0 \
T = l + i 4 5-i 2i \
\ 0 2(1 + *) 2£ 0 /

which has invariant factor form

/1+i 0 0 0\
A= 0 2 0 0 ) .
\ 0 0 6 0/

This can be found by using the elementary row operations "row 1 -H- row
2", "row 2 «-> row 3" cleverly, or by finding the Fitting ideals of T.
Hints and Solutions 239

11.6 Fiti = RX + RY is not principal and so there is no invariant


factor form. Fit 2 = (X 2 - Y2)R.
12.1 Let x = (xi,x2) £ M. Z = Zx precisely when Ann(x) = 0.
Suppose z 6 Ann(x). Then 2x1 = 0 € Z/Za and zx2 = 0 G Z. If x2 ^ 0,
then z = 0 and Ann(x) = 0, so Zx = Z. If x 2 = 0, then ax = 0, so Zx ^ Z.
Internal direct sum. First, we must have 0 = Zx n Zy. Write y = (2/1,2/2)-
Then ay2x = (0,ax2y2) = ax2y is in the intersection, so 0x22/2 = 0 in Z.
But 0x2 ^ 0, so 2/2 = 0 (which guarantees that the intersection is zero).
Next we must have M = Zx + Zy. In particular, m2 = bx + cy for some
integers b, c. Looking at the second components, we get 1 = CX2 in Z, so
x2 = ± 1 .
We also have to be able to write mi = dx + ey for some integers d, e. This
gives dx2 + ey2 = 0 and d = ±ey2, so that

1 = ±ey2xi + eyi € Z/Za.

But for any given values of Xi and y2 we can find a value of 2/1 with ±2/2^1 +
2/i invertible in Z/Za. Thus x = (xi, ±1) is the general form for x.
12.2
(a) M = Z2 is 2-primary.
(b) M = Z 3 is 3-primary.
(c) For a = 0, M = Z 4 is 2-primary; for a = 1, M = 0; and for a = 2,
M is 2-primary again.
12.3
• If a / 0, the invariant factor decomposition is

M S F[X]/F[X](X - 1) x F[X]/F[Jf](X - 1)(X - 6).

If also 6 = 1 , this is the (X - l)-primary form of M and the elementary


divisors are (X — 1), (X — l ) 2 .
if M i ,
F[X]/F[X](X - 1)(X - 6) S F[X]/F(X]{X - 1) x F[X}/F[X}(X - b),

so M has (X — l)-primary component


F[X]/F[X](X - 1) x F[X]/F[X](X - 1)

and (X - 6)-primary component F[X]/F[X](X - b).


• If a = 0 and b ^ 1, we find that the invariant factor decomposition is as
before, so we get a similar analysis.
• If a = 0 and 6 = 1 ,
M = F[X]/F[X](X - C ? ^ S r f | ^ / M ^ ( « / - 1) x F[X]/F[X](X - 1).
240 Hints and Solutions

12.4 By the calculations for Exercise 11.5, the invariant factor form of
M is
Z [ t ] / ( l + t ) Z [ i ] x Z[i]/2Z[t] x Z[i]/6Z[i].
We have irreducible factorizations 2 = —t(l + i)2 and 6 = - 1 ( 1 + i)2 ■ 3, so
the 1 + i-primary component of M is
Z[t]/(1 + t)Z[i] x Z[t]/2Z[t] x Z[i]/2Z[t]
and the 3-primary component is
Z[t]/3Z[i]
all other primary components being 0. The elementary divisors are
1 + *, 2,2,3.
In the following solutions, "InFs" means invariant factors, "RCF" rational
canonical form, and "JNF" Jordan normal form.
13.1 By direct calculation, the InFs of
/ X +3 2 -4 \
XI-A = 4 X -1 4
-
V o 1 X-l I
are 1,1 and {X - l)(X + I)2 = X3 + X2 - X - I.
/ 0 0 1 \ / 1 0 0 \
RCF: 1 0 1 , JNF: 0 - 1 0 .
\ 0 1 -1 ) \ 0 1 -1 /
B turns out to have the same InFs, so the same RCF, JNF. But B itself
is not in RCF: the l x l block is the companion matrix of the polynomial
X — 1 and the 2 x 2 block is the companion matrix of X2 + 2X + 1, which
is not a multiple of X — 1.
13.2
0 0 0 \
A: InFs 1,1, X 3 , RCF & JNF: : 0 0 I
\
\o i o)
B: InFs 1,1,Xs - 1, RCF:
( 0 0 1\
1 0 0 JNF:
1 0
w
°\ where
b 0 0

u) = exp(27ri/3).
Vo i o v° 0 2
- y
C: I n F s l , l , X - 2 , ( X - 2 ) 3
/ 2 0 0 0 \ / 2 1 0 0 \
0 0 0 1 0 1 0 1
RCF: JNF:
0 1 0 - 4 0 2 1 0
\ 0 0 1 4 ) Copyrigntefl IVfategiali
Hints and Solutions 241

D: InFs 1 , . , . , 1, (X - l ) n . Put (X - l)n = Xn - fn^Xn~l /o;


then the RCF is and JNF are respectively

(° 0 0. .0 /o
h
\ / 1 0 0...0 0 \
1 0 0 . .0 1 1 0...0 0
0 1 0 . .0 h 0 1 1 ...0 0
and

0 0 0 . .0 fn-2 0 0 0 ... 1 0
0 0 . .. 1 fn-1 ) \ 0 0 0 ... 1 1 /

13.3 The calculations of the minors involve only integers, so we get


the "same" minors whatever the field of coefficients, but we must interpret
integers as their residues in Z 2 , Z 3 and Z 5 in turn. The Fitting ideals may
change.
For the matrix A of 13.1, the answer changes only over the field Z2, since
nowFit 2 (A) = X - l a n d F i t 3 ( J 4 ) = (X-l)3. The InFs are 1,-X-l, (X-l)2,
/ I 0 0\
the RCF is 0 0 1 , and the JNF is the "same".
\0 1 l )
For B of 13.1, the answers are all the "same". Note that B has a 2 x 2
identity submatrix, ensuring that Fit2(i?) = 1 for any field of coefficients.
For A,C,D of 13.2, the InFs, RCF and JNF are unchanged apart from
interpretation. Note that A, C have 2 x 2 identity submatrices, guaranteeing
Fit2 = 1 for any coefficient field, and D has an (n — 1) x (n — 1) submatrix
with determinant 1 over Z, giving Fit„_i(Z?) = 1 always.
For B of 13.2, the InFs are 1,1, X 3 - 1 whatever the field, so the RCF
is unchanged. The JNF will depend on the factorization of X3 — 1 in
the field of coefficients, which will vary. Over R,Z 2 and Z 5 , X3 - 1 =
(X - l)(X2 + X + 1) with the second factor irreducible. So B has an RCF
(left to the reader) but no JNF.
Over Z 3 , X3 - 1 = {X - l ) 3 , so B has an RCF and a JNF.
13.4 Since (XI - A)T = XI - AT, the h x h submatrices of XI - AT
are all transposes of h x h submatrices of XI — A, and vice versa.
Since determinants are not changed by transposition, XI — AT and XI — A
have the same Fitting ideals and so the same invariant factors. By (13.3.3),
this means that A is similar to AT and hence that the F[X]-modules M(A)
and M(AT) are isomorphic.
13.8 It is enough to a^,fMM^?&M^fM, with the desired property.
242 Hints and Solutions

Suppose J 2 = J: displaying 3 rows and columns, we have

/ A2 0 0 / A 0 0
2A A2 0 " ^ 1 A 0
2
1 2A A • 0 1 A

giving
A2 = A and 2A = 1.
These equations are incompatible unless the matrix J is 1 x 1, in which
case A = 0,1 work.
Ir 0
So, collecting all the l's together, the possible JNF's are for
0 0
identity matrices of size r = 0 , l , . . . , n .
No hints for Chapter 14-'
Bibliography

[Allenby] R. B. J. T. Allenby, Rings, Fields and Groups, 2nd edition, Ed­


ward Arnold, London 1991.
[A k L] D. M. Arnold k R. C. Laubenbacher, Finitely generated modules
over pull-back rings, J. Algebra 184, (1996) 304-332.
[A k McD] M. F. Atiyah k I. G. Macdonald, Introduction to Commutative
Algebra, Addison-Wesley, Reading, Mass., 1969.
[A, R & S] M. Auslander, I. Reitun & S. O. Smal0, Representation Theory
of Artin Algebras, Cambridge studies in advanced mathematics 36,
Cambridge University Press, Cambridge, 1995.
[B k K: IRM] A. J. Berrick k M. E. Keating, An Introduction to Rings
and Modules, Cambridge University Press, to appear.
[B k K: CM] A. J. Berrick k M. E. Keating, Categories and Modules,
Cambridge University Press, to appear.
[Cohn 1] P. M. Cohn, Algebra, volume 1, 2nd edition, John Wiley k Sons,
Chichester, 1982.
[Cohn 2] P. M. Cohn, Algebra, volume 2, John Wiley k Sons, Chichester,
1979.
[Cohn: FRTR] P. M. Cohn, Free Rings and their Relations, 2nd edition,
Academic Press, London, 1985.
[E, L k S] R. B. Eggleton, C. B. Lacampagne k J. L. Selfridge, Euclidean
Quadratic Fields, Amer. Math. Monthly 99 (1992) 829-837.
[Euclid] Euclid's Elements, volume 2, Dover, New York, 1956.
[Grayson] D. R. Grayson, SK\ of an interesting principal ideal domain, J.
Pure Appl. Algebra 20 (1981) 157-163.
[Green] J. A. Green, The characters of the general linear group, Transac­
tions Amer. Math. Soc. 80 (1955) 402-447.
[G k L] R. M. Guralnick k L. S. Levy, Presentations of modules when
ideals need not be principal, Illinois J. Math. 32 (1988) 593-653.

243
244 Bibliography

[G, L & O] R. M. Guralnick, L. S. Levy & C. Odenthal, Elementary divisor


theorem for noncommutative PID's, Proc. Amer. Math Soc. 103 (1988)
1003-1012.
[H & W] G. H. Hardy & E. M.Wright, An Introduction to the Theory of
Numbers, 5th edition, Oxford University Press, Oxford, 1979.
[Ischebeck] F. Ischebeck, Hauptidealringe mit nichttrivialer Si^-gruppe,
Arch. Math. (Basel) 35 (1980), no. 1-2 138-139.
[Jacobson] N. Jacobson, Basic Algebra I, 2nd edition, W. H. Freeman, New
York 1985.
[J & L] G. D. James & M. W. Liebeck, Representations and Characters of
Groups, Cambridge University Press, Cambridge, 1993.
[L k. S] R. C. Laubenbacher k. B. Sturmfels, A normal form algorithm for
modules over k[x,y]/(xy), J. Algebra 184 (1996) 1001-1024.
[McC &: R] J. C. McConnell & J. C. Robson, Noncommutative Noetherian
Rings, Wiley-Interscience, John Wiley, Chichester, 1987.
[Mac Lane] S. Mac Lane, Homology, 3rd corrected printing, Springer-
Verlag, Berlin, 1975.
[Marcus] D. A. Marcus, Number Fields, Universitext, Springer-Verlag,
Berlin, 1977.
[Reiner] I. Reiner, Maximal Orders, Academic Press, London, 1975
[Rotman] J. J. Rotman, An Introduction to Homological Algebra, Academic
Press, Boston, Mass., 1979.
[Rowen] L. H. Rowen, Ring Theory, volume I, Academic Press, Boston,
Mass., 1988.
[Sharp] R. Y. Sharpe, Steps in Commutative Algebra, London Math. Soc.
Student Texts 19, Cambridge University Press, Cambridge, 1990.
Index
Index of Symbols. These grouped M(A): module given by matrix
according to the type of object to A, 39
which they relate, as far as possi­ M/L: factor module, 92
ble. R1: free module on J, 121
Tp(M): p-primary component, 128
Ideals. Pi x • • ■ x Pfc: external direct sum,
114
I fl J: intersection, 9 rank(M): rank of module, 71
(a, b): GCD, 22
Ann(M): annihilator, 125 Homomorphisms.
Ann(i): annihilator, 61
I + J: sum of ideals, 9 =: isomorphism, 63
Ra, aR: principal ideals, 8 Hom(M, N): set of homomorphisms,
65
Rings. lm(0): image, 59
Ker(0): kernel, 58
R, S: rings, 3 rank(T): rank, 60
F[A"]: polynomial ring, 6 a(a): mult, homomorphism, 54
Mn(R): n x n matrices over R, 4 T(X): mult, homomorphism, 54
C: the complex numbers, 4 id,M'- identity homomorphism, 52
Q: the rational numbers, 4 inc: inclusion homomorphism, 52
R: the real numbers, 4 null(T): nullity, 60
Z: ring of integers , 4
Z[i]: Gaussian integers, 19 Matrices.
Z m : residue ring mod m, 12
PC,B- change of basis matrix, 79
Modules. C(A): rational canonical form of
A, 198
L + N: sum of submodules, 42 J(A): Jordan normal form of A,
L\®-- -®Lk- internal direct sum, 201
112 J+(p,k): nonsplit Jordan block
M = L® N: internal direct sum, matrix, 205
108 JS(A): separable Jordan form, 207

245
246 Index

Js(p,k): separable Jordan block characteristic polynomial, 46, 104,


matrix, 207 142, 143, 167, 195
diag(di,..., dn): diagonal matrix, Chinese Remainder Theorem, 116
152 cofactor, 85
column operations
Others. elementary, 146
column space, 60
C: strict containment, 13 common denominator, 76
U(R): unit group, 5 commutative
\G\: order of group, 189 group, 2
vol(L): volume, 188 ring, 4
companion matrix, 100, 196
Index of terms.
complement, 108, 112
abelian group, 2, 37, 186 component, 108, 112
action composition series, 104
of a linear transformation, 55 congruence, 10
of a matrix, 38 conjugate matrix, 194
addition, 1 coordinate vector, 80
additive group, 1, 37 coprime, 23
adjoint, 86 cyclic decomposition, 178
algebraically closed field, 203 cyclic group
algorithm representations of, 226
for diagonalization, 156 cyclic module, 43, 61, 94
annihilator, 124 over Euclidean domain, 97
of an element, 61 over polynomial ring, 100
of module, 125
Artinian, 221 decomposition, 108
associated matrices, 155 Dedekind domain, 225
associates, 23 defining relations, 136
degree, 6, 18
basis, 71 determinant, 85
change of, 77 diagonal matrix, 152
infinite, 236 diagram, 98
of vector space, 44 direct product of rings, 120
standard, 70 direct sum
block diagonal action, 113 k components, 112
external, 114
canonical basis, 30 infinite, 121
canonical homomorphism, 92 internal, 108
Cartesian product, 114 distinct irreducible elements, 23
Cay ley-Hamilton Theorem, 143 division algorithm, 18
characteristic matrix, 142Q&ffi iferialon. ring, 102
Index 247

domain, 4 Gaussian integers, 19, 26, 33


GCD, 21
eigenspace, 46 generator, 43
eigenvalue, 46 of ideal, 8
eigenvector, 46, 58 generators, 43
left vs. right, 66 greatest common divisor, 21
Eisenstein's Criterion, 27 group
elementary divisors, 181 abelian, 2
elementary Jordan matrix, 200 additive, 1
elementary operations commutative, 2
matrix interpretation, 149 multiplicative, 2
elementary row & column opera­ representation, 222
tions, 145 group ring, 222
endomorphism ring, 65
entire ring, 4 hereditary ring, 225
equivalence problem, 145,155, 167 homological algebra, 143, 225
equivalence relation, 11 homomorphism
equivalent matrices, 155 canonical, 92
Euclid's algorithm, 22 identity, 52
Euclidean domain, 17 inclusion, 52
external direct sum, 114 induced, 93
multiplication, 54
factor module, 92 of F[X]-module, 56
factorization of groups, 52
standard, 25 of modules, 51
unique, 24 of rings, 66
field, 5 product, 53
as Euclidean domain, 18 product of, 53
of fractions, 5 sum, 53
quotient, 5
finite dimensional, 44 ideal, 7
First Isomorphism Theorem, 94 left, 7
Fitting ideal, 163 maximal, 13
free module, 71 prime, 14
infinite, 121 principal, 8, 21
standard, 70 proper, 7
free resolution, 143 right, 7
fundamental parallelepiped, 188 two-sided, 7
Fundamental Theorem of Algebra, zero, 7
26 identity operation, 146
Idiot's Cancellation Lemma, 103
Gauss' Lemma, 27 Jte«§/e, 59, 64
248 Index

inverse, 64 Lagrange's Theorem, 191


indecomposable module, 111 lattice, 187
independent vectors, 44 full, 187
induced homomorphism, 93 leading term, 6
Induced Mapping Theorem, 94 left ideal, 7
infinite matrices, 122 left module, 35
injective, 59 linear independence, 71
integral domain, 4 linear map, 52
intersection linear transformation, 38, 52
of ideals, 9 long division, 18
of submodules, 42
invariant factor, 152 Maschke's Theorem, 224, 227
standard choice, 167 matrix
trivial, 195 action, 38
invariant factor decomposition, 178 adjoint, 86
invariant factor form, 152 change of basis, 79
invariant factors invertible, 85, 158
of XI - A, 195 of homomorphism, 82
of a module, 178 ring, 4
uniqueness, 166 root of unity, 209
invariant subspace, 45 scalar, 39
inverse, 4 minimal ideal, 220
inverse image, 64 minimum polynomial, 196
invertible, 4, 63 minor, 163
invertible matrix, 158 module
irreducible, 23 p-primary, 127
standard, 25 cyclic, 43, 61, 94
irreducible elements defined by relations, 136
distinct, 23 factor, 92
irreducible representation, 223 finitely generated, 44
isomorphism homomorphism, 51
of modules, 63 indecomposable, 111
of rings, 15, 66 isomorphism, 63
left, 35
Jordan normal form, 201 over Z, 37
nonsplit, 203 over commutative ring, 36
separable, 207 over polynomial ring, 39, 55
split case, 200 projective, 215
Jordan-Holder Theorem, 104 quotient, 92
regular, 36
kernel, 58 relation, 135
Kernel and Image TheoreOtj 60 afight, 36
Index 249

simple, 41, 50 product of homomorphisms, 53


sum proofs in full detail, 8, 9, 42, 59,
fc-fold, 43 63, 93
torsion, 124 proper submodule, 41
torsion-free, 124 purely algebraic, 193
trivial, 40
zero, 36 quaternion algebra, 105
monic, 22 quotient module, 92
multiplicative group, 2
rank, 60, 70
nilpotent matrix, 209 of a module, 178
Noetherian ring, 143, 169 of free module, 71
normal form, 193 Rank & Nullity Theorem, 60
null space, 60 rational canonical block matrix,
nullity, 60 100
rational canonical form, 198
one-to-one, 59 reducible, 23
onto, 59 reflexive, 11
order regular module, 36
of a module, 191 relation, 135
ordered set, 121 relation module, 135
orthogonal idempotents, 121 representation of a group, 222
residue class, 11
p-primary module, 127 residue ring, 10
component, submodule, 128 residues
polynomial of integers, 12
constant, 6 right ideal, 7
degree, 6 right module, 36
monic, 22 ring, 3
root of, 30 commutative, 4, 36
split, 31 domain, 4
zero, 6 factor, 10
polynomial ring, 5 homomorphism, 66
presentation, 134 isomorphism, 66
diagonal, 175 of diagonal matrices, 14, 216
invariant factor, 175 of dual numbers, 226
presentation homomorphism, 140 of Gaussian integers, 19, 26,
presentation matrix, 140 33
primary decomposition, 179 of integers
prime ideal, 14 as Euclidean domain, 18
principal ideal domain, 32 of matrices, 4, 8, 50
product, 2 infinite, 122
250 Index

of polynomials, 5, 29 sum, 42
as ED, 18 torsion, 124
complex, 26 zero, 41
over a finite field, 28 sum
rational, 27 of homomorphisms, 53
real, 27 of ideals, 9
of scalars, 36 of submodules, 42
of triangular matrices, 15 summand, 108, 112
quotient, 10 surjective, 59
residue, 10, 28, 29 Sylow subgroup, 187
trivial, 3 symmetric, 11
zero, 3
row & column operations Third Isomorphism Theorem, 103
general, 150 torsion, 124
row operations module, 124
elementary, 146 submodule, 124
torsion-free, 124
scalar matrix, 39 transitive, 11
scalar multiplication, 35 transpose matrix is similar, 212
Second Isomorphism Theorem, 103 trivial invariant factor, 195
semisimple, 220 trivial module, 40
separable Jordan normal form, 207 two-sided ideal, 7
separable polynomial, 206
set of generators, 44 underlying space, 55
similar matrices, 194, 199 Unique Factorization Theorem, 24
similarity problem, 193 unit, 4
simple module, 41 unit group, 5
Smith normal form, 152
vector space, 36
spanning set, 44
basis, 44
split homomorphism, 216
spanning set, 44
split polynomial, 31
volume, 188
splitting field, 31
standard basis, 60 Wedderburn-Artin Theorem, 221
standard factorization, 25
standard irreducible, 25 zero ideal, 7
standard matrix units, 120 zero module, 36
subgroup, 41 zero submodule, 41
submodule, 40
generated by set, 44
intersection, 42
of cyclic module, 96
proper, 41

You might also like