0% found this document useful (0 votes)
7 views

MATH325-425-Representation-Theory-of-Finite-Groups-Notes

The document contains course notes for MATH325/425 on Representation Theory of Finite Groups, authored by Jan E. Grabowski. It covers topics such as group actions, group representations, modules, character theory, and includes definitions and examples related to categories and functors. The course aims to provide a foundational understanding of representation theory, particularly in the context of finite groups, and outlines key learning outcomes for students.

Uploaded by

gamer1234444
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

MATH325-425-Representation-Theory-of-Finite-Groups-Notes

The document contains course notes for MATH325/425 on Representation Theory of Finite Groups, authored by Jan E. Grabowski. It covers topics such as group actions, group representations, modules, character theory, and includes definitions and examples related to categories and functors. The course aims to provide a foundational understanding of representation theory, particularly in the context of finite groups, and outlines key learning outcomes for students.

Uploaded by

gamer1234444
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 104

MATH325/425

Representation Theory of Finite Groups


Course Notes
Jan E. Grabowski
Department of Mathematics and Statistics
Lancaster University

[email protected]
http://www.maths.lancs.ac.uk/~grabowsj

31st January 2023

1
Contents
0 Categories 4

1 Group actions 9

2 Group representations 11

3 Linear representations 15
3.1 Equivalence of representations . . . . . . . . . . . . . . . . . . 18
3.2 Subrepresentations . . . . . . . . . . . . . . . . . . . . . . . . 18
3.3 Quotient representations . . . . . . . . . . . . . . . . . . . . . 19
3.4 Direct sums . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

4 Algebras 20
4.1 Interlude—tensor products of vector spaces . . . . . . . . . . 21
4.2 Algebras, continued . . . . . . . . . . . . . . . . . . . . . . . 22

5 The group algebra and group of units functors 27

6 Modules 33
6.1 Modules for group algebras . . . . . . . . . . . . . . . . . . . 43

7 Character theory 47

A Burnside’s pα q β theorem 60

B Groups, rings and vector spaces 64

2
Introduction
This is a first course on representation theory, with a particular aim of
describing the representation theory of finite groups. We will introduce
the notion of group representations, to set the context for the shift from
representations—which have certain disadvantages—to modules. We give
the basics of the theory of modules for a general algebra over a field, and
then specialise to the group algebra of a group, this being an algebra that
carries all the information of the group. We will finish with the theory of
characters of a finite group, which is a more concrete way of studying specific
representations of groups.
We will introduce and use the language of categories, but not very much
of the theory of these. This language is that of modern representation theory
and allows us to make more precise statements.
The learning outcomes for this module are as follows. On completion of
the module you should

(a) understand the basics of representation theory, such as the concept of


module, representation and, in the case of finite groups, character.

(b) understand the use of matrix groups in the study of representations


and the correspondence between group representations and modules
for the group algebra.

(c) know and be able to apply the main results on modules for group
algebras, such as Maschke’s theorem and Schur’s lemma, in particular
in the case of finite Abelian groups.

(d) know and be able to apply the main results on module homomorph-
isms, in particular in the case of the decomposition of the group algebra
of a finite group.

(e) know and be able to apply the main results on characters, such as
the orthogonality relations and construction of the character table of
a finite group.

3
0 Categories
In this preliminary section, we will introduce some new language that will
help us talk about representation theory. Categories are (as I hope you
will see) a very natural way to express relationships among collections of
algebraic objects, in a way that in particular respects functions. We will not
need very much actual category theory but using the terminology will allow
us to make better, more precise statements.

Definition 0.1. A category C consists of

• a collection1 of objects Obj(C) and

• for each pair of objects x, y ∈ Obj(C) a collection of morphisms C(x, y)

such that

(a) for each pair of morphisms f ∈ C(x, y) and g ∈ C(y, z), there exists a
composite morphism g ◦ f ∈ C(x, z) such that composition is associat-
ive:
h ◦ (g ◦ f ) = (h ◦ g) ◦ f
for all f ∈ C(w, x), g ∈ C(x, y) and h ∈ C(y, z); and

(b) for each x ∈ Obj(C), there exists an identity morphism idx ∈ C(x, x)
such that the left and right unit laws hold :

idy ◦ f = f = f ◦ idx

for all x, y and f ∈ C(x, y).

As you might expect, there are lots of examples.


Examples 0.2.

(a) There is a category with one object ∗ and one morphism id∗ .
e
(b) Let v → w be the directed graph Q with two vertices v, w and one
edge e. Then there is a category P(Q) (the path category of the graph)
with two objects, v and w, and P(v, v) = {idv }, P(v, w) = {e} and
P(w, w) = {idw }. If Q were any arbitrary directed graph, we would
take Obj(P) to be the set of vertices of Q and P(v, w) to be the set
of paths in Q from v to w. (Note that idv corresponds to the path of
length 0 from v to itself.)
1
Here and elsewhere, we will say “collection” due to what mathematicians call “size
issues”. These occur because the collection of all things we might be interested in might
not form a set, depending on our logical foundations. You might recall that issues such as
Russell’s paradox can occur if care is not taken. We will not dwell on this but will signal
that care is needed, by using “collection” instead of “set”.

4
(c) Let G be a group. Let G be the category with one object ∗ and
G(∗, ∗) = G. Note that eG = id∗ .

But, I hear you say, these examples are all a bit. . . unnatural. Yes: this
makes the point that the definition of a category includes a huge range of
structures.
As for some more familiar examples:

(d) Let Set be the category with objects Obj(Set) the collection of all
sets and morphisms Set(X, Y ) = Fun(X, Y ), that is, the collection of
morphisms between two sets X and Y is the collection of all functions
from X to Y . Then you have known for a long time that composition
of functions exists, is associative and that we have identity functions.

(e) Let Grp be the category with objects Obj(Grp) the collection of all
groups and morphisms Grp(G, H) = Hom(G, H) the collection of all
group homomorphisms from G to H. Composition and its associ-
ativity are inherited from Set and the identity function is a group
homomorphism.

(f) Let Rng be the category with objects Obj(Rng) the collection of all
(not necessarily unital) rings and morphisms Rng(R, S) = Hom(R, S)
the collection of all (not necessarily unital) ring homomorphisms from
R to S. Composition and its associativity are inherited from Set and
the identity function is a ring homomorphism.

(g) Let Ring be the category with objects Obj(Ring) the collection of all
unital rings and morphisms Ring(R, S) = Homu (R, S) the collection
of all unital ring homomorphisms from R to S. Composition and its
associativity are inherited from Set and the identity function is a ring
homomorphism.

(h) Let K be a field and let VectK be the category with objects Obj(VectK )
the collection of all K-vector spaces and morphisms VectK (V, W ) =
HomK (V, W ) the collection of all K-linear maps from V to W . Com-
position and its associativity are inherited from Set and the identity
function is a K-linear map.

(i) Let Top be the category with objects Obj(Top) the collection of all
topological spaces and morphisms Top(X, Y ) = C 0 (X, Y ) the collec-
tion of all continuous functions from X to Y . Some work is needed
to prove that this is a category! (Don’t worry if you don’t know this
example: it won’t appear again in this course, and this is why it was
included, i.e. to illustrate that there are examples that are natural but
not algebraic.)

5
Most of our categories will be like these, with morphisms being some
structure-preserving map. Going forward, our usual convention for morph-
ism sets will be to write HomC (X, Y ) for morphisms from X to Y in the
category C, to remind us to think of homomorphisms. When X = Y ,
we have a special name for the resulting morphisms; namely, we write
def
EndC (X) = HomC (X, X).
Just as sets, groups, rings, vector spaces etc. have structure-preserving
maps between them, so do categories. Indeed, it was claimed by Mac Lane,
one of the founders of category theory, that being able to define and study
these was the whole point. Indeed, just as bijections or group or ring iso-
morphisms or invertible linear maps encode for us when two sets, groups,
rings or vector spaces are “the same”, so certain maps of categories will do
this too.

Definition 0.3. Let C and D be categories. A functor F : C → D is a map


that sends every x ∈ C to an object F x ∈ D and every morphism f ∈ C(x, y)
to a morphism F f ∈ D(F x, F y) such that

(a) F preserves composition: F (g ◦ f ) = F g ◦ F f for all composable f, g


and

(b) F preserves identity morphisms: F idx = idF x for all x ∈ Obj(C).

It should be clear by comparing with the definition of a category that


this is the2 natural definition that preserves the structure within a category.
The preservation conditions in the definition of a functor could be gathered
together into a slicker statement: that F preserves all commuting diagrams.
We will use commuting diagrams regularly in our definitions, so let us ex-
pand on this a little.

Definition 0.4. A diagram in a category C is a collection of objects and


morphisms, some of which may be composable. We say that a diagram
commutes, or is a commuting diagram, if the compositions of the morphisms
along any two paths with the same start and end objects are equal.

Then saying that F preserves composition is equivalent to F preserving


all commuting triangles:

f F Ff
x y ⇒ Fx Fy
g Fg
g◦f F (g◦f )
z Fz
2
In fact, it is one of two such, though arguably the other looks less natural at first
sight. Namely, what we have defined is called a covariant functor; its counterpart, a
contravariant functor, satisfies F (g ◦ f ) = F f ◦ F g.

6
Similarly, preserving identity morphisms means preserving loops.
Then the collection of all categories together with functors between them
forms a category. Unsurprisingly, we are running headlong into logical and
set-theoretical issues here again, but we can avoid some of this, as follows.

Definition 0.5. A category C is called locally small if for all x, y ∈ Obj(C),


C(x, y) is a set. A category C is called small if Obj(C) is a set and it is
locally small.

Exercise 0.6. Which of the above examples of categories are small? locally
small?

Definition 0.7. The category Cat with objects all small categories and
morphisms Cat(C, D) all functors from C to D is a category.

An important class of functors, especially given our examples above, is


that of forgetful functors. There is not a precise definition but the examples
give the idea: when an algebraic structure consists of a set with additional
data, there is a functor to the category Set sending every object to itself but
forgetting the extra structure. We also send morphisms, which we typically
take to be functions that preserve the additional structure, to themselves,
forgetting that they have extra properties. We don’t have to forget all the
structure, either. In this way we have forgetful functors as follows:

• F : Ring → Set

• F : Rng → Set

• F : Grp → Set

• F : VectK → Set

• F : Ring → Rng

• F : Ring → Grp

• F : Rng → Grp

• F : Top → Set

Most functors actually “do” something and we will see more non-trivial
examples later.
The first four of these and the last express that the categories in the
domain of F (i.e. our more familiar examples of categories) are what is called
concrete. The existence of these forgetful functors to Set is the precise way
to say that these categories consist of sets and functions with extra structure.
Especially in these examples, but also when we are generally feeling lazy,
we will often write x ∈ C to mean “x is an object of C” (i.e. x ∈ Obj(C)).

7
So for example we might say “let V ∈ VectK ” as a shorthand for “let V be
a K-vector space”.
We will conclude our whirlwind tour of “elementary category theory”,
if there is such a thing, by addressing the issue mentioned above of when
two categories should be considered to be “the same”. Perhaps surpris-
ingly3 , there are several levels of sameness, and the most direct analogue of
isomorphism is not the right one.

Definition 0.8. Let C be a category. A morphism f : x → y is an isomorph-


ism if it is invertible, i.e. there exists g : y → x such that g ◦ f = idx and
f ◦ g = idy .
Given x, y ∈ C, if there exists an isomorphism f : x → y between them,
we say x and y are isomorphic and write x ∼ = y. An isomorphism f : x → x
is called an automorphism of x and we write AutC (x) for the collection of
automorphisms of x ∈ C.

So the first idea we might have is to say that two categories C, D are the
same if there is an isomorphism F : C → D in Cat between them. That is, if
there are two functors F : C → D and G : D → C such that G ◦ F = idC and
F ◦ G = idD , where idC is the identity functor on C (sending every object to
itself and every morphism to itself).
However this is too strict a notion and isomorphisms of categories are
very rare. Instead, we should relax the conditions a little and say that G ◦ F
and F ◦ G should be almost the identity functors. To do this definition
properly requires the notion of a natural transformation of functors. For
expediency, we will give an equivalent definition that has the advantages of
both being easier to check and also more like the “injective and surjective”
condition we are used to in Set.

Definition 0.9. Let C and D be (small) categories and F : C → D a functor.


For x, y ∈ C denote by Fxy : C(x, y) → D(F x, F y) the function defined by
Fxy (f ) = F f .

(a) We say that F is faithful if Fxy is injective for all x, y ∈ C.

(b) We say that F is full if Fxy is surjective for all x, y ∈ C.

(c) We say that F is fully faithful if Fxy is bijective for all x, y ∈ C.

(d) We say that F is essentially surjective if for every object d ∈ D there


exists c ∈ C such that F c ∼
= d.

(e) We say that F is an equivalence of categories if F is fully faithful and


essentially surjective.
3
Until one has spent a bit more time moving around the hierarchy of categories

8
There is a special sort of equivalence, when the two functors have a
particular relationship, as follows.

Definition 0.10. Let C and D be categories and F : C  D : G be functors4 .


We say F and G form an adjoint pair if for all objects c ∈ C and d ∈ D
there is a bijection

=
α : C(c, Gd) → D(F c, d)
which is natural in c and d.
If α−1 (idF c ) : c → GF c and α(idGd ) : F Gd → d are natural isomorphisms
in c and d, we say F and G form an adjoint equivalence.

An adjoint equivalence is, in particular, an equivalence. A technique


one may employ is to find adjoint pairs (using various general methods and
theorems) and then try to show they are actually adjoint equivalences. This
is helpful to identify candidate equivalences, just as it is helpful to know
some homomorphisms to be able to test for being isomorphisms; without
any candidates, it is often not clear what to do.
There is much more we could say about categories and their theory but
the above will suffice for now. So, we turn to motivating our main interest,
representation theory.

1 Group actions
When you were introduced to groups, you were probably told that groups
often arise as symmetries. The subtle but important shift from “what group
encodes the symmetries of this object?” to “which objects does this group
give symmetries of?” moves us from the structure theory of groups (which
is what you have studied before) to their representation theory.
The word “symmetry” needs a little unpacking before we can make much
progress—we need some formal definitions in order to state and prove things.
Informally, a symmetry of an object should move the points of that object
around, there should be a symmetry that leaves the object unchanged and
if we have two symmetries, we should be able to apply one and then the
other to obtain another symmetry. Very often, we want our symmetry to
be reversible; asking for this is what puts us in the domain of groups.
The most common way to encode this formally is in the notion of a group
acting on a set, or a group action. We will also do this first, very shortly, but
we will then show how this can be transformed into different formulations
that are better suited for posing the questions we will want to ask.
4
This notation is shorthand for F : C → D, G : D → C and emphasises that we are
interested in F and G as a pair. By itself, it makes no claims about G ◦ F or F ◦ G,
however.

9
Definition 1.1. Let G be a group and X a set. A left action of G on X is
a function α : G × X → X such that

(GA1) α(e, x) = x for all x ∈ X; and

(GA2) α(g1 g2 , x) = α(g1 , α(g2 , x)) for all g1 , g2 ∈ G, x ∈ X.

Exercise 1.2. Write down the definition of a right action.


Exercise 1.3. Write down and prove (using the axioms) a mathematical
statement that formalises the statement “we want a symmetry to be revers-
ible”.
For ease of notation, it is common to write α(g, x) as g · x. However this
can be confusing—the dot · suggests multiplication but this might not make
sense for arbitrary G and X, and the visibility of the “leftness” of the action
is reduced. Instead we will use ., i.e. we will write g . x for α(g, x), so that
our axioms become

(GA10 ) e . x = x for all x ∈ X; and

(GA20 ) (g1 g2 ) . x = g1 . (g2 . x) for all g1 , g2 ∈ G, x ∈ X.

We will abuse notation a little by naming our action function . too, by


writing . : G × X → X, .(g, x) = g . x.
Examples 1.4. Important examples of groups acting on sets are:
 
(a) GL2 (R) acting on R2 , ac db . ( xy ) = ax+by

cx+dy ;

(b) Sn acting on {1, 2, . . . , n}, σ . r = σ(r);

(c) G acting on itself via g . h = gh;

(d) G acting on itself via g . h = ghg −1 ;

(e) G acting on the left cosets of a subgroup H via g . kH = gkH.

Exercise 1.5. Check that these are actions.


Although we will not have so much direct use of the following in this
course, it would be remiss not to briefly introduce orbits and stabilizers and
the theorem that links them.

Definition 1.6. Let . : G × X → X be a left action of G on X. The orbit


of x ∈ X is the set
def
O(x) = {g . x | g ∈ G}.
The stabilizer of x ∈ X is the set
def
Stab(x) = {g ∈ G | g . x = x}.

10
The orbitSof an element x ∈ X is a subset of X and the orbits partition
X, i.e. X = x∈X O(x) and O(x) ∩ O(y) = ∅ if x 6= y. The stabilizer of x
is a subgroup of G.
Exercise 1.7. Prove the claims in the preceding paragraph.
For a group G and subgroup H, we will write (G : H) = {gH | g ∈ G}
for the set of left cosets of H in G.

Theorem 1.8 (Orbit–stabilizer theorem). Let . : G × X → X be a left


action of G on X. For x ∈ X, there is a bijection

ϕ : O(x) → (G : Stab(x))

defined by ϕ(g . x) = g Stab(x) such that

h . ϕ(y) = ϕ(h . y)

for all y ∈ O(x).

Proof. We have

g . x = g0 . x
⇐⇒ (g 0 )−1 . g . x = (g 0 )−1 . g 0 . x
⇐⇒ ((g 0 )−1 g) . x = x
⇐⇒ (g 0 )−1 g ∈ Stab(x)
⇐⇒ g Stab(x) = g 0 Stab(x)

so that ϕ is well-defined (two representatives of y in O(x), g . x = g 0 . x =


y, map to the same coset) and injective. Since given g Stab(x), we have
ϕ(g . x) = g Stab(x), ϕ is surjective too.
Then

h . ϕ(g . x) = h . (g Stab(x)) = hg Stab(x) = ϕ(hg . x) = ϕ(h . (g . x)).

Extension 1.9. How would you define a morphism between two G-actions?
That is, given α : G × X → X and β : G × Y → Y , what would be an
appropriate notion of a morphism f : α → β?
Can you show that the collection Act(G, X) of (left) actions of G on X
with your notion of morphism is a category?

2 Group representations
Now we make our first reformulation, expressing group actions as group
representations.

11
Definition 2.1. Let G be a group. A G-representation on a set X is a
group homomorphism ρ : G → AutSet (X).
Definition 2.2. Let G be a group and let ρ, σ be G-representations on
sets X and Y respectively. A morphism of G-representations f : ρ → σ is a
function f : X → Y such that

f (ρ(g)(x)) = σ(g)(f (x))

for all g ∈ G.
The condition here is best understood as the commuting of the following
diagram:

ρ(g)
X X
f f

Y Y
σ(g)

Let Act(G, X) denote the collection of group actions of G on a set X


and let Rep(G, X) denote the collection of G-representations on X.
Extension 2.3. Show that Rep(G, X) with objects G-representations and
morphisms as above is a category.
The next proposition says that actions and representations are in one-
to-one correspondence.
Proposition 2.4. There is a bijection κ̂ : Act(G, X) → Rep(G, X).
Proof. We first sketch what is needed:
(a) note that Act(G, X) ⊆ HomSet (G × X, X);
(b) there is a very general operation κ called “currying” that takes a func-
tion f : A × B → C to a function κ(f ) : A → HomSet (B, C), defined
by κ(f )(a) = f (a, −);
(c) the special properties of group actions mean that we can tweak this to
replace Hom by Aut;
(d) the resulting functions κ̂(α) are group homomorphisms, again because
we have a group action, so we can regard κ̂ as a function taking actions
to group representations;
(e) κ̂ is a bijection.
It is also worth saying that the reason why what follows is quite technical
is because an action is not a group homomorphism: G × X is not even a
group.
Now, for the proof proper:

12
(a) Note that every left action of G on X is, by definition, given by a
function α : G × X → X, so that Act(G, X) ⊆ HomSet (G × X, X).

(b) By the principle of currying, we may define

κ : Act(G, X) → HomSet (G, HomSet (X, X))

by κ(α)(g) = α(g, −).


The “blank” notation, α(g, −), means “think of the blank as a place
we can put an element” (in this case x) so that we have a function
α(g, −) : X → X given by α(g, −)(x) = α(g, x). So we could (and
perhaps should) write κ(α)(g)(x) = α(g, x). You need to trace through
carefully to see why this is the right thing to do—lots of things here
are functions and you just need to keep evaluating them on the right
elements! (Arguably, in the same spirit we should have just written
κ(α) = α(−, −) when defining κ, but more than one blank can become
confusing, so we went for the middle ground.)

(c) Unpacking a little further, we see that if we write . rather than α as


before, then κ(α)(g) ∈ HomSet (X, X) sends x to g . x. Notice(!) that

κ(α)(g −1 ) ◦ κ(α)(g) (x) = κ(α)(g −1 )(g . x) = g −1 . (g . x) = x




and similarly with g −1 and g interchanged. So κ(α)(g) is actually a


bijection, i.e. for all g ∈ G, κ(α)(g) ∈ AutSet (X). This only works
because α is a group action; it is not a formal consequence of currying
arbitrary functions.
So by restricting the codomain of κ, we can define a function

κ̂ : Act(G, X) → HomSet (G, AutSet (X)), κ̂ = κ|AutSet (X) .

(d) We claim that κ̂(α) is a group homomorphism. We have κ̂(α)(e)(x) =


α(e, x) = x for all x ∈ X so κ̂(α)(e) = idX . Also,

κ̂(α)(gh)(x) = gh . x = g . (h . x) = (κ̂(α)(g) ◦ κ̂(α)(h))(x)

for all x ∈ X, so κ̂(α) is a group homomorphism. Note that the axioms


of a group action are exactly what is needed for this.
Hence, Im κ̂ ⊆ HomGrp (G, AutSet (X)) = Rep(G, X).

(e) It remains to prove that κ̂ is a bijection, which we will do by showing


that it is invertible.
Given a G-representation ρ : G → AutSet (X), define αρ : G × X → X
by αρ (g, x) = ρ(g)(x). Then for all x ∈ X

αρ (e, x) = ρ(e)(x) = idX (x) = x

13
and

αρ (g1 g2 , x) = ρ(g1 g2 )(x)


= (ρ(g1 ) ◦ ρ(g2 ))(x)
= ρ(g1 )(αρ (g2 , x))
= αρ (g1 , αρ (g2 , x))

for all g1 , g2 ∈ G and so we see that αρ ∈ Act(G, X).


Now define

λ̂ : HomGrp (G, AutSet (X)) → Act(G, X), λ̂(ρ) = αρ .

Then

κ̂(λ̂(ρ))(g)(x) = κ̂(αρ )(g)(x)


= αρ (g, x)
= ρ(g)(x)

and

λ̂(κ̂(α))(g)(x) = λ̂(α(g, −))(x)


= αα(g,−) (x)
= α(g, x)

and we see that κ̂ and λ̂ are inverse to each other.


Extension 2.5 (Hard!). Show that if f : α → β is a morphism of left actions
of G on X, then f induces a morphism of G-representations κ̂(f ) : κ̂(α) →
κ̂(β). Furthermore κ̂ respects composition: if f, g are morphisms of left
actions, κ̂(g ◦ f ) = κ̂(g) ◦ κ̂(f ). That is, κ̂ is a functor from the category
Act(G, X) to the category Rep(G, X).
By similarly enhancing λ̂ to a functor in the opposite direction, prove
that Act(G, X) and Rep(G, X) are isomorphic categories.
Examples 2.6. Let us revisit the examples of actions, expressing them as
representations via the proposition.
(a) For G = GL2 (R) 2
 acting
 on R , the representation corresponding to
a b . (x) = ax+by
is ρ : GL2 (R) → AutSet (R2 ) with ρ(M ) the

c d y cx+dy
function sending v ∈ R2 to M v. But then ρ(M ) is the (invertible)
linear transformation M represents; this is the natural representation
of GL2 (R).
Note (as this is relevant later!) that AutSet (R2 ) contains GL2 (R) but
is not equal to it; there are set bijections of R2 with itself that are not
linear, let alone invertible.

14
(b) For G = Sn acting on X = {1, 2, . . . , n} with σ .r = σ(r), we have that
AutSet (X) = Sn and the corresponding representation ρ : Sn → Sn is
the identity map. This is the natural representation of Sn .
Furthermore, any subgroup H of Sn gives rise to a representation
ρ : H ,→ Sn in the same way. Since any finite group G is isomorphic to
a subgroup of SG (this is Cayley’s theorem, B.82), every finite group
has a permutation representation ρ : G ,→ SG .

(c) For G acting on itself via g . h = gh, the associated representation is


ρ : G → AutSet (G) with ρ(g)(h) = Lg (h) = gh, i.e. G is represented by
its left multiplication maps. However AutSet (G) = SG by definition,
so in fact this is just the permutation representation!
Note that in general we may have AutGrp (G) ( AutSet (G). In this
particular example, each ρ(g) is actually a group homomorphism and
Im ρ ⊆ AutGrp (G).

(d) Exercise.

(e) Exercise.

3 Linear representations
As the first example above might suggest, AutSet (X) can be unwieldy to
work with. Also, for many group actions, there is additional structure
around that is ignored by looking at just the underlying sets and functions
on them. By restricting the class of representations we care about, often we
are able to say more.
With this in mind, from this point onward we will concentrate on linear
representations. That is, we will require the set G acts on to be a vector space
def
V over some field K and ask that representations take values in GL(V ) =
AutVectK (V ) (by definition of the latter).

Definition 3.1. Let K be a field, V a K-vector space and G a group. A K-


linear representation of G on V is a group homomorphism ρ : G → GL(V ).
Denote the collection of K-linear representations of G on V by RepK (G, V ).

Definition 3.2. Let ρ : G → GL(V ) be a K-linear representation of G on


V . We say ρ is finite-dimensional if V is finite-dimensional and, in that
def
situation, we define the degree of ρ to be deg ρ = dim V .

Example 3.3. The key first example of a linear representation arises from
idGL2 (R) : GL2 (R) → GL2 (R) with idGL2 (R) (T ) = T (T : R2 → R2 being a
linear transformation, sending v to T v).

15
Indeed, if H is any subgroup of GL2 (R) then ρ : H ,→ GL2 (R) (given by
the inclusion) is a representation of H, which we call the natural represent-
ation. Clearly neither 2 nor R are special: any subgroup H of GL(V ) has a
natural representation on V , ρ : H ,→ GL(V ).
Other groups do not come to us as groups of linear transformations (or
equivalently matrices), though. So linear representation theory is precisely
about finding out in what ways the elements of our given group can be
represented by linear transformations (matrices), in a way that is compatible
with the group structure.
Example 3.4. For any group G and any vector space V , there is a repres-
entation given by ρ : G → GL(V ), ρ(g) = I for all g ∈ G, where I : V → V
is the identity linear transformation. This is the trivial representation of G
on V .
The trivial representation of degree 1 is an important special case: this
is defined by 1G : G → GL1 (K) = K \ {0}, 1G (g) = 1K for all g ∈ G.
Example 3.5. Let G = C2 be cyclic of order 2, generated by g, with g 2 = e.
Let K = C and V = C2 . Define ρ : G → GL2 (C) by
ρ(e) = ( 10 01 )
and
−5 −12

ρ(g) = 2 5

Since −5 −12 2 = ( 1 0 ), this is a homomorphism.



2 5 01
Observe two things:
(a) the choices of V and −5 −12 are not very significant. For any field K

2 5
and any K-vector space V , if we can find T ∈ GL(V ) such that T 2 = I,
we have a representation ρ : G → GL(V ), ρ(g i ) = T i for i = 0, 1. (In
−5 −12

particular, −2 5 being a 2 × 2 matrix is a red herring: the 2-ness
comes from picking V ∼ = C2 not G being cyclic of order 2.)
(b) Implicit was that it was enough to find a matrix (or linear transform-
ation) satisfying the relations in G: this is indeed valid, since if G has
a presentation G = hXi / R it is enough to check that ρ(r) = I for
any r ∈ R to define a homomorphism. Note that ρ(e) = I is always
required, by the definition of a group homomorphism.
Exercise 3.6. Let G = ha | an = 1i by cyclic of order n, let ζ = e2πi/n ∈ C
and θ = 2π/n. Show that
ρ : G → GL1 (C) = C× , ρ(aj ) = ζ j
and
τ : G → GL2 (C), τ (aj ) = cos θ − sin θ j

sin θ cos θ
are C-linear representations of G.

16
Exercise 3.7. Show that the dihedral group of order 8,

D8 = a, b | a4 = b2 = e, bab = a−1
0 −1

1 0
 representation ρ : D8 → GL2 (Q) with ρ(a) =
has a Q-linear 1 0 and
ρ(b) = 0 −1 .
An important property of a representation is that of being faithful.

Definition 3.8. A K-linear representation ρ : G → GL(V ) is faithful if


Ker ρ = {e}.

By the First Isomorphism Theorem (B.96), Ker ρ = {e} implies Im ρ ∼ =


G; that is, a faithful representation puts a “faithful” copy of G into GL(V ).
Every group has a faithful representation, by the following construction.

Definition 3.9. Let G be a group. Let K[G] be the K-vector space5 with
basis G. Let Lg ∈ GL(K[G]) be the linear map defined on the basis of K[G]
by Lg (h) = gh.
The regular representation of G is ρreg : G → GL(K[G]), ρreg (g) = Lg .

That this is a representation follows from observing that

(Lg ◦ Lh )(k) = Lg (hk) = ghk = Lgh (k).

Furthermore. Ker ρreg = {g ∈ G | ρreg (g) = IK[G] } but we can see that
Lg = IK[G] if and only if g = e, so ρreg is faithful.
Example 3.10. Let G = C4 = a | a4 = e = {e, a, a2 , a3 }. Then an element
in K[C4 ] has the form αe + βa + γa2 + δa3 for α, β, γ, δ ∈ K. We know that
Le is the identity, since Le (h) = eh = h for all h ∈ G.
Now,
La (αe + βa + γa2 + δa3 ) = αa + βa2 + γa3 + δe
(since a4 = e), from which we see that La is represented by the matrix
0 0 0 1
1000 .
0100
0010

By the definition of a representation, ρ(aj ) = ρ(a)j , from which we can


deduce the representing matrices for a2 and a3 .
5
This slightly esoteric notation will be reused and expanded on later, so we use it now
for consistency.

17
3.1 Equivalence of representations

=
Let ϕ : V → W be an isomorphism of K-vector spaces. Since GL(V ) encodes
the (linear) symmetries of V , we would expect a close relationship with
GL(W ) and indeed, these are isomorphic groups. For defining

f : GL(V ) → GL(W ), f (T ) = ϕ ◦ T ◦ ϕ−1

and
g : GL(W ) → GL(V ), g(U ) = ϕ−1 ◦ U ◦ ϕ
it is straightforward to check that these are inverse homomorphisms (exer-
cise).
Then a representation ρ : G → GL(V ) has a counterpart representation
f ◦ ρ : G → GL(W ) and vice versa, but these are not really “different”
representations—we have just changed the underlying vector space for an
isomorphic one. So let us say two such representations are equivalent.

Definition 3.11. We say that two K-linear representations ρ : G → GL(V )


and σ : G → GL(W ) are equivalent if there exists an K-vector space iso-
morphism ϕ : V → W such that
ρ
G GL(V )
idG ϕ̂

G σ GL(W )

commutes, where ϕ̂ : GL(V ) → GL(W ) is defined by ϕ̂(T ) = ϕ ◦ T ◦ ϕ−1 .

Since sometimes we prefer more concrete conditions to check, let us


unpack this a bit. Choose a basis BV for V and a basis BW for W .

= def
First, ϕ : V → W is equivalent to considering (V, BV ) and (V, BV0 =
ϕ−1 (BW )). Then ϕ̂ represents the change of basis matrix [I]B V
B0
, i.e. the
V
identity map written with respect to BV and BV0 = ϕ−1 (BW ), since a matrix
[T ]B B B B0 B B B −1
B becomes [I]B0 [T ]B [I]B = [I]B0 [T ]B ([I]B0 ) .
In other words, two group representations are equivalent if for every
g ∈ G, ρ(g) and σ(g) are similar matrices, with respect to the same similarity
matrix for every g.

3.2 Subrepresentations
We can try to relax the conditions in the definition of equivalence. Let us
assume we have an injective linear map i : W ,→ V . Then we cannot simply
construct î as we did ϕ̂, since elements of V do not necessarily have a pre-
image in W (if i is not surjective). However we can insist that the elements
we care about do.

18
In order to make the following construction, we have to restrict to W
actually being a subspace of V (unless we want a mountain of notation and
complications). Since an injective map i : W ,→ V has W ∼= Im i ≤ V and
we have just dealt with isomorphism in the previous section, this is not a
major issue.

Definition 3.12. Let ρ : G → GL(V ) and W ≤ V . We say that W is a


G-invariant subspace if ρ(G)(W ) ≤ W , i.e. for all g ∈ G, w ∈ W we have
ρ(g)(w) ∈ W .

Here, ρ(G) is convenient alternative notation for Im ρ.

Definition 3.13. Let ρ : G → GL(V ) be a representation of G and let


W ≤ V be a G-invariant subspace. Let i : W ,→ V be the associated
injective linear map given by i(w) = w for all w ∈ W .
We define î : Im ρ → GL(W ), î(T ) = i−1 ◦ T ◦ i and ρW : G → GL(W ),
ρW = î ◦ ρ. We say that ρW is the subrepresentation of ρ associated to W .

Note that the condition of W being G-invariant is taken with respect


to the representation ρ : G → GL(V ); it does not only depend on V , in
particular. For a given representation, not every subspace of V gives rise
to a subrepresentation. Conversely, a subspace W of V might give rise to
a subrepresentation with respect to ρ but not with respect to a different
representation τ .
In particular, some representations only have the subrepresentations
coming from W = 0 and W = V (noting that any representation has these
subrepresentations; exercise). Such representations are called irreducible. If
this seems complicated, it is. Part of what we are aiming for is to swap
representations for objects where these notions are more natural.

3.3 Quotient representations


These should be associated to surjections V  W . But rather than work
this out in detail for representations, we will wait until we have our better
language to do this.

3.4 Direct sums


Given V and W , we may form their direct sum, V ⊕ W . We have a natural
map δ : GL(V ) × GL(W ) → GL(V ⊕ W ), given by δ(T, U ) = T ⊕ U . The
direct sum of linear maps is (hopefully) what you would expect: it sends
(v, w) ∈ V ⊕ W to (T v, U w); onmatrices this is the “block diagonal sum”
operation, with M ⊕ N = M 0
0 N .
Also, let ∆ : G → G × G be the homomorphism ∆(g) = (g, g); this is
often called the diagonal map.

19
Definition 3.14. Let ρ : G → GL(V ), τ : G → GL(W ) be representations
of G. The direct sum ρ ⊕ τ is the representation ρ ⊕ τ : G → GL(V ⊕ W )
given by ρ ⊕ τ = δ ◦ (ρ × τ ) ◦ ∆.

That is,

(ρ⊕τ )(g) = (δ◦(ρ×τ )◦∆)(g) = δ◦(ρ×τ )(g, g) = δ(ρ(g), τ (g)) = ρ(g)⊕τ (g).

At this point, we will stop our study of representations (or at least, stop
doing it like this!).

4 Algebras
In this section, we will take a step back, to introduce some very general
notions, before specialising again to the context we are interested in.
When we looked at group actions, we passed to linear representations
by putting extra structure on the set being acted on. However we could
equally ask what would be the right thing to do if we had more than just
a group. In this section we will introduce modules over rings and algebras.
The motivation for doing so—beyond its intrinsic value—will have to wait
a short while, until we are ready to bring our two pictures together again.
Recall that a ring R = (R, +, ×) is an Abelian group equipped with a
compatible associative multiplication. The notion corresponding to a group
acting on a set for rings is that of an R-module. To emphasise the similarity,
we will again use the . symbol; as a map, . will take a pair (r, m) to r . m.

Definition 4.1. Let R be a ring and M an Abelian group. We say that the
pair (M, .) is a left R-module if . : R × M → M satisfies

(a) (r + s) . m = (r . m) + (s . m) for all r, s ∈ R, m ∈ M ;

(b) r . (m + n) = (r . m) + (r . n) for all r ∈ R, m, n ∈ M ; and

(c) r . (s . m) = (r × s) . m for all r, s ∈ R, m ∈ M .

If R is a unital ring with multiplicative identity 1R , we ask in addition that


1R . m = m for all m ∈ M .

An algebra is then an algebraic structure which is both a module and a


ring in its own right, with the two structures being compatible. We will make
this our definition, in the spirit of giving the most categorical characteriza-
tion as the definition, but immediately give some more concrete equivalent
formulations. We will also restrict ourselves to algebras over commutative
rings, not least since very shortly we will specialise further and work over a
field.

20
Definition 4.2. Let R be a commutative ring. We say that A = (A, +, ×)
is an R-algebra if A is an R-module, A is a ring with respect to + and ×
and the two structures are compatible.

The centre of a ring A, Z(A), is the set of elements of A that commute


(multiplicatively) with every element of A:

Z(A) = {z ∈ A | za = az ∀a ∈ A}.

Then, for R commutative, an equivalent formulation of the definition of an


R-algebra A is to specify a pair (A, ϕ) where A is a ring and ϕ : R → Z(A)
is a ring homomorphism.
Either of these definitions works if R is in fact a field, K. However, K-
modules are actually something familiar: looking carefully at the respective
lists of axioms, we see that M is a K-module if and only if M is a K-vector
space. Indeed, there is a stronger categorical assertion to be made, when we
have the language.
Then we see that a K-algebra is a K-vector space that is also a ring, in
a compatible way. (Or, if you prefer, a ring that is also a K-vector space.)
In order to express this more explicitly, in part to help line up with other
authors’ definitions, we take a brief detour to introduce tensor products.

4.1 Interlude—tensor products of vector spaces


Our definition of the tensor product of two vector spaces will be given by
formalising the following idea: given V , W K-vector spaces, their tensor
product V ⊗W should be “the” linearization of the Cartesian product V ×W
(as sets or Abelian groups), to form a new K-vector space.

Definition 4.3. Let V and W be K-vector spaces. The pair (X, ⊗) is said
to be the tensor product of V and W if

(a) X is a K-vector space;

(b) ⊗ : V × W → X is a bilinear map; and

(c) (universal property) for every vector space Z and bilinear map h : V ×
W → Z, there exists a unique linear map h̃ : X → Z such that h =
h̃ ◦ ⊗.

V ×W X
h

Z
That is, every bilinear map h : V × W → Z factors through ⊗. A sig-
nificant advantage of this definition is that it immediately follows that the

21
vector space X is uniquely determined up to isomorphism. For if (X 0 , ⊗0 ) is
another pair satisfying the conditions of the definition, putting Z = X 0 we
have ⊗0 = ⊗ ˜0 ◦⊗ and conversely via the universal property for ⊗0 , ⊗ = ⊗◦⊗
˜ 0.
Hence ⊗ ˜ 0 ˜
˜ ◦ ⊗ = idX and ⊗ ◦ ⊗0 ˜ = idX 0 are inverse isomorphisms.
A significant disadvantage it that it is not immediate that such a pair
(X, ⊗) exists. This is frequently the case with universal property definitions:
one also needs to construct a model. (One suffices since all models will be
isomorphic.)
The standard model of the tensor product is given by taking the vector
space spanned by all symbols v ⊗ w with v ∈ V and w ∈ W , and then
imposing on this the relations which give ⊗ the bilinearity properties we
want. More formally let T denote the vector space spanned by the set of
symbols {v ⊗ w | v ∈ V, w ∈ W }. Let I be the subspace of T spanned by
the following elements:

(v1 + v2 ) ⊗ w − v1 ⊗ w − v2 ⊗ w
v ⊗ (w1 + w2 ) − v ⊗ w1 − v ⊗ w2
(λv) ⊗ w − λ(v ⊗ w)
v ⊗ (µw) − µ(v ⊗ w)

where v, v1 , v2 ∈ V , w, w1 , w2 ∈ W and λ, µ ∈ K. Then define V ⊗W = T /I,


the quotient vector space. We abuse/overload notation by writing v ⊗ w for
v ⊗ w + I.
It is not hard to show that if BV is a basis for V and BW a basis for W ,
a basis for V ⊗ W is given by {b ⊗ c | b ∈ BV , c ∈ BW }.
There is much more one could say about the tensor product but we will
end the interlude by remarking that we have (deliberately) given a definition
that extends, essentially without modification, from K-modules (i.e. vector
spaces) to R-modules.

4.2 Algebras, continued


We now return to the alternative description of a K-algebra. In this setup,
(A, m, u) is a K-algebra if A is a K-vector space (i.e. a K-module) and
m : A ⊗ A → A and u : K → A are K-linear maps and these are such that
the following diagrams commute:

(a) (associativity)
m⊗id
A⊗A⊗A A⊗A
id⊗m m

A⊗A m A

22
(b) (unitarity)
A⊗A
u⊗id id⊗u

K⊗A m A⊗K


= ∼
=
A

Here, the maps marked “∼ =” are the canonical maps sending λ ⊗ a and a ⊗ λ
to λa.
The map m is the linear map encoding multiplication, ×, and u is a map
encoding the existence of a multiplicative identity 1A (by “importing” 1K
into A).
We also have maps between algebras.

Definition 4.4. Let (A, mA , uA ) and (B, mB , uB ) be K-algebras. A homo-


morphism of K-algebras is a K-linear map f : A → B such that the following
diagrams commute:

(a) (multiplication preserved)

f ⊗f
A⊗A B⊗B
mA mB

A f
B

(b) (unit preserved)


f
A B
uA uB
K

Algebras have representations, analogous to the case of groups.

Definition 4.5. Let V be a K-vector space. The endomorphism algebra of


V , EndK (V ), is the K-algebra with vector space EndK (V ) = HomVectK (V, V ),
m = ◦ (composition of linear maps) and u(λ) = λIV for all λ ∈ K.

Exercise 4.6. Verify the claim implicit in the definition.

Definition 4.7. Let A = (A, m, u) be a K-algebra and V a K-vector space.


A (K-linear) representation of A on V is defined to be an algebra homo-
morphism ρ : A → EndK (V ). Denote the collection of representations of A
on V by RepK (A, V ).

23
We can define equivalence of representations, subrepresentations and
direct sums completely analogously to the group case. However we take a
slightly different route at this point, heading towards our main theoretical
results, tying all our definitions together.
Since a K-algebra A = (A, m, u) is a ring, we can already talk about
A-modules. So, a left A-module (M, .) is an Abelian group M with a map
. : A × M → M satisfying the conditions of Definition 4.1. Now, since we
require u(1K ) = 1A to satisfy 1A . m = m, we may extend this by defining
λm = u(λ).m and hence make M a K-vector space. The additivity required
of . means that it becomes a K-linear map . : A ⊗ M → M .
Thus, in a similar spirit to above, we may give an equivalent definition
of an A-module as (M, .) with M a K-vector space and . : A ⊗ M → M
being such that the following diagrams commute:

(a) (. compatible with m)

m⊗id
A⊗A⊗M A⊗M
id⊗. .

A⊗M . M

(b) (. compatible with u)

u⊗id
K⊗M A⊗M

= .
M

where the map marked “∼


=” is that given by s : K ⊗ M → M , s(λ ⊗ m) =
u(λ) . m = λm.
We denote by A-Mod the collection of left A-modules.
Example 4.8. An important example is A A = (A, m), the regular (left) A-
module.
Exercise 4.9. Give a definition in the above style of a right module.

Definition 4.10. Let A = (A, m, u) be a K-algebra and let (M, .M ) and


(N, .N ) be left A-modules. A homomorphism of A-modules is a K-linear
map f : M → N such that the following diagram commutes:

id⊗f
A⊗M A⊗N
.M .N

M f
N

24
Then A-Mod is a category; indeed A-Mod is a subcategory of K-Mod,
which we know by another name.

Proposition 4.11. Let K be a field. The category of K-modules, K-Mod,


is isomorphic to the category of K-vector spaces, VectK .

Proof (sketch). Every K-vector space V is canonically a K-module (V, s),


where s : K ⊗ V → V is given by s(λ ⊗ m) = λm (as above). This is com-
patible with K-linear maps, making the latter into module homomorphisms,
giving rise to a functor from VectK to K-Mod.
Conversely there is a forgetful functor from K-Mod to VectK sending
(M, .) to M . One may check that these two functors yield an isomorphism
of categories.

We claim that representations of a K-algebra A and A-modules are “the


same thing”.

Proposition 4.12. Let A = (A, m, u) be a K-algebra and M a K-vector


space.

(a) An A-module structure (M, .) on M gives rise to a canonical repres-


entation of A on M .

(b) A representation ρ of A on M gives rise to a canonical A-module


structure on M .

(c) The constructions in (a) and (b) are mutually inverse to each other.

Proof. This uses a much more general technical result, known as Hom-⊗
adjunction. In the case we are interested in, this says that the function

ϕ : HomVectK (A ⊗ M, M ) → HomVectK (A, HomVectK (M, M ))

given by
def
. 7→ ρ. = (a 7→ La )
is an isomorphism, where La : M → M is the map La (m) = a . m. (We have
used the same name, La , as for the “left multiplication by a” map we have
seen for groups before; this is appropriate because the former generalises the
latter.) Note too that we have written the “long” name, HomVectK (M, M ),
for EndK (M ), so that elements of the right-hand side are indeed linear maps
A → EndK (M ), i.e. could be representations, if they have the right proper-
ties.
Elements of the left-hand side are linear maps A ⊗ M → M : we claim
that if . : A ⊗ M → M gives an A-module structure, then ρ. will be actually
be a representation.

25
To check this, we track through the definitions. For example, ρ. (ab) =
Lab so

ρ. (ab)(m) = Lab (m) = (ab).m = a.(b.m) = (La ◦Lb )(m) = (ρ. (a)◦ρ. (b))(m)

as required.
For the converse direction, let ρ ∈ HomVectK (A, HomVectK (M, M )). Then
define .ρ : A ⊗ M → M by .ρ (a ⊗ m) = ρ(a)(m). Here, ρ(a) is a map
M → M , so we can apply this to m to get an element of M . Similarly to
above,

a .ρ (b .ρ m) = a .ρ (ρ(b)(m)) = (ρ(a) ◦ ρ(b))(m) = ρ(ab)(m) = ab .ρ m

so that if ρ is a representation, then .ρ becomes a left action.


It is straightforward to check (so do it!) that ρ 7→ .ρ is inverse to ϕ
(which sends . to ρ. ).

Definition 4.13. Let A = (A, m, u) be a K-algebra and let ρ and ρ0 be rep-


resentations of A on V and V 0 respectively. A homomorphism of representa-
tions is a K-linear map f : V → V 0 such that for all a ∈ A, ρ0 (a)◦f = f ◦ρ(a).
Denote the set of homomorphisms from ρ to ρ0 by HomA (ρ, ρ0 ).
Let Rep(A) be the category with objects representations of A and morph-
isms homomorphisms of representations.

Theorem 4.14. The categories Rep(A) and A-Mod are equivalent.

Proof. We claim that there are functors

F : A-Mod → Rep(A), F (M, .) = ϕ(.), F f = f

and
G : Rep(A) → A-Mod, Gρ = (M, ϕ−1 (ρ)), Gf = f
where ρ : A → EndK (M ).
Proposition 4.12 asserts that F G = IRep(A) and GF = IA-Mod on objects
and this is clearly the case on morphisms.

It is worth unpacking the perhaps odd claim that F f = f and Gf =


f are valid choices. If f : M → N is an A-module homomorphism, then
f : M → N is a K-linear map such that f ◦ .M = .N ◦ (id ⊗ f ). We should

26
check that ρ.N ◦ f = f ◦ ρ.M (a):
(ρ.N (a) ◦ f )(m) = ρ.N (a)(f (m))
a .N f (m)
= .N (a ⊗ f (m))
= (.N ◦ (id ⊗ f ))(a ⊗ m)
(∗)
= (f ◦ .M )(a ⊗ m)
= f (a .M m)
= f (ρ.M (a)(m))
= (f ◦ ρ.M (a))(m)
as required. The check for Gf = f (showing that the homomorphism of
representations condition translates to the homomorphism of modules con-
dition) is achieved by breaking the sequence of equalities at (∗) and noting
that then the assumption is the equality of the two ends. That is, one turns
the proof inside out!
In fact we have shown isomorphism of the categories—it is unusual one
can say something so strong.
This is the mathematical sense in which representations of A are the
same as A-modules—but even more, homomorphisms of representations are
the same as homomorphisms of modules.
To bring this back to the representation theory of groups, we next show
that there is an algebra whose representations correspond to linear repres-
entations of the group, and vice versa.

5 The group algebra and group of units functors


We are going to describe two recipes, one for turning a group into an algebra
and the other an algebra into a group. (Strictly speaking, this isn’t correct
but the formal statements coming soon will make more sense if we have this
as the rough idea in our heads.)
The easier direction is getting a group from an algebra, as follows.
Definition 5.1. Let A = (A, m, u) be an algebra. The group of units of A,
A× , is the group with underlying set
A× = {a ∈ A | ∃ b ∈ A such that ab = 1A = ba}
(i.e. the set of elements of A having a two-sided multiplicative inverse) and
group operation m|A× ×A× (i.e. the algebra multiplication, restricted to ele-
ments of A× ).
That this is a group is straightforward: associativity is inherited from
m, 1A = u(1K ) is the identity element and we have inverses by construction.
In fact, this construction respects homomorphisms:

27
Proposition 5.2. There is a functor −× : AlgK → Grp given on objects
by −× (A) = A× and on morphisms by −× (f ) = f |A× for f : A → B.
Proof. We first check that −× is well-defined: this follows since if f : A → B
and a ∈ A× , f (a) ∈ B × because f is an algebra map. In more detail, we
may apply f to ab = 1A = ba to obtain
f (a)f (b) = f (ab) = f (1A ) = 1B = f (1A ) = f (ba) = f (b)f (a)
as required.
Clearly −× (idA ) = idA |A× = idA× so it remains to check composition.
Let f : A → B and g : B → C. Since (g ◦ f )|A× = g|B × ◦ f |A× , −× is a
functor.
To produce an algebra from a group, we first think about how we would
produce a vector space from a set. Well, we know how to do this: take the
K-linear span of the set (“freely”). Then the set we started with is (by con-
struction) a basis for the vector space. We could call this the “linearisation”
of the set.
Definition 5.3. Let K be a field and S a set. The linearisation of S, K[S],
is the K-vector space of K-linear combinations of elements of S.
If we wanted to be fancy (and why not?), we could go further and talk
about a functor K[−] : Set → BVectK from sets to the category of based
vector spaces whose objects are pairs (V, B) of a vector space V and a basis
B for V . The reason for using based vector spaces is that there is then a
natural forgetful functor b : BVectK → Set with b(V, B) = B, which has a
special relationship with K[−], namely forming an adjoint pair.
To do this carefully we would need to check how functions transform
under K[−], but instead we just note that the fact that linear maps are
determined by their values on a basis is precisely what is needed; we need
this fact shortly so let us give the construction a name.
Definition 5.4. Let f : S → T be a function. We define f lin : K[S] → K[T ]
to be the unique K-linear extension of f .
The next step is to show that if G is a group and not just a set, then K[G]
can be made into an algebra. We only have one bit of unused information—
the group operation, which we know is associative—so to do something
natural and general, we will have to use this to define multiplication. (We
said K[G] is a vector space so it has that as addition.)
Proposition 5.5. Let G = (G, ·) be a group. Then the maps
mG : K[G] ⊗ K[G] → K[G], mG (g ⊗ h) = g · h
and
uG : K → K[G], uG (λ) = λeG
give K[G] the structure of a K-algebra.

28
Notice that we have been lazy in defining mG , only giving its values on
a basis. As we will want this later, let us write the P most general version.
First, recall that elements of K[G] have the form g∈G αg g with αg ∈ K
(where if G is infinite, we insist that all but finitely many of the αg are zero).
We then have
 
 
X X X X X 
mG ( αg g) ⊗ ( βh h) = αg βh (g · h) = 
 α g βh
k

g∈G h∈G g,h∈G k∈G g,h∈G
g·h=k

However, working on the basis will suffice for this proof, as long as we are
happy to accept that multiplication on the basis can be extended to linear
combinations in such a way that distributivity holds (it can).
Proof. Associativity follows from associativity in G:
(mG ◦ (mG ⊗ id))(g ⊗ h ⊗ k) = mG ((g · h) ⊗ k)
= (g · h) · k
= g · (h · k)
= mG (g ⊗ (h · k))
= (mG ◦ (id ⊗ mG ))(g ⊗ h ⊗ k).
For the unitarity,
(mG ◦ (uG ⊗ id))(λ ⊗ g) = mG (λeG ⊗ g)
= λmG (eG ⊗ g)
= λ(eG · g)
= λg
and similarly for mG ◦ (id ⊗ uG ).
Definition. Let G be a group and K a field. Then K[G] = (K[G], mG , uG )
is called the group algebra of G over K.
Proposition 5.6. There is a functor K[−] : Grp → AlgK given on objects
by K[−](G) = K[G] and on morphisms by K[−](f ) = f lin .
Proof. We need to show that if f : G → H is a group homomorphism, then
f lin : K[G] → K[H] is an algebra homomorphism: we have linearity by con-
struction and
f lin (gh) = f (g ·G h) = f (g) ·H f (h) = f lin (g)f lin (h).
Clearly idlin
G = idK[G] since

idlin
G (g) = idG (g) = g = idK[G] (g)

and G ⊆ K[G] is a basis. We can also check compositionality on the basis.

29
Now we have the results we have been aiming at.
Theorem 5.7. The functors K[−] : Set  AlgK : −× form an adjoint pair.
Proof. The claim is that for any G ∈ Grp and A ∈ AlgK , we have a bijection
of sets
αG,A : HomAlgK (K[G], A) → HomGrp (G, A× )
that is functorial in G and A, i.e.
(a) for f : G → H,
αG,A
HomAlgK (K[G], A) HomGrp (G, A× )
−◦K[f ] −◦f

HomAlgK (K[H], A) αH,A HomGrp (H, A× )

(b) for p : A → B,
αG,A
HomAlgK (K[G], A) HomGrp (G, A× )
p◦− p× ◦−

HomAlgK (K[G], B) αG,B HomGrp (G, B × )

The key idea is to try to define αG,A (f : K[G] → A) = f |G . To help keep


track of things we will write this slightly differently. Recall that if X ⊆ Y
we have the injective function ιX : X → Y , ιX (x) = x and then if f : Y → Z
is a function, f |X = f ◦ ιX .
However, f ◦ ιG : G → A has the wrong codomain—A rather than A× .
Fortunately(!), G ⊆ K[G]× (and in fact, G is a subgroup6 of K[G]× ) so
f (G) ⊆ A× , since f is an algebra homomorphism. So we can restrict the
codomain of f ◦ ιG to A× .
That is, define
×
αG,A (f ) = (f ◦ ιG )|A : G → A× .

Then ιG and f being algebra homomorphisms and the definition of mG


imply that αG,A (f ) ∈ HomGrp (G, A× ).
We claim that αG,A is a bijection. First assume αG,A (f ) = αG,A (f 0 ) for
× ×
f, f 0 ∈ HomAlgK (K[G], A). Then (f ◦ ιG )|A = (f 0 ◦ ιG )|A , i.e. for all g ∈ G,
f (g) = f 0 (g). Since f and f 0 then coincide on a basis for K[G], they are
equal. So αG,A is injective.
6
There is very deep mathematics around here. A long-standing conjecture of Higman
from 1940 (popularized by Kaplansky and so known—erroneously—as Kaplansky’s unit
conjecture) asserted that K[G]× = K×G. However a counterexample was given by Gardam
in 2021.

30
Now, if h : G → A× , we have K[h] : K[G] → K[A× ]. But K[A× ] ≤ A
since A× ⊆ A and A is a vector space, so K[A× ] = spanK (A× ) ≤ A (as
vector spaces). Let ιA denote the corresponding map ιA : K[A× ] → A. We
claim that αG,A (ιA ◦ K[h]) = h: we have that for all g ∈ G,

αG,A (ιA ◦ K[h])(g) = ιA ◦ K[h]|G (g) = ιA (h(g)) = h(g),

as required. So αG,A is surjective and hence bijective.


Equivalently, let

βG,A : HomGrp (G, A× ) → HomAlgK (K[G], A), βG,A (f ) = ιA ◦ K[f ]

Then one may check that αG,A and βG,A are inverse to each other.
For (a), we compute: for all g ∈ G, f : G → H and h ∈ HomAlgK (K[H], A),

αG,A ((− ◦ K[f ]))(h)(g) = αG,A (h ◦ K[f ])(g)


= (h ◦ K[f ])|G (g)
= (h ◦ f )(g)

and

(− ◦ f )(αH,A (h))(g) = (− ◦ f )(h|G (g))


= (− ◦ f )(h(g))
= (h ◦ f )(g).

For (b), for all g ∈ G, p : A → B and f : HomAlgK (K[G], A),

αG,B (p ◦ −)(f )(g) = αG,B (p ◦ f )(g)


= (p ◦ f )|G (g)
= (p ◦ f )(g)

and

(p× ◦ −)(αG,A (f ))(g) = (p× ◦ f |G )(g)


= (p ◦ f )(g)

as required.

Corollary 5.8. For any G ∈ Grp and V ∈ VectK , there is a natural


bijection

=
HomAlgK (K[G], EndK (V )) → HomGrp (G, GL(V )).

Proof. Simply recall that EndK (V )× = GL(V ).

31
The left-hand side here is (by definition) RepK (K[G], V ) and the right-
hand side is (again by definition) RepK (G, V ). The above bijection between
these exists for any V and is natural, i.e. if V and W are two different vector
spaces, the corresponding bijections are compatible with homomorphisms
between V and W . So these bijections “fit together” to give an isomorphism
of categories between RepK (K[G]) and RepK (G).
Since we have also seen that, as categories, RepK (K[G]) ∼ = K[G]-Mod,
we have
 
Linear representations of G correspond to K[G]-modules.
 
This motivates our next chapter, studying modules and their structure.
We will begin by doing this for general algebras A and then prove some
important results that tell us that special things happen when A = K[G] for
G a finite group.
Example 5.9. Let G be a group acting on a set Ω via .. Then we can linearize
the action to a linear representation: let K[Ω] be the K-vector space with
basis Ω and define ρ : G → GL(K[Ω]) by ρ(g)(x) = g . x, extended linearly,
i.e. ρ(g) = (g . −)lin .
This representation corresponds to the K[G]-module (K[Ω], ρlin ). Fol-
lowing through the definitions, we see that the module structure map here
is really just the linearization of the original action, ..
This module is known as the permutation module associated to the action
of G on Ω.
Any algebra has a centre,

Z(A) = {z ∈ A | za = az ∀ a ∈ A},

consisting of all the elements z of A that commute with every element of A.


Exercise 5.10. Show that Z(A) is a subalgebra of A.
When G is a finite group, the centre of its group algebra has a particularly
nice description, as follows. First, recall that a group G is partitioned into
its conjugacy classes: these are precisely the orbits for the action of G on
itself via g . h = ghg −1 , or more explicitly, the conjugacy class containing
h ∈ G is
C(h) = {ghg −1 | g ∈ G}.

Proposition 5.11. Let G be a finite group and enumerate the conjugacy


def P
classes of G as {Ci | 1 ≤ i ≤ r}. Define C i = g∈Ci g, the class sum, as
an element of K[G]. Then {C i | 1 ≤ i ≤ r} is a basis for Z(K[G]), which in
particular has dimension r.

Proof. We first show that the class sums are central. Fixing g ∈ Ci , we
may choose elements yj ∈ G (1 ≤ j ≤ m = |Ci |) so that we may write

32
Ci = {yj−1 gyj | 1 ≤ j ≤ m}, and hence C i = m −1
P
j=1 yj gyj . Then for all
h ∈ G,
Xm Xm
−1 −1 −1
h Cih = h yj gyj h = (yj h)−1 g(yj h) = C i
j=1 j=1

since Ci is a conjugacy class, so as j runs from 1 to m, (yj h)−1 g(yj h) will


run through Ci , since yj−1 gyj does. So C i h = hC i and C i is central in K[G]:
by linearity, it suffices for C i to commute with elements of G.
Now {C i | 1 ≤ i ≤ r} is a linearly independent set, by P the disjointness of
conjugacy classes. This set also spans Z(K[G]): let z = g∈G λg g ∈ K[G].
Then for any h ∈ G, h−1 zh = z so g∈G λg h−1 gh =
P P
g∈G λg g. Hence
for every h ∈ G the coefficient λg of g in z is the same as λh−1 gh , i.e. that
of h−1
Pgh. So the function g → λg is constant on conjugacy classes and so
z = ri=1 λi C i , with λi the coefficient of any g ∈ Ci .

Example 5.12. A basis for Z(K[S3 ]) is

{ι, (1 2) + (1 3) + (2 3), (1 2 3) + (1 3 2)}

and dim Z(K[S3 ]) = 3 (compared to dim K[S3 ] = |S3 | = 6).

6 Modules
In this section, A = (A, mA , uA ) will be a K-algebra.
Recall from Section 4.2 that M = (M, .) is a left A-module if M is a
K-vector space and . : A ⊗ M → M is such that . ◦ (mA ⊗ id) = . ◦ (id ⊗ .)
and . ◦ (uA ⊗ id) = s.
We will take a categorical approach and express as many concepts in
terms of homomorphisms of modules as we can. So recall too that if (M, .M )
and (N, .N ) are A-modules, f : M → N is a homomorphism of modules if
f is a K-linear map such that f ◦ .M = .N ◦ (id ⊗ f ). An isomorphism of
modules is a bijective homomorphism.
We start with the notion of a submodule.
Definition 6.1. Let M = (M, .) be an A-module. Let N be a subspace
of M such that .(A ⊗ N ) ≤ N (that is, for all a ∈ A, n ∈ N , we have
a . n ∈ N ). Then (N, .|A⊗N ) is called a submodule of M .
We will mildly abuse notation by writing . for .|A⊗N .
One should check that (N, .) is an A-module, i.e. that the required prop-
erties for .|A⊗N follow from them being satisfied by .; essentially, we can
just observe that the conditions can be checked element-wise.
Example 6.2. The zero module is the module ({0}, . = 0) with a . 0 = 0
for all a ∈ A. We will usually save ourselves some writing by calling this
module 0 rather than {0}. Any other (sub)module is called non-zero.

33
Every non-zero module M has at least two submodules, 0 and M itself.
(The zero module has just one, of course.) Any submodule of M not equal
to M is called proper.
Example 6.3. The trivial module (not to be confused with the zero module!)
is the module (K, τ ) with τ : A ⊗ K → K, τ (a ⊗ λ) = λ for all a ∈ A, λ ∈ K.
Exercise 6.4. Check that the above definition of the trivial module is indeed
an A-module structure.
Note that under the correspondence between representations and mod-
ules, subrepresentations correspond to submodules.
This definition is “not very categorical”: if N 0 ∼
= N , N being a submod-
ule of M does not imply that N 0 is. The reason is that N is actually a subset
of M , whereas we only know that N 0 is in bijection with one. Sometimes
we really care about actual subobjects but within A-Mod, we should relax
this and consider morphisms that are monic. In A-Mod, this is equivalent
to being an injective homomorphism.

Lemma 6.5. Let ι : N → M be an injective A-module homomorphism.


Then N is isomorphic to a submodule of M , namely Im ι.
Conversely, if N is a submodule of M , there is an injective A-module
homomorphism ι : N → M given by ι(n) = n.

We said that quotient representations were awkward to define. One


reason for studying modules instead is that quotients are much easier to
treat, as follows.

Definition 6.6. Let M = (M, .) be an A-module and N = (N, .) a sub-


module of M . The quotient module M/N has underlying vector space M/N
and module structure map .̄ : A⊗(M/N ) → M/N given by .̄(a⊗(m+N )) =
(a . m) + N .

Notice that we do not need to ask for anything extra on N —just that it
is a submodule—unlike groups or rings, where we need a normal subgroup
and an ideal respectively. This is mainly due to the fact that any subgroup
of an Abelian group is normal, and so quotients are defined for any subspace
of a vector space.
We show the first module property for .̄:

(.̄ ◦ (mA ⊗ id))(a ⊗ b ⊗ (m + N )) = .̄(ab ⊗ (m + N ))


= (ab . m) + N
= (a . (b . m)) + N
= .̄(a ⊗ ((b . m) + N ))
= .̄(id ⊗ .̄)(a ⊗ b ⊗ (m + N ))
= .̄ ◦ (id ⊗ .̄)(a ⊗ b ⊗ (m + N ))

34
As with submodules, being a quotient is not invariant under isomorph-
ism. This time, the “right” generalisation is to consider epis, which in
A-Mod is equivalent to being a surjective homomorphism.
Lemma 6.7. Let π : M → N be a surjective A-module homomorphism.
Then N is isomorphic to a quotient module of M , namely M/ Ker π.
Conversely, if M/N is a quotient module of M by a submodule N , there
is a surjective A-module homomorphism π : M → M/N given by π(m) =
m + N.
Since Ker π is a submodule of M (exercise), there is an associated in-
jective homomorphism ι so that we could write
ι π
Ker π M N
At M , we have π ◦ ι = 0 (everything in Im ι = Ker π is sent to 0 by π). Now
ι being injective means that if we write 0 : 0 → Ker π for the (unique) map
sending 0 to 0, we have Ker ι = 0 = Im 0. Similarly, writing 0 : N → 0 for the
(unique) map defined by 0(n) = 0 for all n ∈ N , we have Im π = N = Ker 0;
this is equivalent to π being surjective. We usually suppress the labels for
zero maps and write all of this as
ι π
0 Ker π M N 0
noting that at each object X, Im( → X) = Ker(X → ). We call this
equality of the image of the incoming map with the kernel of the outgoing
map exactness at X. Replacing Ker π by anything isomorphic to it, we
obtain the following definition.
Definition 6.8. Let K, M and N be A-modules.
If f : K → M and g : M → N are module homomorphisms such that f
is injective, g is surjective and Im f = Ker g, then we say that

0 K M N 0
is a short exact sequence of A-modules.
It is called “short” because, well, there are long ones too. Short exact
sequences are sometimes called extensions, because they describe a way to
“extend” K by N to form a bigger module M , having a submodule iso-
morphic to K with quotient isomorphic to N .
Some short exact sequences are special: this is when we can actually find
a submodule of M isomorphic to N too.
Definition 6.9. A short exact sequence
f g
0 K M N 0
of A-modules is split if there exists an A-module homomorphism u : N → M
such that g ◦ u = idN .

35
Note that u is necessarily injective. In nice categories—the correct name
is Abelian categories—such as A-Mod, split short exact sequences arise in a
particularly natural way.
Definition 6.10. Let M = (M, .M ) and N = (N, .N ) be A-modules. Then
the direct sum of vector spaces M ⊕ N is given an A-module structure via

.⊕ : A ⊗ (M ⊕ N ) → M ⊕ N, .⊕ (a ⊗ (m + n)) = a .M m + a .N n

Then (M ⊕ N, .⊕ ) is called the direct sum of M and N .


We have two canonical injective maps ιM : M → M ⊕ N , ιM (m) = m + 0
and ιN : N → M ⊕ N , ιN (n) = 0 + n, and two canonical surjective maps
πM : M ⊕ N → M , πM (m + n) = m and πN : M ⊕ N → N , πN (m + n) = n.
Proposition 6.11. The following are equivalent:
f g
(a) 0 K M N 0 is a split short exact sequence
of A-modules;
f g
(b) 0 K M N 0 is a short exact sequence of
A-modules and there exists a module homomorphism t : M → K such
that t ◦ f = idK (such a map t is necessarily surjective); and

(c) there is an isomorphism h : M → K ⊕ N such that h ◦ f = ιK is the


canonical injective map and g ◦ h−1 = πN is the canonical surjective
map.
Proof. Exercise. Hint: for (c) implies (a), consider
f g
0 K M N 0
h
ιK ιN
0 K K ⊕N N 0
πK πN

Then show that setting u = h−1 ◦ ιN we have a map showing that the
sequence splits.

This is the starting point for the construction of an object that classi-
fies extensions of modules. The details would require too much time and
technicalities beyond us at this point, but the idea is accessible: take all
short exact sequences starting with K and ending with N , define an equi-
valence relation on these and hence construct a group Ext1 (K, N ), in which
[0 → K → K ⊕ N → N → 0] is the identity element. In particular,
Ext1 (K, N ) = {0} if and only if every extension of K by N is split (i.e. the
only extension is the trivial one, K ⊕ N ).

36
In some sense, one major goal of representation theory is to understand
Hom(K, N ), Ext1 (K, N ) and the higher extension groups Exti (K, N ) for all
modules K, N for some given algebra. In fact, there is usually a subsidiary
goal before this, namely to understand all of the “fundamental pieces” from
which we can build other modules via extensions. This is what we now turn
our attention to.

Definition 6.12. An A-module M is simple if the only submodules of M


are 0 and M .

Remark 6.13 (Important!). Simple modules are also called irreducible7 , al-
though this term tends to be used more for the corresponding representation.
That is, we say a representation is irreducible if it has only the zero subrep-
resentation and itself as subrepresentations. Then it is common to shorten
“irreducible representation” to “irrep”. As we are taking a module-theoretic
approach, we will say “simple module”.
We can glue simples together:

Definition 6.14. A module M is called semisimple if it is isomorphic to a


(not necessarily finite8 ) direct sum of simple modules.

This notion is also called “completely reducible” (usually when the term
irreducible is being used, rather than simple). This deserves a name because
in general not every module is semisimple—this is not meant to be obvious,
but it is (very much) true.

Lemma 6.15. A left A-module M is semisimple if and only if every sub-


module of M is a direct summand. Hence, every non-zero submodule and
non-zero quotient module of a semisimple module is again semisimple.

Proof. We will omit this proof, as in full generality (to deal with the “not
necessarily finite” cases) it requires some more advanced techniques. Those
who are interested will find this result in [Rot, Section 4.1] or [EH, Sec-
tion 4.1].

An algebra A is called semisimple if the regular module A A is semisimple.


An Abelian category in which every object is semisimple is itself called
semisimple.
Simple modules deserve to be considered as building blocks for arbitrary
modules for the following reason.
7
For example, in previous
L iterations of this course.
8
An infinite direct sum i∈I Vi has as elements sequences (v1 , v2 , · · · ) with vi ∈ Vi and
all but finitely many vi zero. The vector space operations are defined component-wise, as
for finite direct sums.

37
Definition 6.16. Let M be an A-module. A composition series for M is a
sequence of submodules

0 = M0 ≤ M1 ≤ M2 ≤ · · · ≤ Mr−1 ≤ Mr = M

such that Mi+1 /Mi is simple for all 0 ≤ i ≤ r − 1. We call r the length of
the composition series.

Lemma 6.17. Every finite-dimensional A-module has a composition series.

Proof (sketch). Work by (strong) induction on dimension. If dim M = 0,


there is nothing to do. If dim M > 1 and M is simple, 0 ≤ M is a composi-
tion series (M/0 ∼= M ).
Otherwise let N be a maximal proper submodule of M (i.e. N < M
and if there exists P such that N ≤ P ≤ M then P = N or P = M );
finite-dimensionality of M ensures such an N exists. Then M/N is simple.
But dim N < dim M so by the inductive hypothesis N has a composition
series 0 ≤ N1 ≤ · · · ≤ Nr = N , meaning that 0 ≤ N1 ≤ · · · ≤ Nr = N < M
is a composition series for M .

The fundamental theorem relating to composition series is the following.

Theorem 6.18 (Jordan–Hölder). Let M be an A-module. If M has two


composition series

0 ≤ M1 ≤ · · · ≤ Mr−1 ≤ Mr = M
0 ≤ N1 ≤ · · · ≤ Ns−1 ≤ Ns = M

then r = s and there exists a permutation σ ∈ Sr such that Mi /Mi−1 ∼


=
Nσ(i) /Nσ(i)−1 for all i.

Since then any two composition series for a module M have the same
length, we may say that M has (well-defined) length, the length of any
composition series for it.
Now, if M has a composition series

0 ≤ M1 ≤ · · · ≤ Mr−1 ≤ Mr = M

we have

• M1 is simple;

• 0 → M1 → M2 → M2 /M1 → 0 is a short exact sequence with M2 /M1


also simple;

38
• 0 → M2 → M3 → M3 /M2 → 0 is a short exact sequence with M3 /M2
also simple;

• ...

• 0 → Mr−1 → Mr = M → M/Mr−1 → 0 is a short exact sequence with


M/Mr−1 also simple.

In other words, M is an iterated extension by simple modules.


The following is a foundational result—it is called a lemma only because
its proof is relatively straightforward, rather than because it is insignificant.
Recall that a division ring is a ring such that R× = R \ {0}, i.e. one in which
every non-zero element has a multiplicative inverse. If A is a K-algebra, we
say A is a division algebra if as a ring, it is a division ring (i.e. A× = A\{0}).

Lemma 6.19 (Schur’s lemma). Let S be a simple A-module. Then every


non-zero homomorphism in HomA-Mod (S, M ) is injective and every non-
def
zero homomorphism in HomA-Mod (M, S) is surjective. Hence EndA (S) =
HomA-Mod (S, S) is a division K-algebra.

Proof. If f ∈ HomA-Mod (S, M ) and f 6= 0, then Ker f ≤ S and Ker f 6= S


so, as S is simple, we must have Ker f = 0 and f is injective.
Similarly, if g ∈ HomA-Mod (M, S) and g 6= 0, then Im g ≤ S and Im g 6= 0
so, again as S is simple, we must have Im g = S and g is surjective.
def
Then any non-zero map in EndA (S) = HomA-Mod (S, S) is both injective
and surjective and hence is an isomorphism. That is, every non-zero element
of EndA (S) is invertible, i.e. EndA (S) is a division algebra.

The next closely-related result is also sometimes called “Schur’s lemma”:


its assumptions are stronger but its conclusion is also stronger. It is “mor-
ally” a corollary of Schur’s lemma, but our formulation makes this a little
less clear and our proof is independent of Schur’s lemma, so we just refer to
it as an extra lemma.
The proof uses the fact that HomA-Mod (M, N ) is a K-vector space, which
we leave as an exercise.

Lemma 6.20. If K is algebraically closed and S is a finite-dimensional


simple module, EndA (S) ∼ = K. In particular, any homomorphism f : S → S
is a scalar multiple of the identity, i.e. f = λidS for some λ ∈ K.

Proof. Let f ∈ EndA (S). Then since K is algebraically closed and S is


finite-dimensional, there exists 0 6= s ∈ S and λ ∈ K such that f (s) = λs
(i.e. f has an eigenvalue on S; this is what algebraic closure of K does for
us). Since λidS is also an A-module homomorphism, so is f − λidS .
Now Ker(f −λidS ) ≤ S and s ∈ Ker(f −λidS ), so this kernel is non-zero.
Since S is simple, Ker(f − λidS ) = S, i.e. f = λidS on all of S.

39
Since λidS ∈ EndA (S) for all λ ∈ K, {λidS | λ ∈ K} ⊆ EndA (S).
Conversely we have shown above that EndA (S) ⊆ {λidS | λ ∈ K}. Hence
these vector spaces are equal; note finally that {λidS | λ ∈ K} ∼
= K as
algebras.

Our primary goal shortly will therefore be to classify the simple K[G]-
modules.
Before that, though, we take what appears to be a diversion via another
approach to “building blocks”.

Definition 6.21. An A-module M is indecomposable if it cannot be writ-


ten as a direct sum of two non-zero submodules. Otherwise M is called
decomposable.

That is, we look at M and ask if we can decompose it as a direct sum in a


non-trivial way (0⊕M and M ⊕0 don’t count). If not, M is indecomposable.
If so, do it and continue in the same way. Then every module is a direct
sum of indecomposable submodules—right??
Wrong! Much more care is needed: why should this process terminate?
Indeed, for infinite-dimensional modules it need not. Fortunately, we do
have:

Proposition 6.22. Let M be a non-zero finite-dimensional A-module. Then


M has a decomposition as a finite direct sum of indecomposable submodules.

Proof (barely even a sketch). Induction on dimension.

Remark 6.23. The result remains true if we replace “finite-dimensional” by


“finite length”. (Proof: induction on length!)
This leads us to one of the cornerstone theorems9 of representation the-
ory:

Theorem 6.24 (Krull–Remak–Schmidt theorem). Let M be a non-zero


finite-dimensional A-module and let

M = M1 ⊕ M2 ⊕ · · · ⊕ Mr = N1 ⊕ N2 ⊕ · · · ⊕ Ns

be two decompositions of M into indecomposable submodules. Then r = s


and there exists σ ∈ Sr such that Mi ∼
= Nσ(i) for all i.
This should remind you of the Jordan–Hölder theorem! However—in
general (but hold the thought!)—these are not the same result in two guises.
Before we start focusing in on finite groups, we will look at three final
general notions that are important in general representation theory. First is
the definition of a free module.
9
See [Sha] for an explanation of the somewhat complicated history of the name of this
result.

40
Definition 6.25. Let M be a left A-module. We say M is a finitely gener-
ated free module if M ∼
= A A⊕n for some n ∈ N.

We say a module M is finitely generated if it is the homomorphic image of


a finitely generated free module, i.e. there exists π : A A⊕n  M for some n.
(Note that every module is the homomorphic image of a free module A A⊕I
with I not necessarily finite; take I = M , for example. So the operative
part of the definition is the finiteness.)

Definition 6.26. Let P be a left A-module. We say P is projective if the


functor HomA-Mod (P, −) : A-Mod → K-Mod is exact, i.e. for all short exact
sequences 0 → X → Y → Z → 0 in A-Mod,

0 → HomA-Mod (P, X) → HomA-Mod (P, Y ) → HomA-Mod (P, Z) → 0

is a short exact sequence.

Dually,

Definition 6.27. Let I be a left A-module. We say I is injective if the


functor HomA-Mod (−, I) : A-Mod → K-Mod is exact, i.e. for all short exact
sequences 0 → X → Y → Z → 0 in A-Mod,

0 → HomA-Mod (X, I) → HomA-Mod (Y, I) → HomA-Mod (Z, I) → 0

is a short exact sequence.

Proposition 6.28. The following are equivalent:

(a) P is projective;

(b) every epimorphism f : M  P splits, i.e. there exists s : P → M such


that f ◦ s = idP (s is a section of f );

(c) P is isomorphic to a direct summand of a free module: there exists P 0 ,


Q A-modules and a set I such that P 0 ⊕ Q = A A⊕I and P ∼ = P 0.

Proposition 6.29. The following are equivalent:

(a) I is injective;

(b) every monomorphism f : I ,→ M splits, i.e. there exists r : M → I


such that r ◦ f = idI (r is a retraction of f );

(c) I has a direct complement whenever it exists as a submodule: if I ≤ M ,


there exists Q ≤ M such that M = I ⊕ Q.

These classes of modules are particularly important when studying (finite-


dimensional) algebras.

41
Proposition 6.30. The following are equivalent:

(a) the K-algebra A is semisimple;

(b) every module in A-Mod is injective;

(c) every module in A-Mod is projective;

(d) the category A-Mod is semisimple.

Proof. We first prove (a) if and only if (d), then (b) if and only if (c) and
finally (a) if and only if (b).

• (a)⇐⇒(d)
Assume (a). Then every free module A A⊕I is also semisimple. Since
every module M ∈ A-Mod is a quotient of a free module (at worst, we
may take I to be an indexing set for a basis of M ) and quotients of
semisimple modules are semisimple, we have (d).
Conversely (d) implies (a), by definition.

• (b)⇐⇒(c)
Assume (b) and let f : M  P be a surjection. Then Ker f ≤ M is
injective so M = Ker f ⊕ Q for some Q. But by the first isomorphism
theorem, Im f = P ∼ = Q. Then this isomorphism splits f and P is
projective.
Assume (c) and let f : I ,→ M be an injection. Then Coker f = M/I
is projective so π : M → M/I splits, via r : M/I → M , and M =
Ker π ⊕ Im r. But Ker π = I so I has a complement in M and is
injective.

• (a)⇐⇒(b)
Assume (a). Then as above, any M ∈ A-Mod is a quotient of a free
module A A⊕I via π : A A⊕I  M and this free module is semisimple.
Hence A A⊕I = Ker π ⊕ Q with Q ∼
= M , so M is projective.
Assume (b). Let M be a submodule of A A. Then since A A/M is
projective, π : A A → A A/M splits and M has a direct complement.
By Lemma 6.15, A A is semisimple.

Proposition 6.31. Let A be a semisimple K-algebra. Then

(a) Every simple module for A is a direct summand of A A.


∼ Ln Si is a finite direct sum of simple modules.
(b) A A = i=1

Proof.

42
(a) Let S ∈ A-Mod be a simple A-module and let s ∈ S \ {0}. Then there
is an A-module homomorphism f : A A → S given by f (a) = a .S s for
all a ∈ A; f is linear by the properties of actions and a .S (f (b)) =
a .S (b .S s) = (ab) .S s = f (ab). Now since S is simple and s 6= 0,
Im f = S and f is surjective.
Then since A semisimple implies S is projective, f splits, i.e. S is
(isomorphic to) a direct summand of A A.
L
(b) We have that A A = i∈I Si . Now 1A ∈ A A so there exists a finite
subset
P J ⊆ I and elements sj ∈ Sj for all j ∈ J such that 1A =
j∈J sj .
P  L
But then for all a ∈ A, a = a1A = a s
j∈J j ∈ Sj . So
L L L L j∈J
j∈J Sj ⊆ i∈I Si = A A ⊆ j∈J Sj and hence A A = j∈J Sj .

Note that if A is semisimple, the simple modules generate all of A-Mod


under taking direct sums, there are no non-trivial extensions between them
and we have Schur’s lemma to tell us that any non-zero homomorphism
between two simples is an isomorphism. Thus, in a semisimple category, we
know all the representation-theoretic information as soon as we can describe
the simple modules.
For completeness, we include the following theorem, which is another of
the fundamental theorems of representation theory.

Theorem 6.32 (Artin–Wedderburn). If A is a semisimple K-algebra then


r
A∼
Y
= Mni (Di )
i=1

a product of matrix rings of dimension ni over division K-algebras Di .

Corollary 6.33. If K is algebraically closed and A is a semisimple K-algebra


then
r
A∼
Y
= Mn (K)
i
i=1

Proof. The proof of theL


Artin–Wedderburn theorem works by defining Di =

EndA (Si ), where A = r
i=1 Si for Si simple. Then by Schur’s lemma (or
more precisely Lemma 6.20), since Si is simple and K algebraically closed,
EndA (Si ) ∼
= K.

6.1 Modules for group algebras


We now return to the special case of K[G], the group algebra of a finite
group. By our previous remarks about semisimple algebras, the following
theorem tells us (essentially) everything we need to know.

43
For any field K, there is a (non-zero) ring homomorphism u : Z → K,
u(1Z ) = 1K (extended linearly: u(n1Z ) = n1K ). The kernel of Ker u is a
proper ideal of Z, therefore10 either Ker u = 0 or Ker u = nZ for some
n ∈ Z.
Now since K is a field, Im u is an integral domain and Im u ∼
= Z/nZ by
the First Isomorphism Theorem (B.144). But Z/nZ is an integral domain
if and only if n is prime. So Ker u = 0 or Ker u = pZ for some prime p.
If Ker u = 0, we say K has characteristic zero; if Ker u = pZ with p
prime, we say K has characteristic p. We denote the characteristic of K by
char K.
Theorem 6.34 (Maschke). Let G be a finite group. Then the group algebra
K[G] is semisimple if and only if char K does not divide |G|.
Note that 0 always divides |G|, i.e. over fields of characteristic zero—such
as R and C—the group algebra of a finite group is always semisimple.
Before we prove Maschke’s theorem, we need a lemma.
Lemma 6.35. Let G be a finite group and K a field. Let M and N be
K[G]-modules and let f : M → N be a K-linear map. Then
X
T (f ) : M → N, T (f )(m) = g . (f (g −1 . m))
g∈G

is a K[G]-module homomorphism.
Proof. Exercise.

Proof of Maschke’s theorem. Assume char K - |G|. Let M be a submodule


of K[G] K[G] and let N be a vector space complement to M , i.e. we have

K[G] K[G] =v.s. M ⊕ N . Let f : K[G] K[G] → M be the linear map with kernel
N . Then char K - |G| implies |G| ≡ |G|1K is invertible in K and so
def 1
γ: K[G] K[G]  M, γ = T (f )
|G|
is well-defined. By the lemma, γ is a K[G]-module homomorphism. Let
K = Ker γ.
We claim that i : M ,→ K[G] K[G] splits γ. Since f (m) = m for all m ∈ M ,
f (g −1 . m) = g −1 . m and hence g . (f (g −1 . m)) = g . g −1 . m = m for all
m ∈ M . Hence
1 X
(γ ◦ i)(m) = m=m
|G|
g∈G

as required. Then K[G] K[G] ∼ = M ⊕ K as A-modules, so by Lemma 6.15


K[G] K[G] is semisimple, hence K[G] is semisimple.
10
Since Z is a principal ideal domain; if you don’t know about this, it is fine to take the
statement in the main text on trust.

44
def P
Now assume K[G] is semisimple. Consider w = g∈G g ∈ K[G]. Then
hw = w for all h ∈ G so Kw ≤ K[G] K[G] is a 1-dimensional submodule of
K[G] K[G]. Since K[G] is semisimple, there exists C such that as modules we
have K[G] K[G] ∼= Kw ⊕ C.
Then there exists c ∈ C such that 1K[G] = λw + c for λ ∈ K, c ∈ C. We
have λ 6= 0, else K[G] K[G] = K[G]C ⊆ C 6= K[G] K[G], a contradiction.
We see that w2 = |G|w and

w = w1K[G] = w(λw + c) = λ|G|w + wc

Now wc ∈ C so w − λ|G|w ∈ C and w − λ|G|w = (1 − λ|G|)w ∈ Kw. But


Kw ∩ C = {0} and w 6= 0, λ 6= 0, so we conclude that 1 − λ|G| = 0 in K, i.e.
|G| = λ−1 6= 0 in K. That is, char K - |G|.

Remark 6.36. With a little more technology, one can prove the stronger
form of Maschke’s theorem: K[G] is semisimple if and only if G is finite and
char K divides |G|.
Consequently,

Theorem 6.37. Let G be a finite group and K an algebraically closed field


of characteristic zero. Then
∼ Qr Mn (K) as algebras
(a) K[G] = i=1 i

(b) K[G] has r pairwise non-isomorphic simple modules {Si | 1 ≤ i ≤ r}


with dimension dim Si = ni , and
r
∼ Si⊕ni
M
K[G] K[G] =
i=1

Pr 2
(c) |G| = i=1 ni

(d) there exists a complete set of central orthogonal idempotents


{ei | 1 ≤ i ≤ r}

(e) r = dim Z(K[G]) is equal to the number of conjugacy classes of G

(f) G is Abelian if and only if dim Si = 1 for all i; then |G| = r.

Proof.

(a) The assumptions of the theorem allow us to apply Maschke’s theorem,


so that K[G] is semisimple. Then the claim follows from the Artin–
Wedderburn theorem (specifically, from Corollary 6.33).

45
(b) The matrix algebra Mni (K) has exactly one simple module, the natural
module, isomorphic to K⊕ni , of dimension niQ . Set Si = K⊕ni . It is
straightforward to check that all simples for ri=1 Mni (K) are of the
form
0 × 0 × · · · × 0 × Si × 0 × · · · × 0.
Furthermore Mni (K) is a simple algebra of dimension n2i so

Mni (K) Mni (K)


∼ ⊕n
= Si i ,

from which the last claim follows.

(c) This follows by comparing dimensions in the isomorphism in part (b).

(d) Define Bi = Si⊕ni , so K[G] K[G] ∼


Lr
Bi . Then there exist elements
= i=1 P
{ei ∈ Bi | 1 ≤ i ≤ r} such that 1K[G] = ri=1 ei .
Furthermore, it is a general fact that submodules of the left regular
module are left ideals of the algebra, so the Bi are left ideals. Consider
ei and ej . Since Bj is a left ideal, ei ej ∈ Bj . On the other hand,

ei 1K[G] = ei (e1 + · · · + er ) = ei e1 + · · · + ei er

Then since ei ∈ Bi (left-hand side), ei ej ∈ Bj and the sum K[G] K[G] ∼ =


L r
B
i=1 i is direct, we must have e e
i j = 0 for i 6
= j and hence e i = e i i.
e
That is, ei is an idempotent and the set {ei } is a complete set of
orthogonal idempotents.
For b ∈ Bi , b1K[G] = b = 1K[G] b, 1K[G] = ri=1 ei and the orthogonality
P
of the idempotents imply that bei = b = ei b so that ei is the multi-
plicative identity of Bi , hence identified withQthe identity matrix in
Mni (K) in the algebra decomposition K[G] ∼ = Mni (K). The product
structure implies that each ei commutes with the other factors Mnj (K)
for j 6= i and hence the ei are central.

(e) Now the ei span a subspace of Z(K[G]) of dimension r. By Schur’s


lemma, any z ∈ Z(K[G]) acts on Si by a scalar λi and zei = λi ei .
Then
r r
!
X X
z = z1K[G] = z ei = λi ei ∈ spanK {ei | 1 ≤ i ≤ r}
i=1 i=1

So dim Z(K[G]) ≤ r and hence dim Z(K[G]) = r (and each ei spans


Z(Bi )). By Proposition 5.11, this is the number of conjugacy classes.

(f) Note first that Mni (K) is commutative if and only if ni = 1. So if G


is Abelian, and hence K[G] is commutative, we must have ni = 1 for
all i and the result follows from (b).

46
If all simples are 1-dimensional, then ni = 1 for all i so K[G] ∼
= K⊕r is
commutative, hence G is Abelian. Then by (c),
r
X r
X
|G| = n2i = 12 = r.
i=1 i=1

The take-away from this theorem is that for finite groups and over algeb-
raically closed fields of characteristic zero, we have very detailed information
on their representation theory. Several of these details will re-emerge shortly,
when we look at characters.
First, let us specialise one step further and look at what this means for
the simplest groups, namely finite Abelian groups. To avoid field-theoretic
complications, let us take K = C.
If G is a finite Abelian group, then

G∼
= Cn1 × Cn2 × · · · × Cnr

where Cni = hgi | gini = ei. In particular, if g ∈ G we can write (by a mild
abuse of notation) g = g1a1 · · · grar with ai ∈ Z.
Then letting ζi ∈ C be an ni th root of unity and ζ = (ζ1 , . . . , ζr ), define
Vζ to be the 1-dimensional module Vζ = Cv with

g1a1 · · · grar . v = ζ1a1 · · · ζrar v

These are 1-dimensional and hence necessarilyQr simple.


There
Qr are n i n i th roots of unity, so i=1 ni tuples ζ = (ζ1 , . . . , ζr).
But i=1 ni = |G| so we have the right number of (1-dimensional) simple
modules. With a little more work, one can check that these are pairwise
non-isomorphic and hence are exactly the simple modules.
Let us continue our journey by asking how might we achieve similar con-
crete understanding of the representation theory of arbitrary finite groups.
We will do this through the use of a very powerful tool, character theory.

7 Character theory
From this point, unless otherwise specified, G will be a finite group, all
vector spaces will be C-vector spaces and all C[G]-modules will be finite-
dimensional left C[G]-modules.
A complex matrix representation of a finite group ρ : G → GLn (C) con-
tains a large amount of data: |G|n2 complex numbers, in fact. Passing from
a representation to its character discards the majority of this but retains
enough to be useful, as we will see. The character of ρ is the map χ : G → C
given by tr ◦ ρ, taking each group element to the trace of the representing
matrix, and so keeps only |G| pieces of data. One may alternatively prefer
to define the character of a module:

47
Definition 7.1. Let M = (M, .) be a C[G]-module with basis B. The
character of M is the function χM : G → C defined by χM (g) = tr(MBB (Lg )),
where Lg : M → M is the endomorphism of M given by Lg (m) = g . m.

(You might want to remind yourself of Proposition 4.12: here, Lg =


ρ. (g), for ρ. the canonical representation of A obtained from ..)
Since tr(P −1 AP ) = tr(A) for any invertible matrix P , we see that χM
does not depend on the choice of basis B. It also follows that isomorphic
C[G]-modules have the same character.
It also follows from the construction of direct sums and the additivity of
trace that for modules M and N , χM ⊕N = χM + χN .

Definition 7.2. We say that a map χ : G → C is a character of G if it is the


character of some C[G]-module. We say χ is irreducible if χ is the character
of some simple C[G]-module.

Examples 7.3.

(a) Let C be the trivial C[G]-module, with g . z = z for all z ∈ C and


g ∈ G. The character of this module, χ1 : G → C, is given by χ1 (g) = 1
for all g ∈ G and is called the trivial (or principal) character of G. It
is (almost) always enumerated as χ1 , both since its values are all 1
and since it is the “first”, most natural, character in some sense.

(b) If G acts on a set Ω and C[Ω] is the corresponding permutation module,


then the permutation character χΩ : G → C has χΩ (g) = |FixΩ (g)| for
def
all g ∈ G, where FixΩ (g) = {x ∈ X | g . x = x}.

(c) For G = Sn , the sign of a permutation gives rise to a character:


def
χsign (σ) = (−1)sign(σ) . (The “right” way to show this will take quite
a bit of work!)

(d) The regular C[G]-module C[G] C[G] has character


(
|G| if g = e
χreg =
0 6 e
if g =

and this character is called the regular character. (This formula for
χreg is a consequence of (b) with X = G, g . h = gh, as no element
other than e fixes an element of G: g . h = gh = h ⇒ g = e.)

Characters are examples of a special type of function from a group to


the complex numbers.

Definition 7.4. A function ϕ : G → C is a class function if ϕ(g) = ϕ(h−1 gh)


for all g, h ∈ G. That is, a class function is constant on conjugacy classes.

48
Proposition 7.5. Let χ be a character of G and thus the character of some
C[G]-module M . Then
(a) χ(e) = dim M ;

(b) χ is a class function; and

(c) χ(g −1 ) = χ(g) for all g ∈ G.


Proof. Let n = dim M .
(a) The endomorphism Le is just the identity map so MBB (Le ) = In and
hence χ(e) = n = dim M .

(b) Let g, h ∈ G. As endomorphisms, Lh−1 gh = L−1


h ◦ Lg ◦ Lh so

tr(MBB (Lh−1 gh )) = tr(MBB (Lh )−1 MBB (Lg )MBB (Lh )) = tr(MBB (Lg ))

and χ(g) = χ(h−1 gh).

(c) Since G is finite, g ∈ G has some finite order m. Then M = MBB (Lg )
satisfies xm − 1, which splits as distinct linear factors over C, hence
the minimum polynomial of M does and so M is diagonalisable. This
implies that χ(g) = tr(M) is the sum of the eigenvalues of M and
since these are roots of xm − 1, they are mth roots of unity.
Now the eigenvalues of M−1 = MBB (Lg−1 ) are inverses of roots of unity,
but if ω is a root of unity then ω −1 = ω and hence χ(g −1 ) = χ(g).
The proof of (c) also tells us that |χ(g)| ≤ χ(e) = n and if g has order 2
then χ(g) ∈ Z and χ(g) ≡ χ(e) (mod 2) (exercise).
Definition 7.6. The degree of a character χ of G is χ(e). Characters of
degree 1 are called linear characters.
The trivial character is linear; the regular character has degree |G|. The
map sign : Sn → C× given by taking the sign of a permutation is a linear
character of Sn .
Lemma 7.7. Let G be a group and χ : G → C a character of G. The
×
codomain restriction χ|C : G → C× is a group homomorphism if and only
if χ is a linear character.
Proof. We sketch the implication “linear ⇒ homomorphism”. Since χ is
linear, i.e. is the character of a module M of dimension 1, we may identify
M with C and take the basis B = {1} for it.
Then for any g ∈ G, g . 1 = λg ∈ C. Then Lg is the map λg idC , or
equivalently MBB (Lg ) = (λg ), a 1 × 1 matrix. Now tr((λg )) = λg , from which
×
it is straightforward to check that χ|C is a homomorphism.
We leave the converse as an exercise.

49
Remark 7.8. Let ρ : G → GLn (C) be a representation of G and χ : G → C
its character. Then |χ(g)| = χ(e) for all g ∈ G if and only if ρ(g) = λIn for
some λ ∈ C. Hence the kernel of ρ is equal to the set {g ∈ G | χ(g) = χ(e)},
so that the character retains full information on the kernel of its originating
def
representation. We then set Ker χ = {g ∈ G | χ(g) = χ(e)} = Ker ρ.

Definition 7.9. Let χ be a character of G. Define χ : G → C by χ(g) = χ(g)


for all g ∈ G.

Lemma 7.10. If χ is a character of G then so is χ.

Proof. Defining A ∈ GLn (C) in the obvious way, (AB) = AB, so if ρ is a


def
representation having character χ we have gρ = (gρ) a representation of G
with character χ.

Furthermore, χ is irreducible if and only if χ is.


We illustrate some of the above results by looking at characters of C[C4 ].
There are precisely four non-isomorphic 1-dimensional C[C4 ]-modules and
they form a complete set of irreducible C[C4 ]-modules (see Theorem 6.37 and
the discussion immediately afterwards). The corresponding representations
are ρ1 , ρ−1 , ρi and ρ−i , given by ρz (g k ) = z k for C4 = hgi, z ∈ {±1, ±i}.
Since C4 is Abelian, its conjugacy classes are in bijection with its elements
and the characters corresponding to the above representations are as follows:

e g g2 g3
χ1 1 1 1 1
χ−1 1 −1 1 −1
χi 1 i −1 −i
χ−i 1 −i −1 i

Note that χi (g −1 ) = χi (g 3 ) = −i = χ−i (g) and moreover χi = χ−i .


We have actually constructed the character table of C4 . Before we see
more character tables, we will prove a few more properties of characters,
to help us build our examples. These are the orthogonality relations, for
which we of course need an inner product. The set of functions from G to
C, Fun(G, C), has a natural vector space structure and the subset of class
functions ClassFun(G, C) is a subspace of dimension r = dim Z(C[G]). (A
basis for ClassFun(G, C) is given by {δi | 1 ≤ i ≤ r}, δi |Cj = δij .)

Definition 7.11. There is a complex inner product h , i on Fun(G, C)


defined by
1 X
hϕ, ψi = ϕ(g)ψ(g)
|G|
g∈G

for all ϕ, ψ ∈ Fun(G, C).

50
Since for characters, ψ(g) = ψ(g −1 ) and {g −1 | g ∈ G} = G, it is easy
to see that when we restrict this inner product to the set of characters of
G, we have hϕ, ψi = hψ, ϕi, so hϕ, ψi is in fact real and h , i is symmetric,
rather than just conjugate-symmetric.
Recall that we may write C[G] C[G] = ri=1 Bi where each pair of sub-
L
⊕n
modules Bi ∼ = Si i and Bj ∼
⊕n
= Sj j have no common composition factor.
Recall also that there exist {ei | 1 ≤ i ≤ r} a complete set of orthogonal
central idempotents, with each ei spanning Z(Bi ).

Proposition 7.12. Let ϕi denote the character of the C[G] C[G]-submodule


Bi . Then hϕi , ϕi i = ϕi (e) = dim Bi .
1 P −1
Proof. We claim that ei = |G| g∈G ϕi (g )g. To see this, fix h ∈ G and
consider the C[G]-endomorphism θ : C[G] → C[G], θ(k) = kei h−1 . Now for
bj ∈ Bj we have bj ei = δij bj so θ|Bi is a Bi -endomorphism with trace ϕi (h−1 )
(the map θ|Bi acts on Bi by right multiplication by h−1 ) and θ|Bj = 0 for
i 6= j. Hence tr(θ) = ϕi (h−1 ). P
Since ei ∈ C[G], there exist scalars λg , g ∈ G such that ei = g∈G λg g.
From our analysis of the regular character, the endomorphism k 7→ kgh−1
has traceP zero if gh−1 6= e and trace |G| if gh−1 = e. So the map θ, which has
θ(k) = k g∈G λg gh−1 , must have tr(θ) = λh |G|. Hence ϕi (h−1 ) = λh |G| so
for any h ∈ G, λh = ϕi (h−1 )/|G|. Thus ei = |G| 1 P −1
g∈G ϕi (g )g, as claimed.
It follows that the coefficient of the identity element e in e2i is equal to
both |G|2 g∈G ϕi (g −1 )ϕi (g) = |G|
1 P 1
hϕi , ϕi i and ϕi (e)/|G| (since e2i = ei ) and
so hϕi , ϕi i = ϕi (e) = dim Bi .

From this, we derive our first orthogonality theorem.

Theorem 7.13 (Row orthogonality relations).


Let Irr(G) = {χi | 1 ≤ i ≤ r} be the set of irreducible characters of G. Then
hχi , χj i = δij .

Proof. Let Si and Sj be simple C[G]-submodules of C[G] C[G] with characters


χi and χj . Set nk = dim Sk (k ∈ {i, j}). Then since Bi = Si⊕ni , the character
of Bi , ϕi , is equal to ni χi and so by the previous proposition

hni χi , ni χi i = dim Bi = (dim Si )2 = n2i .

Hence hχi , χi i = 1. L
For the case i 6= j, set Y = Bi ⊕ Bj and Z = k6=i,j Bk , so Y and Z
have no common composition factors and C[G] C[G] = Y ⊕ Z. Now letting
{ei } be as above, we see from the proof of the previous proposition that
1 X
ϕi (g −1 ) + ϕj (g −1 ) g

ei + ej =
|G|
g∈G

51
and the coefficient of e in ei + ej is (ϕi (e) + ϕj (e))/|G|. Denote by λg the
coefficient of g in |G|(ei + ej ), so λg = ϕi (g −1 ) + ϕj (g −1 ). Then
 2
1 X 1 XX
(ei + ej )2 = 2
 λg g  = (λh λh−1 g )g
|G| |G|2
g∈G g∈G h∈G

has coefficient of e equal to


X X
ϕi (h−1 ) + ϕj (h−1 ) ϕi (h) + ϕj (h) .
 
λh λh−1 =
h∈G h∈G

Also
1 X
ϕi (h) + ϕj (h) ϕi (h−1 ) + ϕj (h−1 )
 
hϕi + ϕj , ϕi + ϕj i =
|G|
h∈G

and so we see that hϕi + ϕj , ϕi + ϕj i = ϕi (e) + ϕj (e). Since ϕi = ni χi and


ϕj = nj χj , and ϕi (e) = n2i and ϕj (e) = n2j , we see that on the one hand

hni χi , nj χj i = n2i hχi , χi i + 2ni nj hχi , χj i + n2j hχj , χj i


= n2i + 2ni nj hχi , χj i + n2j

and on the other

hni χi , nj χj i = ni ϕi (e) + nj ϕj (e)


= n2i + n2j .

Hence 2ni nj hχi , χj i = 0 and so hχi , χj i = 0.

Many important results follow from this.

Theorem 7.14. The set Irr(G) of irreducible characters is an orthonormal


basis for ClassFun(G, C).

Proof. Orthonormality is the content of the preceding theorem and linear


independence is immediate from this. Since the number of irreducible char-
acters is equal to the number of conjugacy classes and this is also equal to
dim ClassFun(G, C) as noted above, we have the result.

Indeed, χ = ri=1 hχ, χi i χi for any character χ.


P
Note that the following theorems use the semisimplicity of K[G] in a
strong way.

Theorem 7.15. Two C[G]-modules are isomorphic if and only if their char-
acters are equal.

52
Proof. We discussed the forward direction earlier.
If V , W are finite-dimensional C[G]-modules with equal character χ then
⊕β

V = ri=1 Si⊕αi , W ∼ = i=1 Si i and αi = hχ, χi i = βi , so V ∼
L Lr
= W.

Theorem 7.16. A C[G]-module M is simple if and only if its character χ


satisfies hχ, χi = 1.

Proof. By the above, if M is simple, hχ, χi = 1. Conversely, P


if hχ, χi = 1
then since χ = i=1 mi χi for some mi , we have 1 = hχ, χi = ri=1 m2i and
Pr
so exactly one mi is non-zero, mk say, and mk = 1. So χ = χk and M ∼ = Sk
is simple.

Theorem 7.17. Let V and W be C[G]-modules with characters ϕ and ψ.


Then hϕ, ψi = dim HomC[G]-Mod (V, W ).

Proof. This follows straightforwardly from C[G] being semisimple (and hence
all its modules being semisimple), orthogonality and Schur’s lemma, which
tells us that dim HomC[G]-Mod (Si , Sj ) = δij .

There is a second orthogonality result, called column orthogonality for


reasons we will see very shortly.

Theorem 7.18. Let G be a group and {gi | 1 ≤ i ≤ r} a complete set


of conjugacy class representatives. Let Irr(G) = {χi | 1 ≤ i ≤ r} be the
irreducible characters of G. Then
Pr χi (gk )χj (gk )
(a) k=1 |CG (gk )| = δij (row orthogonality relations);
Pr
(b) k=1 χk (gi )χk (gj ) = δij |CG (gi )| (column orthogonality relations).

Proof. The first claim is simply a restatement of Theorem 7.13, making use
of the fact that characters are class functions and that |Ci ||CG (gi )| = |G|.
For the second part, let {δi | 1 ≤ i ≤ r} be the aforementioned basis for
ClassFun(G, C), with δi taking value Pr1 on jCi and 0 elsewhere.
j
Since Irr(G)
is a basis for ClassFun(G, C), δj = k=1 λk χk for some λk ∈ C. Then

1 X 1 χk (gj )
λjk = hδj , χk i = δj (g)χk (g) = |Cj |δj (gj )χk (gj ) = .
|G| |G| |CG (gj )|
g∈G

Pr j Pr χk (gi )χk (gj )


Hence δij = δj (gi ) = k=1 λk χk (gi ) = k=1 |CG (gj )| .

We can now turn properly to character tables.

Definition 7.19. Let Irr(G) = {χi | 1 ≤ i ≤ r} be the irreducible characters


of G and {gi | 1 ≤ i ≤ r} a complete set of conjugacy class representatives,
with χ1 the trivial character and g1 = e (by convention). The r × r matrix
with (i, j)-entry χi (gj ) is called the character table of G.

53
Note that the character table is an invertible matrix: by orthogonality,
its rows—the irreducible characters—are linearly independent. Also since
r
X r
X
2
χi (e) = (dim Si )2 = |G|
i=1 i=1

we see that the sum of the squares of the entries in the first column equals
|G|. We will refer to this as “the degree sum property”.
Examples 7.20.
(a) G = C2 = a | a2 = e ; C1 = {e}, C2 = {a} so set g1 = e, g2 = a.

gi e a
|CG (gi )| 2 2
χ1 1 1
χ2 1 −1
• χ1 is the trivial character
• χ2 (e) = 1 since |G| = 2 = 12 + 12
• χ2 (a) = −1 by row orthogonality: 1 · 1 + 1 · χ2 (a) = 0 implies
χ2 (a) = −1
Note that χreg = χ1 + χ2 .
(b) G = S3
gi ι (1 2 3) (1 2)
|CG (gi )| 6 3 2
χ1 1 1 1
χ2 1 1 −1
χ3 2 −1 0
• χ1 is the trivial character; χ2 is the sign character
• χ3 (ι) is obtained from the degree sum property: 22 = 6 − 12 − 12
• χ3 ((1 2 3)) is obtained by column orthogonality:

1 · 1 + 1 · 1 + 2 · χ3 ((1 2 3)) = 0 ⇒ χ3 ((1 2 3)) = −1

• χ3 ((1 2)) is obtained by row orthogonality:


1 · 2 1 · −1 1 · χ3 ((1 2))
+ + = 0 ⇒ χ3 ((1 2)) = 0
6 3 2
For the natural permutation module on Ω = {1, 2, 3}, χΩ = χ1 + χ3
(recall that χΩ (g) = |FixΩ (g)|). This is a general phenomenon: for
any subgroup H of Sn with permutation character χΩ , ν = χΩ − χ1 is
a character of H.
Here χreg = χ1 + χ2 + 2χ3 .

54
(c) G = C2 × C2 = a, b | a2 = e, b2 = e, ab = ba

gi e a b ab
|CG (gi )| 4 4 4 4
χ1 1 1 1 1
χ2 1 1 −1 −1
χ3 1 −1 1 −1
χ4 1 −1 −1 1
Here χreg = χ1 + χ2 + χ3 + χ4 .
In order to further aid our search for characters of a group, we return to
linear characters; that is, characters of degree 1.
Proposition 7.21. Let λ be a linear character of G and χ any character.
Then the product χλ : G → C defined by (χλ)(g) = χ(g)λ(g) for all g ∈ G
is a character of G. Furthermore, χλ is irreducible if and only if χ is
irreducible.
Proof. Exercise.

Corollary 7.22. Let G b be the set of linear characters of G. Define a binary


operation · on Gb by (λ1 · λ2 )(g) = λ1 (g)λ2 (g) for all g ∈ G. Then (G,
b ·)
is an Abelian group, called the dual group, with identity element the trivial
character.
Proof. Exercise.

Now G is Abelian if and only if every irreducible character is linear, as


follows easily from the theorem that the number of characters equals the
number of conjugacy classes and the degree sum property. In fact, if A is a
finite Abelian group then A b∼ = A.
To find linear characters, we use a special case of a process called lifting,
which produces characters of G from those of quotients of G. Proper quo-
tients are smaller, so finding their characters is in principle easier. We will
also see that if we know all characters of G, we may identify the possible
normal subgroups of G and so we can use character theory as a way to see
if a group is simple (that is, has no non-trivial proper normal subgroups).
Proposition 7.23. Let N be a proper normal subgroup of a group G and let
e be a character of G/N . Define χ : G → C by χ(g) = χ
χ e(gN ) for all g ∈ G.
Then χ is a character of G of the same degree as χ e and χ is irreducible if
and only if χ
e is. The character χ is called the lift of χ
e to G.
Proof. Let ρe: G/N → GLn (C) be a representation of G/N with character χ
e
and let π : G → G/N be the natural surjective homomorphism. Then their
composition ρ is clearly a representation of G with character values
χ(g) = tr(e
ρ(π(g))) = tr(e
ρ(gN )) = χ
e(gN ),

55
and thus character χ. Then χ(e) = χ e(eN ) = χ e(eG/N ) so the degrees are
equal.
The rest of the proof is deferred to the following theorem, which shows
that a slightly stronger statement than this proposition in fact holds.

Theorem 7.24. Let N be a proper normal subgroup of G. There is a


bijection between the set of characters of G/N and the set of characters of
G having N in their kernel. This bijection preserves irreducibility.

Proof. Exercise (hard).

Given a group G, its derived group is the subgroup

G0 = ghg −1 h−1 | g, h ∈ G

generated by all commutators of pairs of elements of G. It is the unique


minimal normal subgroup of G such that G/G0 is Abelian.

Corollary 7.25. The number of linear characters of G is equal to |G/G0 |


where G0 is the derived group of G.

Proof. Exercise

So from knowledge of the group-theoretic structure of Sn we may find


its linear characters: Sn0 = An so there are two linear characters, χ1 (trivial)
and χ2 . We know the sign character is linear, so this is χ2 , and there are no
others.
For the other direction, finding normal subgroups from character tables,
recall that Ker χ = {g ∈ G | χ(g) = χ(e)} P G. So any intersection of
kernels of characters is a normal subgroup. In fact, every normal subgroup
arises in this way.

Proposition 7.26. If
T N P G then there exist irreducible characters χi1 , . . . , χis
of G such that N = sk=1 Ker χik .

Proof. If g ∈ Ker χ for all χ ∈ Irr(G) then g ∈ Ker δ1 (where δ1 is the class
function with value 1 on C1 = {e} and 0 otherwise), T since Irr(G) is a basis
for the space of class functions. So g = e and χ∈Irr(G) Ker χ = {e}.
eis }, so {N } = sk=1 Ker χ
T
Let Irr(G/N ) = {e χi1 , . . . , χ eik by the above. Let
χik be theT lift of χ
eik to G and note T that g ∈ Ker χik implies gN ∈ Ker χ eik .
So if gT∈ sk=1 Ker χik then gN ∈ sk=1 Ker χ eik = {N } and so g ∈ N . Hence
N = sk=1 Ker χik .

From this we see how to detect simplicity.

Proposition 7.27. The group G is simple if and only if G is non-trivial


and Ker χi = {e} for all χi ∈ Irr(G) \ {χ1 }.

56
Proof. We prove that G is not simple if and only if Ker χi 6= {e} for some
non-trivial irreducible character χi . If N is a proper non-trivial normal
subgroup of G, then by the previous proposition there exists an irreducible
character χi whose kernel is neither {e} nor G, as required. Conversely,
given a non-trivial χi with non-trivial kernel, this kernel is a proper non-
trivial normal subgroup.

We see from its character table that S3 is not simple: χ2 ((1 2 3)) = 1 =
χ2 (ι). Indeed the sign character shows that Sn is not simple for all n ≥ 3.
We return to giving some more examples, beginning with A4 .
Examples 7.28.

(a) G = A4
gi ι (1 2)(3 4) (1 2 3) (1 3 2)
|CG (gi )| 12 4 3 3
χ1 1 1 1 1
χ2 1 1 ω ω2
χ3 1 1 ω2 ω
χ4 3 −1 0 0

• A04 = V4 ∼= C2 × C2 and A4 /V4 = ∼ C3 so there are three linear


characters, lifting those of C3 . Set ω = e2πi/3 . Then the linear
characters are χ1 , χ2 , χ3 as above.
• one irreducible character remains: ν = χΩ − χ1 is a character and
9
hν, νi = 12 + 14 + 0 + 0 = 1 so χ4 = ν is irreducible.

(b) G = S4

gi ι (1 2) (1 2 3) (1 2)(3 4) (1 2 3 4)
|CG (gi )| 24 4 3 8 4
χ1 1 1 1 1 1
χ2 1 −1 1 1 −1
χ3 2 0 −1 2 0
χ4 3 1 0 −1 −1
χ5 3 −1 0 −1 1

• the linear characters are χ1 (trivial) and χ2 (sign) and so there


are three irreducible non-linear characters to find.
• χ4 = χΩ − χ1 is a character and hχ4 , χ4 i = 1, as is easily checked,
so χ4 is irreducible.
• hence χ5 = χ4 χ2 is another irreducible character.
• V4 P S4 and S4 /V4 ∼ = S3 . The trivial and sign characters of S3 lift
to those of S4 . The lift χ3 of the degree 2 character χ
e3 of S3 is as in

57
the table, since (1 2)(3 4) ∈ V4 implies χ3 ((1 2)(3 4)) = χ
e3 (ι) = 2
and V4 (1 2 3 4) = V4 (1 3) implies

χ3 ((1 2 3 4)) = χ
e3 ((1 3)) = χ
e3 ((1 2)) = 0.

(c) G = D14 = a, b | a7 , b2 , b−1 ab = a−1

gi e a a2 a3 b
|CG (gi )| 14 7 7 7 2
χ1 1 1 1 1 1
χ2 1 1 1 1 −1
χ3 2  + −1 2 + −2 3 + −3 0
χ3 2 2 + −2 4 + −4 6 + −6 0
χ3 3
2  + −3 6
 + −6 2
 + −2 0

• a set of conjugacy class representatives is {e, a, a2 , a3 , b} so we


need to find five irreducible characters.
0 = hai and D / hai ∼ C so we obtain χ and χ by lifting
• D14 14 = 2 1 2
the (linear) characters of C2 .
• the remaining characters of D14 all have degree 2 and are obtained
from the following natural representations of D14 . Set  = e2πi/7
and for each integer 1 ≤ j < 7/2 define
 j   
 0 0 1
Aj = and Bj = .
0 −j 1 0

It is straightforward to check that A7j = Bj2 = I and Bj−1 Aj Bj =


A−1 r s r s
j so ρj : D14 → GL2 (C), (a b )ρj = Aj Bj is a representation
of D14 for each j. If i 6= j, Ai has different eigenvalues from Aj so
ρi and ρj cannot be equivalent. For 1 ≤ j ≤ 3 define χj+2 to be
the character of ρj . An easy calculation shows that hχi , χi i = 1
for 3 ≤ i ≤ 5, so these characters are irreducible. Alternatively,
it is clear from the definitions that the representations ρj are
irreducible.

So we see that character tables encode a lot of information about groups.


However they do not capture everything.

Proposition 7.29. Define

Q = a, b | a4 , a2 = b2 , b−1 ab = a−1 ,

the quaternion group of order 8. The character tables of Q and D8 , the


dihedral group of order 8, are equal but Q ∼
6 D8 , so non-isomorphic groups
=
may have the same character table.

58
Proof. Exercise.

At this point, we will remark that one can gain information about the
representation theory of more general algebras than just group algebras by
the use of characters. However, the group setting is where it is most effective
and for arbitrary algebras, especially finite-dimensional ones, there are other
techniques.
These include quivers: on the one hand, we can associate a quiver11
to an algebra from its structure, and then use the representation theory of
quivers to study it. On the other, there are natural quivers appearing in
the category of modules, the most important being the Auslander–Reiten
quiver. This requires a whole other course in representation theory, but for
those interested, [EH], [Sch] and [ASS] are recommended reading.

11
This is another name for a directed graph.

59
A Burnside’s pα q β theorem
There are many other important and interesting constructions to help find
characters but we cannot describe them all here. Some of the most significant
include tensor products, restriction and induction and Frobenius reciprocity
but in this appendix we explain one of the most significant applications of
representation theory to structure theory: Burnside’s pα q β theorem.
To appreciate this example fully one needs some advanced group theory.
In particular, this appendix is recommended to be read alongside a course
on Galois Theory.
We will begin with some properties of algebraic integers necessary for our
proof of Burnside’s pα q β theorem and, along the way, demonstrate a pair of
very useful theorems, namely that rational character values are necessarily
integers and that the degree of any irreducible character divides the group
order.
Definition A.1. A complex number λ is said to be an algebraic integer if
it is the eigenvalue of an integer matrix.
Equivalently, λ is an algebraic integer if it is the root of some monic
polynomial in Z[x]. So every integer is an algebraic integer, as is every
√ 0 1 )). If λ is an
positive square root m of a natural number m (consider ( m 0
algebraic integer so are −λ and λ. Also every root of unity is an algebraic
integer (consider xn − 1). Indeed, if λ and µ are algebraic integers then so
are λµ and λ + µ, so the set of algebraic integers A is a subring of C. It
follows that every character value χ(g) is an algebraic integer, as we showed
earlier that character values are sums of roots of unity.
The next theorem tells us even more about character values.
Theorem A.2. Any rational algebraic integer is an integer: Q ∩ A = Z.
Proof. Suppose λ ∈ Q \ Z. Write λ = r/s with (r, s) = 1 and s 6= ±1. Let
p be a prime dividing s. Then if A ∈ Mn (Z), the off-diagonal entries of
sA − rI are divisible by s and hence by p. So det(sA − rI) = (−r)n + mp
for some m ∈ Z. Since p - r we conclude that det(sA − rI) 6= 0. Then
n
det(A − λI) = 1s det(sA − rI) 6= 0 for all A ∈ Mn (Z) and λ is not an
algebraic integer.

Corollary A.3. Every rational character value is an integer.



Corollary A.4. 2 is irrational.
The following lemmas yield algebraic integers naturally coming from char-
acters.
P
Lemma A.5. Let r = g∈G αg g ∈ C[G] with αg ∈ Z for all g ∈ G. If
there exists a non-zero u ∈ C[G] such that ur = λu with λ ∈ C then λ is an
algebraic integer.

60
Proof. The existence of u gives us that λ is an eigenvalue of the integer
matrix A = (αg−1 gj ) where {g1 , . . . , gn } is an enumeration of the elements
i
of G.
|G| χ(g)
Lemma A.6. If χ is an irreducible character of G and g ∈ G then |CG (g)| χ(e)
is an algebraic integer.
Proof. Let S be a simple C[G]-module
P with character χ, C be the conjugacy
class containing g and C = h∈C h the class sum. We claim that uC = λu
for all u ∈ S, where λ = |C|G| χ(g)
G (g)| χ(e)
.
Since C ∈ Z(C[G]) there exists some
P λ ∈B C such that uC = λu for all
u
P ∈ V . If B is a basis for S then h∈C MB (Lh ) = λI so taking traces,
h∈C χ(h) = λχ(e). But χ is constant on conjugacy classes, hence on C,
so |C|χ(g) = λχ(e) and hence λ = |C|G| χ(g)
G (g)| χ(e)
. We conclude that λ is an
algebraic integer, by the previous lemma.

Therefore we have:
Theorem A.7. If χ is an irreducible character of G then χ(e) divides |G|.
Proof. Let {gi | 1 ≤ i ≤ r} be a complete set of conjugacy class repres-
entatives. Then for all i, |CG|G| χ(gi )
(gi )| χ(e) and χ(gi ) are algebraic integers, so
Pr |G| χ(gi )χ(gi )
i=1 |CG (gi )| χ(e) is an algebraic integer. But this algebraic integer is
equal to |G|/χ(e) by the row orthogonality relations. So the rational algeb-
raic integer |G|/χ(e) is an integer and hence χ(e) | |G|.

Hence, for example, the degree of every irreducible character of a p-group


is a power of p (and is not equal to |G| by the degree sum property). So
groups of order p2 have all irreducible characters of degree 1 or p. But the
degree cannot be p for any irreducible character, since there is the trivial
character of degree 1 and a degree p irreducible character would violate the
degree sum property. Hence all irreducible characters of a group of order p2
are linear and we recover the fact that such groups are Abelian.
Corollary A.8. No simple group has an irreducible character of degree two.
Proof. Exercise.

In fact, we can tell when a character value is an integer if we fully


understand conjugacy in our group.
Theorem A.9. Let g ∈ G have order n and suppose that for all 1 ≤ i ≤ n
with (i, n) = 1 g is conjugate to g i . Then χ(g) is an integer for any character
χ of G. Conversely, if χ(g i ) ∈ Z for all characters χ of G, then g is conjugate
to g i whenever (i, n) = 1.
Proof. Omitted. The converse requires Galois theory.

61
Corollary A.10. All character values for symmetric groups are integers.

Now we can turn directly towards our main goal. For this, we need an
extension of the notion of an algebraic integer, namely algebraic numbers.
A complex number is said to be algebraic (or an algebraic number) if it is
the root of some non-zero polynomial in Q[x]. An algebraic number α has
a minimum polynomial m(x) and we will refer to the elements of the set of
roots of m(x) as conjugates of α. We will need the following facts about
algebraic numbers.

Lemma A.11. Let α and β be algebraic numbers. Then every conjugate of


α + β is of the form α0 + β 0 where α0 (respectively, β 0 ) is a conjugate of α
(respectively, β). If r ∈ Q then every conjugate of rα is of the form rα0 with
α0 conjugate to α.

Next, we can see which character values give algebraic integers.

Lemma A.12. Let χ be a character of G and let g ∈ G.


Then |χ(g)/χ(e)| ≤ 1 and if 0 < |χ(g)/χ(e)| < 1 then χ(g)/χ(e) is not
an algebraic integer.

Proof. Let χ(e) = P d. Then χ(g) is a sum of Pdd roots of unity ω1 , . . . , ωd


so χ(g)/χ(e) = d1 di=1 ωi . Since |χ(g)| ≤ i=1 |ωi | = d, we must have
|χ(g)/χ(e)| ≤ 1.
Suppose χ(g)/χ(e) is an algebraic integer with |χ(g)/χ(e)| < 1. We claim
that χ(g) = 0. SetPγ = χ(g)/χ(e) and let m(x) be the minimum polynomial
of γ, so m(x) = ni=1 αi xi for some n with αn = 1 and αi ∈ Z for all i.
By the previous lemma, every conjugate of γ is of the form d1 di=1 ωi0 with
P
ωi0 conjugate to ωi and hence also a root of unity. So every conjugate of γ
also has modulus at most 1. Then the product λ of all the conjugates of γ,
including γ itself, has modulus strictly less than 1. But the product of the
roots of m(x) is equal to ±α0 and since α0 ∈ Z and |λ| < 1, we have α0 = 0.
Since m(x) is the minimum polynomial it is irreducible, so we must have
m(x) = x and γ = 0. Hence χ(g) = 0.

Theorem A.13. Let G be a group with a conjugacy class C of size pr with


p prime and r ≥ 1. Then G is not simple.

Proof. Let g ∈ C. Since |C| > 1 by assumption, G is not Abelian and g 6= e.


Let Irr(G) = {χi | 1 ≤ i ≤ r} with χ1 the trivial P character. From the
column orthogonality relations we see that 1 + ri=2 χi (g)χi (e) = 0. So
Pr χi (g)χi (e)
i=2 p / A. So for some i ≥ 2, χi (g)χ
= − p1 ∈ p
i (e)
is not an algebraic
integer and hence χi (e)/p is not, since χi (g) is. That is, p - χi (e). Since
|C| = pr , χi (e) and |C| are coprime so there exist integers a and b such that
a|C| + bχi (e) = 1 and so |Ca|G| χi (g)
G (g)| χi (e)
+ bχi (g) = χχii(g)
(e) . Now we have seen that

62
the left-hand side is an algebraic integer and χi (g) 6= 0 so it is non-zero. So
χi (g)/χi (e) ∈ A and hence |χi (g)/χi (e)| = 1.
Let ρ be a representation of G with character χi . Then by Exercise 7.8,
there exists λ ∈ C such that ρ(g) = λI. Let K = Ker ρ C G, a proper normal
subgroup since χi 6= χ1 . If K 6= {e} then G is not simple. If K = {e} then
gρ = λI commutes with hρ for all h ∈ G and so g commutes with every
h ∈ G, since K = {e} means that ρ is faithful. So g ∈ Z(G) C G and
Z(G) 6= {e}, so G is not simple.

Recall that a group G has a derived subgroup, G0 . We can iterate this


def
process: let G(0) = G and G(i) = (G(i−1) )0 for i ≥ 1. A group is said to be
soluble if there exists n such that G(n) = {eG }. (There are other equivalent
formulations.)

Theorem A.14 (Burnside’s pα q β theorem). If p and q are primes then any


group of order pα q β , α, β ∈ N, is soluble.

Proof. By direct (group-theoretic, structural) arguments one may deal with


a number of special cases straightforwardly:

• α = β = 0: G = {e} is soluble

• α + β = 1: G is cyclic of prime order and is soluble (and simple)

• p = q, α + β ≥ 2: G is a non-Abelian p-group and is soluble

• p 6= q, α = β = 1: G has order pq and is soluble

We consider the remaining cases: p 6= q, α ≥ 2, β ≥ 1. We may assume


G is not Abelian, as finitely generated Abelian groups are polycyclic and
so are certainly soluble. By Sylow’s theorems, G has a Sylow q-subgroup
Q of order q β and Z(Q) 6= {e}, since Q is a non-trivial q-group. Let g ∈
Z(Q) \ {e}, so Q 6 CG (g) and if C is the conjugacy class of G containing g
then |C| = |G : CG (g)| = pc for some c. If pc = 1 then g ∈ Z(G) and Z(G)
is a non-trivial proper normal subgroup, so G is not simple. If pc > 1 then
the previous theorem applies and G is again not simple.
Hence G has a proper non-trivial normal subgroup N and both |N | and
|G/N | divide pα q β . By induction on α + β, N and G/N are soluble and
hence G is soluble.

63
B Groups, rings and vector spaces
An algebraic structure is just a set with some operations. We might ask that
the operations have certain properties or that different operations interact
in a particular way. In principle there are many different types of algebraic
structure but in practice some have much richer theories than others and
more (interesting) examples “in nature”.
An algebraic structure consisting of a set S and n operations will be
denoted by (S, ∗1 , ∗2 , . . . , ∗n ). Sometimes we will use different letters for
the set S, to help us remember which type of algebraic structure we are
thinking about, and we might use established or more suggestive notation
for the operations, such as + or ×.
Definition B.1. A group is an algebraic structure (G, ∗) with ∗ an associ-
ative binary operation, having an identity and such that every element of G
has an inverse with respect to ∗.
Definition B.2. A group (G, ∗) is called Abelian if ∗ is commutative.
Definition B.3. A ring is an algebraic structure (R, +, ×) such that (R, +)
is an Abelian group, × is associative and we have

a × (b + c) = (a × b) + (a × c)

and

(a + b) × c = (a × c) + (b × c)

for all a, b, c ∈ R. The identity element for the group (R, +) will be denoted
0R .
The two equations given here are called the first and second (or left and
right) distributive laws. We could be more general and say that × distributes
over + from the left if a × (b + c) = (a × b) + (a × c), and correspondingly
for the right-handed version. Notice that these are not symmetric in + and
×.
Definition B.4. A field is an algebraic structure (F, +, ×) such that (F, +, ×)
is a ring, × is commutative and has an identity 1F 6= 0F and every element
of F \ {0F } has an inverse with respect to ×.
Notice that the definition of a field (F, +, ×) could also be written
equivalently as saying that (F, +) is an Abelian group with identity 0F ,
(F \ {0F }, ×) is an Abelian group and that + and × satisfy the distributive
laws.
From now on, we will generally prefer the symbol K for a field; this will
avoid clashes of notation as we have another concept for which we will prefer
F.

64
We see from these definitions that rings are groups with extra structure,
and fields are rings with extra properties. So every field is in particular a
ring, and every ring is a group.
Groups, rings and fields appear in various different contexts and the
following list of examples is not in any way comprehensive. Rather, they are
examples that should be somewhat familiar to you following a first course
in abstract algebra.

Number systems
• Q, R and C are fields.

• Z is a ring (but not a field); in many ways, Z is the prototype ring and
many ring-theoretic questions are inspired by the structure of Z.

• For each n, the set of integers modulo n, Zn (with addition and mul-
tiplication modulo n) is a ring.

• The set of polynomials in n variables with coefficients from a ring R,


R[x1 , . . . , xn ], is again a ring, with respect to the usual addition and
multiplication of polynomials. This includes examples such as R[x],
C[x, y] and so on.

• Since Q, R and C are fields, as we said in the remarks after the defin-
ition of a field, we have associated groups Q× = Q \ {0}, reals× =
R \ {0} and C× = C \ {0} with respect to multiplication.
More generally, the set of invertible elements in a ring R forms a group
under multiplication (called the group of units, R× ). For a field K, we
have K× = K \ {0K }, because every non-zero element is invertible.

Symmetry groups
• Natural examples of symmetry groups are the set of rotations and
reflections of a square, an n-gon, a circle or other 2-dimensional figures,
as well as rotations and reflections of a cube, tetrahedron, etc.
A particularly important example that occurs often is the dihedral
group D2n (of order 2n), defined to be the group of isometries of R2
preserving a regular n-gon. The dihedral group D2n is generated by
two elements α and β and has the following presentation:

D2n = a, b | αn = e, β 2 = e, αi β = βα−i ∀ i ∈ Z

It follows that D2n = {αi β j | 0 ≤ i ≤ n − 1, 0 ≤ j ≤ 1} and hence


that D2n has 2n elements. Here, the elements αi are the n rotations
of the n-gon and the elements αi β are the n reflections.

65
• The “symmetries of a set” S are given by the set of bijections
def
Bij(S) = {f : S → S | f is a bijection}. This set satisfies all the
conditions to be a group with respect to composition of functions.
Permutations are exactly bijections from the set {1, . . . , n} to itself.
These groups are called the symmetric groups.
They (and the symmetry groups of geometric objects) are usually not
Abelian, in contrast to all the previous examples.

Matrices
• We can add m×n matrices and multiply n×n matrices, we know what
the zero and identity matrices are and we know that sometimes we can
invert matrices. So it should not be a surprise that sets of matrices
can have either group or ring structures, though they are almost never
fields.
More precisely, Mn (Z), Mn (Q), Mn (R) and Mn (C) are all rings; in fact
Mn (R) is one for any ring R. We call these matrix rings. (In linear
algebra, we usually work over a field, because we want to be able to
divide by scalars as well as multiply by them.)

• Inside any matrix ring over a field K, the subset of invertible matrices
(those with non-zero determinant) forms a group, the group of units,
as defined above. We have a special name and notation for the group
of units of matrix rings: we call them general linear groups and write
GLn (K). The group operation is matrix multiplication.

Sets of functions
• The set of functions from R to R, denoted F(R, R), is a ring with
respect to pointwise addition and multiplication, that is, for functions
f, g and x ∈ R, (f + g)(x) = f (x) + g(x) and (f g)(x) = f (x)g(x).

• We can generalise this to functions from any set Y to R (denoted


F(Y, R)) or from any set Y to C (denoted F(Y, C)) or indeed from
any set Y to any ring R (denoted F(Y, R)). Each of these is a ring,
essentially because the codomain (R, C or R) is.

This list might give the (erroneous) impression that there are lot more
rings than groups. This is not the case: since every ring is a group and not
every group is a ring, there are definitely more groups than rings. It just so
happens that the examples you are familiar with are mostly rings or fields.
One important observation about the above list is that lots of these ex-
amples, notably the symmetric groups, the matrix rings and general linear

66
groups, are naturally symmetries of something. The description of sym-
metry groups we gave says this explicitly. One of the main results in Lin-
ear Algebra is that matrices are the same thing as linear transformations
of vector spaces, and hence GLn (K) is the symmetry group of Kn , the n-
dimensional vector space over K. This idea precisely leads to representation
theory.
Having mentioned vector spaces, let us briefly say where they fit in terms
of the definitions so far. Every vector space V over a field K is an Abelian
group under addition, (V, +). Vector spaces are not rings (or fields): we do
not multiply vectors by vectors.
However vector spaces have more structure than just addition, namely
they have scalar multiplication. Scalar multiplication is a slightly different
type of operation: it is not a binary operation on V . It takes two inputs, a
scalar λ ∈ K and a vector v ∈ V , and produces a new vector λv ∈ V . We
can formalise this as a function · : K × V → V , λ · v = λv, and say that a
vector space is an algebraic structure (V, +, ·) where + and · satisfy some
compatibility conditions, namely those that give linearity.

B.1 Groups
Definition B.5. A group (G, ∗) is called finite if the set G is finite.

Definition B.6. The order of a group (G, ∗) is the cardinality of G, |G|.

Recall that the set of bijections from a set S to itself is a group with
respect to composition of functions, (Bij(S), ◦). Let us fix S = {1, 2, . . . , n}
for some n ∈ N.

Definition B.7. The symmetric group of degree n is defined to be the group


of bijections (Bij({1, 2, . . . , n}), ◦) and is denoted Sn .

Proposition B.8. Let (G, ∗) be a group.

(i) The identity element of G, e, is unique.

(ii) Every element of G has a unique inverse with respect to ∗.

(iii) (Cancellation) For a, b, c ∈ G, if either a ∗ b = a ∗ c or b ∗ a = c ∗ a


then b = c.

(iv) For a, b ∈ G, the equation a ∗ x = b has the unique solution x = a−1 ∗ b.

(v) For a, b ∈ G, the equation x ∗ a = b has the unique solution x = b ∗ a−1 .

(vi) If a ∗ b = e then b = a−1 and a = b−1 .

Corollary B.9. Let (G, ∗) be a group and let a ∈ G. Then there is a


bijection between the set G and the sets {a ∗ g | g ∈ G} and {g ∗ a | g ∈ G}.

67
Remark B.10. In fact, a little more is true: La is actually surjective, as we
may easily show since we have already done all the work needed. Let b ∈ G.
Then by part (iv), there exists a solution to the equation a ∗ x = b, namely
a−1 ∗ b.
So for any fixed a ∈ G, La : G → G is in fact a bijection, that is, it is a
permutation of the elements of G. Of course, the same is true of Ra .
Proposition B.11. Let (G, ∗) be a group and let a, b ∈ G. Then

(a ∗ b)−1 = b−1 ∗ a−1 .

Proposition B.12. Let (G, ∗) and (H, ◦) be groups. Then the Cartesian
product G × H is a group with respect to the binary operation • defined by

(g1 , h1 ) • (g2 , h2 ) = (g1 ∗ g2 , h1 ◦ h2 )

for all g1 , g2 ∈ G, h1 , h2 ∈ H.
We briefly recap the notion of the sign of a permutation.
Let π ∈ Sn and suppose that π = γ1 γ2 . . . γk with the γi disjoint cycles.
Set
k
X
ν(π) = (|γi | − 1),
i=1

where |γi | denotes the length of the cycle γi . Note that ν(π) is well-defined,
although this requires some care (in particular, cycles of length 1 contribute
0 to the sum, and so may be ignored). Note that π −1 has the same cycle
lengths as π, so we have ν(π −1 ) = ν(π).
Clearly ν(π) = 1 if and only if π is a transposition. In fact we may
interpret ν(π) as follows:Pif all possible 1-cycles are included in the expression
k
π = γ1 γ2 . . . γk , then i=1 |γi | = n; thus ν(π) = n − k, i.e. ν(π) is the
difference between n and the number of cycles in the full cycle notation.
Definition B.13. Given π ∈ Sn , the sign of π is defined by

sign(π) = (−1)ν(π) .

Thus sign(π) = ±1; if sign(π) = 1 we call π an even permutation, while if


sign(π) = −1 we call π an odd permutation.
Note that because ν(π −1 ) = ν(π), we have sign(π −1 ) = sign(π).
Remark B.14. It is possible to define the sign of a permutation π in another
way, as (−1)n(π) where n(π) is the number of ‘inversions’ of π, i.e. the
number of pairs (i, j) with i < j and π(i) > π(j); we shall not go into this
here.
We now consider how the sign of a permutation is affected by multiplic-
ation by a transposition.

68
Lemma B.15. Let π ∈ Sn and let τ ∈ Sn be a transposition. Then
sign(πτ ) = (−1) · sign(π) = − sign(π).

Corollary B.16. If τ1 , . . . , τr ∈ Sn are transpositions, then

sign(τ1 · · · τr ) = (−1)r .

Theorem B.17. Let π, σ ∈ Sn . Then sign(π) sign(σ) = sign(πσ).

This result shows that if we define a function Σ : Sn → {1, −1} by setting


Σ(π) = sign(π) for all π ∈ Sn , then Σ is a homomorphism.
Notice too that if we take the product of two even permutations, we
obtain another even permutation (1 · 1 = 1!), so the subset of Sn consisting
of all the even permutations is closed with respect to the group operation.
This particular subgroup of Sn is important enough to have its own name.

Definition B.18. The subgroup of Sn consisting of the even permutations


is called the alternating group of degree n and is denoted An .

It can be shown that any element of An can be written as a product of


3-cycles (the ‘smallest’ elements of An , since every transposition is odd).
Remark B.19. Another way to describe the sign of a permutation is via
matrices. For a given permutation σ ∈ Sn , let Aσ be the n × n matrix with
i) entry (1 ≤ i ≤ n) and 0 everywhere else. One way to write
1 in the (σ(i),P
this is: Aσ = ni=1 eσ(i),i , where eji is the n × n matrix with 1 in the (j, i)
position and 0 everywhere else.
Then for any row vector (x1 , . . . , xn ) ∈ Rn ,

(x1 , x2 , . . . , xn )Aσ = (xσ(1) , . . . , xσ(n) )

Thus (or by a direct calculation), Aσ Aτ = Aστ .


What we have shown is that σ 7→ Aσ defines a function Sn → GLn (R)
that respects multiplication. In this setting, the sign of a permutation σ is
just det Aσ . The matrices Aσ are called permutation matrices or sometimes
monomial matrices.
As an explicit example, let σ = (1 2 3 4) and τ = (1 3 2) be permutations
in S4 . We have
   
0 0 0 1 0 1 0 0
1 0 0 0 0 0 1 0
Aσ = 0 1 0 0
, A τ = 1 0 0 0 ,
 

0 0 1 0 0 0 0 1

69
 
0 0 0 1
0 1 0 0
Aστ = A(1 4) =
0
 and
0 1 0
1 0 0 0
    
0 0 0 1 0 1 0 0 0 0 0 1
1 0 0 0 0 0 1 0 0 1 0 0
Aσ Aτ = 
0
 =  = Aστ .
1 0 0 1 0 0 0 0 0 1 0
0 0 1 0 0 0 0 1 1 0 0 0

B.2 Subgroups
Definition B.20. Let (G, ∗) be a group. A subset H of G is said to be a
subgroup if (H, ∗) is a group. If this is the case, we write H ≤ G.

Remarks B.21.

(a) From the definition, we see that “H is a subgroup of G” means not


just that “H is a group with respect to some operation”, but it means
that H is a group with respect to the same operation as G.

(b) It is immediate from this definition that every subgroup of an Abelian


group is Abelian: if a∗b = b∗a for all a, b ∈ G, then certainly a∗b = b∗a
for all a, b ∈ H ⊆ G.

(c) Remember that ∗ : G × G → G is a function sending pairs of elements


of G to an element in G. In order for (H, ∗) to be a group, when we
restrict this function to H × H, to obtain ∗|H×H : H × H → G, we
have to have Im ∗|H×H ⊆ H so that there is a well-defined function
∗|H
H×H : H × H → H, restricting the domain to H × H and the codo-
main to H, and (H, ∗|H H×H ) is a group. Unsurprisingly, we tend to
brush this under the carpet a bit and just write (H, ∗).

Examples B.22.

(a) (Z, +) is a subgroup of (Q, +), and both are subgroups of (R, +).

(b) (Q∗ , ×) is a subgroup of (R∗ , ×).

(c) (Q∗ , ×) is not a subgroup of (Q, +), since the two binary operations
are different.

(d) For n ∈ N let ζ = e2πi/n ∈ C. The set Cn = {ζ k | k ∈ Z} is a subgroup


of (C∗ , ×). Notice that |Cn | = n (since e2πi = 1), so that Cn is a finite
group, while C∗ is certainly not.

70
(e) In the group GLn (R) (where the operation is matrix multiplication),
the set of matrices with determinant 1 is a subgroup; this follows
from properties of determinants you proved in Linear Algebra. This
subgroup has a special name: it is known as the special linear group,
SLn (R). (As before, one may replace R by any field F and obtain
SLn (F ).)

(f) We can consider Sn−1 as a subgroup of Sn : if σ ∈ Sn−1 then we think


of σ as a permutation of {1, 2, . . . , n}, by setting σ(n) = n. More
generally, the set of permutations fixing every element of some subset
of {1, 2, . . . , n} is a subgroup: if two permutations fix some i so does
their product. Also if a permutation fixes i, so does its inverse.

(g) We claimed that the set of even permutations An is a subgroup of Sn ;


this claim will be justified more rigorously very shortly.

(h) In any group G, {e} and G are subgroups.

This last pair of examples is “uninteresting” (in the sense that they don’t
tell us anything we didn’t already know) and on occasion we will want to
exclude these from claims about subgroups. So we introduce the following
terminology.

Definition B.23. Let H be a subgroup of G.


The subgroup {e} is called the trivial subgroup of G. Considered as a
group in its own right, it is called the trivial group. If H 6= {e}, we say that
H is a non-trivial subgroup of G.
If H 6= G (so that H is a proper subset of G), we say H is a proper
subgroup of G and write H  G or more simply H < G.

In principle, to show that a subset H of a group (G, ∗) is itself a group—


that is, to check the definition of a subgroup above—one should show that
H has a well-defined binary operation that is associative, has an identity
element and where every element has an inverse. However this is a lot to
check and in fact one can check rather less.
We will give two lists of properties to check and we will refer to these as
the “subgroup tests”.

Proposition B.24 (Subgroup test I). Let H be a subset of a group (G, ∗)


and denote the identity element of G by eG . Then H is a subgroup of G if
and only if the following three conditions are satisfied:

(SG1) eG ∈ H;

(SG2) for all h1 , h2 ∈ H, we have h1 ∗ h2 ∈ H;

(SG3) for all h ∈ H, we have h−1 ∈ H.

71
Be alert here: it is not enough for a subset to be closed under the group
operation for it to be a subgroup. It is possible to find examples of sub-
sets satisfying (SG2) but not one or other of (SG1) and (SG3), as we will
see shortly. But first, let us start off positively and see some examples of
subgroups and the subgroup test in action.
Examples B.25.
(a) The set of even integers 2Z = {2r | r ∈ Z} is a subgroup of (Z, +), the
integers under addition.

(SG1) The identity of (Z, +) is 0, which is even.


(SG2) The sum of two even integers is even: 2r +2s = 2(r +s) ∈ 2Z.
(SG3) The negative of an even integer is even: if m = 2r then
−m = −2r = 2(−r) ∈ 2Z.

Similarly, for any m ∈ Z, mZ is a subgroup of (Z, +). If m = 0,


mZ = 0Z is trivial; if m = ±1, mZ = ±1Z = Z so mZ is proper
provided m 6= ±1.

(b) For n ∈ N and ζ = e2πi/n ∈ C, the set Cn = {ζ k | k ∈ Z} is a subgroup


of (C∗ , ×).

(SG1) We have 1 = ζ 0 ∈ Cn .
(SG2) By the rules for products of powers, ζ r ζ s = ζ r+s ∈ Cn .
(SG3) Clearly ζ −r is the inverse of ζ r .

We have C1 = {1} trivial and proper, and Cn non-trivial and proper


for n ≥ 2.

(c) As asserted previously, the set of even permutations An is a subgroup


of Sn .

(SG1) The identity permutation ι is even: sign(ι) = (−1)ν(ι) =


(−1)0 = 1.
(SG2) By Theorem B.17, if π, σ are even, so sign(π) = sign(σ) = 1,
then sign(πσ) = 1 · 1 = 1 and πσ is also even.
(SG3) Since ν(π −1 ) = ν(π), the inverse of an even permutation is
again even.

For n = 1, S1 = A1 = {ι} so A1 is trivial and not proper. For n = 2,


A2 = {ι} < S2 is trivial and proper. For n > 2, An is both proper and
non-trivial.

(d) The special linear group SLn (R) (matrices of determinant 1) is a sub-
group of GLn (R) (invertible matrices, with operation matrix multi-
plication).

72
(SG1) The identity matrix In has determinant 1.
(SG2) We know that det(AB) = det(A) det(B), so if A, B have
determinant 1, so does AB.
(SG3) We also know that det(A−1 ) = det(A)−1 , so if A has determ-
inant 1, so does A−1 .

For n = 1, GL1 (R) = R∗ = R\{0}; SL1 (R) = {1} is trivial and proper.
For any n > 1, SLn (R) is a proper non-trivial subgroup of GLn (R).
Sometimes it is instructive to see non-examples too, as in this case, to
see why all three conditions in the subgroup test are necessary.
Examples B.26.
(a) The set of odd integers, 2Z + 1 = {2r + 1 | r ∈ Z}, is not a subgroup
of (Z, +).

(SG1) The identity of (Z, +) is 0 but 0 is not odd.


(SG2) We have that 1 + 3 = 4: 1 and 3 are odd but 1 + 3 is not.
(SG3) 2Z + 1 is closed under negation: −(2r + 1) = 2(−r − 1) + 1.

While it is admittedly somewhat unnatural, we could consider the


subset S = (2Z + 1) ∪ {0}; this subset would satisfy (SG1) and (SG3)
but not (SG2).

(b) The set of positive integers N = {r ∈ Z | r > 0} is not a subgroup of


(Z, +).

(SG1) The identity element 0 is not a positive integer.


(SG2) N is closed under addition.
(SG3) N is not closed under negation.

So N0 = {r ∈ Z | r ≥ 0} = N ∪ {0} satisfies (SG1) and (SG2) but not


(SG3); N0 is not a group with respect to addition.
You might have been expecting an example in which we have (SG2) and
(SG3) but not (SG1). However this is not possible, since h ∗ h−1 = eG would
then be an element of H. So there is some redundancy in the subgroup test
above and we can give a more “efficient” version, as follows.
Proposition B.27 (Subgroup test II). Let H be a subset of a group (G, ∗).
Then H is a subgroup of G if and only if the following two conditions are
satisfied:
(SG10 ) H 6= ∅;

(SG20 ) for all h1 , h2 ∈ H, we have h1 ∗ h−1 −1


2 ∈ H, where h2 is the inverse
of h2 in G.

73
Exercise B.28. Let (G, ∗) be a group and consider the set

Z(G) = {z ∈ G | z ∗ g = g ∗ z for all g ∈ G}.

Prove that Z(G) is a subgroup of G and that Z(G) is Abelian.


This subgroup is important enough to be given a name, the centre of G.
If I is a set, we say that a collection of sets A is an I-indexed family of sets
if there is a one-to-one correspondence between the sets in the collection A
and I. Then we may write A = {Ai | i ∈ I}, i.e. denoting by Ai the member
of A corresponding to I. T
The intersection of the sets in A is i∈I Ai = {x | (∀ i ∈ I)(x ∈ Ai )}.
Proposition B.29. The intersection of any indexed family of subgroups of
G is again a subgroup of G.
Exercise B.30 (Not essential, but good practice with abstract arguments).
T the subgroups: for any collection H of subgroups
We do not need to index T
of G, we may define H = {x | (∀ H ∈ H)(x ∈ H)}. Prove that H is a
subgroup of G.
Remark B.31. The union of subgroups need not be a subgroup: let H1 = 2Z,
H2 = 3Z be subgroups of (Z, +). Then 2, 3 ∈ H1 ∪H2 but 2+3 = 5 ∈
/ H1 ∪H2 .
Example B.32. In (Z, +), the intersection of 2Z and 3Z is the set of integers
that are divisible by 2 and divisible by 3, and these are precisely the integers
divisible by 6. So 2Z ∩ 3Z = 6Z.
More generally, mZ ∩ nZ = lcm(m, n)Z and

m1 Z ∩ m2 Z ∩ · · · ∩ mr Z = lcm(m1 , m2 , . . . , mr )Z.

The collection H T
= {mZ | m ∈ N} is a N-indexed family of subgroups of
Z. The intersection m∈N mZ consists of all
T integers that are divisible by
every m ∈ N. Only 0 has this property, so m∈N mZ = {0} is trivial.
Sometimes we have a subset S of a group G that is not a subgroup, but
where we would like to find a subgroup of G that contains S. Obviously we
could take all of G but this won’t tell us much: we really want the smallest
subgroup of G containing S.
“Smallest” here might mean “smallest size”, but for infinite sets it is
better to think of smallest as “smallest under inclusions of sets”. But we
know how to find the smallest set: it is the intersection. (The intersection is
a subset of every set in the collection and is the unique smallest such under
inclusion.) This leads us to make the following definition.
Definition B.33. Let G be a group and let S ⊆ G be a subset of G. Let
S = {H ≤ GT| S ⊆ H} be the family of subgroups of G that contain S. The
intersection S is a subgroup ofTG containing S, called the subgroup of G
generated by S. We write hSi = S.

74
Notice that by Proposition B.29 (or rather the slightly more general
version in the exercise that follows) hSi is a subgroup of G. It is also the
unique subgroup having the desired property, namely that it contains S. If
S is actually a subgroup already, then hSi = S.
Example B.34. In (Z, +), we have hmi = mZ (and in particular h1i = Z)
and h{m, n}i = hcf(m, n)Z, where hcf denotes the highest common factor.
This key example leads us to the following definition.

Definition B.35. We say a group G is cyclic if there exists g ∈ G such that


hgi = G.

That is, cyclic groups are 1-generator, i.e. can be generated by a single
element. With a little work, one can show that a finite cyclic group is
necessarily isomorphic to Zn for some n and any infinite cyclic group is
isomorphic to Z.
We will often use a “generic” cyclic group Cn , given by

Cn = hg | g n = ei

To simplify notation, if S = {g1 , . . . , gr }, we will write hg1 , . . . , gr i rather


than h{g1 , . . . , gr }i for the subgroup generated by S. So for example, we
would write hm, ni = hcf(m, n)Z.
The above description of the subgroup generated by S as an intersection
is good from the point of view of proving that it is actually a subgroup, but
bad from the point of view of knowing what hSi looks like as a subset of G.
So we are now going to try to tie down a more precise description of
the subgroup generated by S, as the set of elements of G which can be
“expressed in terms of” the elements of S. For example, if x, y ∈ S (and
hence in hSi since S ⊆ hSi) then xy 2 x−1 y 2 x3 y −1 x is expressed in terms of
x and y and is in hSi, by repeated application of the subgroup test.
The following proposition expresses this idea in more formal language.

Proposition B.36. Let S be a subset of G. Let H be the set of all elements


of G of the form sa11 sa22 · · · sar r , where r is a non-negative integer, s1 , . . . , sr ∈
S and a1 , . . . , ar ∈ Z. (When r = 0, this is the identity element of G.)
Then H is a subgroup of G and H = hSi.

More than once we have used the notion of a presentation of a group, for
example when we described the dihedral groups and most recently the cyclic
groups. We can now say a little more precisely what this means, although
we will not check all the details.
Giving a presentation for a group G, written hX | Ri, is the claim that G
is isomorphic to a quotient of the free group generated by X, F(X) = hXi.
The free group on a set is analogous to forming polynomials in a set
of variables: the elements of the free group are words in the set X, which

75
simply means finite expressions of the form x±1 ±1 ±1 with x ∈ X.
1 x2 · · · xr i
This becomes a group by the operation of concatenation (sticking two such
expressions together), where we understand that the expression of length 0
is the identity e and xi x−1 −1
i = e = xi xi .
The particular quotient we take to form hX | Ri is the one where we take
R to be a set of elements of F(X), form the subgroup hRi they generate
and then take the normal closure of this, i.e. we find the smallest normal
def
subgroup hRi of F(X) containing hRi. Then hX | Ri = F(X)/hRi.
Elements of X are called generators of hX | Ri and elements of R are
called relators. In practice, we often give elements of R as relations rather
than relators. Defining by way of an example, if we want a relation such as
b−1 ab = a−1 (imposing a relation meaning that in the group we want the
two sides to be equal) then formally we should make this into the relator
b−1 aba and put this element in R. Then in the quotient group b−1 aba will
become equal to the identity element of the quotient, so b−1 aba = e, i.e.
b−1 ab = a−1 . (We are being sloppy and using the same letters for the
elements a, b of F(X) and their images a + hRi, b + hRi in the quotient, but
this is common practice.) So rather than Cn = hg | g n = ei we should write
hg | g n i, but the former is easier to understand and we do that instead.

B.3 Cosets
Throughout this section, we will consider a group G and a subgroup H
of G. As previously, we will mostly write the group operation in G by
concatenation: if g, h ∈ G, gh will denote the element obtained from g and
h (in that order) under the binary operation of the group. The exception
will be in groups where the binary operation is addition (such as (Z, +)),
where we will write g + h.

Definition B.37. For any g ∈ G, the set gH = {gh | h ∈ H} will be called


the left coset of H in G determined by g.

Example B.38. In S3 , let

H = h(1 2 3)i = {ι, (1 2 3), (1 3 2)}.

The coset (1 2)H in S3 is


{ (1 2)ι = (1 2),
(1 2)(1 2 3) = (1)(2 3) = (2 3),
(1 2)(1 3 2) = (1 3)(2) = (1 3) }

First of all, we see that gH ⊆ G. Since e ∈ H, we have g = ge ∈ gH.


Indeed, every element of gH is, by definition, a “multiple” of g by an element
of H; the left coset gH is precisely all the elements of G that can be written
as gh for some h ∈ H.

76
If H = {e, h1 , h2 , . . . , hr } is finite, this is clear: gH = {g, gh1 , gh2 , . . . , ghr }
and by Proposition B.8(iii) (the cancellation property for groups) ghi = ghj
would imply hi = hj , so the elements g, gh1 , . . . , ghr are distinct. In partic-
ular, |gH| = |H|. It is natural to think of gH as the “translations by g” of
the elements of H.
In fact, the claims of the above paragraph still hold if H is infinite, if we
work a little more abstractly.

Lemma B.39. Let H be a subgroup of G. For any g ∈ G, there is a bijection


Lg : H → gH given by Lg (h) = gh.

So any two cosets of H have the same size. Notice too that eH = H so
H is a coset of itself (the “translation” of H that does nothing).
However it is very much not true that if g1 6= g2 then g1 H 6= g2 H;
shortly we will see the correct condition for when two cosets are equal. This
condition will also tell us when a coset can be a subgroup (sneak preview:
only the “trivial” coset eH = H is).
First, let us look at a very important example.
Example B.40. We consider (Z, +) and its (cyclic) subgroup 4Z. Since we
are working in an additive group, we write the group operation in our coset
notation, so the cosets of 4Z are r + 4Z = {r + m | m ∈ 4Z} = {r + 4n | n ∈
Z}.
We see that the left cosets of 4Z are precisely the congruence classes of
integers modulo 4. The congruence class modulo 4 containing 3, for example,
is the set of integers {. . . , −9, −5, −1, 3, 7, 11, . . . }, which is exactly the set
of integers of the form 3 + 4n, namely {3 + 4n | n ∈ Z} = 3 + 4Z. So cosets
in particular describe congruence classes of integers.
Notice that every integer belongs to some left coset and that no integer
belongs to more than one left coset. This is because the congruence classes
modulo n for some fixed n partition Z, since congruence modulo n is an
equivalence relation. The following theorem is a vast generalisation of this
result, which corresponds to the specific group (Z, +).

Theorem B.41. Let G be a group and let H be a subgroup of G.

(i) The relation RH on G defined by

g RH k ⇐⇒ g −1 k ∈ H

is an equivalence relation.

(ii) The equivalence classes for RH are precisely the left cosets gH, g ∈ G.

Corollary B.42. For H a subgroup of G, the set of left cosets


LH = {gH | g ∈ G} is a partition of G.

77
That is, for g1 , g2 ∈ G, g1 H ∩ g2 H is either empty or g1 H S
= g2 H (and the
latter happens if and only if g1−1 g2 ∈ H), and furthermore g∈G gH = G.

Corollary B.43. For a subgroup H of G, we have gH = kH if and only if


g −1 k ∈ H.

The number of cosets of a given subgroup H of a group G then provides


some measure of the relative size of H in G.

Definition B.44. Let G be a group and let H be a subgroup of G. The


index of H in G is defined to be the cardinality of the set of left cosets of H
in G. We denote the index of H in G by |G : H|.

Example B.45. For n ≥ 2, the set of even permutations An in Sn has index


|Sn : An | = 2. By Theorem B.17, if π ∈
/ An and τ = (1 2) then

sign(τ π) = sign(τ ) sign(π) = (−1)(−1)

so π = (1 2)(1 2)π ∈ (1 2)An . That is, for any π ∈ Sn , either π ∈ An or


π ∈ (1 2)An and the two cosets An and (1 2)An of An in Sn partition Sn .
Therefore the index of An in Sn is at most 2. But transpositions are odd,
so (1 2) ∈
/ An and so there are at least two cosets of An and the index is
exactly 2.
For n = 1, A1 = S1 = {ι} so the index is 1.
When we think about the set of left cosets of a subgroup, it is very
natural to want a complete list of the cosets, without repetitions. But we
have already seen that a coset of H in G has |H| representatives, so clearly
such a list will not be unique in general. Still, we can simply make some
choice.

Definition B.46. Let H be a subgroup of a group G. A transversal for H


in G is a set TH ⊆ G such that for every coset gH, there exists a unique
element tg ∈ TH such that gH = tg H.

That is, a transversal is a complete set of representatives of the left cosets


of H in G, such that any two distinct cosets have distinct representatives.
If we are looking at a specific group and subgroup, there might be one
or more “natural” choices of transversal. This happens in particular for the
integers under addition and the subgroup nZ. As we saw for 4Z above, the
cosets correspond to the different integers modulo n, so a natural transversal
for nZ in (Z, +) is
TnZ = {r | 0 ≤ r ≤ n − 1}.
Explicitly, for n = 4, we can choose T4Z = {0, 1, 2, 3} as the set of left cosets
is L4Z = {0 + 4Z, 1 + 4Z, 2 + 4Z, 3 + 4Z}.
Other choices are also perfectly valid: {1, 2, 3, 4}, {0, 5, 18, 1003} and
{−3, −2, −1, 0} are all transversals for 4Z. However, it is natural to choose

78
{0, 1, 2, 3} as these are the minimal positive representatives, and we will
usually make this choice and take TnZ as above.
Note however that {−2, −1, 0, 1, 2} is another natural choice for a trans-
versal of 5Z, and a similar choice can be made for (2m + 1)Z (i.e. odd n).
Since this works less well for even n, it is less commonly used, until one
wants to consider pZ for p prime, when in all but one case p is odd.
You will probably have noticed that we have been saying “left cosets”
and that the definition of gH is asymmetric. Unsurprisingly, one can make
the corresponding definition of a right coset.

Definition B.47. For any g ∈ G, the set Hg = {hg | h ∈ H} will be called


the right coset of H in G determined by g.

Analogously to Definition B.44, define the right index of H in G to be


the number of right cosets of H in G and denote it by |H : G|.
Namely, in general the left cosets of a subgroup are not the same as the
right cosets; certain special conditions have to hold for this to be the case.
In the example, the left coset (1 2)K is {(1 2), (2 3 4), (1 3 2 4), (1 4 3)}
which we see is not equal to K(1 2).
So we must be careful to specify whether we mean left cosets or right
cosets. Fortunately, while we need to make a choice, some key properties
are independent of which choice we make. The size of a left or a right coset
is the same: every coset, whether left or right, has the same size as H (there
is a natural right-handed version of Lemma B.39).
A little more work also shows that the number of left cosets of H in G is
equal to the number of right cosets of H in G; that is, |G : H| = |H : G|. In
fact, this is why we just say “index” and not “left index” or “right index”.
This is justified by the following exercise.
Exercise B.48. Let H be a subgroup of G. Let LH = {gH | g ∈ G} and
RH = {Hg | g ∈ G} be the sets of left and right cosets of G respectively.
Prove that the function ψ : LH → RH , ψ(gH) = Hg −1 is a bijection.
This function ψ is a function between two sets: in general, the set of
cosets of a subgroup has no algebraic structure.
Remark B.49. Subgroups such that gH = Hg for all g ∈ G are very im-
portant, so much so that they have a special name. They are called normal
subgroups and we will examine them and their importance in detail in the
following sections.
But, as a glimpse of what is to come, we will see that the set of (left)
cosets of a normal subgroup can be given an algebraic structure. Specifically,
the group operation descends12 to a group operation on the set of cosets: if
12
This almost certainly seems an odd choice of word; there are reasons why we use it
and this will become clearer later. For now, think that the original group is “bigger” and
the set of cosets is “smaller” and the operation “goes down” from the former to the latter.

79
gH = Hg for all g ∈ G, we can define a binary operation on cosets by

(gH)(kH) = g(Hk)H = g(kH)H = gkH 2 = gkH.

B.4 Lagrange’s theorem


Theorem B.50 (Lagrange’s theorem). Let G be a group and H a subgroup
of G. Then |G| = |G : H||H|.
Corollary B.51. For any subgroup H of G, both |H| and |G : H| divide
|G|.
Conversely, if m does not divide |G|, G cannot have a subgroup of order m.
Remark B.52. Strictly speaking, the above proof only works for finite groups.
However the result is true as stated for infinite groups, provided the claim
is interpreted properly:
Exercise B.53. Let LH denote the set of left cosets of H in G. Choose TH
a transversal for H, that is, TH ⊆ G such that for all gH ∈ LH , there exists
a unique tg ∈ TH such that gH = tg H (this was Definition B.46). Then for
all g ∈ G, there exists a unique hg ∈ H such that g = tg hg .
Prove that the function ϕ : G → LH ×H, ϕ(g) = (tg H, hg ) is well-defined
and a bijection.
Our mental picture should be the elements of G laid out in a grid, with
each row being a list of the elements of H or a coset of it, and the columns
indexed by (some fixed choice of) representatives for the cosets (the trans-
versal TH ). As a specific example, let us take G = C30 = g | g 30 = e ,
H = g 5 and TH = {g 0 = e, g, g 2 , g 3 , g 4 }, as follows.

TH

g4H g4 g9 g 14 g 19 g 24 g 29

g3H g3 g8 g 13 g 18 g 23 g 28

g2H g2 g7 g 12 g 17 g 22 g 27

g1H g1 g6 g 11 g 16 g 21 g 26

g0H g0 g5 g 10 g 15 g 20 g 25

Notice that this picture suggests something interesting: if |G| cannot


be written as a product in a non-trivial way, we must have either one row
or one column. Remarkably, this visual aid is actually telling us a genuine
result, which we will get to in two steps.

80
Corollary B.54. Let G be a finite group of order |G| = n. Then for all
g ∈ G, o(g) n and g n = e.
Corollary B.55. A group of prime order is cyclic and has no proper non-
trivial subgroups. Any non-identity element generates the group.
Groups with no proper non-trivial normal subgroups are called simple groups.
At the end of our work on groups, we will see in what sense these groups are
“simple” (they can be extremely large and complicated by other measures)
and why they are the building blocks for all groups. Cyclic groups of prime
order are very simple: they have no proper non-trivial subgroups at all!
Proposition B.56. If H, K ≤ G and hcf(|H|, |K|) = 1, then H ∩ K = {e}.

B.5 Homomorphisms
When we have two groups G and H and we want to relate the group struc-
ture on G to that on H, to compare properties between them, the right
mathematical thing to do is to start with a function ϕ : G → H. This gives
us a relationship between the underlying sets but this need not relate the
group structures: we need ϕ to be compatible with the binary operations
on G and H.
This leads us to the following definition.
Definition B.57. Let (G, ∗) and (H, ◦) be groups. A function ϕ : G → H
is called a group homomorphism if

ϕ(g1 ∗ g2 ) = ϕ(g1 ) ◦ ϕ(g2 )

for all g1 , g2 ∈ G.
Notice in particular that the operation in G, ∗, is being used on the left-hand
side of this equation (on g1 , g2 ∈ G), and that the operation in H, ◦ is being
used on the right-hand side (on ϕ(g1 ), ϕ(g2 ) ∈ H). This is an instance when
it is helpful to write the group operations explicitly.
First, let us deal with some elementary properties of homomorphisms.
Proposition B.58. Let ϕ : G → H be a group homomorphism. Then
(i) ϕ(eG ) = eH ;
(ii) ϕ(g −1 ) = ϕ(g)−1 for all g ∈ G;
(iii) ϕ(g n ) = ϕ(g)n for all g ∈ G, n ∈ Z and
(iv) o(ϕ(g)) | o(g) for all g ∈ G.
The composition of two functions is again a function and we would expect
that if two composable functions preserved group structure, their composi-
tion would too. Indeed this is the case.

81
Proposition B.59. Let ϕ : G → H and σ : H → K be group homomorph-
isms. Then σ ◦ ϕ : G → K is a group homomorphism.

Examples B.60.

(a) The function ϕ : Z → Z, ϕ(n) = 2n for all n ∈ Z is a group homo-


morphism from (Z, +) to itself:

ϕ(m + n) = 2(m + n) = 2m + 2n = ϕ(m) + ϕ(n)

for all m, n ∈ Z. (Note the additive notation.)

(b) The function ϕ : R∗ → R∗ , ϕ(x) = x2 for all x ∈ R∗ = (R \ {0}, ×) is


a group homomorphism since

ϕ(xy) = (xy)2 = x2 y 2 = ϕ(x)ϕ(y)

for all x, y ∈ R∗ .

(c) The function ϕ : R → C∗ , ϕ(x) = e2πix for all x ∈ R is a group


homomorphism from (R, +) to (C \ {0}, ×) since

ϕ(x + y) = e2πi(x+y) = e2πix+2πiy = e2πix e2πiy = ϕ(x)ϕ(y)

for all x, y ∈ R. (Note the two different notations for the group oper-
ations!)

(d) Recall from Remark B.19 that we have a function ϕ : Sn → GLn (R),
ϕ(σ) = Aσ where (
1 if σ(j) = i
(Aσ )ij =
0 otherwise
is the permutation matrix associated to σ.
Then Aστ = Aσ Aτ , where the left-hand side involves the composition
of permutations and the right-hand side is the usual multiplication of
matrices. (The definition of Aσ , with a 1 when σ(j) = i might seem
to be the “wrong way round”; however if we made the more natural-
looking definition with σ(i) = j, we would have an order reversal in
the products.) It follows that ϕ : Sn → GLn (R), ϕ(σ) = Aσ is a group
homomorphism. For an explicit example, look back to Remark B.19.

Exercise B.61. Let (G, ∗) and (H, ◦) be groups with identity elements eG and
eH respectively. Show that the function ϕ : G → H defined by ϕ(g) = eH
for all g ∈ G is a group homomorphism. We call this the trivial group
homomorphism.

82
Recall that certain sorts of functions are special, namely injective, sur-
jective and bijective functions; also, a function is bijective if and only if
it is invertible. If a group homomorphism has one of these properties, we
(sometimes) use a special name for it. So an injective group homomorphism
is also known as a monomorphism and a surjective group homomorphism is
called an epimorphism. This comes from the wider category theory context,
but in Grp being mono is equivalent to being injective and being epi is the
same as being surjective.
Examples B.62.

(a) The homomorphism ϕ : Z → Z, ϕ(n) = 2n for all n ∈ Z is injective


but not surjective:

ϕ(m) = ϕ(n) ⇒ 2m = 2n ⇒ m=n

but 3 ∈
/ Im ϕ.

(b) The homomorphism ϕ : R∗ → R∗ , ϕ(x) = x2 for all x ∈ R∗ is not


injective or surjective: ϕ(−1) = 1 = ϕ(1) and −1 ∈
/ Im ϕ.

(c) The homomorphism ϕ : R → C∗ , ϕ(x) = e2πix for all x ∈ R is not


injective and not surjective:

ϕ(1) = e2πi = 1 = ϕ(0)

and Im ϕ = {z ∈ C | kzk = 1} ( C∗ .

(d) The homomorphism ϕ : Sn → GLn (R), ϕ(σ) = Aσ is injective but not


surjective: given Aσ , we can easily recover σ (uniquely) but 2In ∈
/
Im ϕ.

(e) The function Σ : Sn → {1, −1}, Σ(π) = sign(π) for all π ∈ Sn is


a homomorphism by Theorem B.17, where {−1, 1} is a group under
multiplication.
Then Σ is not injective (there are many different permutations have
sign 1; indeed these are the even permutations belonging to An , of
which there are n!/2) but Σ is surjective, since there exist both even
and odd permutations.

Definition B.63. A group homomorphism ϕ : G → H that is bijective is


called a group isomorphism.

83
Note that the identity function idG : G → G, idG (g) = g is a group
homomorphism13 and is a bijection, so is a group isomorphism. This is no
surprise: every group should have the same group structure as itself!
A group isomorphism is a particular kind of homomorphism—a function
with certain properties. We can use these to talk about how two groups
might be related, as follows.

Definition B.64. We say that two groups G and H are isomorphic if and
only if there exists a group isomorphism ϕ : G → H. If G and H are
isomorphic, we often write G ∼
= H, for short.

Lemma B.65. Let G and H be groups. Then the following are are equival-
ent:

(i) G and H are isomorphic;

(ii) there exists a bijective homomorphism ϕ : G → H; and

(iii) there exist group homomorphisms ϕ : G → H and ψ : H → G such that


ψ ◦ ϕ = IG and ϕ ◦ ψ = IH .

The point of this lemma is that, depending on the situation at hand, it


might be easier to check (ii) or (iii). Indeed, we see from a careful examin-
ation of the proof that (iii) could be weakened further to

(iii’) there exist functions ϕ : G → H and ψ : H → G such that ψ ◦ ϕ = IG


and ϕ ◦ ψ = IH and either ϕ or ψ is a group homomorphism.

That is, if we know that a homomorphism is invertible, it is not necessary


to check separately that the inverse is also a group homomorphism: this
holds automatically. In particular, the inverse of an isomorphism is an
isomorphism.
Since a group isomorphism ϕ : G → H is a bijection, we must have that
|G| = |H|, i.e. isomorphic groups have the same order. The converse is very
definitely false in general but the contrapositive tells us that if two groups
have different orders, they cannot be isomorphic.
As for graphs, this leads us to seek invariants that help us identify when
two given groups are isomorphic or not. If we can prove that a particular
property is preserved under isomorphism, then if one group has the property
but the other does not, they cannot be isomorphic.
Examples of invariants are the order of the group (as above), the set of
natural numbers that are the orders of the elements of the group (see the
next example), being Abelian and being cyclic.
13
This, along with properties proved above, shows that the collection of all groups
together with all group homomorphisms between them forms a category.

84
Example B.66. The groups C6 and S3 are not isomorphic. They have the
same order (|C6 | = 6 = 3! = |S3 |) but C6 is Abelian and S3 is not. Indeed,
C6 is cyclic and S3 is not (if it were, every permutation in S3 would have to
be an r-cycle for some r, but S3 has both transpositions and 3-cycles).
However, sharing some properties is not (usually) enough to prove that
the groups are isomorphic: almost always you will need to find an explicit
isomorphism between them. That said, if two groups share lots of properties,
this might reasonably lead you to conjecture that they are isomorphic; this
would not be a proof, though.
Example B.67. Recall that the subset C12 = {e2kπi/12 | 0 ≤ k < 12} of C∗ is
a cyclic subgroup, so is a cyclic group in its own right. We may write it as
C12 = hζi for ζ = e2πi/12 .
Let G = hgi be the (generic) cyclic group of order 12. So G = {e, g, g 2 , . . . , g 11 }
with group operation g r g s = g r+s . We claim that C12 is isomorphic to G.
Let ϕ : C12 → G be defined by ϕ(ζ k ) = g k for all 0 ≤ k < 12. Clearly ϕ
is a bijection and

ϕ(ζ k ζ l ) = ϕ(ζ k+l ) = g k+l = g k g l = ϕ(ζ k )ϕ(ζ l )

so ϕ is a homomorphism.

Now consider (Z12 , +), the integers modulo 12, with addition (modulo
12). So Z12 = {b0, b
1, . . . , 11}
b with b c if and only if a + b ≡ c mod 12.
a + bb = b
We claim that this group is also isomorphic to G, and hence is isomorphic
to C12 too.
The function we need is ψ : Z12 → G, ψ(b a) = g a . Then ψ is a bijection:
a b
g = g if and only if a ≡ b mod 12. It is a group homomorphism since

a + bb) = g a+b = g a g b = ψ(b


ψ(b a)ψ(bb).

So ψ is an isomorphism.

The observant among you will have noticed that the number 12 is irrel-
evant in this example: everything works for a general n ∈ N. We can extend
it further, as follows.

Proposition B.68. Let G and H be cyclic groups. Then G is isomorphic


to H if and only if they have the same order, |G| = |H|.

Note that neither the statement or the proof assume that the cyclic
groups are finite.
As we said before, two groups having the same order is far from being
enough to tell us that the groups are isomorphic. The message we take from
this proposition is that being cyclic is a very strong condition: same order
plus cyclic is enough to give us isomorphic.

85
Previously we noted that the identity map provides an isomorphism of
a group with itself. In fact, a group can have other self-isomorphisms and
it is often an important question to know how many. In some sense, these
are the symmetries of the group (which itself may be the symmetries of
something!).
Definition B.69. Let G be a group. A group isomorphism ϕ : G → G is
called an automorphism of G.
Lemma B.70. Let G be a group. The set of automorphisms of G forms a
group under composition.
This justifies the following definition.
Definition B.71. Let G be a group. The group of automorphisms of G,
AutGrp (G) = {ϕ : G → G | ϕ is an isomorphism} is called the automorph-
ism group of G.
Example B.72. One can show that AutGrp (Cn ) ∼ = (Z×n , ×), the group of units
of the ring of integers modulo n. In particular, we have AutGrp (C2 ) = {e}
and AutGrp (C3 ) ∼= C2 .
We will not justify these claims here: finding the automorphism group
of even a small group usually requires more sophisticated technology than
we currently have to hand. We will simply note that the above has an
important relationship with number theory: n 7→ |AutGrp (Cn )| is called the
Euler totient function14 and is a very important function about which much
more is said in MATH328 Number Theory.
Instead, we will move on and look at some important subsets of a group
associated to homomorphisms.

B.6 Kernels and images


We already said that injective and surjective homomorphisms are particu-
larly important; those that are both are very special, being isomorphisms.
However most homomorphisms are neither injective nor surjective, so we
would like a way to measure how far from being injective or surjective a
given homomorphism is.
This is done by means of the kernel and image of the homomorphism.
The image is just the image of the function: recall that being surjective
precisely means that the image is all of the codomain. So the size of the
image measures how close the map is to being surjective—the larger, the
better.
The kernel is a subset of the domain whose size measures how close to
being injective the map is: if the kernel is as small as possible, then the map
is injective.
14
Also known as Euler’s phi function, since it is often denoted ϕ(n).

86
Definition B.73. Let ϕ : G → H be a group homomorphism. The kernel
of ϕ is defined to be the subset of G given by

Ker ϕ = {g ∈ G | ϕ(g) = eH }.

Definition B.74. Let ϕ : G → H be a group homomorphism. The image


of ϕ is defined to be the subset of H given by

Im ϕ = {h ∈ H | (∃ g ∈ G)(ϕ(g) = h)}.

Notice that the definition of the kernel would not be possible without the
identity element eH and that the image is exactly the image of ϕ as a
function. Be very clear in your mind that Ker ϕ is a subset of the domain,
G, and Im ϕ is a subset of the codomain, H.
You might find the following picture helpful:

Ker ϕ Im ϕ
eG eH

ϕ
G H

Very shortly we will come to examples but first we will prove some basic
facts about kernels and images that will be very helpful in working out the
examples.

Proposition B.75. Let ϕ : G → H be a group homomorphism.

(i) The kernel of ϕ, Ker ϕ, is a subgroup of G.

(ii) The image of ϕ, Im ϕ, is a subgroup of H.

87
(iii) The homomorphism ϕ is injective if and only if Ker ϕ = {eG }.
(iv) The homomorphism ϕ is surjective if and only if Im ϕ = H.
(v) The homomorphism ϕ is an isomorphism if and only if Ker ϕ = {eG }
and Im ϕ = H.
Remark B.76. Let ϕ : G → H be an injective homomorphism. Then we can
consider the function ψ = ϕ|Im ϕ : G → Im ϕ, the codomain restriction to
Im ϕ of ϕ. (This is the function that takes the same values as ϕ but where
we “throw away” any elements of H not in the image of ϕ, and so just take
Im ϕ as the codomain.) Since Ker ψ = Ker ϕ = {eG } and Im ψ = Im ϕ, ψ is
both injective and surjective. Hence ψ is an isomorphism of G with Im ϕ.
Since Im ϕ is a subgroup of H, in this situation we say that G is iso-
morphic to a subgroup of H.
That is, to show that a group G is isomorphic to a subgroup of a group
H, we show that there exists an injective homomorphism from G to H. If
that homomorphism is also surjective, we have that G is isomorphic to H
itself.
Let us revisit the examples from Examples B.60 and B.62.
Examples B.77.
(a) The homomorphism ϕ : Z → Z, ϕ(n) = 2n for all n ∈ Z is injective so
Ker ϕ = {eZ } = {0}.
The image of ϕ is all integers of the form 2n, so Im ϕ = 2Z.
(b) The homomorphism ϕ : R∗ → R∗ , ϕ(x) = x2 for all x ∈ R∗ is not
injective or surjective. Its kernel is all non-zero real numbers whose
square is equal to 1 (which is the identity element in R∗ = (R\{0}, ×)).
So Ker ϕ = {1, −1}.
The image of ϕ is the set of all non-zero real numbers that are squares;
there is no “nice” description of this, so we simply have

Im ϕ = {y ∈ R∗ | y = x2 for some x ∈ R∗ }.

(c) The homomorphism ϕ : R → C∗ , ϕ(x) = e2πix for all x ∈ R has as


kernel all integers, Ker ϕ = Z.
As we saw before, the image of ϕ is

Im ϕ = {z ∈ C | kzk = 1} ( C∗ .

(d) The homomorphism ϕ : Sn → GLn (R), ϕ(σ) = Aσ is injective, so


Ker ϕ = {eSn } = {ι}. Again the image of ϕ has no particularly nice
description: it is just

Im ϕ = {A ∈ GLn (R) | A = Aσ for some σ ∈ Sn }.

88
(e) The homomorphism Σ : Sn → {1, −1}, Σ(π) = sign(π) for all π ∈ Sn
is not injective. Its kernel is

Ker Σ = {σ ∈ Sn | sign(σ) = 1}

since 1 is the identity element in ({1, −1}, ×). By definition, these are
the even permutations, An , so Ker Σ = An . Since there exist both
even and odd permutations, Σ is surjective and Im Σ = {1, −1}.

In the picture above, we saw that given a homomorphism ϕ : G → H,


we can split G up into pieces labelled by the element of H that the elements
of that piece map to under ϕ. The next result tells us that this partition of
G, coming from the equivalence relation Rϕ defined by g1 Rϕ g2 if and only
if ϕ(g1 ) = ϕ(g2 ), is a particularly nice partition: it is the same one as we
obtain from RKer ϕ as described in Theorem B.41.

Proposition B.78. Let ϕ : G → H be a group homomorphism and let K =


Ker ϕ. Then for all g ∈ G,

gK = {l ∈ G | g Rϕ l} = {l ∈ G | ϕ(g) = ϕ(l)}.

Hence the function ψ : LK → Im ϕ, ψ(gK) = ϕ(g) is a bijection, where


LK = {gK | g ∈ G} is the set of left cosets of K = Ker ϕ.

Since the index |G : Ker ϕ| is defined to be the cardinality of the set of


left cosets of Ker ϕ, as an immediate corollary of this result and Lagrange’s
theorem we have:

Corollary B.79. Let ϕ : G → H be a group homomorphism. We have that


|G : Ker ϕ| = |Im ϕ| and hence |G| = |Ker ϕ||Im ϕ|.

This is the analogous result for groups to the Dimension Theorem for vector
spaces, which asserts that for a linear transformation T : V → W we have
dim V = dim Ker T + dim Im T .
In fact, both of these are “numerical shadows” of stronger statements:
shortly we will introduce quotient groups and prove the stronger version for
groups.
Since Ker ϕ ≤ G and Im ϕ ≤ H, we already knew that |Ker ϕ| divides |G|
and |Im ϕ| divides |H|, by Lagrange’s theorem. However this corollary tells
us that |Im ϕ| also divides |G|, which can be useful to know, in applications
such as the following.
Example B.80. Let G be a group of order 16 and H a group of order 9.
Then the only homomorphism ϕ : G → H is the trivial homomorphism with
ϕ(g) = eH for all g ∈ G. For if ϕ : G → H is a homomorphism, |Im ϕ| must
divide |G| and |H|, but hcf(16, 9) = 1 so |Im ϕ| = 1 and so Im ϕ = {eH }.

89
B.7 Cayley’s theorem
Sometimes one needs to calculate in particular explicit examples. This might
be done by hand or, more commonly now, by a computer package. But in
order to do so, we need a “nice” way to represent the elements of our group,
to store them and to calculate the result of the binary operation on them. In
this section, we will see two theoretical results that can be used to provide
such representations. The first is called Cayley’s theorem.
Definition B.81. Let X be a set. The symmetric group on X is defined to
be the group SX = (Bij(X), ◦) of bijections between X and itself.
Theorem B.82 (Cayley’s theorem). Let G be a group. There is an injective
homomorphism λ : G → SG .
Corollary B.83. Every group G is isomorphic to a subgroup of SG .
In particular, every finite group G of order n is isomorphic to a subgroup
of the symmetric group of degree n, Sn .
Corollary B.84. Let n ∈ N. There are finitely many non-isomorphic
groups of order n.
The phrase “up to isomorphism” at the end here is very important: there
are infinitely many groups with one element, for example, but they are all
isomorphic15 .
We can also join two previous results together: Cayley’s theorem and the
homomorphism from the symmetric group to the general linear group given
by taking permutation matrices. This gives us, for free, a matrix version of
Cayley’s theorem.
Theorem B.85. Let G be a finite group. There is an injective homomorph-
ism µ : G → GL|G| (R).

B.8 Quotient groups


Shortly we will see that the above result about kernels of homomorphisms
being normal subgroups is only half of the story: in fact, every normal
subgroup is the kernel of some homomorphism. To prove this will take some
more preparation, however. Firstly, we will, as promised, show that if we
have a normal subgroup N of a group G, we can put an algebraic structure
on the set of cosets of N in G. Specifically, the set of cosets inherits a group
structure from G and this new group is called the quotient group.
Although the construction may seem abstract, we will soon see that a
well-known group is most properly understood as a quotient, namely the
integers modulo n.
15
For each x ∈ R, let Ex = ({x}, ∗) be the group with binary operation x ∗ x = x (note
that ∗ is neither + or ×!). Then if x 6= y, Ex 6= Ey but Ex ∼ = Ey for all x, y ∈ R via
x 7→ y.

90
The ingredients for the construction are:
• a group (G, ∗) (to try to avoid confusion, we will revert to using an
explicit symbol to denote the binary operation in G);

• a normal subgroup N of G; and

• the set of (left) cosets LN = {gN | g ∈ G} of N in G.


Proposition B.86. Let G be a group and N a normal subgroup of G. Then
• : LN × LN → LN given by gN • hN = (g ∗ h)N defines a binary operation
on LN .
Definition B.87. Let (G, ∗) be a group and let N be a normal subgroup
of G. The quotient group of G by N , denoted G/N , is the group (LN , •)
where LN is the set {gN | g ∈ G} of left cosets of N in G and • is the binary
operation gN • hN = (g ∗ h)N .
Proposition B.88. The binary operation gN • hN = (g ∗ h)N defines a
group structure on LN = {gN | g ∈ G}.
From now on, we will use the notation G/N in preference to (LN , •), so
that when we write G/N it is understood that G/N as a set is the set of
left cosets of a normal subgroup N of G and that the binary operation is •.
Corollary B.89. We have |G/N | = |G|/|N |.
Definition B.90. Let n ∈ Z. The group of integers modulo n is defined
to be the quotient group (Z/nZ, +n ). We also often denote this group by
(Zn , +n ).
By the same argument given above for n = 4, Zn ∼
= Cn is cyclic of order
n, with one choice of generator being 1 + nZ.

B.9 The First Isomorphism Theorem for Groups


First we will establish the aforementioned claim that every normal subgroup
is the kernel of a homomorphism. Then in the First Isomorphism Theorem
we will relate kernels, images and quotients.
Lemma B.91. Let G be a group and N a normal subgroup of G. The
function π : G → G/N defined by π(g) = gN is a group homomorphism.
Definition B.92. Let G be a group and N a normal subgroup of G. The
homomorphism π : G → G/N defined by π(g) = gN is called the quotient
homomorphism associated to G and N .
Proposition B.93. Let G be a group and N a normal subgroup of G. The
quotient homomorphism π : G → G/N is surjective and its kernel is Ker π =
N.

91
Corollary B.94. Let G be a group and N a normal subgroup of G. Then N
is the kernel of a group homomorphism with domain G, namely the quotient
homomorphism π : G → G/N .

Theorem B.95 (Universal property of the quotient group). Let G and H


be groups. Let N be a normal subgroup of G and let π : G → G/N be the
associated quotient homomorphism.
Then for every homomorphism ϕ : G → H such that N ⊆ Ker ϕ, there
exists a unique homomorphism ϕ̄ : G/N → H such that ϕ = ϕ̄ ◦ π.
Furthermore, Ker ϕ̄ = (Ker ϕ)/N and Im ϕ̄ = Im ϕ.

We can illustrate the maps in this theorem via the following diagram,
known as a “commutative diagram” (the “commutative” refers to the fact
that following along the arrows in either possible way gives the same result).
ϕ
G H
π
∃!ϕ̄
G/N
A consequence of this theorem is that there is a one-to-one correspond-
ence between homomorphisms ϕ : G → H with N ⊆ Ker ϕ and homomorph-
isms ϕ̄ : G/N → H, the correspondence being given by composing with π.
Since quotient groups can be complicated to understand, this tells us that
to find homomorphisms whose domain is G/N , we can instead look for ho-
momorphisms whose domain is G and just check if N is a subgroup of the
kernel of these. (Depending on the situation, the converse direction can be
very helpful too, of course.)
The above theorem has done much of the heavy lifting to enable us to
state and prove in a nice, clean fashion the First Isomorphism Theorem.
(Arguably, it should be called the “First Isomorphism Corollary” since it
follows essentially immediately from the previous theorem, but often the-
orems are called theorems for their significance rather than their outright
difficulty.)

Theorem B.96 (First Isomorphism Theorem for Groups). Let G and H be


groups and let ϕ : G → H be a group homomorphism. Then

G/ Ker ϕ ∼
= Im ϕ

B.10 The definition of a ring and some examples


We recall the definition of a ring.

Definition B.97. A ring is an algebraic structure (R, +, ×) such that (R, +)


is an Abelian group, × is associative and we have

a × (b + c) = (a × b) + (a × c) and (a + b) × c = (a × c) + (b × c)

92
for all a, b, c ∈ R. The identity element for the group (R, +) will be denoted
0R . The inverse of a with respect to + will be denoted by −a.
Remark B.98. The distributive laws can seem a little like they appear out
of thin air. However, if one stares at them for a while, one sees that in fact
they say that left and right multiplication are group homomorphisms. More
precisely, given a ∈ R, the function µL L
a : R → R given by µa (b) = a × b is a
homomorphism from the Abelian group R = (R, +) to itself, and similarly
for µR R
c : R → R, µc (b) = b×c. From this point of view, the conditions above
are not so unnatural.
Definition B.99. Let R be a ring. The set S(R) of (N0 -indexed) sequences
of elements of R, with the operations +S(R) and ×S(R) defined by

(a +S(R) b)n = an +R bn

and X
(a ×S(R) b)n = ai ×R bj
i+j=n

is a ring, called the ring of formal power series (in one variable) over R. We
denote this ring by R[[x]] and the ring R is called the base ring of R[[x]].
Remark B.100. If R has a multiplicative identity 1R (see Definition B.107)
then the “variable” x may be identified with the sequence
(
1R for n = 1
xn =
0R otherwise.

This covers most familiar situations but has the perhaps surprising implic-
ation that if R does not have a multiplicative identity, then x = 0R + 1R x +
0R x2 + 0R x3 + · · · ∈
/ R[[x]].
Now that we have the ring of formal power series, it is straightforward to
define the ring of polynomials. Polynomials are just “special” power series,
namely they are the power series that are “eventually zero”, i.e. after some
point, every element of the sequence is 0R . Another way to say this is that
only finitely many elements of the corresponding sequence are non-zero. We
say that a sequence a is finitely supported if it has this property: a is finitely
supported if |{i | ai 6= 0R }| is finite.
Definition B.101. Let R be a ring. The set Sfs (R) of finitely supported
(N0 -indexed) sequences of elements of R, with the operations =Sfs (R) and
×Sfs (R) defined by
(a +Sfs (R) b)n = an +R bn
and X
(a ×Sfs (R) b)n = ai ×R bj
i+j=n

93
is a ring, called the ring of polynomials (in one variable) over R. We denote
this ring by R[x] and the ring R is called the base ring of R[x].
Remark B.102. As in Remark B.100, if R has a multiplicative identity 1R
then the “variable” x may be identified with the (finitely supported) se-
quence (
1R for n = 1
xn =
0R otherwise.
Again this leads to the counter-intuitive observation that if R does not have
a multiplicative identity, then x ∈/ R[x], i.e. x is not a polynomial—the
point being that it is not a polynomial with coefficients in R. For a concrete
example, consider the ring 2Z[x] of polynomials in one variable x with even
integer coefficients: x ∈
/ 2Z[x].
Definition B.103. Let R be a ring and R[x] the polynomial ring in one
variable over R. Let p = (pn ) be a non-zero element of R[x], i.e. there exists
i such that pi 6= 0R .
The degree of p is defined to be

deg p = max{k | pk 6= 0R }.

For m = max{k | pk 6= 0R } we say that pm is the leading coefficient of p


and pm xm the leading term of p. If m = 0, we say that p is a constant
polynomial.
Proposition B.104. Let R be a ring such that for any two elements a, b ∈ R
such that a, b 6= 0R , we have ab 6= 0R . Let R[x] be the ring of polynomials
in one variable over R and let p, q ∈ R[x]. Then
(i) deg(p + q) ≤ max{deg p, deg q}, and

(ii) if p, q 6= 0R[x] then pq 6= 0R[x] and

deg pq = deg p + deg q.

Proposition B.105 (The division algorithm for polynomials). Let F be a


field and let f, g ∈ F [x], with g 6= 0. Then there exist unique polynomials
q, r ∈ F [x] such that f = qg + r and either r = 0 or deg r < deg g.
The polynomial q is called the quotient, and r the remainder, of f divided
by g.

B.11 Basic properties of rings


Proposition B.106. Let (R, +, ×) be a ring and 0R the identity element
for +. Recall that we denote the inverse of a with respect to + by −a.
(i) (additive cancellation) For all a, b, c ∈ R, if a + b = a + c then b = c.

94
(ii) For all a ∈ R, 0R × a = 0R = a × 0R .

(iii) For all a, b ∈ R, a×(−b) = (−a)×b = −(a×b) and (−a)×(−b) = a×b.


Definition B.107. Let (R, +, ×) be a ring. An element 1R ∈ R is called a
multiplicative identity if 1R is an identity for the binary operation ×, that
is, for all a ∈ R,
a × 1R = a = 1R × a.
A ring having a multiplicative identity is said to be unital.
Notice that a multiplicative identity is unique if it exists: an identity for
a binary operation is always unique.
Many of the examples of rings given in Section B.10 have multiplicative
identities.
Examples B.108.
(a) The trivial ring ({e}, +, ×) with e + e = e and e × e = e has multiplic-
ative identity e, as we see from the definition of ×. One can show that
the only ring R with a multiplicative identity 1R in which 1R = 0R is
the trivial ring.

(b) The ring (Z, +, ×), with the usual addition and multiplication of in-
tegers, has multiplicative identity 1; similarly for Q, R and C, with
their usual operations.

(c) The integers modulo n, Zn , with addition and multiplication modulo


n has a multiplicative identity b
1 = 1 + nZ. Later we will see why this
should not be a surprise.

(d) The matrix rings Mn (Z), Mn (Q), Mn (R) and Mn (C) all have multi-
plicative identities, namely the n × n identity matrix In ,
(
1 if i = j
(In )ij =
0 otherwise.

The matrix ring Mn (R), for R an arbitrary ring, need not have a
multiplicative identity; shortly we will see precisely when it does.

(e) The ring of functions F(Y, R) (respectively F(Y, C)) (where Y is any
set) has a multiplicative identity, the constant function 1 : Y → R
(respectively 1 : Y → C) defined by 1(y) = 1 for all y ∈ Y .

(f) The polynomial rings Z[x], Q[x], R[x] and C[x] have multiplicative
identities, namely the constant polynomial 1 = 1 + 0x + 0x2 + · · · .
The polynomial ring R[x], for R an arbitrary ring, need not have a
multiplicative identity; again we shall shortly see precisely when it
does.

95
Just as in the case of groups, rings with a commutative multiplication
operation are particularly special. Remember that a ring (R, +, ×) is by
definition an Abelian group with respect to +, so a + b = b + a for all a, b
always. The issue at hand for rings is whether or not the multiplication
operation × is commutative.
Definition B.109. Let (R, +, ×) be a ring. We say that R is a commutative
ring if the multiplication operation × is commutative. That is, a × b = b × a
for all a, b ∈ R.
If R is not commutative, we say R is non-commutative.
Note that the negation of (∀a, b ∈ R)(a × b = b × a) is
¬ ((∀a, b ∈ R)(a × b = b × a)) = (∃ a, b ∈ R)(a × b 6= b × a),
so that to show a ring R is non-commutative we must find two distinct
(specific) elements of R whose products a × b and b × a are not equal. (We
must have a 6= b, since a × a = a × a, of course.)
We have already seen a number of examples of commutative and non-
commutative rings.
Examples B.110.
(a) As you have known for a long time, Z, Q, R and C are commutative
rings.
(b) The integers modulo n, Zn , are a commutative ring.
(c) Matrix rings are usually non-commutative.
(d) The rings of functions F(Y, R) and F(Y, C) are commutative: multi-
plication is defined pointwise so if f, g ∈ F(Y, R),
(f g)(y) = f (y)g(y) = g(y)f (y) = (gf )(y)
for all y ∈ Y ; similarly for f, g ∈ F(Y, C). A little careful thought
shows that the key issue here for F(Y, R) with R an arbitrary ring is
whether or not R is commutative.
(e) Polynomial rings R[x] can be commutative or not, depending on R.
(f) The ring of even integers 2Z is commutative, because its multiplica-
tion is that of Z, restricted to the even integers, and multiplication of
integers is commutative.
Lemma B.111. Let R be a ring such that there exist r, s ∈ R with rs 6= 0R .
Then the matrix ring Mn (R) is commutative if and only if n = 1 and R is
commutative. That is, Mn (R) is non-commutative if n ≥ 2, and M1 (R) is
non-commutative if and only if R is.
Lemma B.112. Let R be a ring. Then the polynomial ring R[x] is com-
mutative if and only if R is.

96
B.12 Subrings
Just as with groups and subgroups, it is natural to study subsets of rings
that are themselves rings. Indeed, in a few places above it would have been
natural to say “subring”, so it is not before time that we give the definition.
Definition B.113. Let (R, +, ×) be a ring. A subset S of R is said to be
a subring if (S, +, ×) is a ring. If this is the case, we write S 6 R.
Proposition B.114 (Subring test). Let S be a subset of a ring (R, +, ×)
and denote the additive identity element of R by 0R . Then S is a subring of
R if and only if the following two conditions are satisfied:
(SR1) S is a subgroup of (R, +);
(SR2) for all s1 , s2 ∈ S, we have s1 × s2 ∈ S.
Lemma B.115. Every subring of a commutative ring is commutative.
Exercise B.116. Let R and S be rings, and let T denote their Cartesian
product, that is,

T = R × S = {(a, b) | a ∈ R, b ∈ S}.

Equip T with the coordinate-wise defined addition and multiplication oper-


ations

(a, b) +T (c, d) = (a +R c, b +S d)

and

(a, b) ×T (c, d) = (a ×R c, b ×S d)

for a, c ∈ R and b, d ∈ S.
(a) Show that T is a ring with respect to these operations.
(b) Show that T is commutative if and only if R and S are commutative.

B.13 Homomorphisms
Earlier, in Section B.5, we saw that the correct way to relate two groups
(G, ∗) and (H, ◦) is to define a group homomorphism, this being a function
ϕ : G → H such that ϕ(g1 ∗ g2 ) = ϕ(g1 ) ◦ ϕ(g2 ) for all g1 , g2 ∈ G. A
ring (R, +, ×) is by definition an Abelian group (R, +) with a compatible
multiplication ×, so a ring homomorphism must be a group homomorphism
that also preserves the multiplication operations.
Definition B.117. Let (R, +R , ×R ) and (S, +S , ×S ) be rings. A function
ϕ : R → S is called a ring homomorphism if ϕ is a group homomorphism
and preserves the multiplication operations. Explicitly,

97
(H1) ϕ(r1 +R r2 ) = ϕ(r1 ) +S ϕ(r2 ) and

(H2) ϕ(r1 ×R r2 ) = ϕ(r1 ) ×S ϕ(r2 ),

for all r1 , r2 ∈ R.

Let us again list some elementary properties of ring homomorphisms,


the first two of which are immediate from a ring homomorphism being in
particular a group homomorphism, so are restatements of parts of Proposi-
tion B.58. The third is easily proved by induction.

Proposition B.118. Let ϕ : R → S be a ring homomorphism. Then

(i) ϕ(0R ) = 0S ;

(ii) ϕ(−r) = −ϕ(r) for all r ∈ R;

(iii) ϕ(rn ) = ϕ(r)n for all r ∈ R, n ∈ N0 .

Proposition B.119. Let ϕ : R → S and σ : S → T be ring homomorphisms.


Then σ ◦ ϕ : R → T is a ring homomorphism.

Example B.120. Next we give the ring analogue of the trivial group homo-
morphism defined in Exercise B.61.
Let R and S be rings and let ϕ : R → S be the function defined by
ϕ(a) = 0S for all a ∈ R. Then ϕ is a ring homomorphism because for all
a, b ∈ R, we have

(H1) ϕ(a + b) = 0S = 0S + 0S = ϕ(a) + ϕ(b) and

(H2) ϕ(ab) = 0S = 0S · 0S = ϕ(a)ϕ(b).

We call ϕ the zero homomorphism; all other homomorphisms are said to be


non-zero.
Example B.121. Recall that for any set B and any subset A ⊆ B, there is
an (injective) function ι : A → B defined by ι(a) = a for all a ∈ A. This is
called the inclusion map, as it precisely encodes A being “included” in B,
as a subset.
Let S be a subring of a ring R, so that in particular S ⊆ R, and let
ι : S → R be the inclusion map. Then ι is a ring homomorphism, as is easily
checked. (The analogous statement for groups is also true, but since we did
not need it earlier, it was not stated previously.)

Definition B.122. A ring homomorphism ϕ : R → S that is bijective is


called a ring isomorphism.

98
Again, the identity function idR : R → R, idR (r) = r is a ring homo-
morphism16 and is a bijection, so is a ring isomorphism.

Definition B.123. We say that two rings R and S are isomorphic if and
only if there exists a ring isomorphism ϕ : R → S. If R and S are isomorphic,
we often write R ∼ = S.

The following lemma follows immediately from Lemma B.65, since we


simply add the property (H2) to all three conditions in the corresponding
statement for group homomorphisms.

Lemma B.124. Let R and S be rings. Then the following are are equival-
ent:

(i) R and S are isomorphic;

(ii) there exists a bijective homomorphism ϕ : R → S; and

(iii) there exist ring homomorphisms ϕ : R → S and ψ : S → R such that


ψ ◦ ϕ = IR and ϕ ◦ ψ = IS .

Definition B.125. Let R and S be rings. If S contains a subring T such


that T is isomorphic to R, then we say that S contains a subring isomorphic
to R.

In practice we use the following criterion to decide whether or not a ring


contains a subring isomorphic to some other ring.

Proposition B.126. Let R and S be rings. Then S contains a subring


isomorphic to R if and only if there is an injective ring homomorphism
from R to S.

If R and S are unital rings, we may or may not have that a ring homo-
morphism sends 1R to 1S . If we want to be sure of this, we need to ask for
it explicitly:

Definition B.127. Let R and S be unital rings. We say that a ring homo-
morphism ϕ : R → S is unital if ϕ(1R ) = 1S .

B.14 Kernels and images


Definition B.128. Let ϕ : R → S be a ring homomorphism. The kernel of
ϕ is defined to be the subset of R given by

Ker ϕ = {r ∈ R | ϕ(r) = 0S }.
16
The collection of all rings together with all ring homomorphisms between them also
forms a category.

99
Definition B.129. Let ϕ : R → S be a ring homomorphism. The image of
ϕ is defined to be the subset of S given by

Im ϕ = {s ∈ S | (∃r ∈ R)(ϕ(r) = s)}.

Proposition B.130. Let ϕ : R → S be a ring homomorphism.

(i) The kernel of ϕ, Ker ϕ, is a subring of R.

(ii) The image of ϕ, Im ϕ, is a subring of S.

(iii) The homomorphism ϕ is injective if and only if Ker ϕ = {0R }.

(iv) The homomorphism ϕ is surjective if and only if Im ϕ = S.

(v) The homomorphism ϕ is an isomorphism if and only if Ker ϕ = {0R }


and Im ϕ = S.

Proposition B.131. Let R be a ring. There is an injective ring homo-


morphism ϕ : R → R[x] whose image is the set of constant polynomials.

Proposition B.132. Let R and S be rings and let ϕ : R → S be a non-


zero homomorphism. Suppose that R has a multiplicative identity 1R . Then
ϕ(1R ) is a multiplicative identity in Im ϕ.

B.15 Quotient rings


Definition B.133. Let (R, +, ×) be a ring. A subset I ⊆ R is said to be
an ideal if

(I1) I is an (additive) subgroup of (R, +); and

(I2) (Closure under multiplication by an arbitrary element of R) for all


a ∈ I and r ∈ R, a × r ∈ I and r × a ∈ I.

If I is an ideal of R, we write I P R; we write I C R if I is a proper ideal,


that is, I P R and I ( R.

Proposition B.134. Let (R, +, ×) be a ring and I an ideal of R.


Let LI = {r + I | r ∈ R} denote the set of left cosets of I in R. Then

+I : LI × LI → LI , (r + I) +I (s + I) = (r + s) + I

and
×I : LI × LI → LI , (r + I) ×I (s + I) = (r × s) + I
define binary operations on LI .

Proposition B.135. Let (R, +, ×) be a ring and I an ideal.

100
(i) The operation +I defines a group structure on LI , the set of left cosets
of I in R (with respect to addition).

(ii) The operation ×I is associative and distributes over +I .


Definition B.136. Let (R, +, ×) be a ring and let I be an ideal of R. The
quotient ring of R by I, denoted R/I, is the ring (LI , +I , ×I ) where LI is
the set {r + I | r ∈ R} of left cosets of I in R and +I and ×I are the binary
operations
(r + I) +I (s + I) = (r + s) + I
and
(r + I) ×I (s + I) = (r × s) + I
The additive identity of the quotient ring R/I is 0R/I = 0R + I.
Definition B.137. Let n ∈ Z. The ring of integers modulo n is defined
to be the quotient ring (Z/nZ, +n , ×n ). We also often denote this ring by
(Zn , +n , ×n ).
Proposition B.138. Let I be an ideal in a ring R.
(i) Suppose that R is commutative. Then the quotient ring R/I is com-
mutative.

(ii) Suppose that R has a multiplicative identity 1R . Then 1 + I is a mul-


tiplicative identity in the quotient ring R/I.

B.16 The First Isomorphism Theorem for Rings


Lemma B.139. Let R be a ring and I an ideal of R. The function
π : R → R/I defined by π(r) = r + I is a ring homomorphism.
Definition B.140. Let R be a ring and I an ideal of R. The ring ho-
momorphism π : R → R/I defined by π(r) = r + I is called the quotient
homomorphism associated to R and I.
Proposition B.141. Let R be a ring and I an ideal of R. The quotient
homomorphism π : R → R/I is surjective and its kernel is Ker π = I.
Corollary B.142. Let R be a ring and I an ideal of R. Then I is the kernel
of a ring homomorphism with domain R, namely the quotient homomorph-
ism π : R → R/I.
Theorem B.143 (Universal property of the quotient ring). Let R and S be
rings. Let I be an ideal of R and let π : R → R/I be the associated quotient
homomorphism.
Then for every ring homomorphism ϕ : R → S such that I ⊆ Ker ϕ,
there exists a unique ring homomorphism ϕ̄ : R/I → S such that ϕ = ϕ̄ ◦ π.
Furthermore, Ker ϕ̄ = (Ker ϕ)/I and Im ϕ̄ = Im ϕ.

101
Theorem B.144 (First Isomorphism Theorem for Rings). Let R and S be
rings and let ϕ : R → S be a ring homomorphism. Then

R/ Ker ϕ ∼
= Im ϕ.

B.17 Vector spaces


Above, we said that a vector space over a field K is an Abelian group V
together with a scalar multiplication · : K × V → V , suitably compatible
with the Abelian group structure.
Then we simply “enhance” the group-theoretic concepts to vector spaces
by asking or checking at each stage that the idea in question can be made
compatible with the extra scalar multiplication.
Specifically, a subspace W of a vector space V is a subgroup of V closed
under scalar multiplication.
A homomorphism of vector spaces or, to give it its more common name,
a linear transformation (or map) T : V → W is a homomorphism of Abelian
groups such that ϕ(λv) = λϕ(w).
Two important quantities associated to a linear endomorphism T : V →
V of a finite-dimensional vector space are the determinant and the trace.
To define both, we choose a basis B for V . Then we may write P down the
matrix [T ]B B = (t ij ) whose entries are determined by T bi = j tij bj where
bi ∈ B. The P B
trace of T , tr(T ) or tr([T ]B ), is the sum of the diagonal entries
of [T ]B
B , i.e. i tii ; one should show that this does not depend on the choice
of bases.
The determinant is a little more complicated, but we have introduced
what we need above:
def
X
det(T ) = sign(σ)t1σ(1) t2σ(2) · · · tnσ(n)
σ∈Sn

where [T ]B
B = (tij ).
Eigenvectors and eigenvalues will be important for us, too. An eigen-
vector for a linear map T : V → V is a vector v ∈ V such that there exists
λ ∈ K such that T v = λv. The scalar λ is called the eigenvalue associated
to the eigenvector v.
To find eigenvalues, we may use the following result.
Theorem B.145. Let T : V → V and set M = [T ]B B for some choice of
basis B for V . Then for any λ ∈ K there exists an eigenvector v for T with
eigenvalue λ if and only if λ is a root of the characteristic polynomial of T ,
det(M − tI).
Then to find eigenvectors, we solve the simultaneous linear equations
M v = λv. (Fast computational methods to compute eigenvectors and ei-
genvalues exist but is also important to know how to do this by hand.)

102
The subset of eigenvectors for a fixed eigenvalue λ forms a subspace of
V , which we denote Vλ and call the λ-eigenspace.
If a vector space V has a basis consisting of eigenvectors for a linear
transformation T : V → V , then we say T is diagonalisable, for then with
respect to this basis, T is represented by a diagonal matrix with the eigen-
values on the diagonal. It follows that if T is diagonalisable, its trace is equal
to the sum of its eigenvalues and its determinant is equal to the product of
its eigenvalues.

B.18 Quotient vector spaces


Since vector spaces are Abelian groups and every subgroup of an Abelian
group is a normal subgroup, we may take the quotient group and show that
this inherits a vector space structure.
Let W be a subspace of V . Then the quotient vector space V /W is the
Abelian group V /W = {v+W | v ∈ V } with (v+W )+(v 0 +W ) = (v+v 0 )+W
and λ · (v + W ) = λv + W . We leave it as an exercise to check the necessary
properties; they all “descend” from those for V . (The well-definedness of the
construction has already been handled by the theory of quotient groups.)
Note that if V is a finite-dimensional vector space and W a subspace of
V , then dim V /W = dim V − dim W . Indeed this follows from proving the
stronger result that V ∼= W ⊕ V /W .
The point here is that due to the “freeness” of vector spaces (essentially,
this boils down to the existence of bases), in the category of vector spaces
every short exact sequence of vector spaces splits. The correct, sophisticated
way to say this is that K-Mod is a semisimple category.

References
[ASS] Ibrahim Assem, Daniel Simson, and Andrzej Skowroński. Elements
of the representation theory of associative algebras. Vol. 1, volume 65
of London Mathematical Society Student Texts. Cambridge Univer-
sity Press, Cambridge, 2006. 978-0-521-58423-4; 978-0-521-58631-3;
0-521-58631-3. Techniques of representation theory.

[EH] Karin Erdmann and Thorsten Holm. Algebras and representation


theory. Springer Undergrad. Math. Ser. Cham: Springer, 2018. 978-
3-319-91997-3; 978-3-319-91998-0.

[Rot] Joseph J. Rotman. An introduction to the theory of groups, volume


148 of Graduate Texts in Mathematics. Springer-Verlag, New York,
fourth edition, 1995. ISBN 0-387-94285-8.

103
[Sch] Ralf Schiffler. Quiver representations. CMS Books Math./Ouvrages
Math. SMC. Cham: Springer, 2014. 978-3-319-09203-4; 978-3-319-
09204-1.

[Sha] Amit Shah. Krull–Remak–Schmidt decompositions in Hom-finite ad-


ditive categories. Preprint, arXiv:2209.00337, 2022.

104

You might also like