Full
Full
Joy Morris
University of Lethbridge
Combinatorics (Morris)
This text is disseminated via the Open Education Resource (OER) LibreTexts Project (https://LibreTexts.org) and like the hundreds
of other texts available within this powerful platform, it is freely available for reading, printing and "consuming." Most, but not all,
pages in the library have licenses that may allow individuals to make changes, save, and print this book. Carefully
consult the applicable license(s) before pursuing such effects.
Instructors can adopt existing LibreTexts texts or Remix them to quickly build course-specific resources to meet the needs of their
students. Unlike traditional textbooks, LibreTexts’ web based origins allow powerful integration of advanced features and new
technologies to support learning.
The LibreTexts mission is to unite students, faculty and scholars in a cooperative effort to develop an easy-to-use online platform
for the construction, customization, and dissemination of OER content to reduce the burdens of unreasonable textbook costs to our
students and society. The LibreTexts project is a multi-institutional collaborative venture to develop the next generation of open-
access texts to improve postsecondary education at all levels of higher learning by developing an Open Access Resource
environment. The project currently consists of 14 independently operating and interconnected libraries that are constantly being
optimized by students, faculty, and outside experts to supplant conventional paper-based books. These free textbook alternatives are
organized within a central environment that is both vertically (from advance to basic level) and horizontally (across different fields)
integrated.
The LibreTexts libraries are Powered by NICE CXOne and are supported by the Department of Education Open Textbook Pilot
Project, the UC Davis Office of the Provost, the UC Davis Library, the California State University Affordable Learning Solutions
Program, and Merlot. This material is based upon work supported by the National Science Foundation under Grant No. 1246120,
1525057, and 1413739.
Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not
necessarily reflect the views of the National Science Foundation nor the US Department of Education.
Have questions or comments? For information about adoptions or adaptions contact [email protected]. More information on our
activities can be found via Facebook (https://facebook.com/Libretexts), Twitter (https://twitter.com/libretexts), or our blog
(http://Blog.Libretexts.org).
This text was compiled on 12/01/2023
TABLE OF CONTENTS
Licensing
1: Introduction
1: What is Combinatorics?
1.1: Enumeration
1.2: Graph Theory
1.3: Ramsey Theory
1.4: Design Theory
1.5: Coding Theory
1.6: Summary
2: Enumeration
2: Basic Counting Techniques
2.1: The Product Rule
2.2: The Sum Rule
2.3: Putting Them Together
2.4: Summing Up
2.5: Summary
3: Permutations, Combinations, and the Binomial Theorem
3.1: Permutations
3.2: Combinations
3.3: The Binomial Theorem
3.4: Summary
4: Bijections and Combinatorial Proofs
4.1: Counting via Bijections
4.2: Combinatorial Proofs
4.3: Summary
5: Counting with Repetitions
5.1: Unlimited Repetition
5.2: Sorting a Set that Contains Repetition
5.3: Summary
6: Induction and Recursion
6.1: Recursively-Defined Sequences
6.2: Basic Induction
6.3: More Advanced Induction
6.4: Summary
7: Generating Functions
7.1: What is a Generating Function?
7.2: The Generalized Binomial Theorem
7.3: Using Generating Functions To Count Things
7.4: Summary
8: Generating Functions and Recursion
8.1: Partial Fractions
8.2: Factoring Polynomials
8.3: Using Generating Functions to Solve Recursively-Defined Sequences
1 https://math.libretexts.org/@go/page/60512
8.4: Summary
9: Some Important Recursively-Defined Sequences
9.1: Derangements
9.2: Catalan Numbers
9.3: Bell Numbers and Exponential Generating Functions
9.4: Summary
10: Other Basic Counting Techniques
10.1: The Pigeonhole Principle
10.2: Inclusion-Exclusion
10.3: Summary
3: Graph Theory
11: Basics of Graph Theory
11.1: Background
11.2: Basic Definitions, Terminology, and Notation
11.3: Deletion, Complete Graphs, and the Handshaking Lemma
11.4: Graph Isomorphisms
11.5: Summary
12: Moving Through Graphs
12.1: Directed Graphs
12.2: Walks and Connectedness
12.3: Paths and Cycles
12.4: Trees
12.5: Summary
13: Euler and Hamilton
13.1: Euler Tours and Trails
13.2: Hamilton Paths and Cycles
13.3: Summary
14: Graph Coloring
14.1: Edge Coloring
14.2: Ramsey Theory
14.3: Vertex Colouring
14.4: Summary
15: Planar Graphs
15.1: Planar Graphs
15.2: Euler’s Formula
15.3: Map Colouring
15.4: Summary
4: Design Theory
16: Latin Squares
16.1: Latin Squares and Sudokus
16.2: Mutually Orthogonal Latin Squares (MOLS)
16.3: Systems of Distinct Representatives
16.4: Summary
17: Designs
17.1: Balanced Incomplete Block Designs (BIBD)
17.2: Constructing Designs and Existence of Designs
2 https://math.libretexts.org/@go/page/60512
17.3: Fisher’s Inequality
17.4: Summary
18: More Designs
18.1: Steiner and Kirkman Triple Systems
18.2: t-Designs
18.3: Affine Planes
18.4: Projective Planes
18.5: Summary
19: Designs and Codes
19.1: Introduction
19.2: Error-Correcting Codes
19.3: Using the Generator Matrix For Encoding
19.4: Using the Parity-Check Matrix For Decoding
19.5: Codes From Designs
19.6: Summary
Index
List of Notation
Appendix A: Solutions To Selected Exercises
Detailed Licensing
3 https://math.libretexts.org/@go/page/60512
Licensing
A detailed breakdown of this resource's licensing can be found in Back Matter/Detailed Licensing.
1 https://math.libretexts.org/@go/page/115447
SECTION OVERVIEW
1: Introduction
Combinatorics is a subfield of “discrete mathematics,” so we should begin by asking what discrete mathematics means. The
differences are to some extent a matter of opinion, and various mathematicians might classify specific topics differently.
“Discrete” should not be confused with “discreet,” which is a much more commonly-used word. They share the same Latin root,
“discretio,” which has to do with wise discernment or separation. In the mathematical “discrete,” the emphasis is on separateness,
so “discrete” is the opposite of “continuous.” If we are studying objects that can be separated and treated as a (generally countable)
collection of units rather than a continuous structure, then this study falls into discrete mathematics.
In calculus, we deal with continuous functions, so calculus is not discrete mathematics. In linear algebra, our matrices often have
real entries, so linear algebra also does not fall into discrete mathematics.
Textbooks on discrete mathematics often include some logic, as discrete mathematics is often used as a gateway course for upper-
level math. Elementary number theory and set theory are also sometimes covered. Algorithms are a common topic, as algorithmic
techniques tend to work very well on the sorts of structures that we study in discrete mathematics.
In Combinatorics, we focus on combinations and arrangements of discrete structures. There are five major branches of
combinatorics that we will touch on in this course: enumeration, graph theory, Ramsey Theory, design theory, and coding theory.
(The related topic of cryptography can also be studied in combinatorics, but we will not touch on it in this course.) We will focus
on enumeration, graph theory, and design theory, but will briefly introduce the other two topics.
1: What is Combinatorics?
1.1: Enumeration
1.2: Graph Theory
1.3: Ramsey Theory
1.4: Design Theory
1.5: Coding Theory
1.6: Summary
This page titled 1: Introduction is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy Morris.
1.1 https://math.libretexts.org/@go/page/70849
CHAPTER OVERVIEW
1: What is Combinatorics?
1.1: Enumeration
1.2: Graph Theory
1.3: Ramsey Theory
1.4: Design Theory
1.5: Coding Theory
1.6: Summary
1: What is Combinatorics? is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by LibreTexts.
1
1.1: Enumeration
Enumeration is a big fancy word for counting. If you’ve taken a course in probability and statistics, you’ve already covered some
of the techniques and problems we’ll be covering in this course. When a statistician (or other mathematician) is calculating the
“probability” of a particular outcome in circumstances where all outcomes are equally likely, what they usually do is enumerate all
possible outcomes, and then figure out how many of these include the outcome they are looking for.
Example 1.1.1
That was an example that you could probably figure out without having studied enumeration or probability, but it nonetheless
illustrates the basic principle behind many calculations of probability. The object of enumeration is to enable us to count outcomes
in much more complicated situations. This sometimes has natural applications to questions of probability, but our focus will be on
the counting, not on the probability.
After studying enumeration in this course, you should be able to solve problems such as:
If you are playing Texas Hold’em poker against a single opponent, and the two cards in your hand are a pair, what is the
probability that your opponent has a higher pair?
How many distinct Shidokus (4-by-4 Sudokus) are there?
How many different orders of five items can be made from a bakery that makes three kinds of cookies?
Male honeybees come from a queen bee’s unfertilised eggs, so have only one parent (a female). Female honeybees have two
parents (one male, one female). Assuming all ancestors were distinct, how many ancestors does a male honeybee have from 10
generations ago?
Although all of these questions (and many others that arise naturally) may be of interest to you, the reason we begin our study with
enumeration is because we’ll be able to use many of the techniques we learn, to count the other structures we’ll be studying.
This page titled 1.1: Enumeration is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy Morris.
1.1.1 https://math.libretexts.org/@go/page/70850
1.2: Graph Theory
When a mathematician talks about graph theory, she is not referring to the “graphs” that you learn about in school, that can be
produced by a spreadsheet or a graphing calculator. The “graphs” that are studied in graph theory are models of networks.
Any network can be modeled by using dots to represent the nodes of the network (the cities, computers, cell phones, or whatever is
being connected) together with lines to represent the connections between those nodes (the roads, wires, wireless connections, etc.).
This model is called a graph. It is important to be aware that only at a node may information, traffic, etc. may pass from one edge
of a graph to another edge. If we want to model a highway network using a graph, and two highways intersect in the middle of
nowhere, we must still place a node at that intersection. If we draw a graph in which edges cross over each other but there is no
node at that point, you should think of it as if there is an overpass there with no exits from one highway to the other: the roads
happen to cross, but they are not connecting in any meaningful sense.
Example 1.2.1
is a graph.
Many questions that have important real-world applications can be modeled with graphs. These are not always limited to questions
that seem to apply to networks. Some questions can be modeled as graphs by using lines to represent constraints or some other
relationship between them (e.g. the nodes might represent classes, with a line between them if they cannot be scheduled at the same
time, or some nodes might represent students and others classes, with a line between a student and each of the classes he or she is
taking).
After studying graph theory in this course, you should be able to solve problems such as:
How can we find a good route for garbage trucks to take through a particular city?
Is there a delivery route that visits every city in a particular area, without repetition?
Given a collection of project topics and a group of students each of whom has expressed interest in some of the topics, is it
possible to assign each student a unique topic that interests him or her?
A city has bus routes all of which begin and end at the bus terminal, but with different schedules, some of which overlap. What
is the least number of buses (and drivers) required in order to be able to complete all of the routes according to the schedule?
Create a schedule for a round-robin tournament that uses as few time slots as possible.
Some of these questions you may only be able to answer for particular kinds of graphs.
Graph theory is a relatively “young” branch of mathematics. Although some of the problems and ideas that we will study date back
a few hundred years, it was not until the 1930s that these individual problems were gathered together and a unified study of the
theory of graphs began to develop.
This page titled 1.2: Graph Theory is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy Morris.
1.2.1 https://math.libretexts.org/@go/page/70852
1.3: Ramsey Theory
Ramsey theory takes its name from Frank P. Ramsey, a British mathematician who died in 1930 at the tragically young age of 26,
when he developed jaundice after an operation.
Ramsey was a logician. A result that he considered a minor lemma in one of his logic papers now bears the name “Ramsey’s
Theorem” and was the basis for this branch of mathematics. Its statement requires a bit of graph theory: given c colours and fixed
sizes n , . . . , n , there is an integer r = R(n , . . . , n ) such that for any colouring the edges of a complete graph on r vertices,
1 c 1 c
there must be some i between 1 and c such that there is a complete subgraph on n vertices, all of whose edges are coloured with
i
colour i.
In addition to requiring some graph theory, that statement was a bit technical. In much less precise terms that don’t require so much
background knowledge (but could be misleading in specific situations), Ramsey Theory asserts that if structure is big enough and
contains a property we are interested in, then no matter how we cut it into pieces, at least one of the pieces should also have that
property. One major theorem in Ramsey Theory is van der Waerden’s Theorem, which states that for any two constants c and n ,
there is a constant V (c, n) such that if we take V (c, n) consecutive numbers and colour them with c colours, there must be an
arithmetic progression of length n all of whose members have been coloured with the same colour.
Example 1.3.1
Here is a small example of van der Waerden’s Theorem. With two colours and a desired length of 3 for the arithmetic
progression, we can show that V (2, 3) > 8 using the following colouring:
3 4 5 6 7 8 9 10
(In case it is difficult to see, we point out that 3, 4, 7, and 8 are black, while 5, 6, 9, and 10 are grey, a different colour.) Notice
that with eight consecutive integers, the difference in a three-term arithmetic progression cannot be larger than three. For every
three-term arithmetic progression with difference of one, two, or three, it is straightforward to check that the numbers have not
all received the same colour.
In fact, V (2, 3) = 9 , but proving this requires exhaustive testing.
We will touch lightly on Ramsey Theory in this course, specifically on Ramsey’s Theorem itself, in the context of graph theory.
This page titled 1.3: Ramsey Theory is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy Morris.
1.3.1 https://math.libretexts.org/@go/page/70854
1.4: Design Theory
Like graph theory, design theory is probably not what any non-mathematician might expect from its name.
When researchers conduct an experiment, errors can be introduced by many factors (including, for example, the timing or the
subject of the experiment). It is therefore important to replicate the experiment a number of times, to ensure that these unintended
variations do not account for the success of a particular treatment. If a number of different treatments are being tested, replicating
all of them numerous times becomes costly and potentially infeasible. One way to reduce the total number of trials while still
maintaining the accuracy, is to test multiple treatments on each subject, in different combinations.
One of the major early motivations for design theory was this context: given a fixed number of total treatments, and a fixed number
of treatments we are willing to give to any subject, can we find combinations of the possible treatments so that each treatment is
given to some fixed number of subjects, and any pair of treatments is given together some fixed number of times (often just once).
This is the basic structure of a block design.
Example 1.4.1
Suppose that we have seven different fertilisers and seven garden plots on which to try them. We can organise them so that
each fertiliser is applied to three of the plots, each garden plot receives 3 fertilisers, and any pair of fertilisers is used together
on precisely one of the plots. If the different fertilisers are numbered one through seven, then a method for arranging them is to
place fertilisers 1, 2, and 3 on the first plot; 1, 4, and 5 on the second; 1, 6, and 7 on the third; 2, 4, and 6 on the fourth; 2, 5,
and 7 on the fifth; 3, 4, and 7 on the sixth; and 3, 5, and 6 on the last. Thus,
123 145 167
356
is a design.
This basic idea can be generalised in many ways, and the study of structures like these is the basis of design theory. In this course,
we will learn some facts about when designs exist, and how to construct them.
After studying design theory in this course, you should be able to solve problems such as:
Is it possible for a design to exist with a particular set of parameters?
What methods might we use in trying to construct a design?
This page titled 1.4: Design Theory is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy Morris.
1.4.1 https://math.libretexts.org/@go/page/70856
1.5: Coding Theory
In many people’s minds “codes” and “cryptography” are inextricably linked, and they might be hard-pressed to tell you the
difference. Nonetheless, the two topics are vastly different, as is the mathematics involved in them.
Coding theory is the study of encoding information into different symbols. When someone uses a code in an attempt to make a
message that only certain other people can read, this becomes cryptography. Cryptographers study strategies for ensuring that a
code is difficult to “break” for those who don’t have some additional information. In coding theory, we ignore the question of who
has access to the code and how secret it may be. Instead, one of the primary concerns becomes our ability to detect and correct
errors in the code.
Codes are used for many purposes in which the information is not intended to be secret. For example, computer programs are
transformed into long strings of binary data, that a computer reinterprets as instructions. When you text a photo to a friend, the
pixel and colour information are converted into binary data to be transmitted through radio waves. When you listen to an .mp3 file,
the sound frequencies of the music have been converted into binary data that your computer decodes back into sound frequencies.
Electronic encoding is always subject to interference, which can cause errors. Even when a coded message is physically etched
onto a device (such as a dvd), scratches can make some parts of the code illegible. People don’t like it when their movies, music, or
apps freeze, crash, or skip over something. To avoid this problem, engineers use codes that allow our devices to automatically
detect, and correct, minor errors that may be introduced.
In coding theory, we learn how to create codes that allow for error detection and correction, without requiring excessive memory or
storage capacity. Although coding theory is not a focus of this course, designs can be used to create good codes. We will learn how
to make codes from designs, and what makes these codes “good.”
Example 1.5.1
Suppose we have a string of binary information, and we want the computer to store it so we can detect if an error has arisen.
There are two symbols we need to encode: 0 and 1. If we just use 0 for 0 and 1 for 1, we’ll never know if a bit has been flipped
(from 0 to 1 or vice versa). If we use 00 for 0 and 01 for 1, then if the first bit gets flipped we’ll know there was an error
(because the first bit should never be 1), but we won’t notice if the second was flipped. If we use 00 for 0 and 11 for 1, then we
will be able to detect an error, as long as at most one bit gets flipped, because flipping one bit of either code word will result in
either 01 or 10, neither of which is a valid code word. Thus, this code allows us to detect an error. It does not allow us to
correct an error, because even knowing that a single bit has been flipped, there is no way of knowing whether a 10 arose from a
00 with the first bit flipped, or from a 11 with the second bit flipped. We would need a longer code to be able to correct errors.
After our study of coding theory, you should be able to solve problems such as:
Given a code, how many errors can be detected?
Given a code, how many errors can be corrected?
Construct some small codes that allow detection and correction of small numbers of errors.
Exercise 1.5.1
Can you come up with an interesting counting problem that you wouldn’t know how to solve?
This page titled 1.5: Coding Theory is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy Morris.
1.5.1 https://math.libretexts.org/@go/page/70858
1.6: Summary
Enumeration
Graph theory
Ramsey theory
Design theory
Coding theory
This page titled 1.6: Summary is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy Morris.
1.6.1 https://math.libretexts.org/@go/page/70860
SECTION OVERVIEW
2: Enumeration
2: Basic Counting Techniques
2.1: The Product Rule
2.2: The Sum Rule
2.3: Putting Them Together
2.4: Summing Up
2.5: Summary
7: Generating Functions
7.1: What is a Generating Function?
7.2: The Generalized Binomial Theorem
7.3: Using Generating Functions To Count Things
7.4: Summary
1
9: Some Important Recursively-Defined Sequences
9.1: Derangements
9.2: Catalan Numbers
9.3: Bell Numbers and Exponential Generating Functions
9.4: Summary
This page titled 2: Enumeration is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy Morris.
2
CHAPTER OVERVIEW
Thumbnail: The abacus is a calculating tool that has been in use since ancient times and is still in use today. The abacus consists of
a number of rows of movable beads or other objects, which represent digits. One of two numbers is set up, and the beads are
manipulated to implement an operation involving a second number (e.g., addition), or rarely a square or cubic root. (Unsplash
Lisense; Crissy Jarvis via Unspash)
This page titled 2: Basic Counting Techniques is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy Morris.
1
2.1: The Product Rule
The product rule is a rule that applies when we there is more than one variable (i.e. thing that can change) involved in determining
the final outcome.
Example 2.1.1
Consider the example of buying coffee at a coffee shop that sells four varieties and three sizes. When you are choosing your
coffee, you need to choose both variety and size. One way of figuring out how many choices you have in total, would be to
make a table. You could label the columns with the sizes, and the rows with the varieties (or vice versa, it doesn’t matter). Each
entry in your table will be a different combination of variety and size:
As you can see, a different combination of variety and size appears in each space of the table, and every possible combination of
variety and size appears somewhere. Thus the total number of possible choices is the number of entries in this table. Although in a
small example like this we could simply count all of the entries and see that there are twelve, it will be more useful to notice that
elementary arithmetic tells us that the number of entries in the table will be the number of rows times the number of columns,
which is four times three.
In other words, to determine the total number of choices you have, we multiply the number of choices of variety (that is, the
number of rows in our table) by the number of choices of size (that is, the number of columns in our table). This is an example of
what we’ll call the product rule.
We’re now ready to state the product rule in its full generality.
Suppose that when you are determining the total number of outcomes, you can identify two different aspects that can vary. If
there are n possible outcomes for the first aspect, and for each of those possible outcomes, there are n possible outcomes for
1 2
the second aspect, then the total number of possible outcomes will be n n . 1 2
In the above example, we can think of the aspects that can change as being the variety of coffee, and the size. There are four
outcomes (choices) for the first aspect, and three outcomes (choices) for the second aspect, so the total number of possible
outcomes is 4 ⋅ 3 = 12 .
Sometimes it seems clear that there are more than two aspects that are varying. If this happens, we can apply the product rule more
than once to determine the answer, by first identifying two aspects (one of which may be “all the rest”), and then subdividing one
or both of those aspects. An example of this is the problem posed earlier of buying a doughnut to go with your coffee.
Example 2.1.2
Kyle wants to buy coffee and a doughnut. The local doughnut shop has five kinds of doughnuts for sale and sells four varieties
of coffee in three sizes (as in Example 2.1.1). How many different orders could Kyle make?
Solution
A natural way to divide Kyle’s options into two aspects that can vary is to consider separately his choice of doughnut, and his
choice of coffee. There are five choices for the kind of doughnut he orders, so n = 5 . For choosing the coffee, we have
1
already used the product rule in Example 2.1.1 to determine that the number of coffee options is n = 12 .
2
2.1.1 https://math.libretexts.org/@go/page/60098
Let’s go through an example that more clearly involves repeated applications of the product rule.
Example 2.1.3
Chlöe wants to manufacture children’s t-shirts. There are generally three sizes of t-shirts for children: small, medium, and
large. She wants to offer the t-shirts in eight different colours (blue, yellow, pink, green, purple, orange, white, and black). The
shirts can have an image on the front, and a slogan on the back. She has come up with three images, and five slogans.
To stock her show room, Chlöe wants to produce a single sample of each kind of shirt that she will be offering for sale. The
shirts cost her $4 each to produce. How much are the samples going to cost her (in total)?
Solution
To solve this problem, observe that to determine how many sample shirts Chlöe will produce, we can consider the size as one
aspect, and the style (including colour, image, and slogan) as the other. There are n = 3 sizes. So the number of samples will
1
We now break n down further: to determine how many possible styles are available, you can divide this into two aspects: the
2
colour, and the decoration (image and slogan). There are n = 8 colours. So the number of styles will be eight times n ,
2,1 2,2
where n is the number of possible decorations (combinations of image and slogan) that are available.
2,2
We can break n down further: to determine how many possible decorations are available, you divide this into the two
2,2
Putting all of this together, we see that Chlöe will have to create 3(8(3 ⋅ 5)) = 360 sample t-shirts. As each one costs $4, her
total cost will be $1440.
Notice that finding exactly two aspects that vary can be quite artificial. Example 2.1.3 serves as a good demonstration for a
generalisation of the product rule as we stated it above. In that example, it would have been more natural to have considered from
the start that there were four apparent aspects to the t-shirts that can vary: size, colour, image, and slogan. The total number of t-
shirts she needed to produce was the product of the number of possible outcomes of each of these aspects: 3 ⋅ 8 ⋅ 3 ⋅ 5 = 360 .
Suppose that when you are determining the total number of outcomes, you can identify k different aspects that can vary. If for
each i between 1 and k there are n possible outcomes for the i aspect, then the total number of possible outcomes will be
i
th
Now let’s look at an example where we are trying to evaluate a probability. Since this course is about counting rather than
probability, we’ll restrict our attention to examples where all outcomes are equally likely. Under this assumption, in order to
determine a probability, we can count the number of outcomes that have the property we’re looking for, and divide by the total
number of outcomes.
Example 2.1.4
Peter and Mary have two daughters. What is the probability that their next two children will also be girls?
Solution
To answer this, we consider each child as a different aspect. There are two possible sexes for their third child: boy or girl. For
each of these, there are two possible choices for their fourth child: boy or girl. So in total, the product rule tells us that there are
2 ⋅ 2 = 4 possible combinations for the sexes of their third and fourth children. This will be the denominator of the probability.
To determine the numerator (that is, the number of ways in which both children can be girls), we again consider each child as a
different aspect. There is only one possible way for the third child to be a girl, and then there is only one possible way for the
fourth child to be a girl. So in total, only one of the four possible combinations of sexes involves both children being girls.
1
The probability that their next two children will also be girls is .
4
2.1.2 https://math.libretexts.org/@go/page/60098
Notice that in this example, the fact that Peter and Mary’s first two children were girls was irrelevant to our calculations, because it
was already a known outcome, over and done with, so is true no matter what may happen with their later children. If Peter and
Mary hadn’t yet had any children and we asked for the probability that their first four children will all be girls, then our calculations
would have to include both possible options for the sex of each of their first two children. In this case, the final probability would
1
be (there are 16 possible combinations for the sexes of four children, only one of which involves all four being female).
16
Exercise 2.1.1
This page titled 2.1: The Product Rule is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy Morris.
2.1.3 https://math.libretexts.org/@go/page/60098
2.2: The Sum Rule
The sum rule is a rule that can be applied to determine the number of possible outcomes when there are two different things that
you might choose to do (and various ways in which you can do each of them), and you cannot do both of them. Often, it is applied
when there is a natural way of breaking the outcomes down into cases.
Example 2.2.1
Recall the example of buying a bagel or a doughnut at a doughnut shop that sells five kinds of doughnuts and three kinds of
bagels. You are only choosing one or the other, so one way to determine how many choices you have in total, would be to write
down all of the possible kinds of doughnut in one list, and all of the possible kinds of bagel in another list:
Doughnuts Bagels
custard-filled
original glazed
The total number of possible choices is the number of entries that appear in the two lists combined, which is five plus three.
In other words, to determine the number of choices you have, we add the number of choices of doughnut (that is, the number of
entries in the first list) and the number of choices of bagel (that is, the number of entries in the second list). This is an example
of the sum rule.
We’re now ready to state the sum rule in its full generality.
Suppose that when you are determining the total number of outcomes, you can identify two distinct cases with the property that
every possible outcome lies in exactly one of the cases. If there are n possible outcomes in the first case, and n possible
1 2
outcomes in the second case, then the total number of possible outcomes will be n + n .
1 2
It’s hard to do much with the sum rule by itself, but we’ll cover a couple more examples and then in the next section, we’ll get into
some more challenging examples where we combine the two rules.
Sometimes the problem naturally splits into more than two cases, with every possible outcome lying in exactly one of the cases. If
this happens, we can apply the sum rule more than once to determine the answer. First we identify two cases (one of which may be
“everything else”) and then subdivide one or both of the cases. Let’s look at an example of this.
Example 2.2.2
Mary and Peter are planning to have no more than three children. What are the possible combinations of girls and boys they
might end up with, if we aren’t keeping track of the order of the children? (By not keeping track of the order of the children, I
mean that we’ll consider having two girls followed by one boy as being the same as having two girls and one boy in any other
order.)
Solution
To answer this question, we’ll break the problem into cases. First we’ll divide the problem into two possibilities: Mary and
Peter have no children; or they have at least one child. If Mary and Peter have no children, this can happen in only one way (no
boys and no girls). If Mary and Peter have at least one child, then they have between one and three children. We’ll have to
break this down further to find how many outcomes are involved.
We break the case where Mary and Peter have between one and three children down into two cases: they might have one child,
or they might have more than one child. If they have one child, that child might be a boy or a girl, so there are two possible
2.2.1 https://math.libretexts.org/@go/page/60099
outcomes. If they have more than one child, again we’ll need to further subdivide this case.
The case where Mary and Peter have either two or three children naturally breaks down into two cases: they might have two
children, or they might have three children. If they have two children, the number of girls they have might be zero, one, or two,
so there are three possible outcomes (the remaining children, if any, must all be boys). If they have three children, the number
of girls they have might be zero, one, two, or three, so there are four possible outcomes (again, any remaining children must be
boys).
Now we put all of these outcomes together with the sum rule. We conclude that in total, there are 1 + (2 + (3 + 4)) = 10
different combinations of girls and boys that Mary and Peter might end up with.
Notice that it was artificial to repeatedly break this example up into two cases at a time. Thus, Example 2.2.2 serves as a good
demonstration for a generalisation of the sum rule as we stated it above. It would have been more natural to have broken the
problem of Mary and Peter’s kids up into four cases from the beginning, depending on whether they end up with zero, one, two, or
three kids. The total number of combinations of girls and boys that Mary and Peter might end up with, is the sum of the
combinations they can end up with in each of these cases; that is, 1 + 2 + 3 + 4 = 10 .
Suppose that when you are determining the total number of outcomes, you can identify k distinct cases with the property that
every possible outcome lies in exactly one of the cases. If for each i between 1 and k there are n possible outcomes in the i
i
th
k
case, then the total number of possible outcomes will be ∏ n (that is, the sum as i goes from 1 to k of the n ).
i=1 i i
There is one other important way to use the sum rule. This application is a bit more subtle. Suppose you know the total number of
outcomes, and you want to know the number of outcomes that don’t include a particular event. The sum rule tells us that the total
number of outcomes is comprised of the outcomes that do include that event, together with the ones that don’t. So if it’s easy to
figure out how many outcomes include the event that interests you, then you can subtract that from the total number of outcomes to
determine how many outcomes exclude that event. Here’s an example.
Example 2.2.3
There are 216 different possible outcomes from rolling a white die, a red die, and a yellow die. (You can work this out using
the product rule.) How many of these outcomes involve rolling a one on two or fewer of the dice?
Solution
Tackling this problem directly, you might be inclined to split it into three cases: outcomes that involve rolling no ones, those
that involve rolling exactly one one, and those that involve rolling exactly two ones. If you try this, the analysis will be long
and fairly involved, and will include both the product rule and the sum rule. If you are careful, you will be able to find the
correct answer this way.
We’ll use a different approach, by first counting the outcomes that we don’t want: those that involve getting a one on all three
of the dice. There is only one way for this to happen: all three of the dice have to roll ones! So the number of outcomes that
involve rolling ones on two or fewer of the dice, will be 216 − 1 = 215 .
Exercise 2.2.1
This page titled 2.2: The Sum Rule is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy Morris.
2.2.2 https://math.libretexts.org/@go/page/60099
2.3: Putting Them Together
When we combine the product rule and the sum rule, we can explore more challenging questions.
Example 2.3.1
Grace is staying at a bed-and-breakfast. In the evening, she is offered a choice of menu items for breakfast in bed, to be
delivered the next morning. There are three kinds of items: main dishes, side dishes, and beverages. She is allowed to choose
up to one of each, but some of them come with optional extras. From the menu below, how many different breakfasts could she
order?
Solution
We see that the number of choices Grace has available depends partly on whether or not she orders an item or items that
include optional extras. We will therefore divide our consideration into four cases:
1. Grace does not order any pancakes, waffles, or toast.
2. Grace orders pancakes or waffles, but does not order toast.
3. Grace does not order pancakes or waffles, but does order toast.
4. Grace orders toast, and also orders either pancakes or waffles.
In the first case, Grace has three possible choices for her main dish (oatmeal, omelette, or nothing). For each of these, she has
two choices for her side dish (fruit cup, or nothing). For each of these, she has four choices for her beverage (coffee, tea,
orange juice, or nothing). Using the product rule, we conclude that Grace could order 3 ⋅ 2 ⋅ 4 = 24 different breakfasts that do
not include pancakes, waffles, or toast.
In the second case, Grace has two possible choices for her main dish (pancakes, or waffles). For each of these, she has two
choices for her side dish (fruit cup, or nothing). For each of these, she has four choices for her beverage. In addition, for each
of her choices of pancakes or waffles, she can choose to have maple syrup, or not (two choices). Using the product rule, we
conclude that Grace could order 2 ⋅ 2 ⋅ 4 ⋅ 2 = 32 different breakfasts that include pancakes or waffles, but not toast.
In the third case, Grace has three possible choices for her main dish (oatmeal, omelette, or nothing). For each of these, she has
only one possible side dish (toast), but she has four choices for what to put on her toast (marmalade, lemon curd, blackberry
jam, or nothing). For each of these choices, she has four choices of beverage. Using the product rule, we conclude that Grace
could order 3 ⋅ 4 ⋅ 4 = 48 different breakfasts that include toast, but do not include pancakes or waffles.
In the final case, Grace has two possible choices for her main dish (pancakes, or waffles). She has two choices for what to put
on her main dish (maple syrup, or only butter). She is having toast, but has four choices for what to put on her toast. Finally,
she again has four choices of beverage. Using the product rule, we conclude that Grace could order 2 ⋅ 2 ⋅ 4 ⋅ 4 = 64 different
breakfasts that include toast as well as either pancakes or waffles.
Using the sum rule, we see that the total number of different breakfasts Grace could order is 24 + 32 + 48 + 64 = 168 .
2.3.1 https://math.libretexts.org/@go/page/60100
Example 2.3.2
The types of license plates in Alberta that are available to individuals (not corporations or farms) for their cars or motorcycles,
fall into one of the following categories:
vanity plates
regular car plates
veteran plates
motorcycle plates.
None of these license plates use the letters I or O.
Regular car plates have one of two formats: three letters followed by three digits; or three letters followed by four digits (in the
latter case, none of the letters A, E, I, O, U, or Y is used).
Veteran plates begin with the letter V, followed by two other letters and two digits. Motorcycle plates have two letters followed
by three digits.
Setting aside vanity plates and ignoring the fact that some three-letter words are avoided, how many license plates are available
to individuals in Alberta for their cars or motorcycles?
Solution
To answer this question, there is a natural division into four cases: regular car plates with three digits; regular car plates with
four digits; veteran plates; and motorcycle plates.
For a regular car plate with three digits, there are 24 choices for the first letter, followed by 24 choices for the second letter, and
24 choices for the third letter. There are 10 choices for the first digit, 10 choices for the second digit, and 10 choices for the
third digit. Using the product rule, the total number of license plates in this category is 243 ⋅ 103 = 13, 824, 000.
For a regular car plate with four digits, there are 20 choices for the first letter, followed by 20 choices for the second letter, and
20 choices for the third letter. There are 10 choices for the first digit, 10 choices for the second digit, 10 choices for the third
digit, and 10 choices for the fourth digit. Using the product rule, the total number of license plates in this category is
203 ⋅ 104 = 80, 000, 000 .
For a veteran plate, there are 24 choices for the first letter, followed by 24 choices for the second letter. There are 10 choices
for the first digit, and 10 choices for the second digit. Using the product rule, the total number of license plates in this category
is 242 ⋅ 102 = 57, 600.
Finally, for a motorcycle plate, there are 24 choices for the first letter, followed by 24 choices for the second letter. There are
10 choices for the first digit, 10 choices for the second digit, and 10 choices for the third digit. Using the product rule, the total
number of license plates in this category is 242 ⋅ 103 = 576, 000.
Using the sum rule, we see that the total number of license plates is
13, 824, 000 + 80, 000, 000 + 57, 600 + 576, 000
It doesn’t always happen that the sum rule is applied first to break the problem down into cases, followed by the product rule within
each case. In some problems, these might occur in the other order. Sometimes there may seem to be one “obvious” way to look at
the problem, but often there is more than one equally effective analysis, and different analyses might begin with different rules.
In Example 2.3.1, we could have begun by noticing that no matter what else she may choose, Grace has four possible options for
her beverage. Thus, the total number of possible breakfast orders will be four times the number of possible orders of main and side
(with optional extras). Then we could have proceeded to analyse the number of possible choices for her main dish and her side dish
(together with the extras). Breaking down the choices for her main and side dishes into the same cases as before, we could see that
there are 3 ⋅ 2 = 6 choices in the first case; 2 ⋅ 2 ⋅ 2 = 8 choices in the second case; 3 ⋅ 4 = 12 choices in the third case; and
2 ⋅ 2 ⋅ 4 = 16 choices in the fourth case. Thus she has a total of 6 + 8 + 12 + 16 = 42 choices for her main and side dishes. The
product rule now tells us that she has 4 ⋅ 42 = 168 possible orders for her breakfast.
2.3.2 https://math.libretexts.org/@go/page/60100
Let’s run through one more (simpler) example of using both the sum and product rules, and work out the answer in two different
ways.
Example 2.3.3
Kathy plans to buy her Dad a shirt for his birthday. The store she goes to has three different colours of short-sleeved shirts, and
six different colours of long-sleeved shirts. They will gift-wrap in her choice of two wrapping papers. Assuming that she wants
the shirt gift-wrapped, how many different options does she have for her gift?
Solution
Let’s start by applying the product rule first. There are two aspects that she can vary: the shirt, and the wrapping. She has two
choices for the wrapping, so her total number of options will be twice the number of shirt choices that she has. For the shirt, we
break her choices down into two cases: if she opts for a short-sleeved shirt then she has three choices (of colour), while if she
opts for a long-sleeved shirt then she has six choices. In total she has 3 + 6 = 9 choices for the shirt. Using the product rule,
we see that she has 2 ⋅ 9 = 18 options for her gift.
Alternatively, we could apply the sum rule first. We will consider the two cases: that she buys a short-sleeved shirt; or a long-
sleeved shirt. If she buys a short-sleeved shirt, then she has three options for the shirt, and for each of these she has two options
for the wrapping, making (by the product rule) 3 ⋅ 2 = 6 options of short-sleeved shirts. If she buys a long-sleeved shirt, then
she has six options for the shirt, and for each of these she has two options for the wrapping, making (by the product rule)
6 ⋅ 2 = 12 options of long-sleeved shirts. Using the sum rule, we see that she has 6 + 12 = 18 options for her gift.
Exercise 2.3.1
Exercise 2.3.2
1. There are 8 buses a day from Toronto to Ottawa, 20 from Ottawa to Montreal, and 9 buses directly from Toronto to
Montreal. Assuming that you do not have to complete the trip in one day (so the departure and arrival times of the buses is
not an issue), how many different schedules could you use in travelling by bus from Toronto to Montreal?
2. How many 7-bit ternary strings (that is, strings whose only entries are 0, 1, or 2) begin with either 1 or 01
This page titled 2.3: Putting Them Together is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy Morris.
2.3.3 https://math.libretexts.org/@go/page/60100
2.4: Summing Up
Very likely you’ve used the sum rule or the product rule when counting simple things, without even stopping to think about what
you were doing. The reason we’re going through each of them very slowly and carefully, is because when we start looking at more
complicated problems, our uses of the sum and product rules will become more subtle. If we don’t have a very clear understanding
in very simple situations of what we are doing and why, we’ll be completely lost when we get to more difficult examples.
It’s dangerous to try to come up with a simple guideline for when to use the product rule and when to use the sum rule, because
such a guideline will often go wrong in complicated situations. Nonetheless, a good question to ask yourself when you are trying to
decide which rule to use is, “Would I describe this with the word ‘and,’ or the word ‘or’?” The word “and” is generally used in
situations where it’s appropriate to use the product rule, while “or” tends to go along with the sum rule.
Let’s see how this applies to each of the examples we’ve looked at in this chapter.
In Example 2.1.1, you needed to choose the size and the variety for your coffee. In Example 2.1.2, Kyle wanted to choose a
doughnut and coffee. In Example 2.1.3, Chlöe needed to determine the size and the colour and the image and the slogan for each t-
shirt. In Example 2.1.4, we wanted to know the sex of Peter and Mary’s third and fourth children. So in each of these examples, we
used the product rule.
In Example 2.2.1, you needed to choose a bagel or a doughnut. In Example 2.2.2, Mary and Peter could have zero or one or two or
three children. So in each of these examples, we used the sum rule.
You definitely have to be careful in applying this guideline, as problems can be phrased in a misleading way. We could have said
that in Example 2.2.1, we want to know how many different kinds of doughnuts and of bagels there are, altogether. The important
point is that you aren’t choosing both of these things, though; you are choosing just one thing, and it will be either a doughnut, or a
bagel.
In Example 2.3.1, Grace was choosing a main dish and a side dish and a beverage, so we used the product rule to put these aspects
together. Whether or not she had extra options available for her main dish depended on whether she chose pancakes or waffles or
oatmeal or omelette or nothing, so the sum rule applied here. (Note that we didn’t actually consider each of these four things
separately, since they naturally fell into two categories. However, we would have reached the same answer if we had considered
each of them separately.) Similarly, whether or not she had extra options available for her side dish depended on whether she chose
toast or not, so again the sum rule applied.
In Example 2.3.2, the plates can be regular (in either of two ways) or veteran or motorcycle plates, so the sum rule was used. In
each of these categories, we had to consider the options for the first character and the second character (and so on), so the product
rule applied.
Finally, in Example 2.3.3, the shirt Kathy chooses can be short-sleeved or long-sleeved, so the sum rule applies to that distinction.
Since she wants to choose a shirt and gift wrap, the product rule applies to that combination.
Exercise 2.4.1
For each of the following problems, do you need to use the sum rule, the product rule, or both?
1. Count all of the numbers that have exactly two digits, and the numbers that have exactly four digits.
2. How many possible outcomes are there from rolling a red die and a yellow die?
3. How many possible outcomes are there from rolling three dice, if you only count the outcomes that involve at most one of
the dice coming up as a one?
This page titled 2.4: Summing Up is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy Morris.
2.4.1 https://math.libretexts.org/@go/page/60101
2.5: Summary
Product rule.
Sum rule.
Combining the product and sum rules.
This page titled 2.5: Summary is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy Morris.
2.5.1 https://math.libretexts.org/@go/page/60102
CHAPTER OVERVIEW
This page titled 3: Permutations, Combinations, and the Binomial Theorem is shared under a CC BY-NC-SA license and was authored, remixed,
and/or curated by Joy Morris.
1
3.1: Permutations
We begin by looking at permutations, because these are a straightforward application of the product rule. The word “permutation”
means a rearrangement, and this is exactly what a permutation is: an ordering of a number of distinct items in a line. Sometimes
even though we have a large number of distinct items, we want to single out a smaller number and arrange those into a line; this is
also a sort of permutation.
Definition: Permutation
A permutation of n distinct objects is an arrangement of those objects into an ordered line. If 1 ≤ r ≤ n (and r is a natural
number) then an r-permutation of n objects is an arrangement of r of the n objects into an ordered line.
So a permutation involves choosing items from a finite population in which every item is uniquely identified, and keeping track of
the order in which the items were chosen.
Since we are studying enumeration, it shouldn’t surprise you that what we’ll be asking in this situation is how many permutations
there are, in a variety of circumstances. Let’s begin with an example in which we’ll calculate the number of 3-permutations of ten
objects (or in this case, people).
Example 3.1.1
Ten athletes are competing for Olympic medals in women’s speed skating (1000 metres). In how many ways might the medals
end up being awarded?
Solution
There are three medals: gold, silver, and bronze, so this question amounts to finding the number of 3-permutations of the ten
athletes (the first person in the 3-permutation is the one who gets the gold medal, the second gets the silver, and the third gets
the bronze).
To solve this question, we’ll apply the product rule, where the aspects that can vary are the winners of the gold, silver, and
bronze medals. We begin by considering how many different athletes might get the gold medal. The answer is that any of the
ten athletes might get that medal. No matter which of the athletes gets the gold medal, once that is decided we move our
consideration to the silver medal. Since one of the athletes has already been awarded the gold medal, only nine of them remain
in contention for the silver medal, so for any choice of athlete who wins gold, the number of choices for who gets the silver
medal is nine. Finally, with the gold and silver medalists out of contention for the bronze, there remain eight choices for who
might win that medal. Thus, the total number of ways in which the medals might be awarded is 10 ⋅ 9 ⋅ 8 = 720 .
We can use the same reasoning to determine a general formula for the number of r-permutations of n objects:
Theorem 3.1.1
Proof
There are n ways in which the first object can be chosen (any of the n objects). For each of these possible choices, there
remain n − 1 objects to choose for the second object, etc.
Note
We use n! to denote the number of permutations of n objects, so
n! = n(n − 1). . . 1 (3.1.1)
.
By convention, we define 0! = 1 .
3.1.1 https://math.libretexts.org/@go/page/60200
Definition: Factorial
We read n! as “n factorial,” so n factorial is n(n − 1). . . 1 . Thus, the number of r-permutations of n objects can be re-written
n! n!
as . When n = r this gives = n! , making sense of our definition that 0! = 1 .
(n − r)! 0!
Example 3.1.2
There are 36 people at a workshop. They are seated at six round tables of six people each for lunch. The Morris family (of
three) has asked to be seated together (side-by-side). How many different seating arrangements are possible at the Morris
family’s table?
Solution
First, there are 3! = 6 ways of arranging the order in which the three members of the Morris family sit at the table. Since the
tables are round, it doesn’t matter which specific seats they take, only the order in which they sit matters. Once the Morris
family is seated, the three remaining chairs are uniquely determined by their positions relative to the Morris family (one to
their right, one to their left, and one across from them). There are 33 other people at the conference; we need to choose three of
33! 33!
these people and place them in order into the three vacant chairs. There are = ways of doing this. In total,
(33 − 3)! 30!
33!
there are 6 ( ) = 196, 416 different seating arrangements possible at the Morris family’s table.
30!
By adjusting the details of the preceding example, it can require some quite different thought processes to find the answer.
Example 3.1.3
At the same workshop, there are three round dinner tables, seating twelve people each. The Morris family members (Joy, Dave,
and Harmony) still want to sit at the same table, but they have decided to spread out (so no two of them should be side-by-side)
to meet more people. How many different seating arrangements are possible at the Morris family’s table now?
Solution
Let’s begin by arbitrarily placing Joy somewhere at the table, and seating everyone else relative to her. This effectively
distinguishes the other eleven seats. Next, we’ll consider the nine people who aren’t in Joy’s family, and place them (standing)
33!
in an order clockwise around the table from her. There are ways to do this. Before we actually assign seats to these
(33 − 9)!
nine people, we decide where to slot in Dave and Harmony amongst them.
(In the above diagram, the digits 1 through 9 represent the nine other people who are sitting at the Morris family’s table, and
the J represents Joy’s position.) Dave can sit between any pair of non-Morrises who are standing beside each other; that is, in
any of the spots marked by small black dots in the diagram above. Thus, there are eight possible choices for where Dave will
sit. Now Harmony can go into any of the remaining seven spots marked by black dots. Once Dave and Harmony are in place,
everyone shifts to even out the circle (so the remaining black dots disappear), and takes their seats in the order determined.
3.1.2 https://math.libretexts.org/@go/page/60200
33!
We have shown that there are ⋅8⋅7 possible seating arrangements at the Morris table. That’s a really big number, and it’s
24!
quite acceptable to leave it in this format. However, in case you find another way to work out the problem and want to check
your answer, the total number is 783, 732, 837, 888, 000.
Exercise 3.1.1
Use what you have learned about permutations to work out the following problems. The sum and/or product rule may also be
required.
1. Six people, all of whom can play both bass and guitar, are auditioning for a band. There are two spots available: lead guitar,
and bass player. In how many ways can the band be completed?
2. Your friend Garth tries out for a play. After the auditions, he texts you that he got one of the parts he wanted, and that
(including him) nine people tried out for the five roles. You know that there were two parts that interested him. In how
many ways might the cast be completed (who gets which role matters)?
3. You are creating an 8-character password. You are allowed to use any of the 26 lowercase characters, and you must use
exactly one digit (from 0 through 9) somewhere in the password. You are not allowed to use any character more than once.
How many different passwords can you create?
4. How many 3-letter “words” (strings of characters, they don’t actually have to be words) can you form from the letters of
the word STRONG? How many of those words contain an s? (You may not use a letter more than once.)
5. How many permutations of {0, 1, 2, 3, 4, 5, 6}have no adjacent even digits? For example, a permutation like 5034216 is
not allowed because 4 and 2 are adjacent.
This page titled 3.1: Permutations is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy Morris.
3.1.3 https://math.libretexts.org/@go/page/60200
3.2: Combinations
Sometimes the order in which individuals are chosen doesn’t matter; all that matters is whether or not they were chosen. An
example of this is choosing a set of problems for an exam. Although the order in which the questions are arranged may make the
exam more or less intimidating, what really matters is which questions are on the exam, and which are not. Another example would
be choosing shirts to pack for a trip (assuming all of your shirts are distinguishable from each other). We call a choice like this a
“combination,” to indicate that it is the collection of things chosen that matters, and not the order.
Definition: r -Combination
Let n be a positive natural number, and 0 ≤ r ≤ n . Assume that we have n distinct objects. An r-combination of the n objects
is a subset consisting of r of the objects.
So a combination involves choosing items from a finite population in which every item is uniquely identified, but the order in
which the choices are made is unimportant.
Again, you should not be surprised to learn (since we are studying enumeration) that what we’ll be asking is how many
combinations there are, in a variety of circumstances. One significant difference from permutations is that it’s not interesting to ask
how many n-combinations there are of n objects; there is only one, as we must choose all of the objects.
Let’s begin with an example in which we’ll calculate the number of 3-combinations of ten objects (or in this case, people).
Example 3.2.1
Of the ten athletes competing for Olympic medals in women’s speed skating (1000 metres), three are to be chosen to form a
committee to review the rules for future competitions. How many different committees could be formed?
Solution
10!
We determined in Example 3.1.1 that there are ways in which the medals can be assigned. One easy way to choose the
7!
committee would be to make it consist of the three medal-winners. However, notice that if (for example) Wong wins gold,
Sajna wins silver, and Andersen wins bronze, we will end up with the same committee as if Sajna wins gold, Andersen wins
silver, and Wong wins bronze. In fact, what we’ve learned about permutations tells us that there are 3! different medal
outcomes that would each result in the committee being formed of Wong, Sajna, and Andersen.
In fact, there’s nothing special about Wong, Sajna, and Andersen – for any choice of three people to be on the committee, there
are 3! = 6 ways in which those individuals could have been awarded the medals. Therefore, when we counted the number of
ways in which the medals could be assigned, we counted each possible 3-member committee exactly 3! = 6 times. So the
10! 8
number of different committees is = 10 ⋅ 9 ⋅ = 120 .
(7!3!) 6
We can use the same reasoning to determine a general formula for the number of r-combinations of n objects:
Theorem 3.2.1
Proof
n!
By Theorem 3.1.1, there are r -permutations of n objects. Suppose that we knew there are k unordered r -subsets
(n − r)!
of n objects (i.e. r -combinations). For each of these k unordered subsets, there are r! ways in which we could order the
n! n!
elements. This tells us that k ⋅ r! = . Rearranging the equation, we obtain k = .
(n − r)! (r!(n − r)!)
3.2.1 https://math.libretexts.org/@go/page/60201
It will also prove extremely useful to have a short form for the number of r−combinations of n objects.
Note
We use ( n
r
) to denote the number of r-combinations of n objects, so
n n!
( ) = (3.2.2)
r r!(n − r)!
Definition
n!
We read ( n
r
) as “n choose r ,” so n choose r is
[r!(n − r)!]
coinciding with our earlier observation that there is only one way in which all of the n objects can be chosen. Similarly,
n n!
( ) = =1 (3.2.4)
0 0!(n − 0)!
Example 3.2.2
Jasmine is holding three cards from a regular deck of playing cards. She tells you that they are all hearts, and that she is
holding at least one of the two highest cards in the suit (Ace and King). If you wanted to list all of the possible sets of cards she
might be holding, how long would your list be?
Solution
We’ll consider three cases: that Jasmine is holding the Ace (but not the King); that she is holding the King (but not the Ace), or
that she is holding both the Ace and the King
If Jasmine is holding the Ace but not the King, of the eleven other cards in the suit of hearts she must be holding two. There are
( ) possible choices for the cards she is holding in this case.
11
Similarly, if Jasmine is holding the King but not the Ace, of the eleven other cards in the suit of hearts she must be holding
two. Again, there are ( ) possible choices for the cards she is holding in this case.
11
Finally, if Jasmine is holding the Ace and the King, then she is holding one of the other eleven cards in the suit of hearts. There
are ( ) possible choices for the cards she is holding in this case.
11
2
11
)+(
1
) = + + = + + 11 = 55 + 55 + 11 = 121 possible sets of cards.
2!9! 2!9! 1!10! 2 2
Here is another analysis that also works: Jasmine has at least one of the Ace and the King, so let’s divide the problem into two
cases: she might be holding the Ace, or she might be holding the King but not the Ace. If she is holding the Ace, then of the
twelve other hearts, she is holding two; these can be chosen in ( ) = 66 ways. If she is holding the King but not the Ace, then
12
as before, her other two cards can be chosen in ( ) = 55 ways, for a total (again) of 121.
11
A common mistake in an example like this is to divide the problem into the cases that Jasmine is holding the Ace, or that she is
holding the King, and to determine that each of these cases includes ( ) = 66 possible combinations of cards, for a total of
12
132. The problem with this analysis is that we’ve counted the combinations that include both the Ace and the King twice: once
as a combination that includes the Ace, and once as a combination that includes the King. If you do this, you need to
compensate by subtracting at the end the number of combinations that have been counted twice: that is, those that include the
3.2.2 https://math.libretexts.org/@go/page/60201
Ace and the King. As we worked out in the example, there are (
11
1
) = 11 of these, making a total of 132 − 11 = 121
combinations.
Exercise 3.2.1
Use what you have learned about combinations to work out the following problems. Permutations and other counting rules
we’ve covered may also be required.
1. For a magic trick, you ask a friend to draw three cards from a standard deck of 52 cards. How many possible sets of cards
might she have chosen?
2. For the same trick, you insist that your friend keep replacing her first draw until she draws a card that isn’t a spade. She can
choose any cards for her other two cards. How many possible sets of cards might she end up with? (Caution: choosing 5♣,
6 ♦, 3 ♠ in that order, is not different from choosing 6 ♦, 5 ♣, 3 ♠ in that order. You do not need to take into account that some
This page titled 3.2: Combinations is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy Morris.
3.2.3 https://math.libretexts.org/@go/page/60201
3.3: The Binomial Theorem
Here is an algebraic example in which “n choose r” arises naturally.
Example 3.3.1
Consider:
4
(a + b ) = (a + b)(a + b)(a + b)(a + b)
If you try to multiply this out, you must systematically choose the a or the b from each of the four factors, and make sure that
you make every possible combination of choices sooner or later.
One way of breaking this task down into smaller pieces is to separate it into five parts, depending on how many of the factors
you choose as from (4, 3, 2, 1, or 0). Each time you choose 4 of the a s, you will obtain a single contribution to the coefficient
of the term a ; each time you choose 3 of the a s, you will obtain a single contribution to the term a b ; each time you choose 2
4 3
of the a s, you will obtain a single contribution to the term a b ; each time you choose 1 of the a s, you will obtain a single
2 2
contribution to the term ab ; and each time you choose 0 of the a s, you will obtain a single contribution to the term b . In
3 4
other words, the coefficient of a particular term a b will be the number of ways in which you can choose i of the factors
i 4−i
take as. (Clearly, you must choose an a from every one of the four factors.) Thus, the coefficient of a will be 1. 4
If you want to take as from three of the four factors, Theorem 3.1.1 tells us that there are ( ) = 4 ways in which to choose the 4
factors from which you take the a s. (Specifically, these four ways consist of taking the b from any one of the four factors, and
the as from the other three factors). Thus, the coefficient of a b will be 4. 3
If you want to take as from two of the four factors, and bs from the other two, Theorem 3.1.1 tells us that there are ( ) = 6 4
ways in which to choose the factors from which you take the as (then take b s from the other two factors). This is a small
enough example that you could easily work out all six ways by hand if you wish. Thus, the coefficient of a b will be 6. 2 2
If you want to take as from one of the four factors, Theorem 3.1.1 tells us that there are ( ) = 4 ways in which to choose the 4
factors from which you take the a s. (Specifically, these four ways consist of taking the a from any one of the four factors, and
the b s from the other three factors). Thus, the coefficient of ab will be 4. 3
Finally, by Theorem 3.1.1, there is ( ) = 1 way to choose zero factors from which to take a s. (Clearly, you must choose a
4
0
b
from every one of the four factors.) Thus, the coefficient of b will be 1. 4
In fact, if we leave the coefficients in the original form in which we worked them out, we see that
4 4 4 4 3 4 2 2 4 3 4 4
(a + b ) = ( )a + ( )a b + ( )a b + ( )ab + ( )b
4 3 2 1 0
Proof
3.3.1 https://math.libretexts.org/@go/page/60202
As in Example 2.2.3.1, we see that the coefficient of a b in (a + b) will be the number of ways of choosing r of the n
r n−r n
factors from which we’ll take the a (taking the b from the other n − r factors). By Theorem 2.2.2.1, there are ( ) ways of n
making this choice. For the special case, begin by observing that (1 + x ) = (x + 1) ; then take a = x and b = 1 in the
n n
general formula. Use the fact that 1 = 1 for any integers n and r .
n−r
r
) are the coefficients of the terms in the Binomial Theorem.
r
) are referred to as binomial coefficients.
Corollary 3.3.1
Proof
This is an immediate consequence of substituting a = b = 1 into the Binomial Theorem.
Corollary 3.3.2
Proof
From the special case of the Binomial Theorem, we have
n
n
n r
(1 + x ) = ∑( )x (3.3.5)
r
r=0
n−1
n r−1
n(1 + x ) = ∑ r( )x (3.3.6)
r
r=0
Exercise 3.3.1
i=1
n
( )2
i
i
2) the coefficient of a 2 3 2
b c d
4
in (a + b) 5
(c + d)
6
.
3) the coefficient of a 2 6
b c
3
in (a + b) 5
(b + c )
6
.
4) the coefficient of a 3 2
b in (a + b) 5
+ (a + b )
2
.
This page titled 3.3: The Binomial Theorem is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy Morris.
3.3.2 https://math.libretexts.org/@go/page/60202
3.4: Summary
n!
The number of r-permutations of n objects is .
(n − r)!
n!
The number of r-combinations of n objects is ( n
r
) = .
r!(n − r)!
r -combination
n choose r
binomial coefficients
Notation:
n!
n
( )
r
This page titled 3.4: Summary is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy Morris.
3.4.1 https://math.libretexts.org/@go/page/60203
CHAPTER OVERVIEW
with a problem Y that is already known to be in N P . Then the scientist uses some clever ideas to show that problem Y can be
related to problem X in such a way that if problem X could be solved in polynomial time, that solution would produce a solution
to problem Y , still in polynomial time. Thus the fact that Y is in N P forces X to be in N P also. The same ideas may sometimes
relate the number of solutions of problem X to the number of solutions of problem Y .
4.1: Counting via Bijections
4.2: Combinatorial Proofs
4.3: Summary
This page titled 4: Bijections and Combinatorial Proofs is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by
Joy Morris.
1
4.1: Counting via Bijections
It can be hard to figure out how to count the number of outcomes for a particular problem. Sometimes it will be possible to find a
different problem, and to prove that the two problems have the same number of outcomes (by finding a bijection between their
outcomes). If we can work out how to count the outcomes for the second problem, then we’ve also solved the first problem! This
may seem blatantly obvious intuitively, but this technique can provide simple solutions to problems that at first glance seem very
difficult.
This technique of counting a set (or the number of outcomes to some problem) indirectly, via a different set or problem, is the
bijective technique for counting. We begin with a classic example of this technique.
Example 4.1.1
x y z
0 0 0 0
{x} 1 0 0
{y} 0 1 0
{z} 0 0 1
{x, y} 1 1 0
{x, z} 1 0 1
{y, z} 0 1 1
{x, y, z} 1 1 1
As you can see, the pattern of 1s and 0s is different in each row of the table, since the elements of each subset are different.
Furthermore, any pattern of 1s and 0s that has length 3, appears in some row of this table.
This is not a coincidence. In general, we can define a bijection between the binary strings of length n , and the subsets of a set
of n elements, as follows. We already know by the definition of cardinality, that there is a bijection between our set of n
elements, and the set {1, . . . , n}, so we’ll actually define a bijection between the subsets of {1, . . . , n} and the binary strings
of length n . Since the composition of two bijections is a bijection, this will indirectly define a bijection between our original
set, and the binary strings of length n .
Given a subset of {1, . . . , n}, the binary string that corresponds to this subset will be the binary string that has 1s in the i th
position if and only if i is in the subset. This tells us how to determine the binary string from the subset. We can also reverse
(invert) the process. Given a binary string of length n , the corresponding subset of {1, . . . , n} will be the subset whose
elements are the positions of the 1s in the binary string.
Although we haven’t directly proven that this map from subsets to binary strings is both one-to-one and onto, an invertible
function must be a bijection, so the fact that we were able to find an inverse function does prove that this map is a bijection.
(You should check that you agree that the function we’ve claimed as an inverse really does invert the original function.)
Now, our imaginary table wouldn’t be much use if we actually had to write it out. In order to write it out, we would need to
know all of the subsets of our set already; and if we knew them all, we could certainly count them! Fortunately, we do not need
4.1.1 https://math.libretexts.org/@go/page/60205
to write it out. Instead, we use the bijection we have just defined. Rather than count the number of subsets of an n -set, we
count the number of binary strings of length n . We can do this using just the multiplication rule! In each position, there is
either a zero or a one, so there are 2 choices for each of the n positions. Hence, there are 2 binary strings of length n .
n
In some ways, we’ve actually been using this idea pretty much every time we’ve come up with more than one way to solve a
problem. Implicitly, finding a different way of thinking of the problem is equivalent to finding a bijection between the solutions to
these different approaches.
Example 4.1.2
How many ways are there to choose ten people from a group of 30 men and 30 women, if the group must include at least one
woman?
Solution
Attacking this problem directly will get ugly. We would have to consider separately the cases of including one woman, two
women, etc., all the way up to ten women, in our group, and add all of the resulting terms together. Instead, we note that there
is an obvious bijection (the identity map) between groups that do include at least one woman, and groups that do not include
exactly zero women.
The latter is relatively easy to figure out: there are ( ) possible groups of ten people that could be chosen from the 60 people.
60
10
Of these, there are ( ) groups that include zero women (since the members of any such group must be chosen entirely from
30
10
the 30 men). Therefore, the number of groups that do not include exactly zero women, is ( ) − ( ) .
60
10
30
10
Thanks to our bijection, we conclude that the number of groups that can be chosen, that will include at least one woman, is also
( ) − ( ).
60 30
10 10
Exercise 4.1.1
The following problems should help you in working with the bijective technique for counting.
1. We define a structure that is like a subset, except that any element of the original set may appear 0, 1, or 2 times in the
structure. How many of these structures can we form from the set {1, . . . , n}?
2. Find a bijection between the coefficient of x in (1 + x) , and the number of r-combinations of an n -set.
r n
3. Find a bijection between the number of ways in which three different dolls can be put into ten numbered cribs, and the
number of ways in which ten Olympic contenders can win the medals in their event.
This page titled 4.1: Counting via Bijections is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy Morris.
4.1.2 https://math.libretexts.org/@go/page/60205
4.2: Combinatorial Proofs
As we said in the previous section, thinking about a problem in two different ways implicitly creates a bijection, telling us that the
number of solutions we obtain will be the same either way. When we looked at bijections, we were using this idea to find an easier
way to count something that seemed difficult. But if we actually can find a (possibly messy) formula that counts the answer to our
problem correctly in some “difficult” way, and we can also find a different formula that counts the answer to the same problem
correctly by looking at it in a different way, then we know that the values of the two formulas must be equal, no matter how
different they may look.
This is the idea of a “combinatorial proof".
Theorem 4.2.1
If f (n) and g(n) are functions that count the number of solutions to some problem involving n objects, then f (n) = g(n) for
every n
In the statement of this theorem and definition, we’ve made f and g functions of a single variable, n , but the same ideas hold if f
and g are functions of more than one variable. Our first example demonstrates this.
Example 4.2.1
Prove that for every natural number n and every integer r between 0 and n , we have
n n
( ) =( )
r n−r
Solution
By the definition of ( ), this is the number of ways of choosing r objects from a set of n distinct objects. Any time we choose
n
r of the objects, the other n − r objects are being left out of the set we are choosing. So equivalently, instead of choosing the r
objects to include in our set, we could choose the n − r objects to leave out of our set. By the definition of the binomial
coefficients, there are ( ) ways of making this choice.
n
n−r
Therefore, it must be the case that for every natural number n and every integer r between 0 and n , we have
n n
( ) =( )
r n−r
as desired,
Of course, this particular identity is also quite easy to prove directly, using the formula for ( ), since n
n
n! n! n
( ) = = =( )
n−r r
(n − r)!(n − (n − r))! (n − r)!r!
Many identities that can be proven using a combinatorial proof can also be proven directly, or using a proof by induction. The
nice thing about a combinatorial proof is it usually gives us rather more insight into why the two formulas should be equal, than
we get from many other proof techniques.
In Example 4.1.1, we noted that one way to figure out the number of subsets of an n-element set would be to count the number of
subsets of each possible size, and add them all up. We then followed a bijective approach to prove that the answer is in fact 2n. If
we actually carry through on the first idea, this leads to another combinatorial identity (one that we already observed via the
Binomial Theorem):
4.2.1 https://math.libretexts.org/@go/page/60206
Example 4.2.2
Solution
We have seen in Example 4.1.1 that the number of subsets of a set of n elements is n
2 . We will count the same problem in a
different way, to obtain the other side of the equality.
To determine the number of subsets of a set of n elements, we break the problem down into n + 1 cases, and use the sum rule.
The cases into which we will divide the problem are the different possible cardinalities for the subsets: everything from 0
through n . There are ( ) ways to choose a subset of r elements from the set of n elements, so the number of subsets that
n
contain r elements is ( ). Thus, the total number of subsets of our original set must be ∑
n
r
( ).
n
r=0
n
Since we have counted the same problem in two different ways and obtained different formulas, Theorem 4.2.1 tells us that the
two formulas must be equal; that is,
n n n
∑ ( ) =2
r=0 r
as desired.
We can also produce an interesting combinatorial identity from a generalisation of the problem studied in Example 4.1.2.
Example 4.2.3
Suppose we have a collection of n men and n women, and we want to choose r of them for a focus group, but we must include
at least one woman. In how many ways can this be done? Use two different methods to count the solutions, and deduce a
combinatorial identity.
Solution
Using the same reasoning that we applied in Example 4.1.2, we see that the number of ways of choosing a group that includes
at least one woman is the total number of ways of choosing a group of r people from these 2n people, less the number of ways
that include only men; that is: ( ) − ( ) .
2n
r
n
Alternatively, we can divide the problem up into r cases depending on how many women are to be included in the group (there
must be i women, for some 1 ≤ i ≤ r ). There are ( ) ways to choose i women for the group, and for each of these, there are
n
) ways to choose r − i men to complete the group. Thus, the total number of ways of choosing a group that includes at
n
(
r−i
One context in which combinatorial proofs arise very naturally is when we are counting ordered pairs that have some property.
That is, for some subset of X × Y , we may wish to count all of the ordered pairs (x, y), where x ∈ X and y ∈ Y , such that (x, y)
has some property. We can do this by first considering every possible value of x ∈ X , and for each such value, counting the
number of y ∈ Y such that (x, y) satisfies the desired property, or by first considering every possible value of y ∈ Y , and for each
such value, counting the number of x ∈ X such that (x, y) satisfies the desired property.
Although this idea may not seem very practical, it is actually the context in which many of the combinatorial proofs in later
chapters will arise. We will be looking at a set X of elements, and a set Y that is actually a collection of subsets of elements of X,
and counting pairs (x, y) for which the element x appears in the subset y . By counting these pairs in two ways, we will find a
combinatorial identity.
4.2.2 https://math.libretexts.org/@go/page/60206
Example 4.2.4
Let B be the set of city blocks in a small city, and let S be the set of street segments in the city (where a street segment means
a section of street that lies between two intersections). Assume that each block has at least three sides. Count the number of
pairs (s, b) with s ∈ S and b ∈ B such that the street segment s is adjacent to the block b in two ways. Use this to deduce a
combinatorial inequality.
Solution
Let |S| = t . Each street segment is adjacent to two blocks: the blocks that lie on either side of the street. Therefore, for any
given street segment s , there are two pairs (s, b) such that s is adjacent to the block b . Multiplying this count by t (the number
of street segments) tells us that the total number of pairs (s, b) ∈ S × B with s adjacent to b is 2t.
Let |B| = c . Each block is adjacent to at least 3 street segments: the street segments that surround the block. Therefore, for any
given block b in the city, there are at least 3 pairs (s, b) such that b is adjacent to the street segment s . Multiplying this count
by c (the number of blocks) tells us that the total number of pairs (s, b) ∈ S × B with s adjacent to b is at least 3c.
We deduce that 2t ≥ 3c .
Exercise 4.2.1
Let P be the set of people in a group, with |P | = p . Let C be a set of clubs formed by the people in this group, with |C | = c .
Suppose that each club contains exactly g people, and each person is in exactly j clubs. Use two different ways to count the
number of pairs (b, h) ∈ P × C such that person b is in club h , and deduce a combinatorial identity.
Exercise 4.2.2
r
n−1
r−1
n−1
of r people from a group of n people. Then break the problem into two cases depending on whether or not one specific
person is chosen for the team.]
2. For any natural numbers k , r, n , with 0 ≤ k ≤ r ≤ n , ( )( ) = ( )( ) . [Hint: Consider the number of ways to choose r
n
r
r
k
n
k
n−k
r−k
dogs who will enter a competition, from a set of n dogs, and to choose k of those r dogs to become the finalists. Then
choose the finalists first, followed by the other dogs who entered the competition.]
2
3. For any natural number n , ∑ ( ) =(
n
).
r=0
n
r
2n
n! (n − 1)!
4. For n ≥ 1 and k ≥ 1 , =n .
(n − k)! (n − 1 − (k − 1))!
5. For n ≥ 1 , 3 n n
=∑
k=0
n
( )2
k
n−k
Exercise 4.2.3
Sometimes the hardest part of a combinatorial proof can be figuring out what problem the given formula provides a solution to.
For each of the following formulas, state a counting problem that can be solved by the formula.
1. n2 n−1
.
2. ∑ n
r=0
r( )
n
r
.
3. ∑ n
k=r
n
( )( )
k
k
r
.
4. 2n−r n
( )
r
.
This page titled 4.2: Combinatorial Proofs is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy Morris.
4.2.3 https://math.libretexts.org/@go/page/60206
4.3: Summary
Counting via bijections
Combinatorial identities
Combinatorial proofs
This page titled 4.3: Summary is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy Morris.
4.3.1 https://math.libretexts.org/@go/page/60207
CHAPTER OVERVIEW
This page titled 5: Counting with Repetitions is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy Morris.
1
5.1: Unlimited Repetition
For many practical purposes, even if the number of indistinguishable elements in each class is not actually infinite, we will be
drawing a small enough number that we will not run out. The bagel shop we visited in Example 2.2.1 is not likely to run out of one
variety of bagel before filling a particular order. In standard card games, we never deal enough cards to a single player that they
might have all of the cards of one suit and still be getting more cards.
This is the sort of scenario we’ll be studying in this section. The set-up we’ll use is to assume that there are n different “types” of
item, and there are enough items of each type that we won’t run out. Then we’ll choose items, allowing ourselves to repeatedly
choose items of the same type as many times as we wish, until the total number of items we’ve chosen is r. Notice that (unlike in
Chapter 3), in this scenario r may exceed n .
We’ll consider two scenarios: the order in which we make the choice matters, or the order in which we make the choice doesn’t
matter.
Example 5.1.1
Chris has promised to bring back bagels for three friends he’s studying with (as well as one for himself). The bagel shop sells
eight varieties of bagel. In how many ways can he choose the bagels to give to Jan, Tom, Olive, and himself?
Solution
Here, it matters who gets which bagel. We can model this by assuming that the first bagel Chris orders will be for himself, the
second for Jan, the third for Tom, and the last for Olive. Thus, the order in which he asks for the bagels matters.
We actually saw back in Chapter 2 how to solve this problem. It’s just an application of the product rule! Chris has eight
choices for the first bagel; for each of these, he has eight choices for the second bagel; for each of these, he has eight choices
for the third bagel; and for each of these, he has eight choices for the fourth bagel. So in total, he has 8 ways to choose the
4
bagels.
OK, so if the order in which we make the choice matters, we just use the multiplication rule. What about if order doesn’t matter?
Example 5.1.2
When Chris brought back the bagels, it turned out that he’d done a poor job of figuring out what his friends wanted. They all
traded around. Later that night, they sent him to the doughnut store, but this time they told him to just bring back eight
doughnuts and they’d figure out who should get which. If the doughnut store has five varieties, how many ways are there for
Chris to fill this order?
Solution
Let’s call the five varieties chocolate, maple, boston cream, powdered, and jamfilled. One way to describe Chris’ order would
be to make a list in which we first write one c for each chocolate doughnut, then one m for each maple doughnut, then one b
for each boston cream doughnut, then one p for each powdered doughnut, and finally one j for each jam-filled doughnut. Since
Chris is ordering eight doughnuts, there will be eight letters in this list. Notice that there’s more information provided by this
list than we actually need. We know that all of the first group of letters are c s, so instead of writing them all out, we could
simply put a dividing marker after all of the c s and before the first m. Similarly, we can put three more dividing markers in to
separate the ms from the b s, the b s from the ps, and the ps from the j s. Now we have a list that might look something like this:
cc||bbb|ppp
5.1.1 https://math.libretexts.org/@go/page/60210
|_ _|_ _|_ _ _|_
we understand that Chris ordered no chocolate doughnuts; two maple doughnuts; two boston cream doughnuts; three powdered
doughnuts; and one jam-filled doughnut.
So an equivalent problem is to count the number of ways of arranging eight underlines and four dividing markers in a line.
This is something we already understand! We have twelve positions that we need to fill, and the problem is: in how many ways
can we fill eight of the twelve positions with underlines (placing dividing markers in the other four positions). We know that
this can be done in ( ) ways.
12
This technique can be used to give us a general formula for counting the number of ways of choosing r objects from n types of
objects, where we are allowed to repeatedly choose objects of the same type.
Theorem 5.1.1
The number of ways of choosing r objects from n types of objects (with replacement or repetition allowed) is
n+r−1
( ) (5.1.1)
r
Proof
We use the same idea as in the solution to Example 5.1.2, above. Since there are n different types of objects, we will need
n − 1 dividing markers to keep them apart. Since we are choosing r objects, we will need r underlines, for a total of
n + r − 1 positions to be filled. We can choose the r positions in which the objects will go in ( ) ways, and then (in
n+r−1
each case) put dividing markers into the remaining n − 1 positions. Thus, there are ( ) ways to choose r objects from
n+r−1
Note
We use (( n
r
)) to denote the number of ways of choosing r objects from n types of objects (with replacement or repetition
allowed),
n n+r−1
(( )) = ( ) (5.1.2)
r r
The reason we say “replacement or repetition” is because there is another natural model for this type of problem. Suppose that
instead of choosing eight bagels from five varieties, Chris is asked to put his hand into a bag that contains five different-coloured
pebbles, and draw one out; then replace it, repeatedly (with eight draws in total). If he keeps count of how many times he draws
each of the rocks, the number of possible tallies he’ll end up with is exactly the same as the number of doughnut orders in Example
5.1.2.
The following table summarises some of the key things we’ve learned about counting so far:
Table 5.1.1: The number of ways to choose r objects from n objects (or types of objects)
(n − r)!
Exercise 5.1.1
5.1.2 https://math.libretexts.org/@go/page/60210
1. Each of the ten sections in your community band (trombones, flutes, and so on) includes at least four people. The conductor
needs a quartet to play at a school event. How many different sets of instruments might end up playing at the event?
2. The prize bucket at a local fair contains six types of prizes. Kim wins 4 prizes; Jordan wins three prizes, and Finn wins six.
Each of the kids plans to give one of the prizes he has won to his teacher, and keep the rest. In how many ways can their
prizes (including the gifts to the teacher) be chosen? (It is important which gift comes from which child.)
3. There are three age categories in the local science fair: junior, intermediate, and senior. The judges can choose nine projects
in total to advance to the next level of competition, and they must choose at least one project from each age group. In how
many ways can the projects that advance be distributed across the age groups?
Exercise 5.1.2
k
n−1
)) = ((
k
)) + ((
n
k−1
)) .
2. For k , n ≥ 1 , (( )) = (
n
k
n+k−1
k
) .
3. For k , n ≥ 1 , (( )) = (
n−1
k+1 n+k−1
k
. )
4. For 1 ≤ n ≤ k , (( )) = ( n
k−1
k−1
k−n
).
Exercise 5.1.3
x + x + x + x + 3 x = 12 .
1 2 3 4 5
2. We will buy 3 pies (not necessarily all different) from a store that sells 4 kinds of pie. How many different orders are
possible? List all of the possibilities (using A for apple, B for blueberry, C for cherry, and D for the other one).
3. Suppose Lacrosse balls come in 3 colours: red, yellow, and blue. How many different combinations of colours are possible
in a 6-pack of Lacrosse balls?
4. After expanding (a + b + c + d) and combining like terms, how many terms are there? [Justify your answer without
7
This page titled 5.1: Unlimited Repetition is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy Morris.
5.1.3 https://math.libretexts.org/@go/page/60210
5.2: Sorting a Set that Contains Repetition
In the previous section, the new work came from looking at combinations where repetition or replacement is allowed. For our
purposes, we assumed that the repetition or replacement was effectively unlimited; that is, the store might only have 30 cinnamon
raisin bagels, but since Chris was only ordering four bagels, that limit didn’t matter.
In this section, we’re going to consider the situation where there are a fixed number of objects in total; some of them are “repeated”
(that is, indistinguishable from one another), and we want to determine how many ways they can be arranged (permuted). This can
arise in a variety of situations.
Example 5.2.1
When Chris gets back from the doughnut store run, he discovers that Mohammed, Jing, Karl, and Sara have joined the study
session. He has bought two chocolate doughnuts, three maple doughnuts, and three boston cream doughnuts. In how many
ways can the doughnuts be distributed so that everyone gets one doughnut?
Solution
Initially, this looks a lot like a permutation question: we need to figure out the number of ways to arrange the doughnuts in
some order, and give the first doughnut to the first student, the second doughnut to the second student, and so on.
The key new piece in this problem is that, unlike the permutations we’ve studied thus far, the two chocolate doughnuts are
indistinguishable (as are the three maple doughnuts and the three boston cream doughnuts). This means that there is no
difference between giving the first chocolate doughnut to Tom and the second to Mohammed, and giving the first chocolate
doughnut to Mohammed and the second to Tom.
One way to solve this problem is to look at it as a series of combinations of the people, rather than as a permutation question
about the doughnuts. Instead of arranging the doughnuts, we can first choose which two of the eight people will receive the
two chocolate doughnuts. Once that is done, from the remaining six people, we choose which three will receive maple
doughnuts. Finally, the remaining three people receive boston cream doughnuts. Thus, the solution is ( )( ).
8
2
6
Another approach is more like the approach we used to figure out how many r-combinations there are of n objects. In this
approach, we begin by noting that we would be able to arrange the eight doughnuts in 8! orders if all of them were distinct. For
any fixed choice of two people who receive the chocolate doughnuts, there are 2! ways in which those two chocolate
doughnuts could have been distributed to them, so in the 8! orderings of the doughnuts, each of these choices for who gets the
chocolate doughnuts has been counted 2! times rather than once. Similarly, for any fixed choice of three people who receive
the maple doughnuts, there are 3! ways in which these three maple doughnuts could have been distributed to them, and each of
these choices has been counted 3! times rather than once. The same holds true for the three boston cream doughnuts. Thus, the
8!
solution is .
(2!3!3!)
Since:
8 6
8! 6! 8!
( )( ) = ⋅ =
2 3
2!6! 3!3! (2!3!3!)
we see that these solutions are in fact identical although they look different.
This technique can be used to give us a general formula for counting the number of ways of arranging n objects some of which are
indistinguishable from each other.
Theorem 5.2.1
Suppose that:
there are n objects;
for each i with 1 ≤ i ≤ m , r of them are of type i (indistinguishable from each other); and
i
r +. . . +r
1 m= n.
5.2.1 https://math.libretexts.org/@go/page/60211
n!
(5.2.1)
r1 ! r2 !. . . rm !
Proof
We use the same idea as in the solution to Example 5.2.1, above. Either approach will work, but we’ll use the first. There
will be n positions in the final ordering of the objects. We begin by choosing r of these to hold the objects of type 1. Then
1
we choose r of them to hold the objects of type 2, and so on. Ultimately, we choose the final r locations (in ( ) = 1
2 m
rm
rm
n! (n − r1 )! (n − r1 −. . . −rm−1 )!
= ⋅ ⋅. . . ⋅ (5.2.2)
r1 !(n − r1 )! r2 !(n − r1 − r2 )! rm !0!
n!
=
r1 ! r2 !. . . rm !
Note
We use ( n
)
r1 ,..., rm
to denote the number of arrangements of n = r 1 +. . . +rm objects where for each i with 1 ≤ i ≤ m we have
ri indistinguishable objects of type i. Thus,
n n!
( ) = (5.2.3)
r1 , . . . , rm r1 !. . . rm !
Example 5.2.2
Cathy, Akos, and Dagmar will be going into a classroom of 30 students. They will each be pulling out four students to work
with in a small group setting. In how many ways can the groups be chosen?
Solution
Even though all of the students in the class are distinct, the order in which they get chosen for the group they end up in doesn’t
matter. One way of making the selection would be to put the names Cathy, Akos, and Dagmar into a hat (four times each)
along with 18 blank slips of paper. Each student could choose a slip of paper and would be assigned to the group
corresponding to the name they chose. The four slips with Cathy’s name on them are identical, as are the four with Akos’
name, the four with Dagmar’s name, and the 18 blank slips.
Thus, the solution to this problem is
30
30!
( ) =
4,4,4,18
(4!4!4!18)
We could also work this out more directly, by allowing each of Cathy, Akos, and Dagmar to choose four students; Cathy’s
choice can be made in ( ) ways; then Akos’ in ( ) ways; then Dagmar’s in ( ) ways, and the product of these is
30
4
26
4
22
30!
.
(4!4!4!18)
5.2.2 https://math.libretexts.org/@go/page/60211
Exercise 5.2.1
Exercise 5.2.2
1
k2
x
2
. . . xm
km
that comes from the product on the left-hand side of the equation.
This page titled 5.2: Sorting a Set that Contains Repetition is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by
Joy Morris.
5.2.3 https://math.libretexts.org/@go/page/60211
5.3: Summary
The number of ways of choosing r objects from n types of objects (with replacement or repetition allowed) is (( n
r
n+r−1
)) = (
r
) .
The number of ways of arranging n objects where r of them are of type i (indistinguishable), is (
i ).
n
r1 , r2 ,..., rm
Notation:
n
(( ))
r
n
( )
r1 , r2 ,..., rm
This page titled 5.3: Summary is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy Morris.
5.3.1 https://math.libretexts.org/@go/page/60212
CHAPTER OVERVIEW
This page titled 6: Induction and Recursion is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy Morris.
1
6.1: Recursively-Defined Sequences
You may be familiar with the term “recursion” as a programming technique. It comes from the same root as the word “recur,” and
is a technique that involves repeatedly applying a self-referencing definition until we reach some initial terms that are explicitly
defined, and then going back through the applications to work out the result we want. If you didn’t follow that, it’s okay, we’ll go
through the definition and some specific examples that should give you the idea.
as the initial conditions for the recursively-defined sequence. The equation that defines r from r , . . . , r n is called the
1 n−1
recursive relation.
Probably the best-known example of a recursively-defined sequence is the Fibonacci sequence. It is named for an Italian
mathematician who introduced the sequence to western culture as an example in a book he wrote in 1202 to advocate for the use of
Arabic numerals and the decimal system. The sequence was known to Indian mathematicians as early as the 6 century. th
So in the Fibonacci sequence, f 0 = f1 = 1 are the initial conditions, and f n = fn−1 + fn−2 for all n ≥ 2 is the recursive relation.
The usual problem associated with recursively-defined sequences, is to find an explicit formula for the n term that does not th
require calculating all of the previous terms. Clearly, if we want to be able to determine terms that arise later in the sequence, this is
critical. If we try to find the millionth term of a recursively-defined sequence directly, it will require a great deal of computing time
and might also require a lot of memory.
Every time you were asked in school to look at a sequence of numbers, find a pattern, and give the next number in the sequence,
you were probably working out a recurrence relation and applying it.
Example 6.1.1
Consider the sequence 5, 8, 11, 14. What number should come next?
Solution
We consider the differences between successive pairs: 8 − 5 = 3 ; 11 − 8 = 3 ; 14 − 11 = 3 . This appears to be an arithmetic
sequence, with the constant difference of 3 between successive terms. So the sequence can be defined by a = 5 and 1
a =a
n n−1+ 3 , for every n ≥ 2 . We were asked for a , and we know that a = 14 , so a = a + 3 = 14 + 3 = 17 .
5 4 5 4
Example 6.1.2
Consider the sequence 3, 6, 11, 18, 27. What number should come next?
Solution
Again, consider the differences between successive terms: 6 − 3 = 3 ; 11 − 6 = 5 ; 18 − 11 = 7 ; 27 − 18 = 9 . These
differences aren’t constant, but do follow a predictable pattern: they are the odd numbers (starting at 3 and increasing). So the
sequence can be defined by a = 3 and an = a
1 + (2n − 1) , for every n ≥ 2 . We were asked for a , and we know that
n−1 6
a = 27 , so a = a + 2(6) − 1 = 27 + 11 = 38 .
5 6 5
This example shows that the recurrence relation can depend on n, as well as on the values of the preceding terms. (Although we
didn’t state this explicitly in our definition, it is implicit because n − 1 is the number of previous terms on which r depends; we n
could calculate n as a + a +. . . +a
0
1
0
2
0
n−1
+ 1 .)
6.1.1 https://math.libretexts.org/@go/page/60215
Let’s look at one more example.
Example 6.1.3
Stavroula’s bank pays 1% interest (compounded annually), and charges her a service fee of $10 per year to maintain the
account. The fee is charged at the start of the year, and the interest is calculated on the balance at the end of the year. If she
starts with a balance of $2000, is she making money or losing money? If this account is set up for her by her parents and she’s
not allowed to touch it, how much money will be in the account after seven years?
Solution
We see that the initial term is r = 2000. We’re going to use r as the first term, because then the value of her account after 1
0 0
year will be r ; after two years will be r ; and after seven years will be r . This just makes it a little easier to keep track of
1 2 7
$10, so if she makes money in the first year, she will continue to make money; while if she loses money in the first year, she
will continue to lose money after that. So to answer the first question, we’ll work out
r = 1.01(r − 10) = 1.01(1990) = 2009.9. Stavroula is making money.
1 0
To answer the second question, unless we’ve managed to figure out an explicit formula for r (which we don’t yet know how
n
to do), we need to calculate r , r , r , r , r , and r . It would be reasonable to assume that the bank rounds its calculations to
2 3 4 5 6 7
the nearest penny every year, and carries forward with the rounded value, but because this will create an error that will be
compounded in comparison with solving our recurrence relation explicitly (which we’ll learn later how to do), we’ll keep track
of the exact values instead. We have
r2 = 1.01(2009.9 − 10) = 2019.899;
r3 = 1.01(2009.899) = 2029.99799;
r4 = 1.01(2019.99799) = 2040.1979699;
r5 = 1.01(2030.1979699) = 2050.499949599;
r7 = 1.01(2050.90494909499) = 2071.4139985859399
Exercise 6.1.1
3. If the annual fee on Stavroula’s bank account from Example 6.1.3 is $20 instead of $10, is she making money or losing
money?
This page titled 6.1: Recursively-Defined Sequences is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy
Morris.
6.1.2 https://math.libretexts.org/@go/page/60215
6.2: Basic Induction
Suppose we want to show that n! is at least 2 n
−2 , for every n ≥ 1 (where n must be an integer). We could start verifying this fact
for each of the possible values for n :
1
1! = 1 ≥ 2 − 2 = 0;
2
2! = 2 ≥ 2 − 2 = 2;
(6.2.1)
3
3! = 6 ≥ 2 − 2 = 6;
4
4! = 24 ≥ 2 − 2 = 14.
We could continue verifying the values one at a time, but the process would go on forever, so we’d never be able to complete the
proof.
Instead, think about the following method. We know that the inequality holds for n = 1 . Let’s suppose that the inequality holds for
some value n = k , i.e. that
k
k! ≥ 2 − 2. (6.2.2)
Now let’s use the fact that we can easily calculate (k + 1)! from k! together with our supposition, to deduce that the inequality
holds when n = k + 1 , i.e. that
k+1
(k + 1)! ≥ 2 − 2. (6.2.3)
This is enough to prove the inequality for every integer n ≥ 1 because applying our supposition and deduction enough times will
prove the inequality for any value at all that interests us! For example, if we wanted to be sure that the inequality holds for
n = 100 , we could take the fact that we know it holds for 1 , to deduce that it holds for 2 , then the fact that it holds for 2 allows us
to deduce that it holds for 3. By repeating this 97 more times, eventually we see that since it holds for 99, we can deduce that it
holds for 100.
2. For any integer k ≥ n , if P (k) is true then P (k + 1) must also be true, then P (n) is true for every integer n ≥ n .
0 0
conditional statement that P (k) ⇒ P (k + 1) for every k ≥ n is called the inductive step. The assumption we make in the
0
inductive step, that P (k) is true for some arbitrary k ≥ n , is called the inductive hypothesis, and can be referred to by (IH)
0
Now that we’ve gone through the formalities, let’s write a proper proof by induction for the inequality we used to introduce this
idea.
Example 6.2.1
Prove by induction that n! ≥ 2 − 2 , for every integer n ≥ 2 . (This inequality is actually true for every n ≥ 0 , but the proof is
n
Certainly 2 ≥ 2 , so the inequality holds for n = 2 . This completes the proof of the base case.
Inductive step: We begin with the inductive hypothesis. Let k ≥2 be arbitrary, and suppose that the inequality holds for
n = k ; that is, assume that k! ≥ 2 − 2 .
k
6.2.1 https://math.libretexts.org/@go/page/60216
Now we want to deduce that
k+1
(k + 1)! ≥ 2 − 2.
Let’s start from the left-hand side of this inequality. By the definition of factorial, we know that
(k + 1)! = (k + 1)k!
Now that we have k! in the expression, we’re in a position to apply the inductive hypothesis; that is,
k
(k + 1)! = (k + 1)k! ≥ (k + 1)(2 − 2).
Since k ≥ 2 , we have k + 1 ≥ 3 , so
k k k k k+1 k
(k + 1)(2 − 2) ≥ 3(2 − 2) = 2(2 ) + 2 −6 = 2 +2 − 6.
which is what we wanted to deduce. This completes the proof of the inductive step.
By the Principle of Mathematical Induction, n! ≥ 2 n
−2 for every integer n ≥ 2 .
Proofs by induction work very naturally with recursively-defined sequences, since the recurrence relation gives us information
about the (k + 1) term of the sequence, based on previous terms.
st
Example 6.2.2
Consider the sum of the first n integers. We can think about this as a recursively-defined sequence, by defining s1 = 1 , and
s =s
n n−1+ n , for every n ≥ 2 . Thus, s = 1 + 2 ;2
s3 = s2 + 3 = 1 + 2 + 3 ,
n(n + 1)
and so on. Prove by induction that s n = , for every n ≥ 1 .
2
Solution
Base case: n = 1 . We have s n = s1 = 1 , and
n(n + 1) 1(2)
= =1 ,
2 2
so the equality holds for n = 1 . This completes the proof of the base case.
Inductive step: We begin with the inductive hypothesis. Let k ≥ 1 be arbitrary, and suppose that the equality holds for n = k ;
k(k + 1)
that is, assume that s k = .
2
Using the recursive relation, we have sk+1 = sk + (k + 1) since k+1 ≥ 2 , and using the inductive hypothesis, we have
k(k + 1)
sk = , so putting these together, we see that
2
k(k + 1)
sk+1 = + (k + 1) .
2
which is what we wanted to deduce. This completes the proof of the inductive step.
n(n + 1)
By the Principle of Mathematical Induction, s n = for every n ≥ 1 .
2
6.2.2 https://math.libretexts.org/@go/page/60216
Note
The steps of a proof by induction are precisely defined, and if you leave any of them out, or forget the conditions required,
things can go badly wrong. The base case may seem obvious, but can’t be left out; also, the hypothesis that k ≥ n may be 0
Let’s look at an example where, by forgetting to include the base case, we can give a “proof by induction” of something that is
clearly false.
Example 6.2.3
Here is a “proof by induction” (without a base case) that every integer n is at least 1000.
Solution
Inductive step: We begin with the inductive hypothesis. Let k be arbitrary, and suppose that k ≥ 1000 .
Now we want to deduce that k + 1 ≥ 1000 . But clearly,
k + 1 ≥ k ≥ 1000
(by our inductive hypothesis), which is what we wanted to deduce. This completes the proof of the inductive step.
By the Principle of Mathematical Induction, n ≥ 1000 for every integer n .
Now it’s your turn to try a few, but don’t leave out any of the steps!
Exercise 6.2.1
Conjecture a closed formula for tn based on the values you have calculated, and use induction to prove that your formula is
correct.
6) Prove that for every integer n ≥ 1 ,
n
1
∑ j! ≤ (n + 1)!
j=1
2
This page titled 6.2: Basic Induction is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy Morris.
6.2.3 https://math.libretexts.org/@go/page/60216
6.3: More Advanced Induction
Now that we’ve reviewed the basic form of induction, it’s important to consider some more advanced forms that are often used.
The first form we’ll look at is strong induction. When we have a recursively-defined sequence that depends on the previous terms,
sometimes we need to know not just about the single term that comes immediately before the n term, but about other previous
th
terms. Only by putting all of this information together will we be able to deduce the result we need about the n term. th
Example 6.3.1
i=1
ai .
Thus, a 2 = a1 = 2 ;
a3 = a1 + a2 = 2 + 2 = 4;
a4 = a1 + a2 + a3 = 2 + 2 + 4 = 8,
inductive hypothesis, we let k ≥ 2 be arbitrary, and suppose that the equality is true for n = k , so a = 2 . Now when
k
k−1
n = k + 1 , we have
an = ak+1 = ∑
k
i=1
ai ,
by the recursive relation for this sequence. We know what a is, by our initial condition; we know that a = 2
1 , but what
k
k−1
about the values in between? The Principle of Mathematical Induction as we’ve learned it so far, doesn’t allow us to assume
anything about (for example) a . k−1
Actually, though, the way the concept of induction works, by the time we’re trying to prove something about an, we’ve actually
already deduced it for every value between n and n − 1 (inclusive). So there is nothing wrong with assuming that P (i) is true for
0
every value between n and k , rather than just for k , in order to deduce that P (k + 1) is true. More concretely, this is saying the
0
following. Suppose that by knowing P (0) we can deduce P (1), and then by knowing P (0) and P (1) we can deduce P (2), and so
on, so that eventually by knowing that everything from P (0) through P (k) is true, we can deduce that P (k + 1) is true.
Then P (n) is true for every integer n ≥ 0 . Of course, we don’t have to start with 0; we can start with any integer n0 . This is the
strong form of mathematical induction:
2. For any integer k ≥ n , if every P (i) is true for n ≤ i ≤ k , then P (k + 1) must also be true,
0 0
Using this, we can complete the example we started above. We have a = 2 (by the initial condition), and the strong induction
1
hypothesis allows us to assume that a = 2 for every integer i with 2 ≤ i ≤ k . So using the recursive relation
i
i−1
ak+1 = ∑ ai (6.3.1)
i=1
we see that
k
i−1
ak+1 = 2 + ∑ 2 (6.3.2)
i=2
6.3.1 https://math.libretexts.org/@go/page/60217
You probably learned in high school how to add up geometric sequences like this; in particular, that
k
j k+1
∑2 =2 −1 (6.3.3)
j=0
j=1 j=0
Example 6.3.2
Shawna is building a tower with lego. Prove that if she has n pieces of lego (where n ≥ 1 ), and a “move” consists of sticking
two smaller towers together into one (where a tower may consist of one or more pieces of lego), then it will take her n − 1
moves to complete the tower
Solution
Base case: n = 1 . Shawna’s “tower” is already complete after n − 1 = 0 moves. This completes the proof of the base case.
Induction step: we begin with the induction hypothesis, that when 1 ≤i ≤k , it takes Shawna i −1 moves to build a tower
that contains i pieces of lego.
Now we want to deduce that when Shawna has k + 1 pieces of lego, it takes her k moves to stick them together into a single
tower. Notice that when she makes her final move, it must consist of sticking together two smaller towers, one of which
contains j pieces of lego, and the other of which contains the remaining k + 1 − j pieces. Both j and k + 1 − j must lie
between 1 and k (if either of the smaller towers had k + 1 pieces then the tower would already be complete), so the induction
hypothesis applies to each of them. Thus, it has taken Shawna j − 1 move to build the tower that contains j pieces, and k − j
moves to build the tower that contains k − j + 1 pieces. Together with her final move, then, it must take Shawna
(j − 1) + (k − j) + 1
This is pretty amazing. If we tried to go through the full argument for how many moves it takes her to build a tower with four
blocks, it would go something like this. First, to build a tower with one block clearly takes 0 moves; to build a tower with two
blocks clearly takes 1 move (stick the two blocks together). To build a tower with three blocks, we must use 1 move to stick
together a tower of two blocks (which took 1 move to create) with a tower of one block (which took 0 moves to create), meaning
that we use 2 moves altogether. Now, a tower of four blocks can be built in two ways: by using 1 move to stick together two towers
of two blocks, each of which took 1 move to make, for a total of 3 moves; or by using 1 move to stick together a tower of one
block (which took 0 moves to make) with a tower of three blocks (which took 2 moves to make), for a total of 3 moves. So under
either method, building a tower of four blocks takes 3 moves. You can see that the argument will get more and more complicated as
n increases, but it will always continue to work.
We won’t need strong induction as such very much until later in the course, but the idea is useful background for the next kind of
induction we’ll look at, which is very important when dealing with recurrence relations: induction with multiple base cases.
Induction with multiple base cases is very important for dealing with recursively defined sequences such as the Fibonacci
sequence, where each term depends on more than one of the preceding terms.
6.3.2 https://math.libretexts.org/@go/page/60217
Suppose you were asked to prove that the nth term of the Fibonacci sequence, fn , is at least 2
n−2
. If we try to follow our basic
inductive strategy, we’d begin by observing that this is true for f : 0
1
f0 = 1 ≥ 2 − 2 = .
4
Then we’d make the inductive hypothesis that our inequality is true for some arbitrary k ≥ 0 , so fk ≥ 2
k−2
. Now to deduce the
inequality for n = k + 1 , the natural approach is to use the recursive relation, which tells us that
fk+1 = fk + fk−1 .
We can use our inductive hypothesis to make a substitution for f , but what about f ? You might (reasonably) argue at this point
k k−1
that we should use strong induction, which will allow us to assume that the result is true for both f and f , but actually, this k k−1
doesn’t work! Why not? Well, the trouble is that everything we know about the Fibonacci sequence starts with f , but if k = 1 0
(which is the first time we try to use induction) then f =f , which we haven’t even defined! It is very important to ensure
k−1 −1
that in the inductive step, we never make our assumption go back too far, i.e. to a value below n . 0
So, how can we deal with this problem? The solution is to add another base case, for n = 1 . When n = 1 , we have
1
f1 = 1 ≥ 2
1−2
= .
2
Now if we try induction, at the first step we will be using the fact that the statement is true for f and f to prove it for f ; then the 0 1 2
fact that it’s true for f and f will allow us to deduce it for f , and so on. The final argument will look like the following.
1 2 3
Example 6.3.3
n−1
3
Prove by induction that the n th
term of the Fibonacci sequence, f , is at least ( n ) , for every n ≥ 0 .
2
Solution
Since the recursive relation for the Fibonacci sequence requires the two immediately preceding terms, we will require two base
cases.
Base cases: When n = 0 , we have
−1
3 2
f0 = 1 ≥ ( ) = ,
2 3
so the inequality holds for n = 1 . This completes the proof of the base cases.
Inductive step: We begin with the (strong) inductive hypothesis. Let k be an arbitrary integer at least as big as our biggest base
i−1
3
case, so k ≥ 1 . Assume that for every integer i with 0 ≤ i ≤ k , we have f i ≥( ) .
2
Using the recursive relation, we know that f = f +f k+1 . Since k ≥ 1 , we have k − 1 ≥ 0 , so both k and k − 1 satisfy
k k−1
the bounds on i (that 0 ≤ i ≤ k ), so that we can apply our inductive hypothesis to both f and f . We therefore have k k−1
5 3
since > . This is what we wanted to deduce. This completes the proof of the inductive step.
3 2
6.3.3 https://math.libretexts.org/@go/page/60217
n−1
3
By the Principle of Mathematical Induction, f n ≥( ) for every n ≥ 0 .
2
Exercise 6.3.1
1. Prove by induction that for every n ≥ 0 , the nth term of the Fibonacci sequence is no greater than 2 .
n
2. The machine at the coffee shop isn’t working properly, and can only put increments of $4 or $5 on your gift card. Prove by
induction that you can get any amount of dollars that is at least $12. [Hint: You should have four base cases.]
3. Define a recurrence relation by a = a = a = 1 , and a = a
0 1 2 n +a
n−1 n−2+a n−3 for n ≥ 3 . Prove by induction that
a ≤2
n
n
for all n ≥ 0 .
This page titled 6.3: More Advanced Induction is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy
Morris.
6.3.4 https://math.libretexts.org/@go/page/60217
6.4: Summary
Important Definitions:
Recursively-defined Sequence
Initial Conditions
Recursive Relation
Fibonacci Sequence
Proof by Induction
Base Case
Inductive Step
Inductive Hypothesis
Strong Induction
Induction with Multiple Base Cases
Notation:
(IH)
This page titled 6.4: Summary is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy Morris.
6.4.1 https://math.libretexts.org/@go/page/60218
CHAPTER OVERVIEW
7: Generating Functions
Recall that the basic goal with a recursively-defined sequence, is to find an explicit formula for the nth term of the sequence.
Generating functions will allow us to do this.
7.1: What is a Generating Function?
7.2: The Generalized Binomial Theorem
7.3: Using Generating Functions To Count Things
7.4: Summary
This page titled 7: Generating Functions is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy Morris.
1
7.1: What is a Generating Function?
A generating function is a formal structure that is closely related to a numerical sequence, but allows us to manipulate the sequence
as a single entity, with the goal of understanding it better. Here’s the formal definition.
n i
f (x) = a0 + a1 x+. . . +an x +. . . = ∑ ai x (7.1.1)
i=0
So a , the n
n
th
term of the sequence, is the coefficient of x in f (x). n
Example 7.1.1
3) (n
0
n
1
n
), ( ), . . . , ( ), 0, 0, 0, . . .
n
has generating function
n n n n n
( ) + ( )x+. . . +( ) x = (1 + x ) (7.1.4)
0 1 n
2 3 i
f (x) = 1 + x + x + x +. . . = ∑ x (7.1.5)
i=0
These generating functions can be manipulated. For example, if f (x) is as in Example 7.1.2 (4), suppose we take the product
(1 − x)f (x). We have
2 3 4
(1 − x)f (x) = (1 − x)(1 + x + x +x + x +. . . )
2 3 4 2 3 4 5
= (1 + x + x +x + x +. . . ) − (x + x +x +x + x +. . . ) (7.1.6)
=1
1
Dividing through by 1 − x , we see that f (x) = .
(1 − x)
This may seem artificial and rather nonsensical since the generating function was defined as a formal object whose coefficients are
a sequence that interests us. In fact, although we won’t delve into the formalities in this course, algebraic manipulation of
generating functions can be formally defined, and gives us exactly these results.
1
A reasonable question at this point might be, what use is this? Even if we agree that f (x) = , what we really want is the
(1 − x)
1
coefficient of x (in order to retrieve a , the n
n
n
th
term of our sequence). If we have an expression like , how can we work
(1 − x)
Exercise 7.1.1
For each of the following sequences, give the corresponding generating function.
1. 1, 3, 5, 0, 0, 0, . . .
7.1.1 https://math.libretexts.org/@go/page/60104
2. 1, 2, 22, 23, 24, . . .
3. 1, 5, 10, 15, 10, 5, 1, 0, 0, 0, . . .
4. 1, 5, 10, 10, 5, 1, 0, 0, 0, . . .
This page titled 7.1: What is a Generating Function? is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy
Morris.
7.1.2 https://math.libretexts.org/@go/page/60104
7.2: The Generalized Binomial Theorem
We are going to present a generalised version of the special case of Theorem 3.3.1, the Binomial Theorem, in which the exponent is
allowed to be negative. Recall that the Binomial Theorem states that
n
n
n r
(1 + x ) = ∑( )x (7.2.1)
r
r=0
So if we were allowed negative exponents in the Binomial Theorem, then a change of variable y = −x would allow us to calculate
the coefficient of x in f (x).
n
Of course, if n is negative in the Binomial Theorem, we can’t figure out anything unless we have a definition for what ( n
r
) means
under these circumstances.
n n(n − 1). . . (n − r + 1)
( ) = (7.2.3)
r r!
Notice that this coincides with the usual definition for the binomial coefficient when n is a positive integer, since
n!
= n(n − 1). . . (n − r + 1) (7.2.4)
(n − r)!
in this case.
Example 7.2.1
(−2)(−3)(−4)(−5)(−6)
−2
( ) = = −6
5
5!
r
) .
Proposition 7.2.1
Proof
We have
−n(−n − 1). . . (−n − r + 1)
(
−n
r
) = .
r!
Taking a factor of (−1 ) out of each term on the right-hand side give
(n + r − 1)
r
(−1 ) n(n + 1). . . .
(r!)
Now,
(n + r − 1)!
(n + r − 1)(n + r − 2). . . n =
(n − 1)!
7.2.1 https://math.libretexts.org/@go/page/60105
so
n(n + 1). . . (n + r − 1) (n + r − 1)!
(−1 )
r
= (−1 )
r
= (−1 ) (
r n+r−1
r
) ,
r! r!(n − 1)!
as claimed.
With this definition, the binomial theorem generalises just as we would wish. We won’t prove this.
For any n ∈ R ,
∞
n
n r
(1 + x ) = ∑( )x (7.2.6)
r
r=0
Example 7.2.2
Let’s check that this gives us the correct values for the coefficients of f (x) in Example 7.1.2 (4), which we already know.
Solution
We have
−1 −1
f (x) = (1 − x ) = (1 + y )
where y = −x . The Generalised Binomial Theorem tells us that the coefficient of y will be r
−1 r 1+r−1 r
( ) = (−1 ) ( ) = (−1 )
r r
since ( r
r
) =1 . But we want the coefficient of x , not of y , and
r r
r r r r r
y = (−x ) = (−1 ) x =x
so we have
r r 2r r r r r
(−1 ) y = (−1 ) x =1 x =x
Thus, the coefficient of x in f (x) is 1. This is, indeed, precisely the sequence we started with in Example 7.1.2 (4).
r
Example 7.2.3
r
) gives, for various values of r. By Proposition 7.2.1, we have
(r + 2)(r + 1)
−3 r 3+r−1 r r+2 r
( ) = (−1 ) ( ) = (−1 ) ( ) = (−1 )
r r r
2
1 2 3
When r = 0 , this is (−1) 0
2⋅ =1 . When r = 1 , this is (−1) 1
3⋅ = −3 . When r = 2 , this is (−1) 2
4⋅ =6 . In general,
2 2 2
we see that
(n + 2)(n + 1)
−3 2 n n
(1 + x ) = 0 − 3x + 6 x − +. . . +(−1 ) x +. . .
2
Exercise 7.2.1
2. The coefficient of x in (1 − x) .
4 −2
3. The coefficient of x in (1 + x) .
n −4
7.2.2 https://math.libretexts.org/@go/page/60105
4. The coefficient of x k−1
in
1 +x
(1 − 2x)5
1 +x
Hint: Notice that 5
= (1 − 2x )
−5 −5
+ x(1 − 2x ) . Work out the coefficient of x in (1 − 2x)
n −5
and in
(1 − 2x)
x(1 − 2x)
−5
, substitute n = k − 1 , and add the two coefficients.
1
5. The coefficient of x ink
j n
, where j and n are fixed positive integers. Hint: Think about what conditions will make
(1 − x )
This page titled 7.2: The Generalized Binomial Theorem is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by
Joy Morris.
7.2.3 https://math.libretexts.org/@go/page/60105
7.3: Using Generating Functions To Count Things
As you might expect of something that has come up in our study of enumeration, generating functions can be useful in solving
problems about counting. We’ve already seen from the Binomial Theorem, that the coefficient of x in (1 + x) , is ( ), so the r n n
generating function for the binomial coefficients is (1 + x) . In fact, the argument we used to prove the Binomial Theorem
n
explained why this works: if we want the coefficient of x in (1 + x) , it must be the number of ways of choosing the x from r of
r n
the n factors, while choosing the 1 from the other factors. We can use similar reasoning to solve other counting questions.
Example 7.3.1
The grocery store sells paper plates in packages of 1, 5, 20 , or 75 . In how many different ways can Jiping buy a total of 95
paper plates?
Solution
We model this with generating functions. The exponent of x will represent the number of paper plates, and the coefficient of
x
n
will represent the number of ways in which he can buy n paper plates.
We begin by considering the single paper plates that he buys. He could buy 0, or 1, or any other number of these, so we
represent this by the generating function.
∞ 1
2 3 4 i
1 +x +x +x + x +. . . = ∑ x =
i=0
(1 − x)
There is exactly one way of choosing any particular number of single paper plates (we are assuming the plates are
indistinguishable).
Now, he could also buy any number of packages of 5 paper plates, but the difference is that each package he buys contributes 5
to the exponent, since it represents 5 plates. We represent this by the generating function
∞
1
5 10 15 5i
1 +x +x +x +. . . = ∑ x =
i=0 5
(1 − x )
We could multiply this all out to get our answer. We could be a bit more clever, recognising that we only really care about the
coefficient of x , and break the problem down into cases depending on how many of the bigger packages he buys. It should
95
be noted that the generating function hasn’t really saved us any work. This approach involves saying, “Well, if he takes the x 75
from the final factor, then there are only six ways to contribute to the coefficient of x : he could choose an x from the 95 20
7.3.1 https://math.libretexts.org/@go/page/60106
previous factor and 1s from both of the other factors; or he could choose 1 from the third factor and any of 1, x , x , x , or 5 10 15
x
20
from the second factor, in each case, choosing whichever term from the first factor brings the exponent up to 95.” This is
exactly equivalent to saying, “Well, if he buys a package of 75 plates, then there are only six ways to buy 95 plates in total: he
could buy a package of 20 plates and be done; or he could buy 0, 1, 2, 3, or 4 packages of 5 plates, in each case, buying as
many single plates as are needed to bring the total up to 95.”
So what’s the advantage of the generating function approach? It comes in a couple of ways. First, it solves multiple problems at
once: if we actually multiply out the generating function above, we will be able to read off not only how many ways there are
of buying 95 plates, but also how many ways there are of buying every number of plates up to 95. (If we hadn’t cut the factors
off as we did, we could also work out the answers for any number of plates higher than 95.) So by doing a bunch of
multiplication once (and it’s easy to feed into a computer algebra system if you don’t want to do it by hand), we can
simultaneously find out the answer to a lot of closely-related questions.
The other advantage is that the generating function approach can help us solve problems that we don’t see how to solve
without it, such as finding an explicit formula for the n term of a recursively-defined sequence.
th
Here’s an example that involves working out the coefficient of a term in a generating function in two different ways.
Example 7.3.2
1
Consider the generating function ( 4
) = (1 + x + x
2 3
+ x +. . . )
4
. As usual, we want to determine the coefficient of
(1 − x)
x
r
in this product.
Solution
We must choose a power of x from each of the four factors, in such a way that the sum of the powers we choose must be n .
This is the same as choosing a total of r items, when the items come in four distinct types (recall, for example, Example 5.1.2).
The types are represented by the factor the term is chosen from, and the exponent chosen from that factor is the number of
items (xes) chosen of that type. So we know that the number of ways of doing this is (( 4
r
))
r
r
r r r+3 4
(−1 ) (−1 ) ( ) = (( ))
r r
We’ll use the above example to work out a counting question, but first we need an observation.
Proposition 7.3.1
You can prove this by induction on k (this is one of the exercises below), or by multiplying through by 1 − x .
Example 7.3.3
Trent is playing a dice game, using 12-sided dice. How many ways are there for him to roll a total of 24 on his four dice?
Answer
Each die can roll any number between 1 and 12, and there are four dice, so the appropriate generating function is
2 3 4 5 6 7 8 9 10 11 12 4
(x + x +x +x +x +x +x +x +x +x +x +x )
Rolling an i on one of the dice corresponds to choosing x from the corresponding factor of this generating function. We
i
are looking for the coefficient of x , this will tell us the number of ways of rolling a total of 24.
24
7.3.2 https://math.libretexts.org/@go/page/60106
It turns out that by manipulating the generating function, we can work this out a bit more easily than by multiplying this
out. By taking a common factor of x out of each of the four factors, our generating function can be re-written as
4 2 3 4 5 6 7 8 9 10 11 4
x (1 + x + x +x +x +x +x +x +x +x +x +x )
Most of these terms can be ignored, as they will not contribute to the coefficient of x . Recall that the function we’re 20
interested in is the product of this, with (1 − x) , and there are only two ways of getting an x term from this product:
−4 20
by taking the constant term that we’ve just worked out, and multiplying it by the x term from (1 − x) ; or by taking the 20 −4
x
12
term that we’ve just worked out, and multiplying it by the x term from (1 − x) . In the previous example, we 8 −4
20
)) , and the coefficient of x is (( 8 4
8
)) .
Thus, the number of ways in which Trent can roll a total of 24 on his four dice is the coefficient of 24
x in our generating
function, which is
4 4
(( )) − 4 (( )) = 1771 − 660 = 1111
20 8
Exercise 7.3.1
This page titled 7.3: Using Generating Functions To Count Things is shared under a CC BY-NC-SA license and was authored, remixed, and/or
curated by Joy Morris.
7.3.3 https://math.libretexts.org/@go/page/60106
7.4: Summary
If n > 0 is an integer, then
(
−n
r
) = (−1 ) (
r n+r−1
r
) .
The Generalised Binomial Theorem
k+1
(1 − x )
k
1 + x+. . . +x =
(1 − x)
This page titled 7.4: Summary is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy Morris.
7.4.1 https://math.libretexts.org/@go/page/60107
CHAPTER OVERVIEW
This page titled 8: Generating Functions and Recursion is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by
Joy Morris.
1
8.1: Partial Fractions
1
If a generating function looks like i j
, we can use the Generalised Binomial Theorem to find the coefficient of x
r
. But
(1 + ax )
1
what can we do if the generating function looks like , for example, or even more complicated expressions?
(a + bx + c x2 )
One tool that can help us extract coefficients from some expressions like this, is the method of partial fractions.
Example 8.1.1
Solution
Well, if we could separate the factors of the denominator, we would know how to deal with each separately. In fact, this is
exactly what we do. We set
1 +x A B
f (x) = = +
(1 − 2x)(2 + x) 1 − 2x 2 +x
As we work this through, you’ll see that in working this out, we end up with two equations in the two unknowns A and B ,
which we can therefore solve! So it is possible to “split up” the original generating function, into two separate fractions, each
of which has as its denominator one of the factors of the original denominator. This is the method of partial fractions.
A B
To solve for A and B , we add the fractions and over a common denominator. This gives
(1 − 2x) (2 + x)
3
5A = 3 , so A = . Now
5
3 −1
B = 1 −2 ( ) = .
5 5
Thus, we have
3 1
5 5
f (x) = −
1 − 2x 2 +x
Notice that the 2 + x is still a bit problematic. We can use the Generalised Binomial Theorem to work out coefficients for
something that looks like (1 + ax ) , but we need that 1, and here instead we have a 2. To deal with this, we observe that
i j
1
2 + x = 2(1 + ( ) x) .
2
Thus,
8.1.1 https://math.libretexts.org/@go/page/60110
3 1
5 10
f (x) = −
1 − 2x 1
1 +( )x
2
3
so the coefficient of x in this part is (
r
)2
r
. Also,
5
−1 1 −1 1 1 1
(1 + ( )x )
−1
= (1 − x +( x)
2
−( x)
3
+ −. . . ) ,
10 2 10 2 2 2
r
−1 1
so the coefficient of x in this part is (
r
) (−1 ) (
r
) .
10 2
r
3 −1 −1
Thus, the coefficient of x in f (x) is (
r
) (2 )
r
− ( ) .
5 10 2
The method of partial fractions can be applied to any generating function that has a denominator that can be factored into simpler
terms. However, polynomials of degree 3 or higher can become hard to factor, so we’ll mostly restrict our attention to applying this
either with denominators that are already factored, or with denominators that have degree at most two.
There is an extra trick that you should be aware of. This arises if the denominator is divisible by a square. For example, if we are
looking for the coefficient of x in r
1 +x
g(x) =
2
(1 − 2x ) (2 + x)
then it doesn’t make sense to separate all of the factors out as before, because
A B C A+B C
g(x) = + + = +
1 − 2x 1 − 2x 2 +x 1 − 2x 2 +x
and when we add this up, the denominator will be (1 − 2x)(2 + x) rather than (1 − 2x ) 2
(2 + x) . This can be dealt with in either
of two ways. First, you can include both 1 − 2x and (1 − 2x) as denominators: 2
A B C
g(x) = +
2
+ .
1 − 2x (1 − 2x) 2 +x
The second option is to include only (1 − 2x) as one of the denominators, but to include an x in the corresponding numerator, in
2
Either of these methods can be generalised in natural ways to cases where the denominator is divisible by some higher power.
We’ll see more examples of partial fractions applied to specific situations, so we’ll leave the explanation there for now.
Exercise 8.1.1
Find the coefficient of x in each of the following generating functions, using the method of partial fractions and the
r
(1 + 2x)
3.
(1 − 2x)(2 + x)(1 + x)
8.1.2 https://math.libretexts.org/@go/page/60110
This page titled 8.1: Partial Fractions is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy Morris.
8.1.3 https://math.libretexts.org/@go/page/60110
8.2: Factoring Polynomials
You should be familiar with the quadratic formula, which allows us to factor any polynomial of degree two, into linear factors.
Specifically, it tells us that the roots of ax + bx + c are
2
− − −−−−−
2
−b ± √ b − 4ac
(8.2.1)
2a
− −−−− −− − −−−− −−
−b + √ b2 − 4ac −b − √ b2 − 4ac
2
ax + bx + c = a (x − ( )) (x − ( )) (8.2.2)
2a 2a
Recall that in order to use the Generalised Binomial Theorem, we need the constant term to be 1. If you are very comfortable with
algebraic manipulations, you can use the quadratic formula to factor as above, and then divide each factor by the appropriate value
so as to make the constant term 1. This may create a messy constant outside the whole thing, and a messy coefficient of x in each
term, but if you are careful, you can get the correct answer this way.
If you are more confident in memorising another formula (closely related to the quadratic formula) for factoring ax 2
+ bx + c , you
can also factor a quadratic polynomial directly into the form we want, using the following formula:
− − −−−−− − − −−−−−
2 2
2
−b + √ b − 4ac −b − √ b − 4ac
ax + bx + c = c (1 − x) (1 − x) (8.2.3)
2c 2c
Sometimes a denominator will already be factored in the formula for a generating function, but when it isn’t, either of the above
methods can be used to factor it.
Example 8.2.1
Factor 3x 2
− 2x + 1 into linear factors.
Solution
We will use the formula given above. We have a = 3 , b = −2 , and c = 1 . Then
−−−− − −−−− −
2 + √4 − 12 2 − √4 − 12
2
3x − 2x + 1 = (1 − x) (1 − x)
2 2
– –
= (1 − (1 + i √2)x)(1 − (1 − i √2)x).
It is always a good idea to check your result, by multiplying the factors back out.
When coefficients in the factorisation get ugly (even complex, as in the example above), you might find the algebra involved in
working out the coefficients hard to deal with. Let’s work through an example of this, using the factorisation we’ve just completed.
Example 8.2.2
1
f (x) =
2
3x − 2x + 1
Solution
We have determined in the previous example, that
2 – –
3x − 2x + 1 = (1 − (1 + i √2)x)(1 − (1 − i √2)x),
8.2.1 https://math.libretexts.org/@go/page/60111
1
f (x) =
3 x2 − 2x + 1
A B
= +
– –
1 − (1 + i √2)x 1 − (1 − i √2)x
– –
A(1 − (1 − i √2)x) + B(1 − (1 + i √2)x)
= ,
2
3x − 2x + 1
Thus,
– –
A(1 − (1 − i √2)x) + B(1 − (1 + i √2)x) = 1 + 0x ,
– –
so the constant term gives A + B = 1 , while the coefficient of x gives A(1 − i √2) + B(1 + i √2) = 0 . Substituting
B = 1 − A into the latter equation, gives
– – –
A − i √2A + 1 + i √2 − A − i √2A = 0 ,
– –
so 1 + i √2 = i2 √2A . Hence
–
1 + i √2 1 1
A = – = – +
i2 √2 2 √2i 2
–
We make the denominator of the first fraction rational, by multiplying numerator and denominator by √2i, giving
–
√2i 1
A =− +
4 2
Thus we have
– –
(2 − √2i) (2 + √2i)
4 4
f (x) = – + –
1 − (1 + i √2)x 1 − (1 − i √2)x
–
Using the Generalised Binomial Theorem and y = (1 + i √2)x , we see that the first fraction expands as
–
(2 − √2i)
[( )] (1 + y + y
2 3
+ y +. . . ) ,
4
–
(2 − √2i) – r –
and the coefficient of x
r
in this, will be [( )] (1 + i √2) . Similarly, with y = (1 − i √2)x , the second fraction
4
expands as
–
(2 + √2i)
[( )] (1 + y + y
2 3
+ y +. . . ) ,
4
–
(2 + √2i) – r
and the coefficient of x in this, will be [(
r
)] (1 − i √2) .
4
– –
(2 − √2i) – r (2 + √2i) – r
[( )] (1 + i √2) + [( )] (1 − i √2)
4 4
You can see from this example that the algebra can get ugly, but the process of finding the coefficient of x
r
is nonetheless
straightforward.
8.2.2 https://math.libretexts.org/@go/page/60111
Exercise 8.2.1
For each of the generating functions given, factor the denominator and use the method of partial fractions to determine the
coefficient of x . r
x
1. 2
x + 5x − 1
2 +x
2. 2
2x +x −1
x
3. 2
x − 3x + 1
This page titled 8.2: Factoring Polynomials is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy Morris.
8.2.3 https://math.libretexts.org/@go/page/60111
8.3: Using Generating Functions to Solve Recursively-Defined Sequences
At last, we are ready to apply the mechanics we’ve introduced in this chapter, to find an explicit formula for the n
th
term of a
recursively-defined sequence.
This method is probably most easily understood using examples.
Example 8.3.1
Consider the recursively-defined sequence: a0 = 2 , and for every n ≥1 ,a n = 3 an−1 − 1 . Find an explicit formula for an in
terms of n .
Solution
The generating function for this sequence is a(x) = ∑ ∞
i=0
ai x
i
.
Now, we are going to use the recursive relation. We know that a = 3a − 1 , or, by rearranging this, a − 3 a
n = −1 . n−1 n n−1
1
so (1 − 3x)a(x) = 3 − . Dividing through by 1 − 3x gives
(1 − x)
3 1
a(x) = −
1 − 3x (1 − x)(1 − 3x)
Now it’s time to apply what we learned in the preceding sections of this chapter. The denominator is already factored, so we
can immediately apply the method of partial fractions to the second fraction. If
−1 A B A(1 − 3x) + B(1 − x)
= + = ,
(1 − x)(1 − 3x) 1 −x 1 − 3x (1 − 3x)(1 − x)
1 3
then A + B = −1 and −3A − B = 0 , so B = −3A , which gives −2A = −1 , so A = and B = − . Thus,
2 2
1 3 1 3
3 2 2 2 2
a(x) = + − = +
1 − 3x 1 −x 1 − 3x 1 −x 1 − 3x
1 3
The coefficient of x
n
in the first of these terms is , while in the second term, the coefficient of x
n
is ( )3
n
. Thus,
2 2
1 3
an = +( )3
n
. Since our generating function began with a 0x
0
, this formula applies for every n ≥ 0 .
2 2
When going through so much algebra, it’s easy to make a mistake somewhere along the way, so it’s wise to do some double-
checking. For a recursively-defined sequence, if the formula you work out gives the correct answer for the first three or four
terms of the sequence, then it’s very likely that you’ve done the calculations correctly. Let’s check the first three terms of this
one. We know from our initial condition that a = 2 , and our new formula gives
0
1 3 1 3
0
a0 = +( )3 = + =2
2 2 2 2
8.3.1 https://math.libretexts.org/@go/page/60112
Using the recursive relation, we should have a 1 = 3(2) − 1 = 5 , and our formula gives
1 3 1 9
1
a1 = +( )3 = + =5
2 2 2 2
Finally, the recursive relation gives a 2 = 3(5) − 1 = 14 , while our formula gives
1 3 1 27
2
a2 = +( )3 = + = 14
2 2 2 2
You can see the benefit to having an explicit formula if you were asked to work out a 100 . Clearly, it’s much easier to determine
1 3
+(
100
)3 than to apply the recursive relation one hundred times.
2 2
Let’s look at one more example, where the recursive relation involves more than one previous term.
Example 8.3.2
Solution
The generating function for this sequence is
b(x) = ∑
∞
i=0
bi x
i
.
Again, we’ll use the recursive relation, which we rearrange as
bn − bn−1 + 2 bn−3 = 0
for every n ≥ 3 . We want to end up with a polynomial in which the coefficient of x looks like b − b + 2b
m
m m−1 m−3 , so that
we’ll be able to use the recursive relation to replace this by 0. In order to do this, we’ll take b(x) (to get the b mx
m
piece),
minus xb(x) (to get the −b x piece), plus 2x b(x) (to get the +2b
m−1
m 3
x piece). m−3
m
2 3 m
−xb(x) = −b0 x − b1 x −b2 x −. . . −bm−1 x −. . .
3 3 m
+2 x b(x) = +2b0 x +. . . +2bm−3 x +. . .
3 2 3 m
∴ (1 − x + 2 x )b(x) = b0 +(b1 − b0 )x + (b2 − b1 )x +0x +. . . +0x +. . .
2
1 −x +x
b(x) = .
1 − x + 2x3
If we want to be able to do anything with this, we need to factor the denominator. Although we don’t have a general method for
factoring cubic polynomials, in this case it’s not hard to see that −1 is a zero of the polynomial (because
1 − (−1) + 2(−1 ) = 0) , and hence x + 1 is a factor of the polynomial. You will not be expected to factor cubic polynomials
3
yourself in this course, so we won’t review polynomial long division, but if you recall polynomial long division, you can use it
to determine that
1 − x + 2x
3
= (1 + x)(2 x
2
− 2x + 1) .
In any case, you can multiply the right-hand side out to verify that it is true.
Now it’s time to use the factoring formula, with a = 2 , b = −2 , and c = 1 , to factor 2x 2
− 2x + 1 . This gives
−−−− −−−−
2 + √4 − 8 2 − √4 − 8
2x
2
− 2x + 1 = 1 (1 − x) (1 − x) = (1 − (1 + i)x)(1 − (1 − i)x) .
2 2
Having factored
8.3.2 https://math.libretexts.org/@go/page/60112
1 − x + 2x
3
= (1 + x)(1 − (1 + i)x)(1 − (1 − i)x) ,
we now apply the method of partial fractions to split this up into three separate pieces.
If
2
1 −x +x A B C
= + +
3
1 − x + 2x 1 +x 1 − (1 + i)x 1 − (1 − i)x
2
A(2 x − 2x + 1) + B(1 + x)(1 − (1 − i)x) + C (1 + x)(1 − (1 + i)x)
=
3
1 − x + 2x
(from the coefficient of x ); and A+B+C = 1 (from the constant term). The second of these simplifies to
−2A + iB − iC = −1 .
3
The algebra can be done in different ways and gets a bit ugly, but these three equations can be solved, resulting in A = ,
5
(2 − i) (2 + i)
B = ,C = .
10 10
Thus,
3 (2 − i) (2 + i)
2
1 −x +x 5 10 10
3
= + + .
1 − x + 2x 1 +x 1 − (1 + i)x 1 − (1 − i)x
3 (2 − i)
The coefficient of x
n
in the first of these terms is ( )(−1 )
n
; in the second, it is ( ))(1 + i )
n
, and in the third, it is
5 10
(2 + i)
( )(1 − i )
n
.
10
It is somewhat surprising that these formulas involving complex numbers will always work out (when n is an integer) to be not
only real numbers, but integers! Once again, we should check this formula for several values of n to ensure we haven’t made
errors in our calculations along the way.
From the initial conditions, b 0 =1 . Our formula gives
3 (2 − i) (2 + i)
b0 = + + =1 .
5 10 10
We now have a general method that we can apply to solve normal linear recursive relations:
8.3.3 https://math.libretexts.org/@go/page/60112
Method:
1) Rearrange the recurrence relation into the form
. .
(8.3.4)
. .
. .
k i
−ak x h(x) = −ak hi−k x +. . .
2 i
∴ a(x)h(x) = h0 +(h1 − a1 h0 )x + (h2 − a1 h1 − a2 h0 )x +. . . ... +f (i)x +. . .
So
i−1 ∞ n
h0 + (h1 − a1 h0 )x+. . . +(hi−1 − a1 hi−2 +. . . +ak−1 hi−k )x +∑ f (n)x
n=i
h(x) = (8.3.5)
a(x)
4) Factor a(x) (remember that you can use complex roots), and find a closed form for
∞
n
∑ f (n)x (8.3.6)
n=i
5) Use partial fractions to get expressions that we can expand using the generalised binomial theorem.
6) Make variable substitutions if necessary to get forms that look like
A
(8.3.7)
n
(1 + y)
Exercise 8.3.1
For each of the following recursively-defined sequences, use the method of generating functions to find an explicit formula for
the n term of the sequence.
th
1. c
0 =2 ,c 1 =0 ,c n for every n ≥ 2 .
= cn−1 + 2 cn−2
2. d 0 =0 ,d 1 =1 ,d n for every n ≥ 2 .
= 2 dn−2 + 1
3. e
0 = 2, e n = 3e − 2 for every n ≥ 1 .
n−1
4. f
0 = 1, f 1 = 3 , and f = 4(f −f
n ) for every n ≥ 2 .
n−1 n−2
5. g
0 = 2, g 1 = 0 , and g = 2 g − 2g
n for every n ≥ 2 .
n−1 n−2
1 1
6. h 0 = and h n = 3 hn−1 − for every n ≥ 1 .
2 2
7. i
0 = i1 = 2 ,i
2 , and i = 3i
=0 n n−1 − 3 in−2 + in−3for every n ≥ 3 .
8. j
0 = −1 ,j 1 = 0 , and j = 2 j n n−1+ 3j n−2for every n ≥ 2 .
9. k 0 = 10 and k n = 11 kn−1 − 10 for every n ≥ 1 .
8.3.4 https://math.libretexts.org/@go/page/60112
Exercise 8.3.2
1) Let p denote the number of ways to build a pipe n units long, using segments that are either plastic or metal, and (for each
n
material) come in lengths of 1 unit or 2 units. For example, p = 2 since we can use a 1-unit segment that is either plastic or
1
metal, and p = 6 since we can use either type of 2-unit segment, or any of the 22 possible ordered choices of 2 segments each
2
Determine a recurrence relation for p . Give a combinatorial proof that your recurrence relation does solve this counting
n
problem. Use your recurrence relation and the method of generating functions to find a formula for p . n
for every n ≥ 0 .]
2) Let s denote the number of lists of any length that have the fixed sum of n , and whose entries come from {1, 2, 3}. For
n
example, s = 2 because (1, 1) and (2) are the only such lists; and s = 7 because the lists are (3, 1), (1, 3), (2, 2), (2, 1, 1),
2 4
Determine s1 , s3 , and s5 by finding all possible lists. Give a combinatorial proof that sn = sn−1 + sn−2 + sn−3 for every
1
n ≥3 . Use this recurrence relation to show that the generating function S(x) for {sn} is 2 3
1 −x −x −x
This page titled 8.3: Using Generating Functions to Solve Recursively-Defined Sequences is shared under a CC BY-NC-SA license and was
authored, remixed, and/or curated by Joy Morris.
8.3.5 https://math.libretexts.org/@go/page/60112
8.4: Summary
Method of partial fractions
Formula for factoring quadratic polynomials into the required form
Applying generating functions to recursively-defined sequences
This page titled 8.4: Summary is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy Morris.
8.4.1 https://math.libretexts.org/@go/page/60113
CHAPTER OVERVIEW
This page titled 9: Some Important Recursively-Defined Sequences is shared under a CC BY-NC-SA license and was authored, remixed, and/or
curated by Joy Morris.
1
9.1: Derangements
Definition: Derangements
A derangement of a list of objects is a permutation of the objects, in which no object is left in its original position.
A classic example of this is a situation in which you write letters to ten people, address envelopes to each of them, and then put
them in the envelopes, but accidentally end up with none of the letters in the correct envelope.
Another example might be a dance class in which five brother-sister pairs are enrolled. The instructor mixes them up so that no one
is dancing with a sibling.
Since we’re considering enumeration, it shouldn’t surprise you that the question we want answered is: in how many ways can this
happen? That is, given n objects, how many derangements of the n objects are there? Let’s use D to denote the number of
n
derangements of n objects.
We can label the objects with the numbers {1, . . . , n}, and think of a derangement as a bijection
f : {1, . . . , n} → {1, . . . , n} (9.1.1)
such that f does not fix any value. There are n−1 choices for f (n) , since the only restriction is f (n) ≠ n . Say f (n) = i . We
consider two possible cases.
Case 1: f (i) = n
Now, on the other n−2 values between 1 and n that are neither i nor n , f must map {1, . . . , n − 1} ∖ {i} to
{1, . . . , n − 1} ∖ {i} , and must be a derangement. So there are D derangements that have f (n) = i and f (i) = n .
n−2
as follows. We set g(j) = i , and for every other value, g(a) = f (a) (that is, for every a ∈ {1, . . . , n − 1} ∖ {j} ). We had
f (j) = n andf (n) = i , and we are eliminating n from the derangement while maintaining a bijection, by creating the shortcut f
with g(j) = i but g(a) = f (a) for every other a ∈ {1, . . . , n − 1} . Since f is a derangement and j ≠ i , we see that g is also a
derangement (this time of n − 1 objects). So there are D possible derangements g , and for a fixed choice of i, these are in one-
n−1
to-one correspondence with derangements f that have f (j) = n and f (n) = i , so there are also D of these.
n−1
correct place. Also, D = 1 , since there is exactly one way of deranging two objects (by interchanging them).
2
If we wanted to solve this recursively-defined sequence, we would need to use exponential generating functions, which we’ll
introduce in this chapter but won’t really study in this course. Instead, we’ll give the explicit formula for D without proof.
n
Proposition 9.1.1
Exercise 9.1.1
1. Use induction to prove Proposition 9.1.1.
2. Which kind of induction did you have to use to prove Proposition 9.1.1?
3. Calculate D using the explicit formula given in Proposition 9.1.1.
5
9.1.1 https://math.libretexts.org/@go/page/60115
This page titled 9.1: Derangements is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy Morris.
9.1.2 https://math.libretexts.org/@go/page/60115
9.2: Catalan Numbers
This is an example that shows even more clearly the power of the generating function method.
The Catalan numbers are a sequence that can be defined in a variety of ways, because they arise in a number of different
circumstances. We’ll use the following definition.
This may be easier to see with an example. We’ve worked out C above; in order to work out C using this recursive relation, we
3 4
also need to know C and C . There is only one way to combine a single term (we don’t need brackets at all), so C = 1 . We also
1 2 1
have C = 1 , since there is only one way to put brackets around a pair of terms: (_·_).
2
Now, to use brackets to order the operations in a four-term expression, our final operation must either combine a group of three
terms with a single term; a group of two terms with another group of two terms; or a single term with a group of three terms (this
time, the single term is at the left). The first two expressions below come from combining a group of three terms with a single term;
the third comes from combining a group of two terms with another group of two terms; and the last two come from combining a
single term with a group of three terms.
([(_·_)·_]·_), ([_·(_·_)]·_), [(_·_) · (_·_)] (_·[(_·_)·_]) (_·[_·(_·_)]).
Thus,
C4 = C3 C1 + C2 C2 + C1 C3 = 2 + 1 + 2 = 5 .
Now we want to use generating functions to figure out what we can about the Catalan numbers. Unfortunately, there is a difficulty.
Any time we want to use generating functions to solve a recursively-defined sequence, the sequence must start with a 0 term, to th
be the coefficient of x . With some recursively-defined sequences, we can simply use the recursive relation “backwards” to solve
0
previous terms, going down to n = 0 , even if our initial conditions began with much higher terms. For example, if a recursively-
defined sequence is given by h = 1 , h = 5 and h = 8h
2 3 −h n for every n ≥ 4 , we can use n = 3 in this to get
n−2 n−1
h3 = 5 = 8 h1 − h2 = 8 h1 − 1 .
3
Solving for h gives h
1 1 = . Then using the recursive relation with n = 2 gives
4
3
h2 = 1 = 8 h0 − h1 = 8 h0 − .
4
7
Solving for h gives h
0 0 = . This allows us to use generating functions on the sequence.
32
The recursive relation for the Catalan numbers doesn’t have a form that allows us to solve for C by knowing other terms of the
0
sequence, so we do what we have to, in order to make things work. Instead of working with the generating function for the Catalan
numbers themselves (since we can’t), we work with the generating function for the sequence c , c , c , . . ., where c = C
0 for
1 2 i i+1
every i ≥ 0 . In other words, the n term of our new sequence will be the n + 1 Catalan number.
th st
Adjusting the recursive relation we’ve determined for the Catalan numbers to this new sequence, gives
9.2.1 https://math.libretexts.org/@go/page/60116
c0 = 1 , and c n =∑
n−1
k=0
ck cn−k−1 for every n ≥ 1 .
Notice that
2 3 2 3
c(x)c(x) = (c0 + c1 x + c2 x + c3 x +. . . )(c0 + c1 x + c2 x + c3 x +. . . )
2 3
= c0 c0 + (c1 c0 + c0 c1 )x + (c2 c0 + c1 c1 + c0 c2 )x + (c0 c3 + c1 c2 + c2 c1 + c3 c0 )x +. . . ,
m
∑k=0 ck cm−k .
This should look familiar! In fact, you can see that the coefficient of x in (c(x)) , is the same as the coefficient of x
m 2 m+1
in c(x),
since the latter is
m
∑ ck cm−k
k=0
also.
Thus, we have an expression for c(x) in terms of (c(x)) , since multiplying 2
(c(x))
2
by x gives all of the terms of c(x) except
c : c(x) = x(c(x)) + c . We can rearrange this equation, to see that
2
0 0
x[c(x)]
2
− c(x) + 1 = 0 .
We are about to do something to this generating function that may seem a bit like black magic: we will use the quadratic formula to
factor this quadratic equation in c(x), treating x as the coefficient of (c(x)) . Thus, in the quadratic formula, we take a = x ,
2
−−−− −
1 ± √1 − 4x
c(x) =
2x
Of course, there are two roots to this, and only one of them will give the correct generating function; we need to work out which
one (whether to take the plus or the minus).
Using the Generalized Binomial Theorem, we see that
1 1 1 1 1
2 k
(1 − 4x ) 2 = ( 2 ) + ( 2 )(−4x) + ( 2 )(−4x ) +. . . +( 2 )(−4x ) +. . .
0 1 2 k
and
1 1 3 1
1 ( )(− )(− ). . . ( − k + 1) k−1
2 2 2 2 (−1 ) 1 ⋅ 3 ⋅ 5 ⋅ (2k − 3)
(2) = =
k
k k! 2 k!
k−1 k
(−1 ) 1 ⋅ 3 ⋅ 5 ⋅ (2k − 3) (−1)1 ⋅ 3 ⋅ 5 ⋅ (2k − 3)2
k
(−4 ) ) =
k
2 k! k!
Whichever root we use will require this expression, so let’s work with it a bit more to get it into a nicer form.
k k k
2 k! = 2 (1 ⋅ 2 ⋅ 3⋅. . . ⋅k) = 2 ⋅ 4 ⋅ 6⋅. . . ⋅ 2 ,
so if we multiply the numerator and denominator of the fraction by k! (which does not change the result), we see that we have
(−1)1 ⋅ 3 ⋅ 5 ⋅ (2k − 3)2 ⋅ 4 ⋅ 6. . . ⋅2k (−1)(2k − 2)!2k (−1)(2k)! −1 2k
= = = ( ),
k
k!k! k!k! (2k − 1)k!k! 2k − 1
so
1
∞
1 2k k
(1 − 4x ) 2 = − ∑ ( )x
k=0 k
2k − 1
The coefficients shown on the right-hand side of this equation quickly get big and negative.
If
9.2.2 https://math.libretexts.org/@go/page/60116
−−−− −
1 + √1 − 4x
c(x) =
2x
then for n > 0 the coefficient of x in c(x) will be half of the coefficient of x
n
in (1 − 4x) 2 , which (when n is large) will be
n+1
big and negative. But it is easy to see from the recurrence relation that all of the Catalan numbers are positive. To get positive
coefficients, we must use
−−−− −
1 − √1 − 4x
c(x) =
2x
Since in this expression we take the negative of the large negative coefficients, the result will be large positive coefficients (even
when we divide by 2, and look for the coefficient of x ). n+1
Thus,
1
∞ 1 2k k
1 + ∑k=0 ( )x
1 − (1 − 4x) 2 2k − 1
k
c(x) = =
2x 2x
From this, we see that for n > 0 , the coefficient of x in c(x) is half of the coefficient of x
n n+1
in (1 − 4x) 2 , which is
1 2(n + 1) 1 (2n + 2)!
( ) =
2 n+1 2n + 1 2(n + 1)!(n + 1)!(2n + 1)
1 2n + 2 (2n)! 2n + 1
= ⋅ ⋅ ⋅
2 n+1 (n + 1)!n! 2n + 1
1
2n
= ( ).
n+1 n
So
1
cn = (
2n
n
) .
2n + 1
1
Although we derived this expression for n > 0 only, we can verify that c 0 =1 =
0
( )
0
since 0! = 1 , so this expression is true
0 +1
for every n ≥ 0 .
Exercise 9.2.1
1. Use induction and the recursive relation for Catalan numbers (as adjusted for the values of {c }, where c i i = Ci+1 ) to prove
that c > 0 for every n ≥ 0 .
n
This page titled 9.2: Catalan Numbers is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy Morris.
9.2.3 https://math.libretexts.org/@go/page/60116
9.3: Bell Numbers and Exponential Generating Functions
Sometimes a recurrence relation involves factorials, or binomial coefficients. When this happens, it becomes difficult if not
impossible to use ordinary generating functions to find an explicit formula for the nth term of the sequence. In some cases, a
different kind of generating function, the exponential generating function, may succeed where an ordinary generating function fails.
Obviously, the difference between this and an ordinary generating function comes from the factorial expression in the denominator.
Cancellation between this and expressions in the numerator can lead to nicer compact expressions
Example 9.3.1
This is the Taylor series expansion for e . Thus, e is the exponential generating function for 1, 1, 1, . . ..
x x
We will not be using exponential generating functions in this class; we are just introducing the topic. We will go through one
example of a sequence for which exponential generating functions are useful: the Bell numbers
Example 9.3.2
{1, 2, 3}, so B = 5 .
3
Probably after seeing the above examples, you don’t want to calculate larger Bell numbers directly. However, we can derive a
recursive relation for these numbers. For this relation to work properly, we will define B = 1 . 0
Proposition 9.3.1
Proof
We’ll use a combinatorial proof of this statement. We know that Bn is the number of partitions of {1, . . . , n} into subsets.
For the other side of the equation, let’s consider the subset that contains the element n , and call the cardinality of this subset
k . Since n is in this subset, k ≥ 1 , and since this is a subset of 1, . . . , n, we have k ≤ n , so 1 ≤ k ≤ n . There are (
n−1
)
k−1
ways to choose the remaining k − 1 elements of this subset; that is, for any 1 ≤ k ≤ n , there are ( ) ways to choose the n−1
k−1
9.3.1 https://math.libretexts.org/@go/page/60117
subset of 1, . . . n that contains the element n . For each of these ways, there are n − k other elements that must be
partitioned, and by the definition of the Bell numbers, there are B ways to partition them into subsets. (Our definition of
n−k
B = 1 deals with the case k = n , ensuring that the ( ) = 1 way of choosing n to be in a single set of all n elements is
n−1
0
n−1
Let us try to find the exponential generating function for the Bell numbers. When dealing with exponential generating functions,
n n n−1
x x x
notice that the derivative of is n = , so taking derivatives often results in a nice expression that helps us find a
n! n! (n − 1)!
nice expression for the coefficients. You already know a particularly nice example of this: the derivative of e is e , which tells us x x
that all of the coefficients in that exponential generating function are equal.
Define
i 2 n
∞ x x x x
B(x) = ∑i=0 Bi = B0 + B1 + B2 +. . . +Bn
i! 1! 2! n!
Using our recursive relation from Proposition 9.3.1, we see that this is
∞ n n−1
d n−1 x
B(x) = ∑ [ ∑ ( ) Bn−k ]
dx k−1 (n − 1)!
n=1 k=1
∞ n n−1
(n − 1)! x
= ∑ [∑ Bn−k ]
(k − 1)!(n − k)! (n − 1)!
n=1 k=1
∞ n
1
n−1
= ∑ [∑ Bn−k x ]
(k − 1)!(n − k)!
n=1 k=1
∞ n k−1 n−k
x x
= ∑ [∑ Bn−k ]
(k − 1)! (n − k)!
n=1 k=1
Notice that for each value of n , as k goes from 1 to n the values k − 1 and n − k take on every pair of non-negative integral
values that add up to n . Thus, as n goes from 1 to infinity, the values k − 1 and n − k take on every possible pair of non-negative
integral values. Therefore, we can rewrite this expression as
∞ ∞ j i
d x x
B(x) = ∑ [ ∑ Bi ]
dx j! i!
j=0 i=0
∞ j ∞ i
x x
=∑ [ ∑ Bi ] (9.3.3)
j! i!
j=0 i=0
∞ j ∞ i
x x
= [∑ ] [ ∑ Bi ]
j! i!
j=0 i=0
e
−( e ) x
e B(x) − B(x)e
−( e )
e
x
=0 ,
x x x
9.3.2 https://math.libretexts.org/@go/page/60117
n
0
B(0) = ∑
∞
n=0
Bn = 1 +∑
∞
n=1
0 =1 ,
n!
(recall that 0 0
=0 , or if you don’t like that, simply use the expansion of B(0)), we see that
0
ce
e
= ce
1
= ce = 1 ,
so c = e −1
. Hence
.
x x
−1 (e ) ( e −1)
B(x) = e e =e
There are techniques to extract coefficients from expressions like this, also, but we will not cover these techniques in this class.
Exercise 9.3.1
1. Find B . 4
2. What is the exponential generating function for the sequence a i = i! for every i ≥ 0 ? Give the sequence in both an
expanded and a closed form.
(i + 1)!
3. What is the exponential generating function for the sequence b i = for every i ≥ 0 ? Give the sequence in both an
2
expanded and a closed form.
This page titled 9.3: Bell Numbers and Exponential Generating Functions is shared under a CC BY-NC-SA license and was authored, remixed,
and/or curated by Joy Morris.
9.3.3 https://math.libretexts.org/@go/page/60117
9.4: Summary
Generating functions must start with a 0 th
term
Important Definitions:
Derangements
Catalan Numbers
Exponential Generating Function
Bell Numbers
This page titled 9.4: Summary is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy Morris.
9.4.1 https://math.libretexts.org/@go/page/60118
CHAPTER OVERVIEW
This page titled 10: Other Basic Counting Techniques is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy
Morris.
1
10.1: The Pigeonhole Principle
The Pigeonhole Principle is a technique that you can apply when you are faced with items chosen from a number of different
categories of items, and you want to know whether or not some of them must come from the same category, without looking at all
of the items.
Example 10.1.1
Suppose I will be teaching an independent study course in graph theory to two students next semester, and I want to use Bondy
& Murty’s “Graph Theory” textbook. It has been issued in two editions, and I don’t care which edition we use, but I want both
students to have the same edition.
I find a website on which someone has posted that they have three copies of the text for sale, but they don’t say which editions
they are. Without any more information, I know that if I buy these texts, I will have suitable texts for my students.
The reasoning is straightforward. The first book could be edition 1 or edition 2. If the second text is the same as the first, then I
have what I need, so the only possible problem is if the first two books consist of one copy of edition 1, and one copy of
edition 2. But then the third book must match one or the other of the first two, since there are only two editions, so I will have
two copies of one or the other of the editions.
This idea can be generalised in several ways. We’ll look at the most straightforward generalisation first.
If there are n items that fall into m different categories and n >m , then at least two of the items must fall into the same
category.
Proof
Amongst the first m items, either two of the items are from the same category (so we are done), or there is exactly one item
from each of the m categories. Since n > m , there is at least one more item. This item must fall into the same category as
one of the previous items since every category already has an item.
The name of this principle comes from the idea that it can be stated with the categories being a row of holes, and the items being
pigeons who are assigned to these holes.
In Example 10.1.1, the categories were the editions, and the items were the textbooks.
Example 10.1.1 was a very direct and straightforward application of the Pigeonhole Principle. The Principle can also apply in much
more subtle and surprising ways.
Example 10.1.2
Maria makes a bet with Juan. He must buy her at least one chocolate bar every day for the next 60 days. If, at the end of that
time, she cannot point out a span of consecutive days in which the number of chocolate bars he gave her was precisely 19, then
she will pay for all of the chocolate bars and give them back to him. If she can find such a span, then she gets to keep the
chocolate bars. To limit the size of the bet, they agree in advance that Juan will not buy more than 100 chocolate bars in total.
Is there a way for Juan to win this bet?
Solution
The answer is no. For 1 ≤ i ≤ 60 , let a represent the number of chocolate bars that Juan has bought for Maria by the end of
i
day i. Then 1 ≤ a < a <. . . < a ≤ 100 . Maria is hoping that for some i < j , she will be able to find that a + 19 = a .
1 2 60 i j
We therefore also need to consider the values a + 19 < a + 19 <. . . < a + 19 . By the bounds on a and a , we have
1 2 60 1 60
a + 19 ≥ 20 , and a
1 + 19 ≤ 119 . Thus, the values a , . . . , a
60 1, a + 19, . . . , a
60 1 + 19 are 120 numbers all of which lie
60
10.1.1 https://math.libretexts.org/@go/page/60121
By the Pigeonhole Principle, at least two of these numbers must be equal, but we know that the a s are strictly increasing (as
i
are the a + 19 s), so there must exist some i < j such that a + 19 = a . Maria must point to the span of days from the start
i i j
of day i + 1 to the end of day j since in this span Juan gave her 19 chocolate bars.
In fact, Juan could not win a bet of this nature that lasted more than 56 days, but proving this requires more detailed analysis
specific to the numbers involved, and is not really relevant to this course.
Here is another example that would be hard to prove if you didn’t know the Pigeonhole Principle.
Example 10.1.3
Fix n , and colour each point of the plane with one of n colours. Prove that there exists a rectangle whose four corners are the
same colour.
Solution
Take a grid of points with n + 1 rows and n( n+1
2
)+1 columns. We claim that this grid will contain such a rectangle
Since n colours have been used, and there are n + 1 points in each column, by the Pigeonhole Principle each column must
contain at least two grid points that are the same colour.
In any column, there are ( ) possible locations in which a pair of points of the same colour could appear. Thus there are at
n+1
most ( ) ways to position two points of colour 1 in a column so that the points do not occupy the same two locations in
n+1
more than one of these columns. The same is true for each of the n colours. Therefore, we can create a maximum of n( ) n+1
columns, each having two points of some colour, in such a way as to avoid having the same colour occupy the same two
locations in more than one of the columns. Since we have n( ) + 1 columns, there must exist some pair of columns such
n+1
that the same colour does occupy the same two locations in both of the columns. These four points form a rectangle whose
corners all have the same colour.
So far we have only thought about guaranteeing that there are at least two items in some category. Sometimes we might want to
know that there are at least k items in some category, where k > 2 . There is a generalisation of the Pigeonhole Principle that
applies to such situations.
Given n items that fall into m different categories, if n > km for some positive integer k , then at least k + 1 of the items must
fall into the same category.
Proof
Amongst the first km items, either k + 1 of the items are from the same category (so we are done), or there are exactly k
items from each of the m categories. Since n > km , there is at least one more item. This item must fall into the same
category as one of the previous items. Since every category already has k items, this means that there will be k + 1 items
in this category.
Notice that the Pigeonhole Principle is a special case of the Generalised Pigeonhole Principle, obtained by taking k = 1 .
Example 10.1.4
The population of West Lethbridge in the 2014 census was 35, 377. Show that at least 97 residents of West Lethbridge share a
birthday. If you live in West Lethbridge, how many people can you be sure have the same birthday as you?
Solution
For the first part of this question, we apply the generalised pigeonhole principle, with m = 366 (for the 366 days of the year,
counting February 29 since it is just as legitimate a birthday as any other despite being more uncommon), k = 96 , and
n = 35, 377. We have
10.1.2 https://math.libretexts.org/@go/page/60121
so the Generalised Pigeonhole Principle tells us that at least k + 1 = 97 people must share a birthday.
For the second part of the question, the answer is 0. There is no reason why every single other person in West Lethbridge might
not have their birthday on the day after yours (although that particular possibility is quite unlikely). There is certainly no
guarantee that any of them has the same birthday as yours.
Notice that although we have found in the above example that some group of at least 97 people in West Lethbridge must have the
same birthday, we have no idea of which 97 people are involved, or of what the joint birthday is. This is rather remarkable, but is
an example of a type of proof that is quite common in advanced mathematics. Such proofs are referred to as “non-constructive,”
since they prove that something exists, without giving you any idea of how to find (or construct) such a thing.
The proof of the following theorem involves a more subtle application of the Generalised Pigeonhole Principle.
For every pair of integers a , b ≥ 1 , if S is a sequence of ab + 1 distinct real numbers, then there is either an increasing
subsequence of length a + 1 or a decreasing subsequence of length b + 1 in S .
Proof
Define a function f that maps each element of S to the length of the longest increasing subsequence that begins with that
element.
If there exists some s ∈ S such that f (s) ≥ a + 1 , then we are done. So we may assume that f (s) ≤ a for every s ∈ S .
Since there is always an increasing sequence of length at least 1 starting at any element of S , we in fact have 1 ≤ f (s) ≤ a
for every s ∈ S , so there are a possible values for the outputs of f . Since |S| = ab + 1 , and ab + 1 > ab , the Generalised
Pigeonhole Principle tells us that at least b + 1 elements of S must have the same output under the function f .
We claim that if x is before y in S and f (x) = f (y) , then x > y . By assumption, x ≠ y (all values of S are distinct), so
the only other possibility is x < y . If x < y , then taking x followed by an increasing subsequence of length f (y) that starts
at y, would give an increasing subsequence of length f (y) + 1 that starts at x , contradicting f (x) = f (y) . This
contradiction shows that x < y is not possible, so x > y , as claimed.
Let s , s , . . . , s
1 2 be elements of
b+1 Sthat have the same output under f , and appear in this order. Then by the claim we
proved in the previous paragraph, s 1 > s >. . . > s
2 , which is a decreasing subsequence of length b + 1 .
n+1
In fact, ab + 1 is the smallest possible length for S that guarantees this property. For any a , b , there is a sequence of length ab in
which the longest increasing sequence has length a and the longest decreasing subsequence has length b . One such sequence is
Any increasing subsequence can only have one entry from each of the a subsequences of length b that are separated by semicolons,
so can only have length a . Any decreasing subsequence must be entirely contained within one of the subsequences of length b that
are separated by semicolons, so can only have length b .
items that fall into m categories, there must be some 1 ≤ i ≤ m such that at least n items fall into the i
i
th
category.
Proof
Amongst the first
n1 + n2 +. . . +nm − m
items, either there is some 1 ≤ i ≤ m such that at least ni of the items fall into the i category, or there are precisely
th
n − 1 objects in the ith category, for every 1 ≤ i ≤ m . Since there is at least one more item, this item must fall into the i
th
i
category for some 1 ≤ i ≤ m , which means that there will be n items in this category.
i
10.1.3 https://math.libretexts.org/@go/page/60121
Notice that the Generalised Pigeonhole Principle is a special case of the “Even more generalised pigeonhole principle,” obtained by
taking
n1 = n2 =. . . = nm = k. (10.1.3)
Example 10.1.5
Suppose Ali owes Tomas $10, and wants to give him a number of identical pieces of currency to pay her debt. Her bank only
gives out currency in loonies, twonies, five-dollar bills, or ten-dollar bills, and does not take requests for specific kinds of
currency. How much money must Ali request from the teller, if she wants to be sure to have $10 in identical pieces of currency
with which to pay Tomas?
Solution
If Ali gets any $10 bills she can give one of those to Tomas and is done. If she gets at least two $5 bills, she is done. If she gets
at least five twonies, she is done, and if she gets at least 10 loonies she is done. So the most money she can get without being
able to give Tomas his $10 in a single type of currency, is 9 loonies, 4 twonies, and a $5 bill, for a total of $22. Therefore, if
Ali asks for $23, she is guaranteed to be able to pay Tomas in a single type of currency.
Although the above example does not directly use the “Even more generalised pigeonhole principle” because it asks for the value
of the currency Ali needs to request rather than the number of items she must request, it uses the same ideas and should be helpful
in understanding the concept.
Exercise 10.1.1
1) Show that in any positioning of 17 rooks on an 8-by-8 chessboard, there must be at least three rooks none of which threaten
each other (i.e. no two of which lie in the same row or column).
2) Sixteen people must sit in a row of eighteen chairs. Prove that somewhere in the row there must be six adjacent chairs all
occupied.
3) An artist has produced a large work of art to be carried in a parade. Part of the concept is that it must be carried by people of
roughly the same size (i.e., either all adults, or all children). The artist has left it to the last minute to find people to carry this,
and is in a bit of a panic. He doesn’t know if he will be able to assemble enough of either adults or children to carry the piece,
so he decides to ask everyone he sees, until he has enough volunteers. It takes 15 adults to carry the piece, or 23 children. If
everyone approached agrees to help, how many people does the artist need to approach before he is sure to have enough people
to carry his art in the parade?
4) Let n be odd, let a be even, and let π : {1, . . . , n} → {1, . . . , n} be a permutation. Prove that the product
(a + 1 − π(1))(a + 2 − π(2)) ⋅ ⋅ ⋅ (a + n − π(n))
is even. Is the same conclusion necessarily true if n is even or if a is odd? Give a proof or a counterexample in each case.
5) Let n ≥ 1 , let x be a positive integer, and let S be a subset of cardinality n+1 from 2 2n
{x, x , . . . , x . Prove that there
}
6) Show that in every set of n + 1 distinct integers, there exist two elements a and b such that a − b is divisible by n .
7) A drawer contains socks of 8 different colours. How many socks must you pull out of the drawer to be certain that you have
two of the same colour?
8) There are 15 students in a Combinatorics class. Explain how you know that two of them have their birthday in the same
month.
9) A pizza restaurant has 8 different toppings. Every day in October, they will put a 2-topping pizza on sale. Prove that the
same pizza will be on sale on two different days.
10) Suppose A is a set of 10 natural numbers between 1 and 100 (inclusive). Show that two different subsets of A have the
same sum. For example, if
10.1.4 https://math.libretexts.org/@go/page/60121
A = {2, 6, 13, 30, 45, 59, 65, 82, 88, 97},
then the subsets {6, 13, 45, 65}and {2, 30, 97}both add up to 129.
[Hint: Compare the answers to two questions: How many subsets of A are there? Since there are only 10 elements of A , and
each of them is at most 100, how many different possible sums are there?]
11) Consider any set of 5 points in the plane that have integer coordinates. Prove that there is some pair from these 5 points
such that the midpoint of the line segment joining this pair of points also has integer coordinates. Give an example of 4 points
in the plane that do not have this property, and list all of the midpoints as evidence
This page titled 10.1: The Pigeonhole Principle is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy
Morris.
10.1.5 https://math.libretexts.org/@go/page/60121
10.2: Inclusion-Exclusion
In school, you probably saw Venn diagrams sometimes, showing groups that overlapped with one another. We could draw a very
basic Venn diagram showing the kinds of trees that are growing at the various houses on my street:
Looking at the Venn diagram can help us figure out the values of some of the pieces from knowing the values of others. Suppose
we know how many houses have deciduous trees, and how many houses have evergreen trees. Naively, you might think that adding
these together would give us the total number of houses with trees. However, by looking at the Venn diagram, we see that if we
simply add the values together, then any houses that have both kinds of trees have been counted twice (once as a house with a
deciduous tree, and again as a house with an evergreen tree). So in order to work out the number of houses that have trees, we can
add the number that have deciduous trees to the number that have evergreen trees and then subtract the number that have both kinds
of trees. This is the idea of “inclusion-exclusion.”
Specifically, if two sets A and B are disjoint, then |A ∪ B| = |A| + |B| . However, if A and B are not disjoint, then |A| + |B|
counts the elements of A ∩ B twice (both as elements of A and as elements of B ). Subtracting this overcount yields the correct
answer:
Proof
Let A 0 = A∖B and B
0 = B∖A , so
A is the disjoint union of A and A ∩ B ,
0
Then
= (| A0 | + | B0 | + |A ∩ B|) + |A ∩ B| (10.2.2)
= |A ∪ B| + |A ∩ B|
Example 10.2.1
10.2.1 https://math.libretexts.org/@go/page/60122
Example 10.2.2
Every one of the 4000 students at Modern U owns either a tablet or a smartwatch (or both). Surveys show that:
3500 students own a tablet, and
1000 students own a smartwatch.
Then, by assumption,
, ,
|S| = 4000 |T | = 3500 |W | = 1000 .
Since every student owns either a tablet or a smartwatch, we have S = T ∪ W . Therefore, Inclusion-Exclusion tells us that
|S| = |T ∪ W | = |T | + |W | − |T ∩ W | ,
so
|T ∩ W | = |T | + |W | − |S| = 3500 + 1000 − 4000 = 500 .
Hence, there are exactly 500 students who own both a tablet and a smart watch.
The following exercise provides a formula for the union of three sets A , B , and C . The idea is that A ∩ B , A ∩ C , and B ∩ C
have all been overcounted. However, subtracting all of these will overcompensate, because the elements of A ∩ B ∩ C have been
subtracted too many times, so they need to be added back in.
Exercise 10.2.1
The following general formula calculates the cardinality of the union of any number of sets, by adding or subtracting the cardinality
of every possible intersection of the sets. It is called the Inclusion-Exclusion formula, because it works by adding (or “including”)
the cardinalities of certain sets, and subtracting (or “excluding”) the cardinalities of certain other sets.
n+1
| A1 ∪. . . ∪An | = ( ∑ | Ai |) − ( ∑ | Ai ∩ Aj |) +. . . +((−1 ) | A1 ∩. . . ∩An |) (10.2.3)
i=1 1≤i<j≤n
Of course, we can figure out the value of any one of the terms in the inclusion-exclusion formula, if we know the values of all of
the other terms
Example 10.2.3
Sandy’s class is at Calaway Park. There are 21 students in the class. At the end of the day, the teacher asks some questions and
determines the following:
10.2.2 https://math.libretexts.org/@go/page/60122
Every student rode at least one of the roller coaster, the train, the log ride, or the bumper cars;
13 students rode the roller coaster;
students who rode the train; A will be the set of students who rode the log ride; and A will be the set of students who rode
3 4
| A1 ∩ A2 ∩ A3 | + | A1 ∩ A2 ∩ A4 | + | A1 ∩ A3 ∩ A4 | + | A2 ∩ A3 ∩ A4 | − 6 = 10 ,
so
| A1 ∩ A2 ∩ A3 | + | A1 ∩ A2 ∩ A4 | + | A1 ∩ A3 ∩ A4 | + | A2 ∩ A3 ∩ A4 | = 16 .
Thus, the inclusion-exclusion formula tells us that
21 = (13 + 6 + 12 + 15) − ∑
1≤i<j≤4
| Ai ∩ Aj | + 16 − 2 ,
so ∑ 1≤i<j≤4
| A ∩ A | = 39 . Unfortunately, this still isn’t quite what we’re looking for. The value we want is the number of
i j
students who rode exactly two of the four rides. Again, similar reasoning shows that the number of students who rode the roller
coaster and the train but neither of the other two rides, will be given by:
| A1 ∩ A2 | − | A1 ∩ A2 ∩ A3 | − | A1 ∩ A2 ∩ A4 | + | A1 ∩ A2 ∩ A3 ∩ A4 | .
Similar formulas can be worked out for each of the other five pairs that can be formed from the four rides. What we have been
asked for, is the sum of these six formulas. This works out to
∑
1≤i<j≤4
| Ai ∩ Aj | − 3 (∑
1≤i<j<k≤4
| Ai ∩ Aj ∩ Ak |) + 6| A1 ∩ A2 ∩ A3 ∩ A4 | = 39 − 3(16) + 6(2) = 3 .
Only three of the students rode exactly two of the four rides.
This was a very complicated example. You should not expect to have to work out examples that are quite so tricky, but this gives
you an idea of the power of inclusion-exclusion. Here is a more straightforward application.
10.2.3 https://math.libretexts.org/@go/page/60122
Example 10.2.4
In the Faculty of Arts and Science, the voting method used is “approval;” that is, regardless of the number of positions
available, each voter can mark as many boxes as they wish on their ballot.
Imagine that Prof. Li, Prof. Cheng, and Prof. Osborn were all nominated for two computer science positions on the
department’s search committee. Barb Hodgson notes the following facts when counting the ballots:
Prof. Cheng received 18 votes; Prof. Osborn received 15 votes, and Prof. Li received 10 votes.
Only one ballot had all three boxes marked.
Five of the ballots were marked for both Prof. Osborn and Prof. Li.
Ten of the ballots were marked for Prof. Cheng and Prof. Osborn.
Six of the ballots were marked for Prof. Cheng and Prof. Li.
How many members of the department voted in the election?
Solution
Again, we begin by establishing some notation. Let C be the set of ballots that were marked for Prof. Cheng; let O be the set
of ballots that were marked for Prof. Osborn; and let L be the set of ballots that were marked for Prof. Li. Then what we want
is |C ∪ O ∪ L| : the number of ballots that were marked for at least one of the three candidates; this is the same as the number
of people who voted.
Inclusion-Exclusion tells us that
|C ∪ O ∪ L| = |C | + |O| + |L| − |C ∩ O| − |C ∩ L| − |O ∩ L| + |C ∩ O ∩ L| .
We have been given all of the values on the right-hand side of this equation, so we see that
|C ∪ O ∪ L| = 18 + 15 + 10 − 10 − 6 − 5 + 1 = 23 .
There were 23 department members who voted in the election.
In fact, the information we have been given is enough for us to fill in the values in every piece of the Venn diagram.
The 9 people who voted for Prof. Cheng and Prof. Osborn but not Prof. Li is determined from the fact that 10 people voted for
Professors Cheng and Osborn, and only one of those voted for all three professors. Similarly, the 4 people who voted for Prof.
Li and Prof. Osborn but not Prof. Cheng is determined from the fact that 5 people voted for Professors Li and Osborn, and only
one of those voted for all three professors; also, the 5 people who voted for Prof. Cheng and Prof. Li but not Prof. Osborn is
determined from the fact that 6 people voted for Professors Cheng and Li, and only one of those voted for all three professors.
From the above deductions, we see that of the 18 votes Prof. Cheng received, one ballot was marked for all 3 candidates; 9
were marked for Professors Cheng and Osborn (but not Li); and 5 were marked for Professors Cheng and Li (but not Osborn).
10.2.4 https://math.libretexts.org/@go/page/60122
The remaining 3 votes must have been for Prof. Cheng alone, allowing us to fill in that spot. Similarly, of the 15 votes Prof.
Osborn received, one ballot was marked for all 3 candidates; 9 were marked for Professors Cheng and Osborn (but not Li); and
4 were marked for Professors Osborn and Li (but not Cheng). The remaining vote must have been for Prof. Osborn alone,
allowing us to fill in that spot. Finally, all of Prof. Li’s 10 votes are accounted for between the 5 who voted for Professors
Cheng and Li (but not Osborn), the 4 who voted for Professors Li and Osborn (but not Cheng) and the one who voted for all
three, so we put a 0 into the final spot.
Exercise 10.2.2
1. Of 15 students in a stats class, 8 are math majors, 6 are CS majors, and 7 are in education. None are in all three, and none
have any other majors. There are two math/CS joint majors, and 3 CS majors who are in education. How many math
majors are in education? How many of the math majors are not in either CS or education?
2. Kevin has 165 apps on his phone. Every one of these that is not a game and was not free, requires internet access. Of these,
78 were free. Internet access is necessary for 124 of the apps to function fully. Of the apps on his phone, 101 are games.
Kevin has 62 games on his phone that require internet access; 48 of these were free. Out of all of the games on his phone,
58 were free. How many of the free apps on Kevin’s phone that aren’t games, require internet access?
3. How many integers between 1 and 60 are divisible by at least one of 2, 3, and 5?
4. In the 403 area code, how many of the 10-digit possible phone numbers (where any combination of digits is allowed)
contain at least one of each odd digit?
5. Assume |U | = 15 , |V | = 12 , and |U ∩ V | = 4 . Find |U ∪ V | .
6. Assume |R| = 13 , |S| = 17 , and |R ∪ S| = 25 . Find |R ∩ S| .
7. Assume |J| = 300 , |J ∪ L| = 500 , and |J ∩ L| = 150 . Find |L|.
8. At a small university, there are 90 students that are taking either Calculus or Linear Algebra (or both). If the Calculus class
has 70 students, and the Linear Algebra class has 35 students, then how many students are taking both Calculus and Linear
Algebra?
9. How many numbers from 1 to 5000 are divisible by either 3 or 17?
10. How many 12-digit numbers (in which the first digit is not 0) have either no 0 or no 5?
This page titled 10.2: Inclusion-Exclusion is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy Morris.
10.2.5 https://math.libretexts.org/@go/page/60122
10.3: Summary
Pigeonhole Principle
Generalised Pigeonhole Principle
Even more generalised Pigeonhole Principle
Inclusion-Exclusion
This page titled 10.3: Summary is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy Morris.
10.3.1 https://math.libretexts.org/@go/page/60123
SECTION OVERVIEW
3: Graph Theory
11: Basics of Graph Theory
11.1: Background
11.2: Basic Definitions, Terminology, and Notation
11.3: Deletion, Complete Graphs, and the Handshaking Lemma
11.4: Graph Isomorphisms
11.5: Summary
This page titled 3: Graph Theory is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy Morris.
1
CHAPTER OVERVIEW
This page titled 11: Basics of Graph Theory is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy Morris.
1
11.1: Background
In combinatorics, what we call a graph has nothing to do with the x and y axes, and plotting. Here, a graph is the most
straightforward way you could think of to model a network. A network could be a computer network, a road network, a telephone
network; it doesn’t matter what kind of network it is. Conceptually, any network consists of a bunch of things (let’s call them
nodes) that are being connected in some fashion. To model this, we draw some points for the nodes, and we draw edges between
nodes that have a direct connection.
Leonhard Euler laid the foundations of graph theory in 1735, with his solution to the Königsberg bridge problem. Königsberg,
Prussia (now Kaliningrad, Russia) was a city on the river Pregel. The city included two islands in the river, as well as land on both
sides of the river, and there were seven bridges connecting the various parts of the city. The lay-out of the city and its bridges
looked something like this:
The question had been posed: is it possible for residents of Königsberg, out for Sunday strolls, to cross each of the seven bridges
exactly once? Better yet, can they do this and end up in the same part of the city where they started?
Euler modeled the problem using a graph, with a vertex for each part of the city (one for each bank, and one for each island), and
edges representing the bridges. His model looked like this:
The nodes A and C represent the two banks of the river, while B and D represent the islands. In the model, the question becomes
can we trace all of the edges of this graph, without lifting our pen from the paper or going over an edge more than once?
Euler was actually able to find an easy method you can use on any graph, to quickly work out whether or not this can be done for
that graph. Also, unlike what we saw in the Pigeonhole Principle, his method is constructive: if it can be done, his method shows
you how to do it. We’ll go over this method later, in Chapter 13.
This page titled 11.1: Background is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy Morris.
11.1.1 https://math.libretexts.org/@go/page/60127
11.2: Basic Definitions, Terminology, and Notation
Now that we have an intuitive understanding of what a graph is, it is time to make a formal definition.
According to this definition, Euler’s model of the bridges of Königsberg is not actually a graph, because some of the vertices have
more than one edge between them (for example, there are two edges between A and B ), which makes E a multiset rather than a
set. This leads naturally to another definition.
Another situation that we might like to allow for some purposes but not allow for others, is the possibility of a connection that goes
from a node back to itself.
For most of the graph theory we cover in this course, we will only consider simple graphs. However, there are some results for
which the proof is identical whether or not the graph is simple, and other results that actually become easier to prove if we allow
multigraphs and/or loops, than if we only allow simple graphs. It is worthwhile and sometimes important to think about which of
our results apply to multigraphs (and/or graphs with loops), and which do not. From this point on, unless otherwise specified, you
should assume that any time the word “graph” is used, it means a simple graph. However, be aware that many of our definitions
and results generalise to multigraphs and to graphs or multigraphs with loops, even where we don’t specify this.
There is still more basic terminology that we need to establish before we can say much about a graph.
Definition: Endvertex
If e = {u, v} is an edge of a graph (or a multigraph, with or without loops), then we say that u and v are endvertices (singular:
endvertex) of e . We say that e is incident with u and v (or vice versa, the vertices are also incident with the edge), and that u
and v are adjacent since there is an edge joining them, or that u is a neighbour of v .
11.2.1 https://math.libretexts.org/@go/page/60128
Notation
We use the notation u ∼ v to denote that u is adjacent to v . We may also denote the edge e = {u, v} by uv or by vu .
After one more definition, we will go through some examples using the terminology we have established.
If v ∈ V is a vertex of a graph (simple or multi, with or without loops), then the number of times v appears as the endvertex of
some edge is called the valency of v in G. (Many sources use degree rather than valency, but the word degree has many
meanings in mathematics, making valency a preferable term for this.) A vertex of valency 0 is called an isolated vertex.
Note
In a graph without loops, we can define the valency of any vertex v as the number of edges incident with v . For most purposes,
this is a good way to think of the valency. However, when a graph has loops, many formulas work out more nicely if we
consider each loop to contribute 2 to the valency of its endvertex. This fits the definition we have given, since a vertex v
appears twice as the endvertex of any loop incident with v .
Notation
Example 11.2.1
Figure 11.2.3 : Two different drawings of the same graph. (Copyright; author via source)
Example 11.2.2
Let the graph G be defined by V = {w, x, y, z} and E = {e , e } , where e = {w, x} and e = {w, y} . There are no loops
1 2 1 2
or multiple edges, so G is a simple graph. The edge e has endvertices w and y . The vertex w is incident with both e and e .
2 1 2
The vertices x and y are not adjacent. The vertex z is an isolated vertex, as it has no neighbours. The vertex y has only one
neighbour, w. The valency of w is 2. The valency of x and the valency of y are both 1. In verifying all of these statements,
drawing a diagram of the graph might help you.
11.2.2 https://math.libretexts.org/@go/page/60128
Exercise 11.2.1
For each of the following graphs (which may or may not be simple, and may or may not have loops), find the valency of each
vertex. Determine whether or not the graph is simple, and if there is any isolated vertex. List the neighbours of a , and all edges
with which \(a is incident.
1. Let G be defined by V = {a, b, c, d, e} and E = {e , e , e , e , e , e } with e = {a, c} , e = {b, d} , e = {c, d} ,
1 2 3 4 5 6 1 2 3
Exercise 11.2.2
1. Let G be the graph whose vertices are the 2-element subsets of {1, 2, 3, 4, 5}, with vertices {a, b} and {c, d} adjacent if
and only if {a, b} ∩ {c, d} = ∅ . Draw G.
2. The number of edges in the k -dimensional cube Q (which is an important structure in network design, but you do not need
k
to know the structure to solve this) can be found by the recurrence relation:
e(Q0 ) = 0; e(Qn ) = 2e(Qn−1 ) + 2
n−1
for n ≥ 1 .
Use generating functions to solve this recurrence relation and therefore determine the number of edges in the k -dimensional
cube
This page titled 11.2: Basic Definitions, Terminology, and Notation is shared under a CC BY-NC-SA license and was authored, remixed, and/or
curated by Joy Morris.
11.2.3 https://math.libretexts.org/@go/page/60128
11.3: Deletion, Complete Graphs, and the Handshaking Lemma
We’ll begin this section by introducing a basic operation that can change a graph (or a multigraph, with or without loops) into a
smaller graph: deletion.
Notation
The graph obtained by deleting the vertex v from G is denoted by G ∖ {v} . We can delete more than one vertex; for any set
S ⊆ V of vertices of G, we use G ∖ S to denote the graph obtained by deleting all of the vertices of S from G.
The graph G ∖ {v} might be a multigraph, but only if G is. It could have loops, but only if G has loops.
If we begin with the graph
and delete the vertex D, then we obtain the graph shown in Figure 11.2.2.
We can also delete edges, rather than vertices.
Vertex and edge deletion will be very useful for using proofs by induction on graphs (and multigraphs, with or without loops). It is
handy to have terminology for a graph that can be obtained from another graph by deleting vertices and/or edges.
Notation
The graph obtained by deleting the edge e from G is denoted by G ∖ e . We can delete more than one edge; for any set T ⊆E
of edges of G, we use G ∖ T to denote the graph obtained by deleting all of the edges of T from G.
The graph G ∖ {e} might be a multigraph, but only if G is. It could have loops, but only if G has loops.
Notice that deleting the edges {C , D}, {a, D} and {c, D} from the graph drawn above, does not result in the graph shown in
Figure 11.2.2, because the graph we obtain by deleting these edges still has the vertex D (as an isolated vertex), whereas the
graph shown in Figure 11.2.2 has only the six vertices {a, b, c, A, B, C }.
Vertex and edge deletion will be very useful for using proofs by induction on graphs (and multigraphs, with or without loops). It is
handy to have terminology for a graph that can be obtained from another graph by deleting vertices and/or edges.
Definition: Subgraph
Let G be a graph. If H can be obtained from G by deleting vertices and/or edges, then H is a subgraph of G. A subgraph H
of G is proper if H ≠ G .
11.3.1 https://math.libretexts.org/@go/page/60129
Definition: Complete Graph
A (simple) graph in which every vertex is adjacent to every other vertex, is called a complete graph. If this graph has n
The notation K for a complete graph on n vertices comes from the name of Kazimierz Kuratowski, a Polish mathematician
n
who lived from 1896–1980. Although his main area of research was logic, Kuratowski proved an important theorem that
involves a complete graph. We’ll study his theorem later in the course.
With this setup, we are ready to prove our first result about graphs.
Proposition 11.3.1
n(n − 1)
The number of edges of K isn
n
=( )
2
2
We present two proofs of this proposition: first, a combinatorial proof; then, a proof by induction
Proof
1) Combinatorial Proof: A complete graph has an edge between any pair of vertices. From n vertices, there are ( n
2
) pairs
that must be connected by an edge for the graph to be complete. Thus, there are ( ) edges in K . n
2
n
Before giving the proof by induction, let’s show a few of the small complete graphs. In particular, we’ll need to have K in 1
2) Proof By Induction: Base case: n = 1 . As we can see above, the graph K has 0 edges. Also, 1
n(n − 1) 1(0)
= =0
2 2
So the equality holds for n = 1 . This completes the proof of the base case.
Inductive step: We begin with the inductive hypothesis. Let k ≥ 1 be arbitrary, and suppose that K has ( k
k
2
) edges.
We want to deduce that K has ( ) edges. Start with K , and let the number of edges of this graph be t. Now we
k+1
k+1
2 k+1
delete a vertex v from K . By the definition of vertex deletion, we must delete every edge incident with v . Since K
k+1 is k+1
complete, v is adjacent to every other vertex, so there are k edges incident with v , and it is precisely these edges that we
have deleted. There must be t − k edges remaining.
Notice that deleting v does not affect edges that are not incident with v . Therefore, if we consider any two vertices in the
remaining graph, they will still be adjacent (since they were adjacent in K and the edge between them was not deleted).
k+1
k(k − 1) k(k − 1)
Using our inductive hypothesis, we know that K has k edges. We have shown that t − k = , so
2 2
k(k − 1) (k − 1) k(k + 1)
k+1
t = +k = k( + 1) = =( )
2
2 2 2
which is what we wanted to deduce. This completes the proof of the inductive step.
By the Principle of Mathematical Induction, K has ( n
n
2
) vertices for every n ≥ 1 .
Although this proof by induction may seem ridiculously long and complicated in comparison with the combinatorial proof, it
serves as a relatively simple illustration of how proofs by induction can work on graphs. This can be a very powerful technique for
11.3.2 https://math.libretexts.org/@go/page/60129
proving results about graphs.
Here is another result that can be proven using either a combinatorial proof, or a proof by induction.
v∈V
This is called the handshaking lemma because it is often explained using vertices to represent people, and edges as handshakes
between people. In this explanation, the lemma says that if you add up all of the hands shaken by all of the people, you will get
twice the number of handshakes that took place. This is an example of using two ways to count pairs (v, e) ∈ V × E such that
v is incident with e , a notion that we discussed briefly when we introduced combinatorial proofs.
Proof
For the left-hand side of the equation, at every vertex we count the number of edges incident with that vertex. To get the
right-hand side from this, observe that this process results in every edge having been counted exactly twice (once at each of
its two endvertices; or, in the case of a loop, twice at its single endvertex since both ends are there).
Although from the right perspective the handshaking lemma might seem obvious, it has a very important and useful corollary.
Corollary 11.3.1
Proof
Since the sum of all of the valencies in the graph is even (by Euler’s handshaking lemma, Lemma 11.3.1), the number of
odd summands in this sum must be even. That is, the number of vertices that have odd valency must be even.
Exercise 11.3.1
3. Show that there is a way of deleting an edge and a vertex from K (in that order) so that the resulting graph is complete.
7
Show that there is a way of deleting an edge and a vertex from K (in that order) so that the resulting graph is not complete.
7
4. Prove Corollary 11.3.1 by induction on the number of edges. (Use edge deletion, and remember that the base case needs to
be when there are no edges.)
5. Use graphs to give a combinatorial proof that
k ni n
∑ ( ) ≤( )
i=1 2 2
i=1
ni = n . Under what circumstances does equality hold?
This page titled 11.3: Deletion, Complete Graphs, and the Handshaking Lemma is shared under a CC BY-NC-SA license and was authored,
remixed, and/or curated by Joy Morris.
11.3.3 https://math.libretexts.org/@go/page/60129
11.4: Graph Isomorphisms
There is a problem with the way we have defined K . A graph is supposed to consist of two sets, V and E . Unless the elements of
n
the sets are labeled, we cannot distinguish amongst them. Here are two graphs, G and H :
Which of these graphs is K ? They can’t both be K since they aren’t the same graph – can they?
2 2
The answer lies in the concept of isomorphisms. Intuitively, graphs are isomorphic if they are identical except for the labels (on the
vertices). Recall that as shown in Figure 11.2.3, since graphs are defined by the sets of vertices and edges rather than by the
diagrams, two isomorphic graphs might be drawn so as to look quite different.
Definition: Isomorphism
Two graphs G 1 = (V1 , E1 ) and G 2 = (V2 , E2 ) are isomorphic if there is a bijection (a one-to-one, onto map) φ from V to V 1 2
such that
Notation
When φ is an isomorphism from G1 to G , we abuse notation by writing
2 φ : G1 → G2 even though φ is actually a map on
the vertex sets.
We also write G 1 ≅G2 for “G is isomorphic to G .”
1 2
So a graph isomorphism is a bijection that preserves edges and non-edges. If you have seen isomorphisms of other mathematical
structures in other courses, they would have been bijections that preserved some important property or properties of the structures
they were mapping. For graphs, the important property is which vertices are connected to each other. If that is preserved, then the
networks being represented are for all intents and purposes, the same.
Recall from Math 2000, a relation is called an equivalence relation if it is a relation that satisfies three properties. It must be:
reflexive (every object must be related to itself);
symmetric (if object A is related to object B , then object B must also be related to object A ); and
transitive (if object A is related to object B and object B is related to object C , then object A must be related to object C ).
The relation “is isomorphic to” is an equivalence relation on graphs. To see this, observe that:
For any graph G, we have G ≅G by the identity map on the vertices;
For any graphs G and G , we have
1 2
since any bijection has an inverse function that is also a bijection, and since
{v, w} ∈ E1 ⇔ {φ(v), φ(w)} ∈ E2 (11.4.3)
is equivalent to
−1 −1
φ (v), φ (w) ∈ E1 ⇔ {v, w} ∈ E2 ; (11.4.4)
11.4.1 https://math.libretexts.org/@go/page/60130
so G 1 ≅G3 .
The answer to our question about complete graphs is that any two complete graphs on n vertices are isomorphic, so even though
technically the set of all complete graphs on 2 vertices is an equivalence class of the set of all graphs, we can ignore the labels and
give the name K to all of the graphs in this class.
2
Example 11.4.1
φ(c) = y ;
φ(d) = x ;
φ(e) = w
To prove that two graphs are isomorphic, we must find a bijection that acts as an isomorphism between them. If we want to prove
that two graphs are not isomorphic, we must show that no bijection can act as an isomorphism between them.
Sometimes it can be very difficult to determine whether or not two graphs are isomorphic. It is possible to create very large graphs
that are very similar in many respects, yet are not isomorphic. A common approach to this problem has been attempting to find an
“invariant” that will distinguish between non-isomorphic graphs. An “invariant” is a graph property that remains the same for all
graphs in any isomorphism class. Thus, if you can find an invariant that is different for two graphs, you know that these graphs
must not be isomorphic. We say in this case that this invariant distinguishes between these two graphs.
Mathematicians have come up with many, many graph invariants. Unfortunately, so far, for every known invariant it is possible to
find two graphs that are not isomorphic, but for which the invariant is the same. In other words, no known invariant distinguishes
between every pair of non-isomorphic graphs.
As an aside for those of you who may know what this means (probably those in computer science), the graph isomorphism is
particularly interesting because it is one of a very few (possibly two, the other being integer factorisation) problems that are known
to be in NP but that are not known to be either in P, or to be NP-complete.
We give a few graph invariants in the following proposition.
Proposition 11.4.1
3. if we list the valency of every vertex of G and do the same for G , the lists will be the same (though possibly in a different
1 2
Proof
1. Since G ≅G , there is an isomorphism φ : V → V (where V is the vertex set of G and V is the vertex set of G ).
1 2 1 2 1 1 2 2
2. Since
{v, w} ∈ E1 ⇒ {φ(v), φ(w)} ∈ E2 , (11.4.6)
11.4.2 https://math.libretexts.org/@go/page/60130
we see that for every edge of E , there is an edge of E . Therefore, |E
1 2 2| ≥ | E1 | . Similarly, since
we see that |E | ≥ |E | . So |E | = |E | .
1 2 1 2
Example 11.4.2
The graph G of Example 11.4.1 is not isomorphic to K , because K has ( ) = 10 edges by Proposition 11.3.1, but
5 5
5
2
G has
only 5 edges. Notice that the number of vertices, despite being a graph invariant, does not distinguish these two graphs
The graphs G and H :
are not isomorphic. Each of them has 6 vertices and 9 edges. However, the graph G has two vertices of valency 2 (a and c ),
two vertices of valency 3 (d and e ), and two vertices of valency 4 (b and f ). Meanwhile, the graph H has one vertex of
valency 2 (w), four vertices of valency 3 (u, x, y , and z ), and one vertex of valency 4 (v ). Although each of these lists has the
same values (2s, 3s, and 4s), the lists are not the same since the number of entries that contain each of the values is different. In
particular, the two vertices a and c both have valency 2, but there is only one vertex of H (vertex w) of valency two. Either a
or c could be sent to w by an isomorphism, but either choice leaves no possible image for the other vertex of valency 2.
Therefore, an isomorphism between these graphs is not possible.
Observe that the two graph
both have 6 vertices and 7 edges, and each has four vertices of valency 2 and two vertices of valency 3. Nonetheless, these
graphs are not isomorphic. Perhaps you can think of another graph invariant that is not the same for these two graphs.
To prove that these graphs are not isomorphic, since each has two vertices of valency 3, any isomorphism would have to map
{c, f } to {v, z} . Now, whichever vertex gets mapped to u must be a mutual neighbour of c and f since u is a mutual
neighbour of v and z . But c and have no mutual neighbours, so this is not possible. Therefore there is no isomorphism between
these graphs.
A natural problem to consider is: how many different graphs are there on n vertices? If we are not worrying about whether or
not the graphs are isomorphic, we could have infinitely many graphs just by changing the labels on the vertices, and that’s not
very interesting. To avoid this problem, we fix the set of labels that we use. Label the vertices with the elements of {1, . . . , n}.
We’ll call the number of graphs we find, the number of labeled graphs on n vertices.
Any edge is a 2-subset of . There are
{1, . . . , n}
n
( )
2
possible edges in total. Any graph is formed by taking a subset of the
n(n − 1) 2n(n − 1)
possible edges. In Example 4.1.1, we learned how to count these: there are subsets
2 2
11.4.3 https://math.libretexts.org/@go/page/60130
Example 11.4.3
When n = 1 , we have ( 1
2
) =0 , and 2
0
=1 , so there is exactly one labeled graph on 1 vertex. It looks like this:
When n = 2 , we have ( 2
2
) =1 , and 2
1
=2 . so there are exactly two labeled graphs on 2 vertices. They look like this:
When n = 3 , we have ( 3
2
) =3 , and 2
3
=8 , so there are exactly eight labeled graphs on 3 vertices. They look like this:
When n = 4 , we have ( 4
2
) =6 , and 2
6
= 64 , so there are exactly sixty-four labeled graphs on 4 vertices. We won’t attempt to
draw them all here.
Although that answer is true as far as it goes, you will no doubt observe that even though we are using a fixed set of labels,
some of the graphs we’ve counted are isomorphic to others. A more interesting question would be, how many isomorphism
classes of graphs are there on n vertices? Since we are considering isomorphism classes, the labels we choose for the vertices
are largely irrelevant except to tell us which vertices are connected to which other vertices, if we don’t have a diagram. Thus, if
we are drawing the graphs, we usually omit vertex labels and refer to the resulting graphs (each of which represents an
isomorphism class) as unlabeled. So the question is, how many unlabeled graphs are there on n vertices?
We can work out the answer to this for small values of n . From the labeled graphs on 3 vertices, you can see that there are four
unlabeled graphs on 3 vertices. These are:
There are 11 unlabeled graphs on four vertices. Unfortunately, since there is no known polynomial-time algorithm for solving
the graph isomorphism problem, determining the number of unlabeled graphs on n vertices gets very hard as n gets large, and
no general formula is known.
Exercise 11.4.1
For each of the following pairs of graphs, find an isomorphism or prove that the graphs are not isomorphic.
1)
11.4.4 https://math.libretexts.org/@go/page/60130
2)
3) G1 = (V1 , E1 ) and G2 = (V2 , E2 ) with V1 = {a, b, c, d} , V2 = {A, B, C , D} , E1 = {ab, ac, ad} ,
E2 = {BC , C D, BD} .
Exercise 11.4.2
1. Draw five unlabeled graphs on 5 vertices that are not isomorphic to each other.
2. How many labeled graphs on 5 vertices have 1 edge?
3. How many labeled graphs on 5 vertices have 3 or 4 edges
This page titled 11.4: Graph Isomorphisms is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy Morris.
11.4.5 https://math.libretexts.org/@go/page/60130
11.5: Summary
Graphs are defined by sets, not by diagrams
Deleting a vertex or edge
How to use proofs by induction on graphs
Euler’s handshaking lemma
Graph invariants, distinguishing between graphs
Labeled and unlabeled graphs
Important Definitions:
Graph, Vertex, Edge
Loop, Multiple Edge, Multigraph, Simple Graph
Endvertex, incident, adjacent, neighbour
Degree, Valency, Isolated Vertex
Subgraph
Complete Graph
Isomorphic graphs, Isomorphism between graphs
Notation:
u ∼v
uv
G ∖ {v}, G ∖ {e}
Kn
G1 ≅G2
This page titled 11.5: Summary is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy Morris.
11.5.1 https://math.libretexts.org/@go/page/60131
CHAPTER OVERVIEW
This page titled 12: Moving Through Graphs is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy Morris.
1
12.1: Directed Graphs
Some networks include connections that only allow travel in one direction (one-way roads; transmitters that are not receivers, etc.).
These can be modeled using directed graphs.
When drawing a digraph, we draw an arrow on each arc so that it points from the first vertex of the ordered pair to the second
vertex.
Like multigraphs, we will not study digraphs in this course, but you should be aware of the basic definition. Many of the results we
will cover in this course, generalise to the context of digraphs.
Example 12.1.1
A digraph.
We will give one example of generalising a result on graphs, to the context of digraphs. In order to do so, we need a definition.
Definition: Word
The outvalency or outdegree of a vertex v in a digraph is the number of arcs whose first entry is v , i.e.,
|{w ∈ V |(v, w) ∈ A}|. (12.1.2)
The invalency or indegree of a vertex v in a digraph is the number of arcs whose second entry is v .
Notation
The outvalency of vertex v is denoted by d +
(v) . The invalency of vertex v is denoted by d −
(v) .
v∈V v∈V
Proof
For the left-hand side of the equation, at every vertex we count the number of arcs that begin at that vertex. Since each of
these arcs ends at some vertex, we get the same result in the middle part of the equation, where at every vertex we count the
number of arcs that end at that vertex. In each case, we have counted every arc precisely once, so both of these values are
equal to the right-hand side of the equation, the number of arcs in the digraph.
12.1.1 https://math.libretexts.org/@go/page/60133
Exercise 12.1.1
1. Use induction to prove Euler’s handshaking lemma for digraphs that have no loops (arcs of the form (v , v ) or multiarcs
(more than one arc from some vertex u to some other vertex v ).
2. A digraph isomorphism is a bijection on the vertices that preserves the arcs. Come up with a digraph invariant, and prove
that it is an invariant.
3. List the indegree and outdegree of each vertex of the digraph from Example 12.1.1
This page titled 12.1: Directed Graphs is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy Morris.
12.1.2 https://math.libretexts.org/@go/page/60133
12.2: Walks and Connectedness
Graphs can be connected or disconnected. Intuitively, this corresponds to the network being connected or disconnected – is it
possible to travel from any node to any other node? When a graph (or network) is disconnected, it has broken down into some
number of separate connected components - the pieces that still are connected.
Since this is mathematics, we require more formal definitions, to ensure that the meanings are not open to misunderstanding.
Before we can define connectedness, we need the concept of a walk in a graph.
Definition: Walk
A walk in a graph G is a sequence of vertices (u , u 1 2, . . . , un ) such that for every 1 ≤ i ≤ n − 1 , we have ui ∼ ui+1 . (That
is, consecutive vertices in the walk must be adjacent.)
A u − v walk in G is a walk with u 1 =u and u
n =v . (That is, a walk that begins at u and ends at v .)
Definition: Word
The graph G with vertex set V is connected if for every u, v ∈ V , there is a u − v walk.
The connected component of G that contains the vertex u, is
{v ∈ V | there is a u − v walk.} (12.2.1)
This definition of connected component seems to depend significantly on the choice of the vertex u. In fact, though, being in the
same connected component of G is an equivalence relation on the vertices of G, so the connected components of G are a property
of G itself, rather than depending on particular choices of vertices. We won’t go through a formal proof that being in the same
connected component is an equivalence relation (we leave this as an exercise below), but we will go through the proof of a
proposition that is closely related.
Proposition 12.2.1
Let G be a graph, and let u, v , w ∈ V (G) . Suppose that v and w are in the connected component of G that contains the vertex
u. Then w is in the connected component of G that contains the vertex v .
Proof
Since v and w are in the connected component of G that contains the vertex u , by definition there is a u − v walk, and a
u − w walk. Let (u = u , u , . . . , u = w) be a u − w walk, and let (u = v , v , . . . , v
1 2 k = v) be a u − v walk.
1 2 m
We need to show that w is in the connected component of G that contains the vertex v ; by definition, this is equivalent to
showing that there is a v − w walk. Consider the sequence of vertices:
(v = vm , vm−1 , . . . , v1 = u = u1 , u2 , . . . , uk = w). (12.2.2)
Since (u = v , v , . . . , v = v) is a u − v walk, consecutive vertices are adjacent, so consecutive vertices in the first part
1 2 m
walk, consecutive vertices are adjacent, so consecutive vertices in the last part of the given sequence (from u = u through 1
When discussing walks, it is convenient to have standard terminology for describing the length of the walk.
12.2.1 https://math.libretexts.org/@go/page/60134
Unfortunately, there is some disagreement amongst mathematicians as to whether the length of a walk should be used to mean the
number of vertices in the walk, or the number of edges in the walk. We will use the latter convention throughout this course
because it is consistent with the definition of the length of a cycle (which will be introduced in the next section). You should be
aware, though, that you might find the other convention used in other sources.
Sometimes it is obvious that a graph is disconnected from the way it has been drawn, but sometimes it is less obvious. In the
following example, you might not immediately notice whether or not the graph is connected.
Example 12.2.1
Find a walk of length 4 from a to f . Find the connected component that contains a . Is the graph connected?
Solution
A walk from of length 4 from a to f is (a, c, a, c, f ). (Notice that the vertices and edges used in a walk need not be distinct.)
Remember that the length of this walk is the number of edges used, which is one less than the number of vertices in the
sequence!
The connected component that contains a is {a, c, e, f }. There are walks from a to each of these vertices, but there are no
edges between any of these vertices and any of the vertices {b, d, g, h}.
Since there is no walk from a to b (for example), the graph is not connected.
Exercise 12.2.1
1) Prove that being in the same connected component of G is an equivalence relation on the vertices of any graph G.
2) Is the following graph connected? Find the connected component that contains a . Find a walk of length 5 from a to f .
3) Is the following graph connected? Find the connected component that contains a . Find a walk of length 3 from a to d .
4) Use Euler’s Handshaking Lemma to prove (by contradiction) that if G is a connected graph with n vertices and n−1
12.2.2 https://math.libretexts.org/@go/page/60134
5) Fix n ≥ 1 . Prove by induction on m that for any m ≥ 0 , a graph with n vertices and m edges has at least n − m connected
components.
This page titled 12.2: Walks and Connectedness is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy
Morris.
12.2.3 https://math.libretexts.org/@go/page/60134
12.3: Paths and Cycles
Recall the definition of a walk. As we saw in Example 12.2.1, the vertices and edges in a walk do not need to be distinct.
There are many circumstances under which we might not want to allow edges or vertices to be re-visited. Efficiency is one possible
reason for this. We have a special name for a walk that does not allow vertices to be re-visited.
Definition: Path
A walk in which no vertex appears more than once is called a path.
Notation
For n ≥ 0 , a graph on n + 1 vertices whose only edges are those used in a path of length n (which is a walk of length n that is
also a path) is denoted by P . (Notice that P ≅K and P ≅K .)
n 0 1 1 2
Notice that if an edge were to appear more than once in a walk, then both of its endvertices would also have to appear more than
once, so a path does not allow vertices or edges to be re-visited.
Example 12.3.1
In the graph
is a path of length 3. However, (a, f , c, h, d, f ) is not a path, even though no edges are repeated, since the vertex f
(a, f , c, h)
Proposition 12.3.1
Suppose that u and v are in the same connected component of a graph. Then any u − v walk of minimum length is a path. In
particular, if there is a u − v walk, then there is a u − v path.
Proof
Since u and v are in the same connected component of a graph, there is a u − v walk.
Towards a contradiction, suppose that we have a u − v walk of minimum length that is not a path. By the definition of a
path, this means that some vertex x appears more than once in the walk, so the walk looks like:
(u = u1 , . . . , ui = x, . . . , uj = x, . . . , uk = v), (12.3.1)
Since consecutive vertices were adjacent in the first sequence, they are also adjacent in the second sequence, so the second
sequence is a walk. The length of the first walk is k − 1 , and the length of the second walk is k − 1 − (j − i) . Since j > i ,
the second walk is strictly shorter than the first walk. In particular, the first walk was not a u − v walk of minimum length.
This contradiction serves to prove that every u − v walk of minimum length is a path.
This allows us to prove another interesting fact that will be useful later.
12.3.1 https://math.libretexts.org/@go/page/60135
Proposition 12.3.2
Deleting an edge from a connected graph can never result in a graph that has more than two connected components.
Proof
Let G be a connected graph, and let u be an arbitrary edge of G . If G ∖ {u } is connected, then it has only one connected
v v
component, so it satisfies our desired conclusion. Thus, we assume in the remainder of the proof that G ∖ {u } is not v
connected.
Let G denote the connected component of G ∖ {u } that contains the vertex u , and let G denote the connected
u v v
component of G ∖ {u } that contains the vertex v . We aim to show that G and G are the only connected components of
v u v
G ∖ { u }.
v
Let x be an arbitrary vertex of G , and suppose that x is a vertex that is not in G . Since G is connected, there is a u − x
u
walk in G , and therefore by Proposition 12.3.1 there is a u − x path in G . Since x is not in G , this u − x path must use
u
the edge u − v , so must start with this edge since u only occurs at the start of the path. Therefore, by removing the vertex
u from the start of this path, we obtain a v − x path that does not use the vertex u . This path cannot use the edge u , so v
Since x was arbitrary, this shows that every vertex of G must be in one or the other of the connected components G and u
arbitrary connected graph, this shows that deleting any edge of a connected graph can never result in a graph with more
than two connected components.
A cycle is like a path, except that it starts and ends at the same vertex. The structures that we will call cycles in this course, are
sometimes referred to as circuits.
Definition: Cycle
A walk of length at least 1 in which no vertex appears more than once, except that the first vertex is the same as the last, is
called a cycle.
Notation
For n ≥ 3 , a graph on n vertices whose only edges are those used in a cycle of length n (which is a walk of length n that is
also a cycle) is denoted by C . n
The requirement that the walk have length at least 1 only serves to make it clear that a walk of just one vertex is not considered a
cycle. In fact, a cycle in a simple graph must have length at least 3.
Example 12.3.2
In the graph from Example 12.3.1, (a, e, f , a) is a cycle of length 3, and (b, g, d, h, c, f , b) is a cycle of length 6.
Here are drawings of some small paths and cycles:
12.3.2 https://math.libretexts.org/@go/page/60135
We end this section with a proposition whose proof will be left as an exercise.
Proposition 12.3.3
Suppose that G is a connected graph. If G has a cycle in which u and v appear as consecutive vertices (so u is an edge of G)
v
then G ∖ {u } is connected.
v
Exercise 12.3.1
1) In the graph
5) Let G be a (simple) graph on n vertices. Suppose that G has the following property: whenever u ≁v ,
dG(u) + dG(v) ≥ n − 1 . Prove that G is connected.
This page titled 12.3: Paths and Cycles is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy Morris.
12.3.3 https://math.libretexts.org/@go/page/60135
12.4: Trees
A special class of graphs that arise often in graph theory, is the class of trees. If a mathematician suspects that something is true for
all graphs, one of the first families of graphs for which s/he will probably try to prove it, is the family of trees because their strong
structure makes them much easier to work with than many other families of graphs.
Notice that the graph P is a tree, for every n ≥ 1 . We prove some important results about the structure of trees.
n
Proposition 12.4.1
Let T be a connected graph with no cycles. Then deleting any edge from T disconnects the graph.
Proof
If T has no edges, the statement is vacuously true. We may thus assume that T has at least one edge. Let {u, v} be an
arbitrary edge of T . (Since a loop is a cycle, we must have u ≠ v even if we were not assuming that our graphs are simple.)
Towards a contradiction, suppose that deleting {u, v} from T does not disconnect T . Then by the definition of a connected
graph, there is a u − v walk in T ∖ {u, v} . By Proposition 12.3.1, the shortest u − v walk in T ∖ {u, v} must be a u − v
path. If we take this same walk in T and add u to the end, this will still be a walk in T since T contains the edge uv. Since
the walk in T ∖ {u, v} was a path, no vertices were repeated. Adding u to the end of this walk makes a walk (certainly of
length at least 2) in which no vertex is repeated except that the first and last vertices are the same: by definition, a cycle.
Thus, T has a cycle, contradicting our hypothesis. This contradiction serves to prove that deleting any edge from T
disconnects the graph.
Since a tree is a connected graph with no cycles, this shows that deleting any edge from a tree will disconnect the graph.
Proposition 12.4.2
Every tree that has at least one edge, has at least two leaves.
Proof
We prove this by strong induction on the number of vertices. Notice that a (simple) graph on one vertex must be K , which 1
has no edges, so the proposition does not apply. Therefore our base case will be when there are 2 vertices.
Base case: n = 2 . Of the two (unlabeled) graphs on 2 vertices, only one is connected: K (or P ; these are isomorphic).
2 1
Both of the vertices have valency 1, so there are two leaves. This completes the proof of the base case.
Induction step: We begin with the strong inductive hypothesis. Let k ≥ 2 be arbitrary. Suppose that for every 2 ≤ i ≤ k ,
every tree with i vertices has at least two leaves. (Since i ≥ 2 and a tree is a connected graph, every tree on i vertices has at
least one edge, so we may omit this part of the hypothesis.)
Let T be a tree with k + 1 vertices. Since k + 1 > 1 , T has at least one edge. Choose any edge {u, v} of T , and delete it.
By Proposition 12.4.1, the resulting graph is disconnected. By Proposition 12.3.2, it cannot have more than two connected
components, so it must have exactly two connected components. Furthermore, by the proof of that proposition, the
components are T (the connected component that contains the vertex u ) and T (the connected component that contains
u v
the vertex v ).
Since T has no cycles, neither do T or T . Since they are connected components, they are certainly connected. Therefore,
u v
both T and T are trees. Since u is not a vertex of T and v is not a vertex of T , each of these trees has at most k vertices.
u v v u
If both T and T have at least two vertices, then we can apply our induction hypothesis to both. This tells us that T and
u v u
12.4.1 https://math.libretexts.org/@go/page/60136
not v . Deleting u from T did not change the valency of either x or y, so x and y must also have valency 1 in T . Therefore
v
T has at least two leaves. This completes the induction step and therefore the proof, in the case where T and T each have
u v
at least two vertices. We must still consider the possibility that at least one of T and T has only one vertex.
u v
Since k + 1 ≥ 3 , at least one of T and T must have two vertices, so only one of them can have only one vertex. Without
u v
loss of generality (since nothing in our argument so far made any distinction between u and v , we can switch u and v if we
need to), we may assume that T has only one vertex, and T has at least two vertices. Applying our induction hypothesis
v u
to T , we conclude that T has some leaf x that is not u , and that is also a leaf of T . Furthermore, since T has only one
u u v
vertex, this means that deleting the edge uv left v as an isolated vertex, so u was the only edge incident with v in T .
v
Therefore, v is a leaf of T . Thus, T has at least two leaves: x and v . This completes the induction step and therefore the
proof, in the case where at least one of T and T has only one vertex. Since we have dealt with all possibilities, this
u v
Proposition 12.4.3
Theorem 12.4.1
Proof
We will prove that the statements are equivalent by showing that 1 ⇒ 2 ⇒ 3 ⇒ 4 ⇒ 1 . Thus, by using a sequence of
implications, we see that any one of the statements implies any other.
(1 ⇒ 2) We assume that T is a tree, and we would like to deduce that T is connected and has n−1 edges. By the
definition of a tree, T is connected. We will use induction on n to show that T has n − 1 edges.
Base case: n = 1 . There is only one (unlabeled) graph on one vertex, it is K1 , so T ≅K1 , which has no edges. Since
0 = n−1 , this completes the proof of the base case.
Inductive step: We begin with the inductive hypothesis. Let k ≥1 be arbitrary, and suppose that every tree on k vertices
has k − 1 edges.
Let T be an arbitrary tree with k + 1 vertices. Since k + 1 ≥ 2 and T is connected, T must have at least one edge, so by
Proposition 12.4.2, T has at least two leaves. Let v be a leaf of T . By Proposition 12.4.3, T ∖ {v} is a tree. Also, T ∖ {v}
has k vertices, so we can apply our induction hypothesis to conclude that T ∖ {v} has k − 1 edges. Since v was a leaf, T
has precisely one more edge than T ∖ {v}, so T must have k = (k + 1) − 1 edges. This completes our inductive step.
By the Principle of Mathematical Induction, every tree on n vertices has n − 1 edges.
(2 ⇒ 3) We assume that T is connected and has n − 1 edges. We need to deduce that T has no cycles.
Towards a contradiction, suppose that T has a cycle. Repeatedly delete edges that are in cycles until no cycles remain. By
Proposition 12.3.3 (used repeatedly), the resulting graph is connected, so by definition it is a tree. Since we have already
proven that 1 ⇒ 2 , this tree must have n − 1 edges. Since we started with n − 1 edges and deleted at least one (based on
our assumption that T has at least one cycle), this is a contradiction. This contradiction serves to prove that T must not
have any cycles.
(3 ⇒ 4) We assume that T has no cycles, and has n − 1 edges. We must show that T is connected, and that deleting any
edge leaves a disconnected graph. We begin by showing that T is connected; we prove this by induction on n .
12.4.2 https://math.libretexts.org/@go/page/60136
Base case: n = 1 . Then T ≅K1 is connected.
Inductive step: We begin with the inductive hypothesis. Let k ≥ 1 be arbitrary, and suppose that every graph on k vertices
that has k − 1 edges and no cycles, is connected.
Let T be an arbitrary graph with k + 1 vertices that has k edges and no cycles. By Euler’s handshaking lemma,
v∈V
v∈V
(this is a lot like the Pigeonhole Principle in concept, but the Pigeonhole Principle itself doesn’t apply to this situation).
Since 2k < 2(k + 1) , there must be some vertex v that does not have valency 2 or more. Delete v . In so doing, we delete at
most 1 edge, since v has at most 1 incident edge. Thus, the resulting graph has k vertices and k or k − 1 edges, and since
T has no cycles, neither does T ∖ {v}.
If T ∖ {v} has k edges, then deleting any of the edges results in a graph on k vertices with no cycles and k − 1 edges,
which by our inductive hypothesis must be connected. Therefore T ∖ {v} is a connected graph that remains connected after
any edge is deleted. By Proposition 12.4.1 (in the contrapositive), this means that T ∖ {v} must contain a cycle, but this is a
contradiction. This contradiction serves to prove that T ∖ {v} cannot have k edges.
Thus, T ∖ {v} has k − 1 edges and k vertices, and no cycles. By our inductive hypothesis, T ∖ {v} must be connected.
Furthermore, the fact that T ∖ {v} has k − 1 edges means that v is incident to an edge, which must have its other endvertex
in T ∖ {v}. Therefore T is connected. This completes the inductive step.
By the Principle of Mathematical Induction, every graph on n vertices with no cycles and n − 1 edges is connected.
It remains to be shown that deleting any edge leaves a disconnected graph, but now that we know that T is connected, this
follows from Proposition 12.4.1.
(4 ⇒ 1) We assume that T is connected, but deleting any edge leaves a disconnected graph. By the definition of a tree, we
must show that T has no cycles. This follows immediately from Proposition 12.3.3.
Exercise 12.4.1
This page titled 12.4: Trees is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy Morris.
12.4.3 https://math.libretexts.org/@go/page/60136
12.5: Summary
Many definitions and results about graphs can be generalised to the context of digraphs.
Important Definitions:
Digraph
Arc
Walk
Length of a walk
Connected
Connected component
Path, Cycle
Tree, Forest, Leaf
Notation:
Pn
Cn
12.5: Summary is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by LibreTexts.
12.5.1 https://math.libretexts.org/@go/page/72992
CHAPTER OVERVIEW
This page titled 13: Euler and Hamilton is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy Morris.
1
13.1: Euler Tours and Trails
To introduce these concepts, we need to know about some special kinds of walks.
Recall the historical example of the bridges of Königsberg. The problem of finding a route that crosses every bridge exactly once,
is equivalent to finding an Euler trail in the corresponding graph. If we want the route to begin and end at the same place (for
example, someone’s home), then the problem is equivalent to finding an Euler tour in the corresponding graph.
Euler tours and trails are important tools for planning routes for tasks like garbage collection, street sweeping, and searches.
Example 13.1.1
In the graph
Here is Euler’s method for finding Euler tours. We will state it for multigraphs, as that makes the corresponding result about Euler
trails a very easy corollary.
Theorem 13.1.1
A connected graph (or multigraph, with or without loops) has an Euler tour if and only if every vertex in the graph has even
valency.
Proof
As the statement is if and only if, we must prove both implications.
\((⇒) Suppose we have a multigraph (possibly with loops) that has an Euler tour,
where e = |E| . Let u be an arbitrary vertex of the multigraph. Every time u appears in the tour, exactly two of the edges
incident with u are used: if u = u , then the edges used are u u and u u
j unless j = 1 or j = e + 1 in which case
j−1 j j j−1
u =u =u 1 and the edges are u u and uu (and we consider this as one appearance of u in the tour). Therefore, if u
e+1 e 2
13.1.1 https://math.libretexts.org/@go/page/60138
appears k times in the tour, then since by the definition of an Euler tour all edges incident with u are used exactly once, we
conclude that u must have valency 2k. Since u was an arbitrary vertex of the multigraph and k (the number of times u
appears in the tour) must be an integer, this shows that the valency of every vertex must be even.
(⇐) Suppose we have a connected multigraph in which the valency of every vertex is even. Consider the following
algorithm (which will be the first stage of our final algorithm):
Make u (some arbitrary vertex) our active vertex, with a list L of all of the edges of E . Make u the first vertex in a new
sequence C of vertices. Repeat the following step as many times as possible:
Call the active vertex v . Choose any edge vx in L that is incident with v . Add x (the other endvertex of this edge) to the
end of C , and make x the new active vertex. Remove vx from L.
We claim that when this algorithm terminates, the sequence C will be a tour (though not necessarily an Euler tour) in the
multigraph. By construction, C is a walk, and C cannot use any edge more than once since each edge appears in L only
once and is removed from L once it has been used, so C is a trail. We need to show that the walk C is closed.
The only way the algorithm can terminate is if L contains no edge that is incident with the active vertex. Towards a
contradiction, suppose that this happens at a time when the active vertex is y ≠ u . Now, y has valency 2r in the multigraph
for some integer r , so there were 2r edges in L that were incident with y when we started the algorithm. Since y ≠ u ,
every time y appears in C before this appearance, we removed 2 edges incident with y from L (one in the step when we
made y the active vertex, and one in the following step). Furthermore, we removed one additional edge incident with y
from L in the final step, when we made y the active vertex again. Thus if there are t previous appearances of y in C , we
have removed 2t + 1 edges incident with y from L. Since 2r is even and 2t + 1 is odd, there must still be at least one edge
incident with y that is in L, contradicting the fact that the algorithm terminated. This contradiction shows that, when the
algorithm terminates, the active vertex must be u , so the sequence C is a closed walk. Since C is a trail, we see that C
must be a tour.
If the tour C is not an Euler tour, let y be the first vertex that appears in C for which there remains an incident edge in L.
Repeat the previous algorithm starting with y being the active vertex, and with L starting at its current state (not all of E ).
The result will be a tour beginning and ending at y that uses only edges that were not in C . Insert this tour into C as
follows: if C = (u = u . . . , y = u , . . . , u = u) and the new tour is (y = v , . . . , v = y) , then the result of inserting
1 i k 1 j
Example 13.1.2
Use the algorithm described in the proof of the previous result, to find an Euler tour in the following graph.
Solution
Let’s begin the algorithm at a . As E = L is a large set, we won’t list the remaining elements every time we choose a new
active vertex in the early stages. An easy method for you to keep track of the edges still in L is to colour the edges that are no
longer in L (the edges we use) with a different colour as we go.
There are many different possible outcomes for the algorithm since there are often many acceptable choices for the next active
vertex. One initial set of choices could be
13.1.2 https://math.libretexts.org/@go/page/60138
C = (a, b, f , e, a, f , g, a).
The first stage of the algorithm terminates at this point since all four edges incident with a have been used. At this point, we
have
L = {bg, bh, cd, cf , cg, ch, df , dg, dh, gh}.
The first vertex in C that is incident with an edge in L is b . We run the first stage of the algorithm again with b as the initial
active vertex and this list for L. Again, there are many possible outcomes; one is (b, g, h, b).
We insert (b, g, h, b) into C , obtaining a new C = (a, b, g, h, b, f , e, a, f , g, a) . At this point, we have
L = {cd, cf , cg, ch, df , dg, dh}.
Now g is the first vertex in C that is incident with an edge in L. We run the first stage of the algorithm again with g as the
initial active vertex and the current L. One possible outcome is (g, c, f , d, g).
Inserting this into C yields a new
C = (a, b, g, c, f , d, g, h, b, f , e, a, f , g, a).
At this point, we have L = {cd, ch, dh}. The first vertex in C that is incident with an edge in L is c . We run the first stage of
the algorithm one final time with c as the initial active vertex and L = {cd, ch, dh}. This time there are only two possible
outcomes: (c, d, h, c) or (c, h, d, c). We choose (c, d, h, c).
Inserting this into C yields our Euler tour:
C = (a, b, g, c, d, h, c, f , d, g, h, b, f , e, a, f , g, a).
Corollary 13.1.1
A connected graph (or multigraph, with or without loops) has an Euler trail if and only if at most two vertices have odd
valency.
Proof
Suppose we have a connected graph (or multigraph, with or without loops), G . Since the statement is if and only if, there
are two implications to prove.
(⇒) Suppose that G has an Euler trail. If the trail is closed then it is a tour, and by Theorem 13.1.1, there are no vertices of
odd valency. If the trail is not closed, say it is a u − v walk. Add an edge between u and v to G , creating a new graph G ∗
(note that G may be a multigraph if uv was already an edge of G , even if G wasn’t a multigraph), and add u to the end of
∗
the Euler trail in G , to create an Euler tour in G . By Theorem 13.1.1, the fact that G has an Euler tour means that every
∗ ∗
vertex of G has even valency. Now, the vertices of G all have the same valency in G as they have in G , with the
∗ ∗
exception that the valencies of u and v are one higher in G than in G . Therefore, in this case there are exactly two vertices
∗
that G may be a multigraph if uv was already an edge of G , even if G wasn’t a multigraph). Now in G every vertex has
∗ ∗
even valency, so G has an Euler tour. In fact, a careful look at the algorithm given in the proof of Theorem 13.1.1 shows
∗
that we may choose u and v (in that order) to be the first two vertices in this Euler tour, so that uv (the edge that is in G ∗
but not G ) is the first edge used in the tour. Now if we delete u from the start of this Euler tour, the result is an Euler trail in
G that starts at v and ends at u .
13.1.3 https://math.libretexts.org/@go/page/60138
Exercise 13.1.1
For each of the following graphs, is there an Euler tour? Is there an Euler trail? If either exists, find one; if not, explain why
not.
1)
2)
3)
4) If it is possible, draw a graph that has an even number of vertices and an odd number of edges, that also has an Euler tour. If
that isn’t possible, explain why there is no such graph.
5) Which complete graphs have an Euler tour? Of the complete graphs that do not have an Euler tour, which of them have an
Euler trail?
Exercise 13.1.2
Sylvia’s cat is missing. She wants to look for it in all the nearby streets, but she is tired and doesn’t want to walk any farther
than she must. Find an efficient route for Sylvia to take through her neighbourhood so that she starts and ends at home and
walks through each street exactly once. The location of Sylvia’s house is marked with a house-shaped symbol ( ).
This page titled 13.1: Euler Tours and Trails is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy Morris.
13.1.4 https://math.libretexts.org/@go/page/60138
13.2: Hamilton Paths and Cycles
Sometimes, rather than traveled along every connection in a network, our object is simply to visit every node of the network. This
relates to a different structure in the corresponding graph.
The definitions of path and cycle ensure that vertices are not repeated. Hamilton paths and cycles are important tools for planning
routes for tasks like package delivery, where the important point is not the routes taken, but the places that have been visited.
In 1857, William Rowan Hamilton first presented a game he called the “icosian game.” It involved tracing edges of a dodecahedron
in such a way as to visit each corner precisely once. In fact, two years earlier Reverend Thomas Kirkman had sent a paper to the
Royal Society in London, in which he posed the problem of finding what he called closed polygons in polyhedra; a closed polygon
he defined as a circuit of edges that passes through each vertex just once. Thus, Kirkman had posed a more general problem prior
to Hamilton (and made some progress toward solving it); nonetheless, it is Hamilton for whom these structures are now named. As
we’ll see later when studying Steiner Triple Systems in design theory, Kirkman was a gifted mathematician who seems to have
been singularly unlucky in terms of receiving proper credit for his achievements. As his title indicates, Kirkman was a minister who
pursued mathematics on the side, as a personal passion.
Hamilton managed to convince the company of John Jacques and sons, who were manufacturers of toys (including high-quality
chess sets) to produce and market the “icosian game.” It was produced under the name Around the World, and sold in two forms: a
flat board, or an actual dodecahedron. In both cases, nails were placed at the corners of the dodecahedron representing cities, and
the game was played by wrapping a string around the nails, traveled only along edges, visiting each nail once, and ending at the
starting point. Unfortunately, the game was not a financial success. It is not very difficult and becomes uninteresting once solved.
The thick edges form a Hamilton cycle in the graph of the dodecahedron:
Not every connected graph has a Hamilton cycle; in fact, not every connected graph has a Hamilton path.
Figure 13.2.1 : A graph with a Hamilton path but no Hamilton cycle. (Copyright; author via source)
Figure 13.2.2 : A graph with no Hamilton path. (Copyright; author via source)
Unfortunately, in contrast to Euler’s result about Euler tours and trails (given in Theorem 13.1.1 and Corollary 13.1.1), there is no
known characterisation that enables us to quickly determine whether or not an arbitrary graph has a Hamilton cycle (or path). This
is a hard problem in general. We do know of some necessary conditions (any graph that fails to meet these conditions cannot have a
Hamilton cycle) and some sufficient conditions (any graph that meets these must have a Hamilton cycle). However, many graphs
fail to meet any of these conditions. There are also some conditions that are either necessary or sufficient for the existence of a
Hamilton path.
Here is a necessary condition for a graph to have a Hamilton cycle.
13.2.1 https://math.libretexts.org/@go/page/60139
Theorem 13.2.1
If G is a graph with a Hamilton cycle, then for every S ⊂V with S ≠∅ , V , the graph G∖ S has at most |S| connected
components.
Proof
Let C be a Hamilton cycle in G . Fix an arbitrary proper, nonempty subset S of V .
One at a time, delete the vertices of S from C . After the first vertex is deleted, the result is still connected, but has become
a path. When any of the subsequent |S| − 1 vertices is deleted, it either breaks some path into two shorter paths (increasing
the number of connected components by one) or removes a vertex at an end of some path (leaving the number of connected
components unchanged, or reducing it by one if this component was a P ). So C ∖ S has at most 1 + (|S| − 1) = |S|
0
connected components.
Notice that if two vertices u and v are in the same connected component of C ∖ S , then they will also be in the same
connected component of G ∖ S . This is because adding edges can only connect things more fully, reducing the number of
connected components. More formally, if there is a u − v walk in C , then any pair of consecutive vertices in that walk is
adjacent in C so is also adjacent in G . Therefore the same walk is a u − v walk in G . This tells us that the number of
connected components of G ∖ S is at most the number of connected components of C ∖ S , which we have shown to be at
most |S|.
Example 13.2.1
When a non-leaf is deleted from a path of length at least 2, the deletion of this single vertex leaves two connected components.
So no path of length at least 2 contains a Hamilton cycle.
Here’s a graph in which the non-existence of a Hamilton cycle might be less obvious without Theorem 13.2.1. Deleting the
three white vertices leaves four connected components.
As you might expect, if all of the vertices of a graph have sufficiently high valency, it will always be possible to find a
Hamilton cycle in the graph. (In fact, generally the graph will have many different Hamilton cycles.) Before we can formalise
this idea, it is helpful to have an additional piece of notation.
13.2.2 https://math.libretexts.org/@go/page/60139
Notation
We use δ to denote the minimum valency of a graph, and Δ to denote its maximum valency. If we need to clarify the graph
involved, we use δ(G) or Δ(G).
|V |
If G is a graph with vertex set V such that |V | ≥ 3 and δ(G) ≥ , then G has a Hamilton cycle.
2
Proof
n
Towards a contradiction, suppose that G is a graph with vertex set V , that |V | = n ≥ 3 , and that δ(G) ≥ , but G has no
2
Hamilton cycle.
Repeat the following as many times as possible: if there is an edge that can be added to G without creating a Hamilton
cycle in the resulting graph, add that edge to G . When this has been done as many times as possible, call the resulting graph
H . The graph H has the same vertex set V , and since we have added edges we have not decreased the valency of any
n
vertex, so we have δ(H ) ≥ . Now, H still has no Hamilton cycle, but adding any edge to H gives a graph that does have
2
a Hamilton cycle.
Since complete graphs on at least three vertices always have Hamilton cycles (see Exercise 13.2.1(1)), we must have
H ≆ K , so there are at least two vertices of H , say u and v , that are not adjacent. By our construction of H from G ,
n
adding the edge uv to H would result in a Hamilton cycle, and this Hamilton cycle must use the edge uv (otherwise it
would be a Hamilton cycle in H , but H has no Hamilton cycle). Thus, the portion of the Hamilton cycle that is in H forms
a Hamilton path from u to v . Write this Hamilton path as
(u = u1 , u2 , . . . , un = v) (13.2.3)
That is, S is the set of vertices that appear immediately before a neighbour of u on the Hamilton path, while T is the set of
vertices on the Hamilton path that are neighbours of v . Notice that v = u ∉ S since u
n isn’t defined, and v = u ∉ T
n+1 n
since our graphs are simple (so have no loops). Thus, u ∉ S ∪ T , so |S ∪ T | < n .
n
Towards a contradiction, suppose that for some i, u i ∈ S∩T . Then by the definitions of S and T , we have u ∼ u i+1 and
v ∼ u , so:
i
is a Hamilton cycle in H , which contradicts our construction of H as a graph that has no Hamilton cycle. This
contradiction serves to prove that |S ∩ T | = ∅ .
Now we have
(the last equality comes from Inclusion-Exclusion). But we have seen that |S ∪ T | < n and |S ∩ T | = 0 , so this gives
n
This contradicts δ(H ) ≥ , since
2
13.2.3 https://math.libretexts.org/@go/page/60139
|V |
This contradiction serves to prove that no graph G with vertex set V such that |V | ≥ 3 and δ(G) ≥ can fail to have a
2
Hamilton cycle.
In fact, the statement of Dirac’s theorem was improved by Bondy and Chvatal in 1974. They began by observing that the proof
given above for Dirac’s Theorem actually proves the following result.
Lemma 13.2.1
Suppose that G is a graph on n vertices, u and v are nonadjacent vertices of G, and d(u) + d(v) ≥ n . Then G has a Hamilton
cycle if and only if the graph obtained by adding the edge uv to G has a Hamilton cycle.
Definition: Closure
Let G be a graph on n vertices. The closure of G is the graph obtained by repeatedly joining pairs of nonadjacent vertices u
Before they were able to work with this definition, they had to prove that the closure of a graph is well-defined. In other words,
since there will often be choices involved in forming the closure of a graph (if more than one pair of vertices satisfy the condition,
which edge do we add first?), is it possible that by making different choices, we might end up with a different graph at the end? The
answer, fortunately, is no; any graph has a unique closure, as we will now prove.
Lemma 13.2.2
Proof
Let (e , . . . , e ) be one sequence of edges we can choose to arrive at the closure of G , and let the resulting closure be the
1 ℓ
graph G . Let (f , . . . , f ) be another such sequence, and let the resulting closure be the graph G . We will prove by
1 1 m 2
Base case: ℓ = 1 , so only the edge {u 1, v1 } is added to G in order to form G . Since this was the first edge added, we must
1
have
dG (u1 ) + dG (v1 ) ≥ n (13.2.8)
Since G is a closure of G , it has no pair of nonadjacent edges whose valencies sum to n or higher, so u must be adjacent
2 1
to v in G . Since the edge u v was not in G , it must be in {f , . . . , f }. This completes the proof of the base case.
1 2 1 1 1 m
Inductive step: We begin with the inductive hypothesis. Let k ≥ 1 be arbitrary (with k ≤ ℓ ), and suppose that
e1 , . . . , ek ∈ { f1 , . . . , fm }. (13.2.10)
Consider e = {u
k+1 ,v } . Let G be the graph obtained by adding the edges e
k+1 k+1 0 1, . . . , ek to G . Since e k+1 was chosen
to add to G to form G , it must be the case that
0 1
By our induction hypothesis, all of the edges of G are also in G , so this means
0 2
13.2.4 https://math.libretexts.org/@go/page/60139
By the Principle of Mathematical Induction, G contains all of the edges of G . Since there was nothing special about G
2 1 2
as distinct from G , we could use the same proof to show that G contains all of the edges of G . Therefore, G and G
1 1 2 1 2
have the same edges. Since they also have the same vertices (the vertices of G ), they are the same graph. Thus, the closure
of any graph is unique.
This allowed Bondy and Chvatal to deduce the following result, which is stronger than Dirac’s although as we’ve seen the proof is
not significantly different.
Theorem 13.2.3
A simple graph has a Hamilton cycle if and only if its closure has a Hamilton cycle.
Proof
Repeatedly apply Lemma 13.2.1.
Corollary 13.2.1
A simple graph on at least 3 vertices whose closure is complete, has a Hamilton cycle.
Proof
This is an immediate consequence of Theorem 13.2.3 together with the fact (see Exercise 13.2.1(1)) that every complete
graph on at least 3 vertices has a Hamilton cycle.
Exercise 13.2.1
2) Find the closure of each of these graphs. Can you easily tell from the closure whether or not the graph has a Hamilton cycle?
a)
b)
3) Use Theorem 13.2.1 to prove that this graph does not have a Hamilton cycle.
13.2.5 https://math.libretexts.org/@go/page/60139
4) Prove that if G has a Hamilton path, then for every nonempty proper subset S of V , G−S has no more than |S| + 1
connected components.
5) For the two graphs in Exercise 13.2.1(2), either find a Hamilton cycle or use Theorem 13.2.1 to show that no Hamilton cycle
exists.
This page titled 13.2: Hamilton Paths and Cycles is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy
Morris.
13.2.6 https://math.libretexts.org/@go/page/60139
13.3: Summary
Algorithms for finding Euler tours and trails
Important Definitions:
Closed walk, trail, tour
Euler tour, Euler trail
Hamilton cycle, Hamilton path
Minimum valency, maximum valency
Closure of a graph
Notation:
δ ,Δ
This page titled 13.3: Summary is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy Morris.
13.3.1 https://math.libretexts.org/@go/page/60140
CHAPTER OVERVIEW
This page titled 14: Graph Coloring is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy Morris.
1
14.1: Edge Coloring
Suppose you have been given the job of scheduling a round-robin tennis tournament with n players. One way to approach the
problem is to model it as a graph: the vertices of the graph will represent the players, and the edges will represent the matches that
need to be played. Since it is a round-robin tournament, every player must play every other player, so the graph will be complete.
Creating the schedule amounts to assigning a time to each of the edges, representing the time at which that match is to be played.
Notice that there is a constraint. When you have assigned a time to a particular edge uv, no other edge incident with either u or v
can be assigned the same time, since this would mean that either player u or player v is supposed to play two games at once.
Instead of writing times on each edge, we will choose a colour to represent each of the time slots, and colour the edges that are to
be played at that time, with that colour.
Here is an example of a possible schedule for the tournament, when n = 7 .
Example 14.1.1
The players are numbered from 1 through7, and we will spread the tournament out over seven days. Games to be played on
each day should have a different colour than the games on other days, but, because this text is printed in black-and-white, we
will use some line patterns, instead of colours. Games to be played on Monday will be drawn as usual. Games to be played on
Tuesday will be thin. Games to be played on Wednesday will be dotted. Games to be played on Thursday will be dashed.
Games to be played on Friday will be thick. Games to be played on Saturday will be grey. Games to be played on Sunday will
be thin and dashed.
This gives a schedule. For anyone who has trouble distinguishing the “colours” of the edges, the normal edges are 12, 37, and
46; the thin edges are 13, 24, and 57; the dotted edges are 15, 26, and 34; the dashed edges are 17, 36, and 45; the thick edges
are 23, 47, and 56; the grey edges are 14, 25, and 67; and the thin dashed edges are 16, 27, and 35.
A proper k-edge-colouring of a graph G is a function that assigns to each edge of G one of k colours, such that edges that
meet at an endvertex must be assigned different colours.
The constraint that edges of the same colour cannot meet at a vertex turns out to be a useful constraint in a number of contexts.
If the graph is large enough we are liable to run out of colours that can be easily distinguished (and we get tired of writing out the
names of colours). The usual convention is to refer to each colour by a number (the first colour is colour 1, etc.) and to label the
edges with the numbers rather than using colours.
Definition: k-Edge-Colourable
A graph G is k-edge-colourable if it admits a proper k -edge-colouring. The smallest integer k for which G is k -edge-
colourable is called the edge chromatic number, or chromatic index of G.
14.1.1 https://math.libretexts.org/@go/page/60143
Notation
The chromatic index of G is denoted by χ (G), or simply by χ if the context is unambiguous.
′ ′
Proposition 14.1.1
Proof
Recall that Δ(G) denotes the maximum value of d(v) over all vertices v of G . So there is some vertex v of G such that
d(v) = Δ(G) . In any proper edge-colouring, the d(v) edges that are incident with v , must all be assigned different colours.
Thus, any proper edge-colouring must have at least d(v) = Δ(G) distinct colours. This means χ (G) ≥ Δ(G) . ′
Example 14.1.2
with one edge of a given colour, there cannot be more than 3 edges coloured with any given colour (3 edges are already
incident with 6 of the 7 vertices, and a fourth edge would have to be incident with two others).
We know that K has ( ) = 21 edges, so if at most 3 edges can be coloured with any given colour, we will require at least
7
7
2
7
14.1.1. Our next example shows that it is sometimes possible to achieve equality in that bound
Example 14.1.3
In case the edge colours are difficult to distinguish, the thick edges are 12, 36, and 45; the thin edges are 13, 24, and 56; the
dotted edges are 14, 26, and 35; the dashed edges are 15, 23, and 46; and the grey edges are 16, 25, and 34. This shows that
χ (K ) ≤ 5 . Since the valency of every vertex of K is 5 , Proposition 14.1.1 implies that χ (K ) ≥ 5 . Putting these together,
′ ′
6 6 6
The following rather remarkable result was proven by Vadim Vizing in 1964:
Proof
We will not go over the proof of this theorem.
14.1.2 https://math.libretexts.org/@go/page/60143
Definition: Class One Graph and Class Two Graph
If χ (G) = Δ(G) then G is said to be a class one graph, and if χ (G) = Δ(G) + 1 then G is said to be a class two graph.
′ ′
To date, graphs have not been completely classified according to which graphs are class one and which are class two, but it has
been proven that “almost every” graph is of class one. Technically, this means that if you choose a random graph out of all of the
graphs on at most n vertices, the probability that you will choose a class two graph approaches 0 as n approaches infinity.
There are, however, infinitely many class two graphs; the same argument we used to show that ′
χ (K7 ) ≥ 7 can also be used to
prove that χ (K
′
) = 2n + 1 for any positive integer n , since the number of edges is
2n+1
(2n + 1)(2n)
= n(2n + 1) (14.1.1)
2
and each colour can only be used to colour n of the edges. Since Δ(K 2n+1 ) = 2n , this shows that K 2n+1 is class two.
Large families of graphs have been shown to be class one graphs. We will devote most of the rest of this section to proving that all
of the graphs in one particular family are class one. First, we need to define the family.
Definition: Bipartition
A graph is bipartite if its vertices can be partitioned into two sets V and V , such that every edge of the graph has one of its
1 2
endvertices in V , and the other in V . The sets V and V form a bipartition of the graph.
1 2 1 2
Example 14.1.4
The following graphs are bipartite. Every edge has one endvertex on the left side, and one on the right.
The graph K is not bipartite if n ≥ 3 . The first vertex may as well go into V ; the second vertex is adjacent to it, so must go
n 1
into V ; but the third vertex is adjacent to both, so cannot go into either V or V .
2 1 2
Although the following class of bipartite graphs will not be used in this chapter, they are an important class of bipartite graphs that
will come up again later.
The complete bipartite graph, K , is the bipartite graph on m + n vertices with as many edges as possible subject to the
m,n
constraint that it has a bipartition into sets of cardinality m and n . That is, it has every edge between the two sets of the
bipartition.
Before proving that all bipartite graphs are class one, we need to understand the structure of bipartite graphs a bit better. Here is an
important theorem.
Theorem 14.1.2
Proof
This is an if and only if statement, so we have two implications to prove.
14.1.3 https://math.libretexts.org/@go/page/60143
(⇒) We prove the contrapositive, that if G contains a cycle of odd length, then G cannot be bipartite.
Let
be a cycle of odd length in G . We try to establish a bipartition V and V for G . Without loss of generality, we may assume
1 2
that v ∈ V . Then we must have v ∈ V since v ∼ v . Continuing in this fashion around the cycle, we see that for every
1 1 2 2 2 1
1 ≤ i ≤ k , we have v ∈ V
2i+1 and v ∈ V . In particular, v
1 2i 2 ∈ V , but v ∈ V and v ∼ v
2k+1 1 , contradicting the
1 1 1 2k+1
fact that every edge must have one of its endvertices in V . Thus, G is not bipartite.
2
(⇐) Let G be a graph that is not bipartite. We must show that there is an odd cycle in G .
If every connected component of G is bipartite, then G is bipartite (choose one set of the bipartition from each connected
component; let V be the union of these, and V the set of all other vertices of G ; this is a bipartition for G ). Thus there is
1 2
Place all of their neighbours into V . Repeat this process, at each step putting all of the neighbours of every vertex of V
1 1
Since this component is not bipartite, at some point we must run into the situation that we place a vertex v into V , but a j
neighbour u of v is also in V (for some j ∈ {1, 2}). By our construction of V and V , there must be a walk from u to v
1 j 1 2 1
that alternates between vertices in V and vertices in V . By Proposition 12.3.1, there must in fact be a path from u to v
j 3−j 1
that alternates between vertices in V and vertices in V . Since the path alternates between the two sets but begins and
j 3−j
ends in V , it has even length. Therefore, adding u to the end of this path yields a cycle of odd length in G.
j 1
In order to prove that bipartite graphs are class one, we require a lemma.
Lemma 14.1.1
Let G be a connected graph that is not a cycle of odd length. Then G the edges of G can be 2-coloured so that edges of both
colours are incident with every non-leaf vertex. (Note: this will probably not be a proper 2-edge-colouring of G.)
Proof
We first consider the case where every vertex of G has even valency.
Choose a vertex v of G subject to the constraint that if any vertex of G has valency greater than 2, then v is such a vertex.
Since every vertex of G has even valency, we can find an Euler tour of G that begins and ends at v . Alternate edge colours
around this tour. Clearly, every vertex that is visited in the middle of the tour (that is, every vertex except possibly v ) must
be incident with edges of both colours, since whichever colour is given to the edge we travel to reach that vertex, the other
colour will be given to the edge we travel when leaving that vertex. If any vertex of G has valency greater than 2, then by
our choice of v , the valency of v must be greater than 2, so v is visited in the middle of the tour, and this colouring has the
desired property. If every vertex of G has valency 2, then since G is connected, G must be a cycle (see Exercise 12.3.1(4).
Since G is not a cycle of odd length (by hypothesis), G must be a cycle of even length. Therefore the number of edges of
G is even, so the tour will begin and end with edges of opposite colours, both of which are incident with v . Again we see
14.1.4 https://math.libretexts.org/@go/page/60143
edge uv, so during any other visit, the edges we travel on to reach v and to leave v will have different colours. Neither of
these edges is incident with u , so both are in G . Thus, this colouring has the desired property.
Notation
Given a (not necessarily proper) edge-colouring C , we use c(v) to denote the number of distinct colours that have be used on
edges that are incident with v . Clearly, c(v) ≤ d(v) .
v∈V v∈V
then we must have c(v) = d(v) for every v ∈ V . This is precisely equivalent to the definition of a proper colouring.
At last, we are ready to prove that bipartite graphs are class one.
Theorem 14.1.3
Proof
Let G be a bipartite graph. Towards a contradiction, suppose that χ (G) > Δ(G) .
′
Let C be an optimal Δ(G) -edge-colouring of G . By assumption, C will not be a proper edge colouring, so there must be
some vertex u such that c(u) < d(u) . By the Pigeonhole Principle, some colour j must be used to colour at least two of
the edges incident with u , and since there are Δ(G) ≥ d(u) colours in total and only c(u) are used on edges incident with
u , there must be some colour i that is not used to colour any edge incident with u .
Consider only the edges of G that have been coloured with either i or j in the colouring C . Since G is bipartite, these edges
cannot include an odd cycle. We apply Lemma 14.1.1 to each connected component formed by these edges to re-colour
these edges. Our re-colouring will use only colours i and j , and if a vertex v was incident to at least two edges coloured
with either i or j in C , then under the re-colouring, v will be incident with at least one edge coloured with i and at least one
edge coloured with j . Leave all of the other edge colours alone, and call this new colouring C . ′
We claim that C is an improvement on C . Any vertex v that had at most one incident edge coloured with either i or j under
′
C , will still have exactly the same colours except that the edge coloured i or j might have switched its colour to the other of
i and j . In any case, we will have c (v) = c(v) . Any vertex v that had at least two incident edges coloured with either i or j
′
under C , will still have all of the same colours except that it will now have incident edges coloured with both i and j , so
c (v) ≥ c(v) . Furthermore, we have c (u) > c(u) since the edges incident with u now include edges coloured with both i
′ ′
and j , where before there were only edges coloured with j . Thus,
′
∑ c (v) = ∑ c(v) (14.1.4)
v∈V v∈V
so C is an improvement on C , as claimed.
′
We have contradicted our assumption that C is an optimal Δ(G) -edge-colouring. This contradiction serves to prove that
χ (G) = Δ(G) .
′
14.1.5 https://math.libretexts.org/@go/page/60143
Exercise 14.1.1
5) Find a systematic approach to colouring the edges of complete graphs that demonstrates that
′ ′
χ (K2n−1 ) = χ (K2n ) = 2n − 1 .
6) Find a systematic approach to colouring the edges of complete bipartite graphs that demonstrates that
) = max{m, n} .
′
χ (K ) = Δ(K
m,n m,n
Exercise 14.1.2
The following exercises illustrate some of the connections between Hamilton cycles and edge-colouring.
1) Definition. A graph is said to be Hamilton-connected if there is a Hamilton path from each vertex in the graph to each of the
other vertices in the graph. Prove that if G is bipartite and has at least 3 vertices, then G is not Hamiltonconnected.
[Hint: Prove this by contradiction. Consider the length of a Hamilton path and where it can end.]
2) Suppose that G is a bipartite graph with V and V forming a bipartition. Show that if |V 1| ≠ |V 2| then G has no Hamilton
1 2
cycle.
3) Prove that if every vertex of G has valency 3, and G has a Hamilton cycle, then G is class one.
[Hint: Use the corollary to Euler’s handshaking lemma, and find a way to assign colours to the edges of the Hamilton cycle.]
This page titled 14.1: Edge Coloring is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy Morris.
14.1.6 https://math.libretexts.org/@go/page/60143
14.2: Ramsey Theory
Although Ramsey Theory is an important part of Combinatorics (along with Enumeration, Graph Theory, and Design Theory), this
course will touch on it only very lightly. The basic idea is that if a very large object is cut into two pieces (or a small number of
pieces), then at least one of the pieces must contain a very nice subset. Here is an illustration.
Example 14.2.1
Suppose each edge of K is coloured either red or blue. Show that either there is a triangle whose edges are all red, or there is a
6
triangle whose edges are all blue. That is, K contains a copy of K that has all of its edges of the same colour. For short, we
6 3
Solution
Choose some vertex v . Since the 5 edges incident with v are coloured with only two colours, the generalized Pigeonhole
Principle implies that three of these edges are the same colour. For definiteness, let us say that three edges vu , vu , and vu 1 2 3
desired monochromatic triangle (namely, a blue triangle). So we may assume that one of the edges is red; say, u u is red. 1 2
Since the edges vu and vu are also red, we see that v , u , and u are the vertices of a monochromatic triangle (namely, a red
1 2 1 2
triangle).
1) Suppose each edge of K is coloured either red or blue. We say there is a red copy of K if there exist k vertices
n k
u , . . . , u , such that the edge u u is red for all i and j (with i ≠ j ). Similarly, we say there is a blue copy of K if there
1 k i j ℓ
exist ℓ vertices v , . . . , v , such that the edge v v is blue for all i and j (with i ≠ j ).
1 ℓ i j
2) The Ramsey number R(k, ℓ) is the smallest number n , such that whenever each edge of K is coloured either red or blue, n
Example 14.2.2
1) We have R(k, 1) = 1 for all k . This is because K has no edges, so, for any colouring of any K , it is true (vacuously) that
1 n
2) We have R(k, 2) = k for all k . Namely, if some edge is blue, then there is a blue K , while if there are no blue edges, then
2
3) We have R(3, 3) = 6 . To see this, note that Example 14.2.1 shows R(3, 3) ≤ 6 , while the edge-colouring of K at right has 5
no monochromatic triangle (because the only monochromatic cycles are of length 5), so R(3, 3) > 5 .
4) We have R(k, ℓ) = R(ℓ, k) for all k and ℓ . Namely, if every colouring of K has either a red K or a blue K , then we see
n k ℓ
that every colouring of K must have either a red K or a blue K , just by switching red and blue in the colouring.
n ℓ k
5) If k ≤ k and ℓ ≤ ℓ , then R(k, ℓ) ≤ R(k , ℓ ) . Namely, we have a colouring of K that contains either a red K or a blue
′ ′ ′ ′
n
′
k
K . Since k ≤ k and ℓ ≤ ℓ , we know that any K contains a copy of K , and any K contains a copy of K .
′ ′
′ ′ ′
ℓ k k ℓ ℓ
It is not at all obvious that R(k, ℓ) exists: theoretically, R(4, 4) might not exist, because it might be possible to colour the edges of
a very large K in such a way that there is no monochromatic K . Fortunately, the following extension of the proof of Example
n 4
14.2.1 https://math.libretexts.org/@go/page/60144
14.2.1 implies that R(k, ℓ) does exist for all k and ℓ . In fact, it provides a bound on how large R(k, ℓ) can be (see Exercise
14.2.1(3) below).
Proposition 14.2.1
Proof
Let n = R(k − 1, ℓ) + R(k, ℓ − 1) , and suppose each edge of K is coloured either red or blue. We wish to show there is
n
so the very generalized Pigeonhole Principle implies that either R(k − 1, ℓ) of these edges are red, or R(k, ℓ − 1) of these
edges are blue.
For definiteness, let us assume that the edges vu , vu , . . . , vu are all blue, where r = R(k, ℓ − 1) . Now, u , u , . . . , u
1 2 r 1 2 r
are the vertices of a copy of K that is inside K . Since r = R(k, ℓ − 1) , we know that this K contains either a red K or
r n r k
a blue K .
ℓ−1
Exercise 14.2.1
[Hint: Show that there must either be two red (say) triangles, or a red triangle and a blue edge whose endvertices are not in the
triangle. Then show that any colouring of the edges joining the red triangle with the blue edge creates either a blue triangle or a
second red triangle.
Note
The exact value of R(k, ℓ) seems to be extremely difficult to find, except for very small values of k and ℓ . For example,
although it has been proved that R(4, 4) = 18 and R(4, 5) = 25, no one has been able to determine the precise value of
R(k, ℓ) for any situation in which k and ℓ are both at least 5 . The legendary combinatorist Paul Erdös (1913–1996) said that it
would be hopeless to try to calculate the exact value of R(6, 6), even with all of the computer resources and brightest minds in
the whole world working on the problem for a year. (We do know that R(6, 6) is somewhere between 102 and 165.) For more
information about the values that have been calculated, see the Wikipedia article on Ramsey’s theorem.
Exercise 14.2.2
The edges of K can also be coloured with more than two colours.
n
1) Show every colouring of the edges of K 17 with 3 colours has a monochromatic triangle.
2) Suppose there is a monochromatic triangle in every colouring of the edges of K with c colours. Show that if n
N − 1 > (c + 1)(n − 1) , then every colouring of the edges of K with c + 1 colours has a monochromatic triangle.
N
14.2.2 https://math.libretexts.org/@go/page/60144
Similar arguments (combined with induction on the number of colours) establish the following very general result.
such that for any c -colouring of the edges of K , there must be some i ∈ {1, . . . , c} such that K has a subgraph isomorphic to
r r
Proof
We will prove this result by induction on c, the number of colours.
Base cases: When c = 1 , all edges of K are coloured with our single colour, so if we let r r = R(n1 ) = n1 , the whole
graph is the K all of whose edges have been coloured with colour 1.
n1
We will also require c = 2 to be a base case in our induction. In order to prove this second base case, we perform a second
proof by induction, this time on n + n . To make the proof easier to read, we’ll call the two colours in any 2-edge-
1 2
colouring red and blue, and if all of the edges of a K have been coloured with one colour, we’ll simply call it a red K , or a
i i
blue K . i
Base case for the second induction: We’ll actually prove a lot of base cases all at once. Since n and n are the number of 1 2
vertices of a complete graph, we must have n , n ≥ 1 . A K has no edges, so vacuously its edges have whichever colour
1 2 1
we desire. Thus if n = 1 or n = 1 , we have r = R(n , n ) = 1 , since for any 2-edge-colouring of K , there is a red K
1 2 1 2 1 1
and a blue K . 1
Inductive step for the second induction: We begin with the inductive hypothesis. Let k ≥ 2 be arbitrary. Assume that for
every k , k ≥ 1 such that k + k = k , there is some integer r = R(k , k ) such that for any 2-edge-colouring of the
1 2 1 2 1 2
Let n , n ≥ 1 such that n + n = k + 1 . If either n = 1 or n = 1 , then this was one of our base cases and the proof
1 2 1 2 1 2
inductive hypothesis, there is some integer r1 = R(n , n − 1) such that for any 2-edge-colouring of the edges of K , 1 2 r1
there is a subgraph that is either a red K or a blue K . We can also use our inductive hypothesis to conclude that there
n1 n2 −1
is some integer r = R(n − 1, n ) such that for any 2-edge-colouring of the edges of K , there is a subgraph that is
2 1 2 r2
We claim that R(n , n ) ≤ r + r . We will show this by proving that any 2-edge-colouring of the edges of
1 2 1 2 Kr
1
+ r2
Consider a complete graph on r + r vertices whose edges have been coloured with red and blue. Choose a vertex v , and
1 2
divide the remaining vertices into two sets: u ∈ V if the edge uv has been coloured red, and u ∈ V if the edge uv has 1 2
r1 + r2 = | V1 | + | V2 | + 1 (14.2.3)
subgraph that is either a red K or a blue K . In the latter case, this subgraph is also in our original K
n1 −1 and we are
n2 r1 +r2
done. In the former case, the subgraph whose vertices are the elements of V ∪ {v} has a red K and we are done. 1 n1
Suppose now that |V | ≥ r (the proof is similar). Since r = R(n , n − 1) , the subgraph whose vertices are the
2 1 1 1 2
original K and we are done. In the latter case, the subgraph whose vertices are the elements of V ∪ {v} has a blue
r1 +r2 2
By the Principle of Mathematical Induction, for every n , n ≥ 1 , there is some integer 1 2 r = R(n1 , n2 ) such that for any
colouring of the edges of K , there is a subgraph that is either a red K or a blue K .
r n1 n2
14.2.3 https://math.libretexts.org/@go/page/60144
This second proof by induction completes the proof of the second base case for our original induction on c, the number of
colours. We are now ready for the inductive step for our original proof by induction.
Inductive step: We begin with the inductive hypothesis. Let m ≥ 2 be arbitrary. Assume that for every k , . . . , k ≥ 1 , 1 m
there is an integer r = R(k , . . . , k ) such that for any mcolouring of the edges of K , there must be some
1 m r
i ∈ {1, . . . , m} such that K has a subgraph isomorphic to K , all of whose edges have been coloured with colour i.
r ki
vertices, and colour its edges with m + 1 colours. Temporarily consider the colours m and m + 1 to be the same, resulting
in a colouring of the edges with m colours. By our inductive hypothesis, there must either be some i ∈ {1, . . . , m − 1}
such that our K has a subgraph isomorphic to K , all of whose edges have been coloured with colour i, or K has a
r ni r
subgraph isomorphic to K all of whose edges have been coloured with the m colour (where this m colour is
R( nm , nm+ 1 )
th th
coloured with colour i, then we are done. The possibility remains that our K has a subgraph isomorphic to K
r all R( nm , nm+ 1 )
of whose edges have been coloured with either colour m or colour m + 1 . But by our base case for c = 2 , this graph must
have either a subgraph isomorphic to K all of whose edges have been coloured with colour m, or a subgraph isomorphic
nm
to K nm+ 1 all of whose edges have been coloured with colour m + 1 . This completes the inductive step.
By the Principle of Mathematical Induction, for every c ≥ 1 and fixed sizes n , . . . , n ≥ 1 , there is an integer
1 c
r = R(n , . . . , n ) such that for any c -colouring of the edges of K , there must be some i ∈ {1, . . . , c} such that K has a
1 c r r
subgraph isomorphic to K all of whose edges have been coloured with colour i.
ni
Exercise 14.2.3
Exercise 14.2.4
(Schur’s Theorem) Let c ∈ N , and let N = R(3, . . . , 3) where there are c entries (all equal to 3). If {A , A , . . . , A
+
1 2 c} is any
partition of {1, 2, . . . , N } into c subsets, show that some A contains three integers x, y , and z , such that x + y = z .
i
[Hint: The vertices of K are 1, 2, . . . , N. Put colour i on each edge uv with |u − v| ∈ A . If u, v , w are the vertices of a
N i
(u − v) + (v − w) = u − w .]
This page titled 14.2: Ramsey Theory is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy Morris.
14.2.4 https://math.libretexts.org/@go/page/60144
14.3: Vertex Colouring
Suppose you have been given the task of assigning broadcast frequencies to transmission towers. You have been given a list of
frequencies that you are permitted to assign. There is a constraint: towers that are too close together cannot be assigned the same
frequency, since they would interfere with each other.
One way to approach this problem is to model it as a graph. The vertices of the graph will represent the towers, and the edges will
represent towers that can interfere with each other. Your job is to assign a frequency to each of the vertices. Instead of writing a
frequency on each vertex, we will choose a colour to represent that frequency, and use that colour to colour the vertices to which
you assign that frequency.
Here is an example of this.
Example 14.3.1
This represents a possible assignment of 4 colours to the vertices. The colour of each vertex (red, green, blue, or yellow) is
indicated by writing the first letter of the colour’s name on the vertex).
Notice that this colouring obeys the constraint that interfering towers are not assigned the same frequencies.
Notice that this colouring obeys the constraint that interfering towers are not assigned the same frequencies.
A proper k-vertex-colouring (or just k -colouring) of a graph G is a function that assigns to each vertex of G one of k
As with edge-colouring, the constraint that adjacent vertices receive different colours turns out to be a useful constraint that arises
in many contexts. We often represent the k colours by the numbers 1, . . . , k, and label the vertices with the appropriate numbers
rather than colouring them.
Definition: k-Colourable
A graph G is k -colourable if it admits a proper k -(vertex-)colouring. The smallest integer k for which G is k -colourable is
called the chromatic number of G.
Notation
The chromatic number of G is denoted by χ(G) , or simply by χ if the context is unambiguous.
14.3.1 https://math.libretexts.org/@go/page/60145
Proposition 14.3.1
Example 14.3.2
Prove that for a graph G, χ(G) = 2 if and only if G is a bipartite graph that has at least one edge.
Solution
(⇒) Suppose that χ(G) = 2 . Take a proper 2-colouring of G with colours 1 and 2. Let V denote the set of vertices of colour
1
1, and let V denote the set of vertices of colour 2. Since the colouring is proper, there are no edges both of whose endvertices
2
are in V (as these would be adjacent vertices both coloured with colour 1). Similarly, there are no edges both of whose
1
endvertices are in V . Thus, the sets V and V form a bipartition of G, so G is bipartite. Since 2 colours were required to
2 1 2
colour the vertices of V with colour 2. By the definition of a bipartition, no pair of adjacent vertices can have been assigned
2
the same colour. Thus, this is a proper 2-colouring of G, so χ(G) ≤ 2 . Since G has at least one edge, the endpoints of that
edge must be assigned different colours, so χ(G) ≥ 2 . Thus χ(G) = 2 .
Example 14.3.3
odd length is not bipartite (see Theorem 14.1.2), Example 14.3.2 shows that χ(C ) ≠ 2 , so χ(C 2n+1) ≥ 3 . Let the cycle be
2n+1
(u , u , . . . , u
1 2 , u ). Since the only edges in the graph are between consecutive vertices in this list, if we assign colour 1 to
2n+1 1
u , colour 2 to u
1 for 1 ≤ i ≤ n , and colour 3 to u
2i for 1 ≤ i ≤ n , this will be a proper 3-colouring. Thus, χ(C
2i+1 ) = 3. 2n+1
Definition: k-Critical
A graph G is k-critical if χ(G) = k , but for every proper subgraph H of G, χ(H ) < χ(G) .
Proposition 14.3.2
Proof
Towards a contradiction, suppose that G is a disconnected k -critical graph, and let G and G be (nonempty) subgraphs of
1 2
G such that every vertex of G is in either G or G , and there is no edge from any vertex in G to any vertex in G . By the
1 2 1 2
definition of k -critical, χ(G ) < χ(G) and χ(G ) < χ(G) . But if we colour G with χ(G ) colours and G with χ(G )
1 2 1 1 2 2
colours, since there is no edge from any vertex of G to any vertex of G , this produces a proper colouring of G with
1 2
colours. This contradiction serves to prove that every k -critical graph is connected.
Theorem 14.3.1
Proof
Towards a contradiction, suppose that G is k -critical and has a vertex v of valency at most k − 2 . By the definition of k -
critical, G ∖ {v} must be (k − 1) -colourable. Now, since v has no more than k − 2 neighbours, its neighbours can be
14.3.2 https://math.libretexts.org/@go/page/60145
assigned at most k − 2 distinct colours in this colouring. Therefore, amongst the colours used in the (k − 1) -colouring of
G ∖ {v} , there must be a colour that is not assigned to any of the neighbours of v . If we assign this colour to v , the result is
a proper (k − 1) -colouring of G , contradicting χ(G) = k . This contradiction serves to prove that every k -critical graph
has minimum valency at least k − 1 .
Corollary 14.3.1
Proof
Let G be an arbitrary graph. By deleting as many edges and vertices as it is possible to delete without reducing the
chromatic number (we can never increase the chromatic number by deleting vertices or edges, see Exercise 14.3.1(1)), we
see that G must have a subgraph H that is χ(G) -critical. By Theorem 14.3.1, we see that
δ(H ) ≥ χ(G) − 1. (14.3.2)
Thus, every vertex of H has valency at least χ(G) − 1 , so in G , these same vertices still have valency at least χ(G) − 1 .
For any such vertex v , we have
Δ(G) ≥ d(v) ≥ χ(G) − 1, (14.3.3)
so χ(G) ≤ Δ(G) + 1 .
We have already seen two families of graphs for which this bound is attained: for complete graphs, we have
(see Example 14.3.3). In fact, Brooks proved in 1941 that these are the only connected graphs for which this bound is obtained.
Proof
We will not include the proof of this result in this course. This theorem does allow us to determine the chromatic number of
some graphs with very little work.
Example 14.3.4
The following very famous graph is called the Petersen graph. It is an exceptional graph in many ways, so when
mathematicians are trying to come up with a proof or a counterexample in graph theory, it is often one of the first examples
they will check. Find its chromatic number.
Solution
14.3.3 https://math.libretexts.org/@go/page/60145
We have Δ = 3 , and since this graph is neither a complete graph nor a cycle of odd length, by Brooks’ Theorem this shows
that χ ≤ 3 . We can find a cycle of length 5 around the outer edge of the graph, so this graph is not bipartite but has an edge.
Therefore (by Example 14.3.2), χ > 2 . Hence χ = 3 .
Exercise 14.3.1
more than j neighbours. What (if anything) can you say about χ(G) ? Can you say more if you know that G is connected and is
neither a complete graph nor a cycle of odd length?
Exercise 14.3.2
For each of the following graphs, determine its chromatic number by using theoretical arguments to provide a lower bound,
and then producing a colouring that meets the bound. Do the same for the edge-chromatic number.
1)
2)
3)
4)
14.3.4 https://math.libretexts.org/@go/page/60145
5)
6)
This page titled 14.3: Vertex Colouring is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy Morris.
14.3.5 https://math.libretexts.org/@go/page/60145
14.4: Summary
Vizing’s Theorem
Graphs are bipartite if and only if they contain no cycle of odd length
Ramsey’s Theorem
Graphs are bipartite if and only if they are 2-colourable
Brooks’ Theorem
Petersen graph
Important Definitions:
Proper k -edge-colouring, k -edge-colourable
Edge chromatic number, chromatic index
Class one graph, class two graph
Bipartite, bipartition
Complete bipartite graph
Proper k -colouring, k -colourable
Chromatic number
k -critical
Notation:
′
χ (G)
Km,n
R(n1 , . . . , nc )
χ(G)
This page titled 14.4: Summary is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy Morris.
14.4.1 https://math.libretexts.org/@go/page/60146
CHAPTER OVERVIEW
This page titled 15: Planar Graphs is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy Morris.
1
15.1: Planar Graphs
Visually, there is always a risk of confusion when a graph is drawn in such a way that some of its edges cross over each other. Also,
in physical realisations of a network, such a configuration can lead to extra costs (think about building an overpass as compared
with building an intersection). It is therefore helpful to be able to work out whether or not a particular graph can be drawn in such a
way that no edges cross.
Definition: Planar
A graph is planar if it can be drawn in the plane (R ) so edges that do not share an endvertex have no points in common, and
2
Example 15.1.1
Theorem 15.1.1
Proof
Label the vertices of K as v , . . . , v . Consider the 3-cycle (v , v , v , v ). The vertex v must lie either inside or outside
5 1 5 1 2 3 1 4
the boundary determined by this 3-cycle. Furthermore, since there is an edge between v and v , the vertex v must lie on
4 5 5
Suppose first that v and v lie inside the boundary. The edges v v , v v , and v v divide the area inside the boundary
4 5 1 4 2 4 3 4
into three regions, and v must lie inside one of these three regions. One of v , v , and v is not a corner of this region, and
5 1 2 3
in fact lies outside of it while v lies inside of it, making it impossible to draw the edge from this vertex to v .
5 5
The proof is similar if v and v lie on the outside of the boundary determined by the 3-cycle (v
4 5 1, v2 , v3 , v1 ) .
Theorem 15.1.2
Proof
Label the vertices in one of the bipartition sets as v , v , v , and the vertices in the other part as u , u , u . Consider the 4-
1 2 3 1 2 3
cycle
15.1.1 https://math.libretexts.org/@go/page/60149
(v1 , u1 , v2 , u2 , v1 ). (15.1.1)
The vertex v must lie either inside or outside the boundary determined by this 4-cycle. Furthermore, since there is an edge
3
between v and u , the vertex u must lie on the same side (inside or outside) as v .
3 3 3 3
Suppose first that v and u lie inside the boundary. The edges v u and v u divide the area inside the boundary into two
3 3 3 1 3 2
regions, and u must lie inside one of these two regions. One of v and v does not lie on the boundary of this region, and
3 1 2
in fact lies outside of it while u lies inside of it, making it impossible to draw the edge from this vertex to u .
3 3
The proof is similar if v and u lie on the outside of the boundary determined by the 4-cycle (v
3 3 1, u1 , v2 , u2 , v1 ) .
However, both K and K can be embedded onto the surface of what we call a torus (a doughnut shape), with no edges meeting
5 3,3
except at mutual endvertices. Embeddings are shown in Figures 15.1.1 and 15.1.2.
Figure 15.1.1 : K embedded on a torus. The dotted edge wraps around through the hole in the torus. (Copyright; author via source)
5
You might think at this point that every graph can be embedded on the torus without edges meeting except at mutual endvertices,
but this is not the case. In fact, for any surface there are graphs that cannot be embedded in that surface (without any edges meeting
except at mutual endvertices).
For any embedding of a planar graph, there is another embedded planar graph that is closely related to it, which we will now
describe. Notice that a planar embedding partitions the plane into regions.
Definition: Faces
The regions into which a planar embedding partitions the plane, are called the faces of the planar embedding.
Figure 15.1.2 : K3,3 embedded on a torus. The dotted edge wraps around through the hole in the torus. (Copyright; author via
source)
Example 15.1.2
In these drawings, we have labeled the faces of the two planar embeddings with f , f , etc., to show them clearly.
1 2
15.1.2 https://math.libretexts.org/@go/page/60149
Notation
We use F (G) (or simply F if the graph is clear from the context) to denote the set of faces of a planar embedding.
It is possible that the dual graph of a planar embedding will not be a simple graph, even if the original graph was simple.
Example 15.1.3
Here we show how to find the planar duals of the embeddings given in Example 15.1.2. We include the original embedding as
above; the grey vertices and dashed edges are the vertices and edges of the dual graph.
Note that the second graph has loops and multiedges. Note also that although f and f meet at a vertex in the embedding of
1 5
the first graph, they are not adjacent in the dual since they do not share a common edge.
Some other useful observations:
|E(G)| = |E(G )|
∗
, and every dashed edge crosses exactly one black edge;
the valency of the vertex f in G is equal to the number of edges you trace, if you trace around the perimeter of the face f in
i
∗
i
G (so edges that dangle inside the face get counted twice).
Proposition 15.1.1
The dual graph of a planar embedding has a natural planar embedding, so is a planar graph. Furthermore, (G ∗ ∗
) =G .
Proof
Both of these facts follow fairly directly from the definitions.
Example 15.1.4
Be careful! – Two different planar embeddings of the same graph may have nonisomorphic dual graphs, as we show here.
In the planar dual of the embedding on the left, f will have valency 3; f and f will have valency 4; and f 4 will have
1 2 3
valency 7. In the planar dual of the embedding on the right, f will have valency 3; f will have valency 5; f will have
1 2 4
valency 4, and f will have valency 6. Since the lists 3, 4, 4, 7 and 3, 4, 5, 6 are not permutations of each other, the planar
3
15.1.3 https://math.libretexts.org/@go/page/60149
Before moving on to other related topics, we present a classification of planar graphs. This is a theorem by Kuratowski (from
whose name the notation for complete graphs is taken). He proved this result in 1930.
We need one new definition.
Definition: Subdivision
An edge uv can be subdivided by placing a vertex somewhere along its length. Technically, this means deleting uv , adding a
new vertex x, and adding the edges ux and vx .
A subdivision of a graph is a graph that is obtained by subdividing some of the edges of the graph.
Example 15.1.5
An example is shown in Figure 15.1.3. The white vertices are the new ones.
Proof
One direction of the proof is fairly straightforward, since we have already proven that K5 and K3,3 are not planar.
However, we won’t try to prove this theorem in this course.
A subdivision of K or of K will sometimes be very difficult to find, but efficient algorithms do exist. Typically, to
5 3,3
Example 15.1.6
Solution
Here is a subdivision of K in the given graph. The white vertices are the vertices that are subdividing edges. Unnecessary
3,3
edges have been deleted. The bipartition consists of {a, c, e} and {b, g, h}.
15.1.4 https://math.libretexts.org/@go/page/60149
Exercise 15.1.1
1) Prove that if a graph G has a subgraph H that is not planar, then G is not planar. Deduce that for every n ≥6 , K is not
n
planar.
2) Find a planar embedding of the following graph, and find the dual graph of your embedding:
3) Find a planar embedding of the following graph, and find the dual graph of your embedding:
4) The graph in Example 15.1.6 also has a subgraph that is a subdivision of K . Find such a subgraph.
5
5) Prove that the Petersen graph is not planar. [Hint: Use Kuratowski’s Theorem.]
6) Find planar embeddings of the two graphs pictured below. (These graphs are obtained by deleting an edge from K5 and
deleting an edge from K , respectively.)
3,3
This page titled 15.1: Planar Graphs is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy Morris.
15.1.5 https://math.libretexts.org/@go/page/60149
15.2: Euler’s Formula
Euler came up with a formula that holds true for any planar embedding of a connected graph.
Theorem 15.2.1
If G is a planar embedding of a connected graph (or multigraph, with or without loops), then
|V | − |E| + |F | = 2. (15.2.1)
Proof 1:
We will prove this formula by induction on the number of faces of the embedding. Let G be a planar embedding of a
connected graph (or multigraph, with or without loops).
Base case: If |F | = 1 then G cannot have any cycles (otherwise the interior and exterior of the cycle would be 2 distinct
faces). So G must be a connected graph that has no cycles, i.e., a tree. By Theorem 12.4.1 we know that we must have
|E| = |V | − 1 , so
A tree cannot have any loops or multiple edges, as these form cycles.
Inductive step: We begin by stating our inductive hypothesis. Let k ≥ 1 be arbitrary, and assume that for any planar
embedding of a connected graph (or multigraph, with or without loops) with k faces, |V | − |E| + |F | = 2 .
Let G be a planar embedding of a connected graph with k + 1 ≥ 2 faces. Since trees have only one face, G must have a
cycle. Choose any edge e that is in a cycle of G , and let H = G ∖ {e} . Clearly, we have
|F (H )| = |F (G)| − 1 = k (15.2.4)
since the edge e being part of a cycle must separate two faces of G , which are united into one face of H . Furthermore,
since e was in a cycle and G is connected, by Proposition 12.3.4, H is connected, and H has a planar embedding induced
by the planar embedding of G . Therefore our inductive hypothesis applies to H , so
2 = |V (H )| − |E(H )| + |F (H )|
The above proof is unusual for a proof by induction on graphs, because the induction is not on the number of vertices. If you try to
prove Euler’s formula by induction on the number of vertices, deleting a vertex might disconnect the graph, which would mean the
induction hypothesis doesn’t apply to the resulting graph.
However, there is a different graph operation that reduces the number of vertices by 1, and keeps the graph connected.
Unfortunately, it may turn a graph into a multigraph, so it can only be used to prove a result that holds true for multigraphs as well
as for graphs. This operation is called edge contraction.
′ ′
V (G ) = (V (G) ∖ {u, v}) ∪ { u }, (15.2.6)
15.2.1 https://math.libretexts.org/@go/page/60150
′ ′
E(G ) = ([E(G) ∖ {ux : ux ∈ E(G)}] ∖ {vx : vx ∈ E(G)}) ∪ { u y|uy ∈ E(G) or vy ∈ E(G)} (15.2.7)
If you think of vertices u and v as being connected by a very short elastic that has been stretched out in G, then you can think
of G as the graph you get if you allow the elastic to contract, combining the vertices u and v into a “new” vertex u .
′ ′
Notice that if G is connected, then the graph obtained by contracting any edge of G will also be connected. However, if uv is the
edge that we contract, and u and v have a mutual neighbour x, then in the graph obtained by contracting uv, there will be a
multiple edge between u and x. Also, if G has a planar embedding, then after contracting any edge there will still be a planar
′
embedding. If u ≠ v , then contracting uv reduces the number of vertices by one, reduces the number of edges by one, and does not
change the number of faces.
Now we can use this operation to prove Euler’s formula by induction on the number of vertices
Theorem 15.2.1
If G is a planar embedding of a connected graph (or multigraph, with or without loops), then
|V | − |E| + |F | = 2. (15.2.8)
Proof 2:
Let G be a planar embedding of a connected graph (or multigraph, with or without loops).
Base case: If |V | = 1 then G has one vertex. Furthermore, every edge is a loop. Every loop involves 1 edge, and encloses 1
face. This graph will therefore have one more face than it has loops (since it has one face even if there are no loops). Thus,
|V | − |E| + |F | = 1 − e + (e + 1) = 2. (15.2.9)
Inductive step: We begin by stating our inductive hypothesis. Let k ≥ 1 be arbitrary, and assume that for any planar
embedding of a connected graph (or multigraph, with or without loops) with k vertices, |V | − |E| + |F | = 2 .
Let G be a planar embedding of a connected graph with k + 1 ≥ 2 vertices. Since the graph is connected and has at least
two vertices, it has at least one edge uv, with u ≠ v . Let G be the graph we obtain by contracting uv. Then G is a planar
′ ′
embedding of a connected graph (or multigraph, with or without loops) on k vertices, so our inductive hypothesis applies to
G . Therefore,
′
′ ′ ′
2 = |V (G )| − |E(G )| + |F (G )|
Contraction of edges has some other very important uses in graph theory. Before looking at some corollaries of Euler’s Formula,
we’ll explain one well-known theorem that involves edge contraction and planar graphs.
Definition: Minor
Let G be a graph. Then H is a minor of G if we can construct H fromG by deleting or contracting edges and deleting
vertices.
15.2.2 https://math.libretexts.org/@go/page/60150
It is possible to prove Wagner’s Theorem as an easy consequence of Kuratowski’s Theorem, since if G has a subgraph that is a
subdivision of K or K then contracting all but one piece of each subdivided edge gives us a minor that is isomorphic to K or
5 3,3 5
K 3,3 . Nonetheless, Wagner’s Theorem is important in its own right, as the first example of the much more recent and very powerful
work by Neil Robertson and Paul Seymour on graph minors.
A family is said to be minor-closed if given any graph in the family, any minor of the graph is also in the family. Planar graphs are
an example of a minor-closed family, since the operations of deletion (of edges or vertices) and contraction of edges preserve a
planar embedding. Robertson and Seymour proved the remarkable result that if a family of graphs is minor-closed, then the family
can be characterised by a finite set of “forbidden minors.” That is, for any such family F , there is a finite set L of graphs, such that
G ∈ F if and only if no minor of G appears in L . Wagner’s Theorem tells us that when F is the family of planar graphs,
L = {K , K5 }.
3,3
Corollary 15.2.1
Let G be a connected graph. Then every planar embedding of G has the same number of faces.
Proof
We have |V | − |E| + |F | = 2 . Since |V | and |E| do not depend on the choice of embedding, we have
|F | = 2 + |E| − |V | cannot depend on the choice of embedding.
Corollary 15.2.2
Proof
Fix a planar embedding of G . We move around each face, counting the number of edges that we encounter, and work out
the result in two ways.
First, we look at every face in turn and count how many edges surround that face. Since the graph is simple, every face
must be surrounded by at least 3 edges unless there is only one face. If there is only one face and when moving around this
face we do not count at least 3 edges, then the graph is a tree that has at most one edge, so |V | ≤ 2 . Therefore, our count
will come to at least 3|F |.
Every edge either separates two faces, or dangles into a face. In the former case, it will be counted once each time we move
around one of the two incident faces. In the latter case, it will be counted twice as we move around the face it dangles into:
once when we move inwards along this dangling part, and once when we move back outward. Thus, every edge is counted
exactly twice, so our count will come to exactly 2|E|.
2|E|
Combining these, we see that 2|E| ≥ 3|F |, so |F | ≤ . If G has no cycles of length less than 4, then every face must
3
|E|
be surrounded by at least 4 edges, so the same argument gives 2|E| ≥ 4|F |, so |F | ≤ .
2
Multiplying through by 3 and moving the |E| terms to the right-hand side, gives
3|V | ≥ |E| + 6, (15.2.13)
which can easily be rearranged into the form of our original statement. In the case where G has no cycles of length less
than 4, we obtain instead
15.2.3 https://math.libretexts.org/@go/page/60150
|E|
|V | − |E| + ≥ 2, (15.2.14)
2
so 2|V | ≥ |E| + 4 , which again can easily be rearranged into the form given in the statement of this corollary.
Corollary 15.2.3
Proof
Towards a contradiction, suppose that G is a simple connected planar graph, and for every v ∈ V , d(v) ≥ 6 . Then
v∈V
v∈V
Therefore,
Euler’s Formula (and its corollaries) give us a much easier way to prove that K and K 5 3,3 are non-planar.
Corollary 15.2.4
Proof
In K we have |E| = (
5
5
2
) = 10 , and |V | = 5 . So,
Corollary 15.2.5
Proof
In K 3,3 we have |E| = 9 , and |V | = 6 . So,
Since K 3,3 is bipartite, it has no cycles of length less than 4, so by Corollary 15.2.2, K 3,3 must not be planar
Exercise 15.2.1
1) Use induction to prove an Euler-like formula for planar graphs that have exactly two connected components.
2) Euler’s formula can be generalised to disconnected graphs, but has an extra variable for the number of connected
components of the graph. Guess what this formula will be, and use induction to prove your answer.
3) Find and prove a corollary to Euler’s formula for disconnected graphs, similar to Corollary 15.2.2. (Use your answer to
question 2.)
15.2.4 https://math.libretexts.org/@go/page/60150
4) For graphs embedded on a torus, |V | − |E| + |F | has a different (but constant) value, as long as all of the faces “look like”
discs. (If you are familiar with topology, the faces must be embeddable into a plane, rather than looking like a torus. So putting
a planar embedding of a graph down on one side of a torus doesn’t count.) What is this value?
5) Definition. We say that a planar embedding of a graph is self-dual if it is isomorphic to its planar dual. Prove that if a planar
embedding of the connected graph G is self-dual, then |E| = 2|V | − 2 .
6) Definition. The complement of G is the graph with the same vertices as G, but whose edges are precisely the non-edges of
G. (That is, u is adjacent to v in the complement of G if and only u is not adjacent to v in G.) Therefore, if G is the c
complement of G, then E(K ) is the disjoint union of E(G) and E(G ) . Show that if G is a simple planar graph with at
|V (G)|
c
This page titled 15.2: Euler’s Formula is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy Morris.
15.2.5 https://math.libretexts.org/@go/page/60150
15.3: Map Colouring
Suppose we have a map of an island that has been divided into states.
Traditionally, map-makers colour the different states so as to ensure that if two states share a border, they are not coloured with the
same colour. (This makes it easier to distinguish the borders.) If two states simply meet at a corner, then they may be coloured with
the same colour.
Using additional colours used to add to the cost of producing the map. Also, if there are too many colours they become harder and
harder to distinguish amongst. The question is, without knowing in advance what the map will look like, is there a bound on how
many colours will be required to colour it? If so, what is that bound? In other words, what is the largest number of colours that
might be required to colour some map?
Well over a century ago, mathematicians observed that it never seemed to require more than 4 colours to colour a map. The map
shown above does require 4 colours, since the central rectangular state (marked with an asterisk) and the three states that surround
it must all receive different colours (each shares a border with each of the others). Unfortunately, they couldn’t prove that no more
would ever be required, although a number of purported proofs were published and later found to have errors.
Although the bound of 4 eluded many attempts at proof, in 1890 Percy John Heawood successfully proved that 5 colours suffice to
colour any map. (His method was based on an incorrect proof of the Four Colour Theorem by Kempe, from 1879.) This result is
known as the Five Colour Theorem. Its proof is slightly technical but not difficult, and we will give it in a moment. First, we will
give a very short proof that 6 colours suffice.
Notice that if we turn the map into a graph by placing a vertex wherever borders meet, and an edge wherever there is a border, this
problem is equivalent to finding a proper vertex colouring of the planar dual of this graph. Thus, what we will actually prove is that
the vertices of any planar graph can be properly coloured using 6 (or in the subsequent result, 5) colours. There is a detail that we
are skimming over here: the planar dual could have loops, which would make it impossible to colour the graph. However, this can
only happen if there is a face of the original map that meets itself along a border, which would never happen in a map. The planar
dual might also have multiedges, but this does not affect the number of colours required to properly colour the graph, so we can
delete any multiedges and assume that we are dealing with a simple planar graph.
Proposition 15.3.1
Proof
Towards a contradiction, suppose that there is a planar graph that is not properly 6-colourable. By deleting edges and
vertices, we can find a subgraph G that is a 7-critical planar graph.
By Corollary 15.2.3, we must have δ(G) ≤ 5 since G is planar. But by Theorem 14.3.1, we must have
δ(G) ≥ 7 − 1 = 6 (15.3.1)
since G is 7-critical. This contradiction serves to prove that every planar graph is properly 6-colourable.
15.3.1 https://math.libretexts.org/@go/page/60151
Theorem 15.3.1: Five Colour Theorem
Proof
Towards a contradiction, suppose that there is a planar graph that is not properly 5-colourable. By deleting edges and
vertices, we can find a subgraph G that is a 6-critical planar graph. Since G is planar, Corollary 15.2.3 tells us that
δ(G) ≤ 5 . We also know from Theorem 14.3.1 that δ(G) ≥ 6 − 1 = 5 since G is 6-critical. Let v be a vertex of valency
δ(G) = 5 .
By the definition of a k -critical graph, G ∖ {v} can be properly 5-coloured. Since G itself cannot be properly 5-coloured,
the neighbours of v must all have been assigned different colours in the proper 5-colouring of G ∖ {v} . Let’s label the
neighbours of v as v , v , v , v , and v as they appear clockwise around v . We will call the colour of v blue, the colour
1 2 3 4 5 1
of v purple, the colour of v yellow, and the colour of v green, as shown in the picture. Here is a picture (where, because
2 3 4
this text is printed in black-and-white, we have put the first letter of a colour onto a vertex, instead of actually colouring the
vertex).
Consider the subgraph consisting of the vertices coloured blue or yellow (and all edges between such vertices). If v and v
1 3
are not in the same connected component of this subgraph, then in the connected component that contains v , we could 1
interchange the colours yellow and blue. Since we are doing this to everything in a connected component of the yellow-
blue subgraph, the result will still be a proper colouring, but v now has colour yellow, so v can be coloured with blue. This
1
contradicts the fact that G is 6-critical, so it must be the case that v and v are in the same connected component of the
1 3
yellow-blue subgraph. In particular, there is a walk from v to v that uses only yellow and blue vertices. By Proposition
1 3
12.3.1, there is in fact a path from v to v that uses only yellow and blue vertices.
1 3
Similarly, if we consider the subgraph consisting of the vertices coloured purple or green (and all edges between such
vertices), we see that there must be a path from v to v that uses only purple or green vertices.
2 4
There is no way to draw the yellow-blue path from v to v and the purple-green path from v to v , without the two paths
1 3 2 4
crossing each other. Since the graph is planar, they must cross each other at a vertex, u . Since u is on the yellow-blue path,
it must be coloured either yellow or blue. Since u is on the purple-green path, it must be coloured either purple or green.
It’s not possible to satisfy both of these conditions on the colour of u . This contradiction serves to prove that no planar 6-
critical graph exists, so every planar graph is properly 5-colourable.
In fact, Appel and Haken proved the Four Colour Theorem in 1976.
Proof
Their proof involved considering a very large number of cases – so many that they used a computer to analyse them all.
Although computers are often used in mathematical work now, this was the first proof that could not reasonably be verified
by hand. It was viewed with suspicion for a long time, but is now generally accepted.
One of the methods by which mathematicians attempted unsuccessfully to prove the Four Colour Theorem seemed
particularly promising, and has led to a lot of interesting work in its own right. We require a couple of definitions to explain
the connection.
15.3.2 https://math.libretexts.org/@go/page/60151
Definition: Cubic Graph
A cubic graph is a graph for which all of the vertices have valency 3.
Definition: Bridge
A bridge in a connected graph is an edge whose deletion disconnects the graph.
Theorem 15.3.3
The problem of 4-colouring a planar graph is equivalent to the problem of 3-edge-colouring a cubic graph that has no bridges.
Proof
We’ll prove one direction of the equivalence stated in this theorem; the other direction is a bit more complicated.
Suppose that every planar graph can be properly 4-coloured, and that G is a (simple) bridgeless cubic graph, embedded in
the plane. We’ll show that there is a proper 3-edge-colouring of G . Since G is bridgeless, we don’t run into the problem of
a loop in the planar dual, so the Four Colour Theorem applies to the faces of G . Properly colour the faces of G with colours
red, green, yellow, and black. Every edge of G lies between faces of two distinct colours, by the definition of a proper
colouring of a map. Colour the edges of G according to the following table: if the colours of the faces separated by the edge
e are the colours listed in the left-hand column, then use the colour listed in the right-hand column to colour e .
Let v be an arbitrary vertex. We will show that the three edges incident with v must all receive different colours. Since 3
edges meet at v , three faces also meet at v , and every pair of these faces share an edge. Thus the three faces that meet at v
must all receive different colours. There are four different cases, depending on which colour is not used for a face at v . We
show what happens in the following picture, using R, G , Y , and B to indicate the face colours, and colouring the edges
according to the table above in each case.
In each case, the three edges incident with v are assigned different colours, so this is a proper 3-edge colouring of G .
This theorem was proven by Tait in 1880; he thought that every cubic graph with no bridges must be 3-edge-colourable, and thus
that he had proven the Four Colour Theorem. In fact, Vizing’s Theorem tells us that any cubic graph can be 4-edge-coloured, so we
would only need to reduce the number of colours by 1 in order to prove the Four Colour Theorem. The problem, therefore, boils
down to proving that there are no bridgeless planar cubic graphs that are class two.
In 1881, Petersen published the Petersen graph that we saw previously in Example 14.3.4.
15.3.3 https://math.libretexts.org/@go/page/60151
This graph is cubic and has no bridges, but is not 3-edge-colourable (this can be proved using a case-by-case analysis). Thus, there
exist bridgeless cubic graphs that are class two! Many people have tried to find other examples, as classifying these could provide a
proof of the Four Colour Theorem.
For many years, Martin Gardner wrote a column in the Scientific American about interesting math problems and puzzles. As the
Four Colour Theorem is easy to explain without technical language, it was a topic he wrote about. When writing about the
importance of bridgeless cubic class two graphs, he decided they needed a more appealing name. Since they seemed rare and
elusive, he called them snarks, after Lewis Carroll’s poem “The Hunting of the Snark.” The name has stuck.
Definition: Snark
A snark is a bridgeless cubic class two graph.
Two infinite families and a number of individual snarks are known. There is no reason to believe that these are all of the snarks that
exist. By the Four Colour Theorem, we know that there are no planar snarks; if we could find a direct proof that there are no planar
snarks, this would provide a new proof of the Four Colour Theorem.
Exercise 15.3.1
1) Prove that if a cubic graph G has a Hamilton cycle, then G is a class one graph.
2) Properly 4-colour the faces of the map given at the start of this section.
3) The map given at the start of this section can be made into a cubic graph, by placing a vertex everywhere two borders meet
(including the coast as a border) and edges where there are borders. Use the method from the proof of Theorem 15.3.3 to
properly 3-edge-colour this cubic graph, using your 4-colouring of the faces.
4) Prove that a graph G that admits a planar embedding has an Euler tour if and only if every planar dual of G is bipartite.
5) Prove that if a graph G that admits a planar embedding in which every face is surrounded by exactly 3 edges, G is 3-
colourable if and only if it has an Euler tour.
This page titled 15.3: Map Colouring is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy Morris.
15.3.4 https://math.libretexts.org/@go/page/60151
15.4: Summary
Kuratowski’s Theorem, Wagner’s Theorem
Euler’s Formula
|E| ≤ 3|V | − 6 for a planar graph
Colouring maps
The Five Colour Theorem
The Four Colour Theorem
Important Definitions:
Planar graph, planar embedding
Face
Dual graph, planar dual
Subdividing an edge, subdivision of a graph
Edge contraction, contracting an edge
Minor
Cubic graph
Bridge
Snark
This page titled 15.4: Summary is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy Morris.
15.4.1 https://math.libretexts.org/@go/page/60152
SECTION OVERVIEW
4: Design Theory
16: Latin Squares
16.1: Latin Squares and Sudokus
16.2: Mutually Orthogonal Latin Squares (MOLS)
16.3: Systems of Distinct Representatives
16.4: Summary
17: Designs
17.1: Balanced Incomplete Block Designs (BIBD)
17.2: Constructing Designs and Existence of Designs
17.3: Fisher’s Inequality
17.4: Summary
Thumbnail: The Fano plane is an example of a finite incidence structure, so many of its properties can be established using
combinatorial techniques and other tools used in the study of incidence geometries. Since it is a projective space, algebraic
techniques can also be effective tools in its study. (Public Domain; Gunther via Wikipedia)
This page titled 4: Design Theory is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy Morris.
1
CHAPTER OVERVIEW
This page titled 16: Latin Squares is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy Morris.
1
16.1: Latin Squares and Sudokus
You can think of a Latin square as a Sudoku puzzle that can be of any (square) size, and does not have the requirement that every
value appear in each of the outlined smaller subsquares.
Example 16.1.1
1 2 3 4 (16.1.1)
4 1 2 3
3 4 1 2
2 3 4 1
Solution
Notice that in the above example, we placed the numbers 1, 2, 3, and 4 in the first row, in that order. For each subsequent row,
we shifted the numbers one place to the right (wrapping around). This same technique (placing the numbers from 1 through n
across the first row) will work to construct a Latin square of order n .
So you might think (with reason) that Latin squares aren’t very interesting. However, even knowing that there is a Latin square
of every possible order and they are easy to construct, there remain some interesting related questions.
Some of these questions are related to Sudokus. If we fill in some entries of a Latin square, are there conditions on these entries
that guarantee that this can be completed to a full Latin square? Are there conditions under which we can be sure that a partial
Latin square has a unique completion to a full Latin square?
Some of these questions have easy answers that are not what we are really looking for. For example, if we give you all but one
entry of a Latin square (or Sudoku), then if it can be completed at all, the completion will be unique. However, some
interesting mathematical work has been done on these problems, both for Latin squares and for Sudokus.
There are no known examples in which a Sudoku puzzle with 16 or fewer squares pre-filled, can be completed uniquely.
However, there are tens of thousands of (non-isomorphic) ways of pre-filling 17 entries of a Sudoku puzzle, that have a unique
completion.
Looking only at “non-isomorphic” examples is important, because there are many ways of creating Latin squares (or Sudoku
puzzles) that are essentially the same. The following operations take a Latin square to another Latin square that is structurally
essentially the same:
Permuting of the symbols used in the set N . For example, changing every 1 to a 2 and every 2 to a 1.
Interchanging any two rows.
Interchanging any two columns.
Making all of the rows into columns, and all of the columns into rows.
Exercise 16.1.1
1) Prove that interchanging two rows of a Latin square, yields a Latin square.
2) Complete the following Latin square. Is the completion unique?
1 _ 4 _
_ 1 _ _
_ _ _ 3
_ _ _ _
16.1.1 https://math.libretexts.org/@go/page/60155
3) Use the method described at the start of this chapter to create a Latin square of order 5. What 3 of the operations listed
above that change a Latin square to an isomorphic Latin square, are required to arrive at the following result?
5 1 4 2 3
1 3 5 4 2
4 5 2 3 1
2 4 3 1 5
3 2 1 5 4
4) Show there are exactly two different Latin squares of order 3 whose first row is 1, 2, 3.
5) Show there are exactly twelve different Latin squares of order 3 whose entries are the numbers 1, 2, 3.
[Hint: Use Problem 4.]
6) There are four different Latin squares of order 4 whose first row is 1, 2, 3, 4 and whose first column is also 1, 2, 3, 4. That
is, there are only four ways to complete the following Latin square:
1 2 3 4
2 _ _ _
3 _ _ _
4 _ _ _
This page titled 16.1: Latin Squares and Sudokus is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy
Morris.
16.1.2 https://math.libretexts.org/@go/page/60155
16.2: Mutually Orthogonal Latin Squares (MOLS)
Most of design theory is concerned with creating nice structures in which different combinations of elements occur equally often.
This is the general structure of all of the design theory we will be covering here, and in this context, orthogonal Latin squares are
the natural thing to learn about.
Definition: Orthogonal
Two Latin squares S and S are orthogonal if when we look at each position in turn and consider the ordered pair formed by
1 2
the entry of S in that position, and the entry of S in that position, every possible ordered pair appears.
1 2
So here, we are looking at positions in the structure of Latin squares, and trying to ensure that every ordered pair appears in each
position. Notice that since the set N has n elements, the total number of ordered pairs possible is n (there are n choices for the
2
first entry and n choices for the second entry). A Latin square has n positions since it has n rows and n columns. Thus, if every
2
possible ordered pair appears in each position, then each ordered pair must appear exactly once.
Once again, Euler was involved in the origins of this problem. In fact, the name Latin square comes from his terminology. In 1782,
he posed the problem of arranging 36 officers into a 6 × 6 square. The officers come from 6 different regiments (which he denoted
with the Latin characters a , b , c , d , e , and f ) and each holds one of 6 possible ranks (which he denoted with the Greek characters
α , β, γ, δ , ε , and ζ ). No two officers from the same regiment hold the same rank. The question he posed was, is it possible to
organise the officers into the square so that in each row and each column, there is precisely one officer from each regiment, and
precisely one officer of each rank? Since he was using Greek and Roman letters to denote the classes, he called this a “Graeco-
Latin square.” He chose the first step to consist of arranging the regiments, i.e. for each regiment to set aside 6 positions in the
square to be filled with officers from that regiment. Subsequently, he would try to assign ranks to the officers in these 6 positions.
Since the regiments were denoted by Latin characters, he called this first step a “Latin square.” The Graeco-Latin square of his
question is a pair of orthogonal Latin squares of order 6, since there is to be one officer from each regiment who holds each of the
possible ranks.
Euler could not find a solution to this problem. Since there is also no pair of orthogonal Latin squares of order 2 (and possibly for
other reasons), he conjectured that there is no pair of orthogonal Latin squares of order n for any n ≡ 2 (mod 4). Although Euler
was correct that there is no pair of orthogonal Latin squares of order 6, his conjecture was not true. In 1959–1960, Bose,
Shrikhande, and Parker first found constructions for pairs of orthogonal Latin squares of orders 22 and 10, and then found a general
construction that can produce a pair of orthogonal Latin squares of order n for every n > 6 with n ≡ 2 (mod 4).
Example 16.2.1
1 2 3 1 2 3 (16.2.1)
3 1 2 2 3 1
2 3 1 3 1 2
We see that the ordered pairs (1, 1), (2, 2), and (3, 3) appear in the first row; the pairs (3, 2), (1, 3), and (2, 1) appear in the
second row; and the pairs (2, 3), (3, 1), and (1, 2) appear in the third row. Every possible ordered pair whose entries lie in
{1, 2, 3} has appeared.
There is a nice pattern to the squares given in this example. The first follows the general construction we mentioned at the start of
this chapter. For the second, each row has been shifted one place to the left (rather than to the right) from the one above it. This
construction does actually work for n odd, but never for n even. For example, when n = 4 , it would give
1 2 3 4 1 2 3 4 (16.2.2)
4 1 2 3 2 3 4 1
3 4 1 2 3 4 1 2
2 3 4 1 4 1 2 3
You can see that the ordered pair (1, 1) occurs in two positions: row 1, column 1, and row 3, column 3. So this pair of Latin
squares is definitely not orthogonal. In fact, the first of these squares has no Latin square that is orthogonal to it. However, there is
a pair of orthogonal
16.2.1 https://math.libretexts.org/@go/page/60156
Latin squares of order 4:
1 2 3 4 1 2 3 4 (16.2.3)
3 4 1 2 4 3 2 1
4 3 2 1 2 1 4 3
2 1 4 3 3 4 1 2
A set of Latin squares is mutually orthogonal if every distinct pair of Latin squares in the set are orthogonal. We call such a
set, a set of MOLS (for Mutually Orthogonal Latin Squares).
The natural question that arises in this context is, how many Latin squares can there be in a set of MOLS?
Before we attempt to answer this question, notice that if we have a pair of orthogonal Latin squares and we permute the symbols
used in the set N independently for each of the squares (resulting in new Latin squares that are nonetheless essentially the same, as
discussed in Section 16.1), the resulting pair of Latin squares will still be orthogonal. If in the first square the symbol x maps to the
symbol y , and in the second square the symbol u maps to the symbol v , then in the new pair of Latin squares the ordered pair (y, v)
will appear precisely once, since the ordered pair (x, u) appeared precisely once in the original pair of Latin squares. This is true
for any pair of entries (y, v), so every pair of entries must appear precisely once.
This idea that we can independently permute the symbols in each square, leads to a very nice method of representing MOLS. The
key idea is that it is not necessary to use the same set of symbols for each square, since the symbols we choose can be permuted
independently to match each other. In fact, we don’t need to use symbols at all to represent some of the squares; we can vary some
other characteristic. For example, to represent the two orthogonal Latin squares of order 3 that were shown in Example 16.2.1, we
can use the symbols 1 to 3 to represent the first square, and the colours red (for 1), blue (for 2) and green (for 3) to represent the
second square. However, varying the colours is not feasible in this textbook, which is printed in black-and-white. Instead, for the
second square, let us use “tilted left” (for 1), “straight up” (for 2), and “tilted right” (for 3). So (for example) since in the second
row, third column the first square had a 2 and the second had a 1, we place a 2 that is tilted to the left in that location in our new
representation (tilted left because the entry of the second square was 1; and 2 because that was the entry of the first square). Here is
the complete representation:
By the property of orthogonality, every combination of tilting and number must appear in exactly one position! Even more
amazing, if we have a set of MOLS and vary different parameters for each of the squares, the fact that the squares are all mutually
orthogonal will mean that every combination of the parameters appears in exactly one position. For example, if we have a set of
five MOLS, we could place a coloured shape behind each coloured symbol, and have different numbers of copies of the symbol.
For any possible colour of any possible shape appearing behind any possible number of any possible colour of any possible
symbol, you would be able to find a position in which that combination appears!
This approach to MOLS is essentially the context in which they first arose, as we can see from Euler’s example of the officers. For
the two orthogonal Latin squares sought in his question, the symbols in one represent the ranks while the symbols in the other
represent the regiment. In his final square, each officer could be represented by a pair of symbols indicating his rank and his
regiment – or by a letter for his regiment and a colour for his rank.
Here is a partial answer to the question of how many MOLS of order n there can be:
Theorem 16.2.1
Proof
We may assume that N = {1, . . . , n}. In each of the Latin squares in S , we can independently permute the symbols of N .
As was noted above, the result will still be a set of MOLS. We permute the symbols so that the first row of each of the Latin
squares has the entries 1, 2, . . . , n in that order.
16.2.2 https://math.libretexts.org/@go/page/60156
Now, if we take any i ∈ N and consider any pair of the Latin squares, the ordered pair (i, i) appears somewhere in the first
row. Consider the first entry of the second row in each square of S . None of these entries can be 1, since 1 has already
appeared in the first column of each of the Latin squares. No two of the Latin squares can have the same entry j in this
position, since the ordered pair (j, j) has already appeared in the j position of the first row of this pair of squares, so can’t
th
appear again in the first position of the second row. So there cannot be more squares in S , than the n − 1 distinct entries
from N ∖ {1} that could go into this position. Thus, |S| ≤ n − 1 , as claimed.
The next natural question is, is it possible to achieve n − 1 MOLS of order n ? We have already seen that the answer is yes in one
very small case, since we found 2 MOLS of order 3. In fact, there are infinitely many values of n for which there are n − 1 MOLS
of order n .
The following result can be generalised to prime powers using some basic field theory that you should understand if you have taken
Math 3400. However, for the purposes of this course, we will avoid the explicit field theory and prove the result only for primes.
We do require a bit of modular arithmetic for this result. As modular arithmetic will also be useful for some of our later results,
here is a quick review of some key points.
Definition: Modulo n
Performing calculations modulo n means replacing the result with the remainder you would get upon dividing that result by n .
In other words, if the result of a computation is n or larger, replace the result by its remainder upon division by n .
Notation
If a and b have the same remainder upon division by n , then we write a ≡ b (mod n ).
There are two key facts from modular arithmetic that we will require. The first is that if a ≡ b (mod n ) and 0 ≤ a , b < n , then we
must have a = b .
The other is that if qa ≡ qb (mod n ) and n and q have a greatest common divisor of 1, then a ≡ b (mod n ). In the special case
where n is prime, as long as q is not a multiple of n then n and q will always have a greatest common divisor of 1.
Theorem 16.2.2
Proof
We will use N = {0, . . . , p − 1} . In order to ensure that the results of our computations will be in N , all of the calculations
given in this result should be taken modulo p .
The squares will be {S 1, . For k ∈ {1, . . . , p},
. . . , Sp−1 }
0 1 ... p −1
⎡ ⎤
⎢k k+1 ... k + (p − 1) ⎥
⎢ ⎥
Sk = ⎢ 2k 2k + 1 ... 2k + (p − 1) ⎥ (16.2.4)
⎢ ⎥
⎢ ⎥
⎢... ... ... ... ⎥
⎣ ⎦
(p − 1)k (p − 1)k + 1 ... (p − 1)k + (p − 1)
We first verify that each S is a Latin square. The entries in each row are easily seen to be distinct. If the entries in the first
k
column are distinct, then we can see that the entries in every other column will be distinct. Suppose that 0 ≤ i , j ≤ p − 1
and that ik ≡ jk (mod p ). Then since every k ∈ {1, . . . , p − 1} has a greatest common divisor of 1 with p , we see that
i ≡ j (mod p ). Since 0 ≤ i , j ≤ p − 1 , this forces i = j . So the entries in the first column of S are all distinct. Thus,
k
Suppose that for some 1 ≤ i , j ≤ p − 1 , the squares S and S have the same ordered pair in two positions: row
i j k1 ,
column m , and row k , column m . Then by the formulas given for the entries of each Latin square, we must have
1 2 2
16.2.3 https://math.libretexts.org/@go/page/60156
(k1 − 1)i + m1 − 1 ≡ (k2 − 1)i + m2 − 1(mod p),
Since (k − k )i ≡ (k − k )j (mod p ), and 1 ≤ i , j ≤ p − 1 , either i = j (so we chose the same Latin square twice
2 1 2 1
instead of choosing a pair of distinct Latin squares), or k − k ≡ 0 (mod p ). Since k and k are row numbers, they are
2 1 1 2
between 1 and p so this forces k = k . Furthermore, in this case we must also have m − m ≡ 0 (mod p ), and we see
1 2 2 1
that this also forces m = m . Thus, the two positions in which the same ordered pair appeared, were actually the same
1 2
Example 16.2.2
Here are the first 8 of the 10 MOLS of order 11, found using the formula given in the proof above.
16.2.4 https://math.libretexts.org/@go/page/60156
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 (16.2.6)
1 2 3 4 5 6 7 8 9 10 0 2 3 4 5 6 7 8 9 10 0 1
2 3 4 5 6 7 8 9 10 0 1 4 5 6 7 8 9 10 0 1 2 3
3 4 5 6 7 8 9 10 0 1 2 6 7 8 9 10 0 1 2 3 4 5
4 5 6 7 8 9 10 0 1 2 3 8 9 10 0 1 2 3 4 5 6 7
5 6 7 8 9 10 0 1 2 3 4 10 0 1 2 3 4 5 6 7 8 9
6 7 8 9 10 0 1 2 3 4 5 1 2 3 4 5 6 7 8 9 10 0
7 8 9 10 0 1 2 3 4 5 6 3 4 5 6 7 8 9 10 0 1 2
8 9 10 0 1 2 3 4 5 6 7 5 6 7 8 9 10 0 1 2 3 4
9 10 0 1 2 3 4 5 6 7 8 7 8 9 10 0 1 2 3 4 5 6
10 0 1 2 3 4 5 6 7 8 9 9 10 0 1 2 3 4 5 6 7 8
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
3 4 5 6 7 8 9 10 0 1 2 4 5 6 7 8 9 10 0 1 2 3
6 7 8 9 10 0 1 2 3 4 5 8 9 10 0 1 2 3 4 5 6 7
9 10 0 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 9 10 0
1 2 3 4 5 6 7 8 9 10 0 5 6 7 8 9 10 0 1 2 3 4
4 5 6 7 8 9 10 0 1 2 3 9 10 0 1 2 3 4 5 6 7 8
7 8 9 10 0 1 2 3 4 5 6 2 3 4 5 6 7 8 9 10 0 1
10 0 1 2 3 4 5 6 7 8 9 6 7 8 9 10 0 1 2 3 4 5
2 3 4 5 6 7 8 9 10 0 1 10 0 1 2 3 4 5 6 7 8 9
5 6 7 8 9 10 0 1 2 3 4 3 4 5 6 7 8 9 10 0 1 2
8 9 10 0 1 2 3 4 5 6 7 7 8 9 10 0 1 2 3 4 5 6
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
5 6 7 8 9 10 0 1 2 3 4 6 7 8 9 10 0 1 2 3 4 5
10 0 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9 10 0
4 5 6 7 8 9 10 0 1 2 3 7 8 9 10 0 1 2 3 4 5 6
9 10 0 1 2 3 4 5 6 7 8 2 3 4 5 6 7 8 9 10 0 1
3 4 5 6 7 8 9 10 0 1 2 8 9 10 0 1 2 3 4 5 6 7
8 9 10 0 1 2 3 4 5 6 7 3 4 5 6 7 8 9 10 0 1 2
2 3 4 5 6 7 8 9 10 0 1 9 10 0 1 2 3 4 5 6 7 8
7 8 9 10 0 1 2 3 4 5 6 4 5 6 7 8 9 10 0 1 2 3
1 2 3 4 5 6 7 8 9 10 0 10 0 1 2 3 4 5 6 7 8 9
6 7 8 9 10 0 1 2 3 4 5 5 6 7 8 9 10 0 1 2 3 4
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
7 8 9 10 0 1 2 3 4 5 6 8 9 10 0 1 2 3 4 5 6 7
3 4 5 6 7 8 9 10 0 1 2 5 6 7 8 9 10 0 1 2 3 4
10 0 1 2 3 4 5 6 7 8 9 2 3 4 5 6 7 8 9 10 0 1
6 7 8 9 10 0 1 2 3 4 5 10 0 1 2 3 4 5 6 7 8 9
2 3 4 5 6 7 8 9 10 0 1 7 8 9 10 0 1 2 3 4 5 6
9 10 0 1 2 3 4 5 6 7 8 4 5 6 7 8 9 10 0 1 2 3
5 6 7 8 9 10 0 1 2 3 4 1 2 3 4 5 6 7 8 9 10 0
1 2 3 4 5 6 7 8 9 10 0 9 10 0 1 2 3 4 5 6 7 8
8 9 10 0 1 2 3 4 5 6 7 6 7 8 9 10 0 1 2 3 4 5
4 5 6 7 8 9 10 0 1 2 3 3 4 5 6 7 8 9 10 0 1 2
We’ve now seen that it is possible to find p − 1 MOLS of order p for any prime p, and that the proof can be generalised to prime
powers. However, as we’ve already discussed in relation to Euler’s original problem, there are orders for which the bound of n − 1
MOLS of order n cannot be attained: in fact, for order 6 it is not possible even to find a pair of orthogonal Latin squares.
16.2.5 https://math.libretexts.org/@go/page/60156
If you are interested in or familiar with some finite geometry, the existence of n − 1 MOLS of order n is equivalent to the
existence of a projective plane of order n . Projective planes, in turn, are a special kind of design. For an interesting article about
some of these relationships, see https://www.maa.org/sites/default/files/pdf/upload_library/22/Ford/Lam305-318.pdf. There is also
some information about this in Sections 18.3 and 18.4.
Exercise 16.2.1
1) Find the two MOLS of order 11 that are not included in Example 16.2.2, but are orthogonal to each other and to the squares
listed there.
2) Find a third Latin square of order 4 that is orthogonal to both of the orthogonal Latin squares of order 4 that were given
earlier in this section.
3) Here is a Latin square of order 8, and some entries for a second Latin square of order 8. Complete the second square so as to
obtain a pair of orthogonal Latin squares.
1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 (16.2.7)
2 1 4 3 6 5 8 7 3 4 _ _ _ 8 5 6
3 4 1 2 7 8 5 6 5 _ 7 _ _ 2 3 _
4 3 2 1 8 7 6 5 7 _ _ 6 _ _ _ _
5 6 7 8 1 2 3 4 4 _ _ 1 8 _ _ _
6 5 8 7 2 1 4 3 2 _ _ _ _ 5 _ _
7 8 5 6 3 4 1 2 8 _ _ _ _ _ _ 1
8 7 6 5 4 3 2 1 6 _ _ 7 _ _ _ 3
4) Write down the six mutually orthogonal Latin squares S1 , . . . , S6 of order 7 that are constructed by letting p =7 in the
proof of Theorem 16.2.2.
This page titled 16.2: Mutually Orthogonal Latin Squares (MOLS) is shared under a CC BY-NC-SA license and was authored, remixed, and/or
curated by Joy Morris.
16.2.6 https://math.libretexts.org/@go/page/60156
16.3: Systems of Distinct Representatives
Suppose we start filling in a Latin square, one row at a time, at each step ensuring that no element has yet appeared more than once
in a column (or in a row). Under what conditions will it be impossible to complete this to a Latin square? Although it may not be
immediately obvious, the answer to this question can be found in a well-known theorem published by Philip Hall in 1935, about
systems of distinct representatives.
Definition: Word
Let T . . . , T be sets. If there exist a , . . . , a all distinct such that for every
1 n 1 n 1 ≤i ≤n , ai ∈ Ti , then { a1 , . . . an } form a
system of distinct representatives (SDR) for T , . . . , T .
1 n
Example 16.3.1
The university is striking a student committee on the subject of tutorials. For each of the 5 Faculties, they ask students to elect
one representative who is taking classes from that faculty. They do not want one student trying to represent more than one
faculty. The candidates are:
Joseph, who is taking courses in Arts and Science, Fine Arts, Management, and Education;
René, who is taking courses in Health Sciences;
Claire, who is taking courses in Education and Health Sciences;
Sandra, who is taking courses in Management, Fine Arts, and Health Sciences;
Laci, who is taking courses in Education and Health Sciences; and
Jing, who is taking courses in Education.
Can the committee be filled?
Solution
The answer is no. For the three Faculties of Arts & Science, Fine Arts, and Management, there are only two possible student
representatives: Joseph (who could represent any of the three), and Sandra (who could represent either Fine Arts or
Management). So it is not possible to elect one student to represent each of the five Faculties, without allowing one of these
students to fill two roles.
In Example 16.3.1, we observed that we could find a collection of the sets to be represented, that collectively had fewer possible
representatives than there are sets in the collection. It is easy to see that if this happens, there cannot be a system of distinct
representatives for the sets.
What Philip Hall proved is the converse: unless we have an obstruction of this type, it is always possible to fine a system of distinct
representatives.
The collection of sets T , . . . , T has a system of distinct representatives if and only if for every 1 ≤ k ≤ n , the union of any k
1 n
This theorem is often referred to as “Hall’s Marriage Theorem,” as one of the problems it solves can be stated as follows. Suppose
we have a collection of men and a collection of women. Each of the women has a list of men she likes (from the collection). When
is it possible to marry each of the women to a man that she likes? (The context is historical, and the assumption of the time was that
every woman would want to marry a man.)
We have seen that one of the two implications of Hall’s Theorem is easy to prove. We will not try to prove the other implication
here, but will focus on using the result. Here are some examples and exercises. For completeness, we provide a proof of Hall’s
Theorem at the end of this chapter.
16.3.1 https://math.libretexts.org/@go/page/60157
Example 16.3.2
Example 16.3.3
clarity, we underline the representatives in the following list of the sets: A = {– a, d} , A = {a, c } , A = { b , c} ,
– – –
1 2 3
A = {c, d} .
4
–
(In fact, it also has another system of distinct representatives: take d for A , a for A , b for A , and c for 1 2 3 A4 : A1 = {a, d} ,
–
A = { a, c} , A = { b , c} , A = { c , d} . )
2 3 4
–
– – –
Exercise 16.3.1
For each collection of sets, determine whether or not it has a system of distinct representatives. If so, find one; if not, explain
why.
1) A 1 = {x} ,A 2 = {y, z} ,A 3 = {x, y} .
2) A 1 ,
= {u, v, w, x, y, z} A2 = {v, w, y} ,A 3 ,
= {w, x, y} A4 = {v, w, x, y} A5 = {v, x, y} , ,A6 = {v, y} .
3) A 1 = {x} ,A 2 = {y} ,A 3 =∅ .
4) A 1 = {x, z} ,A 2 = {y} ,A 3 = {x, y, z} .
5) T 1 = {a, b, c, d} ,T 2 = {a, b, c} ,T 3 = {a} ,T 4 = {c} .
6) U 1 ,
= {x, y} U2 = {y, z} ,U 3 =∅ .
7) V 1 = {e, f } ,V 2 = {e, g} ,V 3 = {e, h} ,V 4 = {f , g} ,V 5 = {h, i} .
8) W 1 = {+, −, ×, ÷, 0} ,W 2 = {+, −, ×} ,W 3 = {+, ×} ,W 4 = {×, −} ,W
5 = {+, −} .
Let’s return to our original question about Latin squares. To answer this, we first give an important general consequence of Hall’s
Theorem.
Proposition 16.3.1
Suppose T , . . . , T is a collection of sets each of which contains exactly r elements. Further, suppose that no element appears
1 n
in more than r of the sets. Then this collection has a system of distinct representatives.
Proof
By Hall’s Theorem, we must show that for every 1 ≤ k ≤ n , the union of any k of the sets T , . . . , T has cardinality at 1 n
least k . Let k ∈ 1, . . . , n be arbitrary, and arbitrarily choose k of the sets T , . . . , T . If k ≤ r then since each T has r j1 jk ji
16.3.2 https://math.libretexts.org/@go/page/60157
We can now answer the question about Latin squares.
Theorem 16.3.2
Suppose that m rows of a Latin square of order n have been filled, where m < n , and that to this point no entry appears more
than once in any row or column. Then another row can be added to the Latin square, maintaining the condition that no entry
appears more than once in any row or column.
Proof
For 1 ≤ i ≤ n , let T be the set of elements that have not yet appeared in column i (from the entries in the first m rows).
i
So T can be thought of as the set of allowable entries for the i column of the new row. Notice that each T has
i
th
i
cardinality n − m (the number of rows that are still empty). The task of finding a new row all of whose entries are distinct,
and whose i entry comes from the set T of allowable entries for that column, is equivalent to finding a system of distinct
th
i
representatives for the sets T , . . . , T . Thus, we must show that the collection T , . . . , T has a system of distinct
1 n 1 n
representatives.
Notice also that every element has appeared once in each of the first m rows, and thus has appeared in precisely m of the
columns. Therefore, there are exactly n − m of the columns in which it has not yet appeared. In other words, each element
appears in exactly n − m of the sets.
We can now apply Proposition 16.3.1, with r = n − m to see that our sets do have a system of distinct representatives.
This can be used to form a new row for the Latin square.
Corollary 16.3.1
Suppose that m rows of a Latin square of order n have been filled, where m < n , and that to this point no entry appears more
than once in any row or column. This structure can always be completed to a Latin square.
Proof
As long as m < n , we can repeatedly apply Theorem 16.3.2 to deduce that it is possible to add a row. Once you actually
find a row that can be added (note that the statement of Hall’s Theorem does not explain how to do this), do so. Eventually,
this process will result in a complete square.
Hall’s Theorem can also be used to prove a special case of a result we proved previously, Theorem 14.1.3. The special case we can
prove with Hall’s Theorem, is the case where every vertex has the same valency.
Theorem 16.3.3
If G is a bipartite graph in which every vertex has the same valency, then any bipartition sets V1 and V2 have the same
cardinality, and there is a set of |V | edges that can be properly coloured with the same colour.
1
Proof
Let k be the valency of every vertex, and for some arbitrary bipartition sets V and V , let n = |V | . By a slight adaptation
1 2 1
of Euler’s handshaking lemma, taking into account the fact that every edge has exactly one of its endvertices in V , we see 1
that nk = |E| . By the same argument, k|V | = |E| , which forces |V | = n = |V | . Since the bipartition was arbitrary, the
2 2 1
A system of distinct representatives for the sets T 1, . . . , Tn will produce n = |V 1| edges that can be properly coloured with
the same colour.
Observe that every set T has cardinality k (since this is the valency of every vertex), and every vertex appears in exactly k
i
of the sets (again because this is the valency of the vertex). Therefore, by Proposition 16.3.1, there is a system of distinct
16.3.3 https://math.libretexts.org/@go/page/60157
representatives for T 1, . . . , Tn . Hence we can find n = |V 1| edges that can be properly coloured with the same colour.
Corollary 16.3.2
If G is a bipartite graph in which every vertex has the same valency k , then its edges can be properly coloured using k colours.
Proof
Repeat the following step k times: find a set of edges that can be properly coloured. Colour them with a new colour, and
delete them from the graph.
At each stage, Theorem 16.3.3 tells us providing that every vertex has the same valency, we can find a set of edges that are
properly coloured, one of which is incident with every vertex of the graph. Since the valency of every vertex is reduced by
exactly one when we delete such a set of edges, at each stage every vertex will have the same valency.
To conclude the chapter, we provide a proof of Hall’s Theorem. As previously noted, one direction is obvious, so we prove only the
other direction.
The collection of sets T , . . . , T has a system of distinct representatives if for every 1 ≤ k ≤ n , the union of any k of the sets
1 n
Proof
We will prove this by strong induction on the number of sets, n .
Base case: n = 1 . If a single set has one element, then that set has a representative. This completes the proof of the base
case.
Inductive step: We begin with the inductive hypothesis. Let m ≥ 1 be arbitrary. Suppose that whenever 1 ≤ i ≤ m , and a
collection of i sets T , . . . , T has the property that for every 1 ≤ k ≤ i , the union of any k of the sets has cardinality at
1 i
union of any k of these sets has cardinality at least k + 1 . In this case, take any element t ∈ T (which by hypothesis is
m+1
nonempty) to be the representative for T , and remove this element from each of the other sets. Due to the case we are
m+1
in, for every 1 ≤ k ≤ m , the union of any k of the sets T − {t}, . . . , T − {t} still has cardinality at least k , (we have
1 m
removed only the element t from this union, which previously had cardinality at least k + 1 ). Thus we can apply our
induction hypothesis to find a system of distinct representatives for T , . . . , T that does not include the element t. This
1 m
cardinality precisely k . By our induction hypothesis, there is a system of distinct representatives for these k sets. From the
other m + 1 − k sets (observe that 1 ≤ m + 1 − k ≤ m ), remove the k elements that were in the union of the original k
sets. Consider any k of these adjusted sets, with 1 ≤ k ≤ m + 1 − k . Observe that the union of the k + k sets consisting
′ ′ ′
of the original k sets together with these k sets must have contained at least k + k distinct elements by hypothesis, so
′ ′
after removing the k representatives of the original k sets (which are the only elements in those sets), these k sets must ′
still have at least k distinct elements in their union. Therefore, we can apply our induction hypothesis to these other
′
m + 1 − k adjusted sets, and see that they too have a system of distinct representatives, none of which are amongst the k
representatives for the original k sets. Combining these two systems of distinct representatives yields a system of distinct
representatives for the full collection of sets.
16.3.4 https://math.libretexts.org/@go/page/60157
Exercise 16.3.2
1) Onyx is doing some research on what people learn from visiting different countries. She has a set of questions that are to be
answered by someone who has visited the country. Before fully launching the study, she wants to try the questions out on some
of her friends. Since the questions are the same for different countries, she doesn’t want one person to answer the questions for
more than one country, as that could bias the results. Of her close friends, the following people have visited the following
countries:
England: Adam, Ella, Justin
Wales: Adam, Justin, Faith, Cayla
Scotland: Bryant, Justin, Ella
Ireland: Adam, Bryant, Justin
Germany: Cayla, Bryant, Justin, Faith, Denise
France: Ella, Justin, Bryant
Italy: Adam, Ella, Bryant
Prove that Onyx cannot find seven different friends, each of whom has visited a different one of these countries.
2) Can the following be completed to a 4 × 4 Latin square? Does Hall’s Theorem apply to this? If not, why not?
1 2 3 4
4 1 2 3
2 4
3) Show (by example) that it is possible to have a bipartite graph in which the bipartition sets have the same cardinality k and
the valency of every vertex is either 3 or 4, but no set of k edges can be properly coloured with a single colour.
4) How do you know (without actually finding a completion) that the following can be completed to a Latin square of order 7?
1 2 3 4 5 6 7
2 4 7 6 1 5 3
3 7 4 2 6 1 5
4 6 2 5 3 7 1
other words, determine whether there is a set of |V | edges that can be properly coloured with the same colour.)
1
(a)
(b)
(c)
This page titled 16.3: Systems of Distinct Representatives is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by
Joy Morris.
16.3.5 https://math.libretexts.org/@go/page/60157
16.4: Summary
Hall’s (Marriage) Theorem
A partial Latin square containing m rows can always be completed.
Important Definitions:
Latin Square
Orthogonal Latin Squares
MOLS (Mutually Orthogonal Latin Squares)
System of Distinct Representatives (SDR)
This page titled 16.4: Summary is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy Morris.
16.4.1 https://math.libretexts.org/@go/page/60158
CHAPTER OVERVIEW
17: Designs
17.1: Balanced Incomplete Block Designs (BIBD)
17.2: Constructing Designs and Existence of Designs
17.3: Fisher’s Inequality
17.4: Summary
This page titled 17: Designs is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy Morris.
1
17.1: Balanced Incomplete Block Designs (BIBD)
Suppose you have 7 possible treatments for a disease, that you hope may work well singly or in combination. You would like to try
out every possible combination of them, but the number of (non-empty) subsets of the set of 7 treatments is 2 − 1 = 127 , and you
7
have only 7 mice in the lab who have this disease. You believe that using a pair of the treatments together will have a more
significant impact than adding more of the treatments, but even trying every pair of treatments on a different mouse would require
( ) = 21 mice.
7
Here is a strategy you could try: give each of the mice 3 of the treatments, according to the following scheme. For 1 ≤ i ≤ 7 , the
treatments given to mouse i will be the elements of the i set in the following list
th
{1, 2, 3}, {1, 4, 5}, {1, 6, 7}, {2, 4, 6}, {2, 5, 7}, {3, 4, 7}, {3, 5, 6} (17.1.1)
Careful perusal of this scheme will show that every pair of treatments is used together on precisely one of the mice.
This definition of a design is too broad to be of much interest without additional constraints, but a variety of different constraints
have been studied.
Notation
We use v to denote |V | and b to denote |B|.
Example 17.1.1
B = {{1, 2, 3}, {1, 4, 5}, {1, 6, 7}, {2, 4, 6}, {2, 5, 7}, {3, 4, 7}, {3, 5, 6}}.
occurring λ times.
Although this definition includes the possibility k = 1 or k = 2 , these are not interesting cases, and can usually be ignored
17.1.1 https://math.libretexts.org/@go/page/60161
Example 17.1.2
Here is another BIBD. This one has parameters (20, 16, 5, 4, 1).
{a, b, c, d}, {e, f , g, h}, {i, j, k, l}, {m, n, o, p}, {a, e, i, m}, {a, f , j, n}, {a, g, k, o},
{a, h, l, p}, {b, e, j, o}, {b, f , i, p}, {b, g, l, m}, {b, h, k, n}, {c, e, l, o}, {c, f , k, p}, (17.1.2)
{c, g, i, n}, {c, h, j, m}, {d, e, k, n}, {d, f , l, m}, {d, g, j, p}, {d, h, i, o}
Theorem 17.1.1
Proof
For the first equation, we count the total number of appearances of each point in the design (including repetitions) in two
ways. This is another example of counting ordered pairs from a cartesian product, as we have discussed previously.
First, there are b blocks, each of which has k points in it. So the answer will be bk.
Second, there are v points, each of which appears r times. So the answer will be vr .
Thus, vr = bk .
For the second equation, we fix a point p and count the number of points with which p appears in a block, in two ways.
First, p appears in r blocks. In each of these, there are k − 1 points besides p . So the answer will be r(k − 1) .
Second, for every point p ∈ V with p ≠ p , the point
′ ′
p
′
appears with p in λ different blocks. Since there are v−1
and
vr λv(v − 1)
b = = (17.1.5)
k k(k − 1)
Thus, if we know that a design is regular, uniform, and balanced, then the parameters r and b can be determined from the
parameters v , k , and λ . We therefore often shorten our notation and refer to a BIBD(v, k, λ) .
Theorem 17.1.2
A BIBD(v, k, λ) is equivalent to colouring the edges of the multigraph λK (the multigraph in which each edge of
v Kv has
been replaced by λ copies of that edge) so that the edges of any colour form a K . k
Proof
Given an edge-colouring of λK as described, define the points of the design to be the set of vertices of the multigraph, and
v
for each colour, create a block whose vertices are the vertices of the K that has that colour. All of these blocks will have
k
cardinality k . Every vertex has valency λ(v − 1) , and every K of one colour that contains that vertex will use k − 1 of
k
the edges incident with that vertex, so every vertex will appear in
λ(v − 1)
r = (17.1.6)
(k − 1)
17.1.2 https://math.libretexts.org/@go/page/60161
blocks. Now, any edge of the λK must appear in some K (the one coloured with the colour of that edge). Thus for any
v k
pair of points, these vertices are joined by λ edges each of which appears in some K , so these points must appear together
k
Similarly, given a BIBD(v, k, λ) , and a multigraph λK , label the vertices of K with the points of the design. For each
v v
block of the design, use a new colour to colour the edges of a K that connects the points in that block. There will be
k
enough uncoloured edges joining these points, since every pair of points appear together in exactly λ blocks, and there are
λ edges joining the corresponding vertices. In fact, careful counting can show that this will result in colouring every edge
of the multigraph.
This is nicest in the case where λ = 1 , when the BIBD corresponds to an edge-colouring of K . v
A colouring of the edges of a graph (or multigraph) is often referred to as a decomposition of the graph (or multigraph), since we
can think of the colour classes as sets of edges whose union forms the entire edge set of the graph.
These provide alternate ways of thinking of designs that may be more intuitive, and are certainly more visual. Equations 17.1.4 and
17.1.5 lead to numerical conditions on v , k , and λ that must be satisfied in order for a BIBD(v, k, λ) to exist.
Theorem 17.1.3
are integers.
Proof
By Equation 17.1.4, every point of the design must appear in
λ(v − 1)
k−1
blocks. Since a point can only appear in an integral number of blocks, the first result follows.
Similarly, By Equation 17.1.5, there must be
λv(v − 1)
k(k − 1)
blocks in the design. Since there can’t be a fractional number of blocks, the second result follows.
Although these conditions are necessary to the existence of a BIBD, there is no guarantee that a BIBD with specified parameters
will exist, even if those parameters satisfy these conditions.
Example 17.1.3
The parameters v = 15 , k = 5 , λ = 2 satisfy the conditions of Theorem 17.1.3, but there is no BIBD(15, 5, 2).
We will not prove that such a design does not exist as the proof would be tedious and unenlightening. We will verify that the
parameters satisfy the necessary conditions.
We have
λ(v − 1) 2(14)
= =7 ,
k−1 4
and
λv(v − 1) 14
= 2 ⋅ 15 ⋅ = 21 .
k(k − 1) 20
17.1.3 https://math.libretexts.org/@go/page/60161
Both of these are integers, so if a design were to exist, each point would appear in 7 blocks, and there would be 21 blocks. A
computer search can verify that no such design exists.
Exercise 17.1.1
1) Show that for any BIBD(v, k, λ) , the number of edges of λK is equal to the number of edges of
v Kk times the number of
blocks of the design.
2) Suppose there is a BIBD(16, 6, 3). How many blocks does it have? In how many of those blocks does each point appear?
3) Find an edge-colouring of K5 so that the edges of any colour form a K2 . What are the parameters of the design to which
this corresponds?
4) Here are the blocks of a BIBD with λ = 1 :
B1 = {1, 2, 3} B2 = {1, 4, 7} B3 = {1, 5, 9} B4 = {1, 6, 8}
This page titled 17.1: Balanced Incomplete Block Designs (BIBD) is shared under a CC BY-NC-SA license and was authored, remixed, and/or
curated by Joy Morris.
17.1.4 https://math.libretexts.org/@go/page/60161
17.2: Constructing Designs and Existence of Designs
There are a number of nice methods for constructing designs. We will discuss some of these methods in this section. For some of
them, you must start with one design, and use it to create a different design.
is a design.
Proposition 17.2.1
Proof
The proof of this proposition is left to the reader, as Exercise 17.2.1(1).
Example 17.2.1
17.2.1 https://math.libretexts.org/@go/page/60162
Definition: Difference Collection and Difference Set
Fix an odd integer n . A collection of sets D 1, . . . , Dm ⊆ {1, . . . , n} is a difference collection for n , if taking the differences
(n − 1)
j−i for every pair i ≠j with i, j ∈ D for each set
k Dk , attains each of the values ±1, . . . , ± , exactly once, when
2
computations are performed modulo n . If m = 1 then D is called a difference set.
1
Example 17.2.2
Difference Set i j j − i, i − j
D1 1 2 ±1
D1 1 5 ±4
D1 2 5 ±3
D2 1 3 ±2
D2 1 10 ±9
D2 3 10 ±7
D3 1 7 ±6
D3 7 15 ±8
Suppose we have a difference collection for v in which each set D 1, . . . , Dm has the same cardinality. Use Di + ℓ to denote
the set
{d + ℓ(mod v)|d ∈ Di } ,
performing the modular arithmetic so as to ensure that D i + ℓ ⊆ {1, . . . , v} . Then the sets
{ Di + ℓ|1 ≤ i ≤ m, 0 ≤ ℓ ≤ v − 1}
form a BIBD(v, |D 1 |, .
1)
In the above example, taking the 57 sets {1, 2, 5} + ℓ , {1, 3, 10} + ℓ , and {1, 7, 15} + ℓ , where 0 ≤ ℓ ≤ 18 , gives a BIBD
(19, 3, 1).
Let’s go over this construction again, thinking about the graph version of the problem. For simplicity, we’ll look only at the special
case λ = 1 . So our object is to colour the edges of the complete graph K so as to ensure that every colour class is a K . If we
v k
draw the vertices of the graph in a circle, and think of the length of an edge as being one more than the number of vertices between
its endvertices as you travel around the circle in whichever direction is shorter, then for every possible length between 1 and
(v − 1) v v
, Kv has v edges of that length. (This is where the trouble arises if v is even: there are only edges of length .)
2 2 2
Furthermore, if we rotate any edge by one step around the graph (i.e. move both of its endpoints one step in the same direction)
repeatedly, after v such rotations we will have moved the edge onto every other edge of that length.
These ideas demonstrate that if we can come up with a set of K s, such that every edge length appears in exactly one of the K s,
k k
then by taking each one of these as well as every possible rotation of each one of these, as a colour class, we find our desired edge-
colouring of K . v
A picture is worth a thousand words. The example above is equivalent to edge-colouring K so that every colour class forms a 19
{1, . . . , 9} appears in exactly one of them. By rotating each of them, giving each rotation a new colour, we obtain 57 K s that use 3
every edge of K exactly once. We’ve labeled the vertices 1 through 19 to make the edge lengths easier to work out. This
19
17.2.2 https://math.libretexts.org/@go/page/60162
textbook is printed in black-and-white, so, instead of drawing actual colours on the edges, we will draw solid edges for blue, dotted
edges for red, and dashed edges for green.
The solid (or blue) triangle has edges of lengths 1, 3, and 4; the dotted (or red) triangle has edges of lengths 2, 7, and 9; and the
dashed (or green) triangle has edges of lengths 6, 8, and 5.
A design created using this method is called a cyclic design, since a small number of “starter blocks” are being rotated cyclically
(in the graph) to find the remaining blocks of the design.
Notice that for a cyclic design to exist, since each set in the difference collection leads to v blocks in the final design, b must be a
multiple of v .
Although these methods can successfully create designs with many different sets of parameters, they are not nearly enough to allow
us to determine the parameters for which BIBDs exist. We noted previously that the necessary conditions given in Theorem 17.1.3
are not sufficient to guarantee the existence of a BIBD with a particular set of parameters. However, there is a very powerful result
along these lines, known as Wilson’s Theorem. It tells us that if we fix k , there are only finitely many values for v that satisfy the
necessary conditions but for which no BIBD(v, k, 1) exists. Then by Method 1 (repeating blocks), if a BIBD(v, k, 1) exists, then
so does a BIBD(v, k, λ) for any λ . Here is a formal statement of Wilson’s Theorem.
Given k , there is an integer v(k) such that for every v > v(k) that satisfies the three conditions:
v∈ Z ;
v(v − 1)
∈ Z ; and
[k(k − 1)]
(v − 1)
∈ Z ,
(k − 1)
a BIBD(v, k, 1) exists.
Proof
We will not give a proof of this theorem.
Exercise 17.2.1
1) Prove that the complement of a BIBD is indeed a design, and that it has the parameters we claimed in Proposition 17.2.1.
[Hint: Use inclusion-exclusion to determine how many blocks of the original design contain neither point from an arbitrary
pair.]
2) Find the complement of the BIBD(8, 4, 3) given by V = {1, 2, 3, 4, 5, 6, 7, 8} and
{1, 2, 3, 4}, {5, 6, 7, 8}, {1, 2, 5, 6}, {1, 2, 7, 8}, {3, 4, 5, 6}, {3, 4, 7, 8}, {2, 4, 6, 8},
B ={ }. (17.2.2)
{1, 3, 5, 7}, {1, 3, 6, 8}, {2, 4, 5, 7}, {1, 4, 5, 8}, {1, 4, 6, 7}, {2, 3, 5, 8}, {2, 3, 6, 7}
17.2.3 https://math.libretexts.org/@go/page/60162
3) By adding two more sets to the sets {1, 3, 7} and {1, 6, 13}, you can create a difference collection for 25 in which each of
the sets has 3 elements, and thus a cyclic BIBD(25, 3, 1). Find two sets to add.
4) Use a difference set to construct a cyclic (11, 11, 5, 5, 2)design.
5) Show that the collection C = {{0, 1, 3}, {0, 4, 5}, {0, 4, 7}, {0, 5, 7}}is a difference collection for 13. Construct the design
and give its parameters.
6) Determine whether the given set D is a difference set for the given value of n . If it is a difference set, find the parameters of
the resulting cyclic BIBD.
(a) D = {1, 2, 4, 10} for n = 13 .
(b) D = {2, 4, 5, 6, 10} for n = 21 .
7) Prove that in any cyclic design, there exists an integer c such that b = cv , ck(k − 1) = λ(v − 1) , and r = ck . What is the
significance of c in terms of the design?
8) Explain why a BIBD with v = 6 , b = 10 , k = 3 , r = 5 , and λ = 2 cannot be cyclic.
9) Does the condition you proved in Problem 7 show that a BIBD with v = 61 , b = 305 , k = 4 , r = 20 , and λ = 1 cannot be
cyclic?
This page titled 17.2: Constructing Designs and Existence of Designs is shared under a CC BY-NC-SA license and was authored, remixed, and/or
curated by Joy Morris.
17.2.4 https://math.libretexts.org/@go/page/60162
17.3: Fisher’s Inequality
There is one more important inequality that is not at all obvious, but is necessary for the existence of a BIBD(v, k, λ) . This is
known as Fisher’s Inequality since it was proven by Fisher. The proof we will give is somewhat longer than the standard proof.
This is because the standard proof uses linear algebra, which is not required background for this course.
Theorem 17.3.1
so b ≥ v implies
λv(v − 1)
≥ v. (17.3.2)
k(k − 1)
Since v is the number of points of a design, it must be positive, so dividing through by v does not reverse the inequality. Thus,
λ(v − 1)
≥ 1. (17.3.3)
k(k − 1)
Since k is the number of points in each block, both k and k − 1 must be positive (we are ignoring the trivial case k =1 ), so
multiplying through by (k − 1) does not reverse the inequality. Thus,
Proof
Suppose we have an arbitrary BIBD(v, k, λ) . Let B be an arbitrary block of this design. For each value of i between 0 and
k (inclusive), let n denote the number of blocks B ≠ B such that | B ∩ B| = i . (When we say B ≠ B we allow the
′ ′ ′
i
blocks to be equal as sets if the block B is a repeated block of the design; we are only insisting that B not be the exact
′
∑ ni = b − 1, (17.3.5)
i=0
i=0
because both sides of this equation count the number of times elements of B appear in some other block of the design.
k
i=2
because both sides of this equation count the number of times all of the ordered pairs of elements from B appear together in
some other block of the design. Note that when i = 0 or i = 1 , we have i(i − 1)n = 0 , so in fact
i
k k
i=0 i=2
17.3.1 https://math.libretexts.org/@go/page/60163
k
2
∑ i ni = k(k − 1)(λ − 1) + k(r − 1). (17.3.9)
i=0
Now comes the part of the proof where something mysterious happens, and for reasons that are not at all apparent, the
result we want will emerge. To fully understand a proof like this one requires deeper mathematics, but even seeing a proof
is useful to convince ourselves that the result is true.
k k k k k
2 2 2 2 2
∑(x − i ) ni = ∑(x − 2xi + i )ni = x ∑ ni − 2x ∑ i ni + ∑ i ni (17.3.10)
Using Equations 17.3.5, 17.3.6, and 17.3.9, we see that this is equal to
2
x (b − 1) − 2xk(r − 1) + k(k − 1)(λ − 1) + k(r − 1). (17.3.11)
Notice that the format in which this polynomial started was a sum of squares times non-negative integers, so its value must
be non-negative for any x ∈ R .
Using the quadratic formula, ax 2 ′
+b x +c = 0 has roots at
′
− −− − −−−−−
′ 2
−b ± √ (b ) − 4ac
(17.3.12)
2a
If a quadratic polynomial has two real roots, then there is a region in which its values are negative. Since this polynomial is
non-negative for every x ∈ R , it can have at most one real root, so (b ) − 4ac ≤ 0 . Substituting the actual values from
′ 2
Hence,
2 2
k (r − 1 ) − k(b − 1)((k − 1)(λ − 1) + r − 1) ≤ 0. (17.3.14)
k(b − 1) = bk − k = vr − k. (17.3.15)
Hence
2 2
k (r − 1 ) − (vr − k)((k − 1)(λ − 1) + r − 1) ≤ 0. (17.3.16)
Expand the second term slightly, and multiply both sides of the inequality by v − 1 :
2 2
k (r − 1 ) (v − 1) − (vr − k)(k − 1)(λ − 1)(v − 1) − (vr − k)(r − 1)(v − 1) ≤ 0 (17.3.17)
r(k − 1)
In the middle expression, we have (λ − 1)(v − 1) . By Theorem 17.1.1, we know that λ = , so
(v − 1)
r(k − 1) − (v − 1)
λ −1 = (17.3.18)
v−1
Therefore,
Thus, we have
2 2
k (r − 1 ) (v − 1) − (vr − k)(k − 1)(rk − r − v + 1) − (vr − k)(r − 1)(v − 1) ≤ 0. (17.3.20)
The next step is a lot of work to do by hand. Fortunately there is good math software that can perform routine tasks like this
quickly. If we expand this inequality fully, remarkably it has a nice factorisation:
2
r(k − r)(v − k) ≤0 (17.3.21)
17.3.2 https://math.libretexts.org/@go/page/60163
Now, r > 0 for any design, and (v − k) is a square, so must be nonnegative. Therefore, this inequality forces k − r ≤ 0 ,
2
r
so k ≤ r . Hence ≥1 . Using Theorem 17.1.1, we have
k
vr
b = ≥ v, (17.3.22)
k
as desired.
Notice that if k is fixed, then only finitely many values of v do not meet Fisher’s Inequality, so satisfying this inequality did not
need to be added as a condition to Wilson’s Theorem.
Exercise 17.3.1
1) Find values for v , k , and λ that satisfy Theorem 17.1.3 but do not satisfy Fisher’s Inequality. What can you say about the
existence of a design with these parameters?
2) Suppose that λ = 1 and k = 20 . How big must v be to satisfy Fisher’s Inequality? What is the smallest value for v that
satisfies all of the necessary conditions?
3) Suppose that λ = 2 and k = 20 . How big must v be to satisfy Fisher’s Inequality? What is the smallest value for v that
satisfies all of the necessary conditions?
4) Explain how you know there does not exist a BIBD with v = 46 , b = 23 , and k = 10 .
5) Explain how you know there does not exist a BIBD with v = 8 , b = 10 , k = 4 , and r = 5 .
6) If B is a BIBD with v = 22 , then what can you say about the value of b ?
This page titled 17.3: Fisher’s Inequality is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy Morris.
17.3.3 https://math.libretexts.org/@go/page/60163
17.4: Summary
Equivalence between designs and (multi)graph colouring/decomposition problem
Necessary conditions for a BIBD
Construction methods for Designs
Wilson’s Theorem
Fisher’s Inequality
Important Definitions:
Design
Blocks
Balanced, Regular, Uniform
BIBD
Complementary Design
Difference Collection
Notation:
b, v, r, k, λ
This page titled 17.4: Summary is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy Morris.
17.4.1 https://math.libretexts.org/@go/page/60164
CHAPTER OVERVIEW
This page titled 18: More Designs is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy Morris.
1
18.1: Steiner and Kirkman Triple Systems
In 1844, W.S.B. Woolhouse, editor of the Ladies and Gentleman’s Diary, posed that publication’s annual prize problem:
Determine the number of combinations that can be made of n symbols, p symbols in each, with this limitation, that no combination
of q symbols which may appear in any one of them shall be repeated in any other.
If we take q = 2 then this is essentially a design with k = p and v = n , although it need not be balanced; some pairs might appear
once while other pairs do not appear. Although some responses were printed in 1845, they were not satisfactory, and in 1846 by
Woolhouse repeated the question for the special case q = 2 and p = 3 .
In 1847, Rev. Thomas Kirkman (previously mentioned in Chapter 13) found a complete solution to this problem in the case where
the design is balanced (with λ = 1 ), and made some progress towards solving the complete problem. In our terminology, his
solution completely determined the values of v for which a BIBD(v, 3, 1) exists.
Although Steiner did not study triple systems in 1853, he came up with Kirkman’s result independently, and his work was more
broadly disseminated in mathematical circles, so these structures still carry his name. Despite this, as we shall see in the next
section, there is a related problem that has been named after Kirkman.
Example 18.1.1
The cyclic BIBD(19, 3, 1) given in Example 17.2.2 is a Steiner triple system. So is the design on 7 points given by
{1, 2, 3}, {1, 4, 5}, {1, 6, 7}, {2, 4, 6}, {2, 5, 7}, {3, 4, 7}, {3, 5, 6} (18.1.1)
Notation
Since the only variable in a Steiner triple system is v , for such a system on v points we use the notation STS(v) .
Triple systems might seem like a very special case of designs, and it would be reasonable to wonder why we have chosen to single
these out for special study and attention. The answer is that triple systems can be thought of as the smallest interesting examples of
designs, since (as noted previously) if k = 1 there are no pairs in any block and the design is trivial; and if k = 2 then the blocks
are simply copies of every possible pair from the set V . So triple systems are a natural starting point when we are learning about
designs: they include examples that are not too big or complicated to understand, but are non-trivial.
If you are now convinced that triple systems are worth studying, you might still be wondering about Steiner triple systems in
particular. We’ve seen that the method of repeating blocks allows us to construct a triple system for any λ if we first have a triple
system with λ = 1 , so Steiner triple systems will be the rarest kind of triple system, and are therefore of particular interest.
In the remainder of this section, we will prove Kirkman’s result characterising the values of v for which an STS(v) exists. To do
this, we will require two results about the existence of special kinds of Latin squares. There are many connections in
combinatorics!
The special kinds of Latin squares we require will be symmetric: that is, the entry in row i and column j must equal the entry in
row j , column i. We will also specify the entries on the main diagonal: the positions for which the row number and the column
number are the same.
Lemma 18.1.1
For every odd n , there is a symmetric Latin square of order n with 1, 2, . . . , n appearing in that order down the main diagonal.
Proof
Make the entries of the first row
18.1.1 https://math.libretexts.org/@go/page/60167
(n + 3) (n + 5) (n − 1) (n + 1)
1, , 2, , 3, . . . , , n, . (18.1.2)
2 2 2 2
For i ≥ 2 , the entries of row i will be the entries of row i − 1 shifted one position to the left.
Clearly all of the entries in any row are distinct. Also, since it takes n shifts to the left to return to the starting point, all of
the entries in any column must be distinct.
Since the entry in row a , column b moves to the entry in row a + 1 column b − 1(mod n) , the positions in which this
entry appears will be precisely the positions (x, y) for which x + y ≡ a + b(mod n) . Since i + j = j + i , the entry in
column i of row j will be the same as the entry in column j of row i. Thus, the Latin square is symmetric.
The same argument shows that the entry in row i and column i will be the entry in position 2i − 1(mod n) of row 1. But
this is precisely where i has been placed, so this entry will be i, as desired.
Example 18.1.2
When n = 11 , here is the (symmetric) Latin square constructed in the proof of Lemma 18.1.1:
1 7 2 8 3 9 4 10 5 11 6 (18.1.3)
7 2 8 3 9 4 10 5 11 6 1
2 8 3 9 4 10 5 11 6 1 7
8 3 9 4 10 5 11 6 1 7 2
3 9 4 10 5 11 6 1 7 2 8
9 4 10 5 11 6 1 7 2 8 3
4 10 5 11 6 1 7 2 8 3 9
10 5 11 6 1 7 2 8 3 9 4
5 11 6 1 7 2 8 3 9 4 10
11 6 1 7 2 8 3 9 4 10 5
6 1 7 2 8 3 9 4 10 5 11
You can see that this construction will not work when n is even, since the values of some of the entries given in the formulas
would not be integers. In fact, it is not possible to construct a symmetric Latin square of order n with the entries 1, . . . n down
the main diagonal when n is even. Fortunately, it is possible to construct something similar that will achieve what we will
require.
Lemma 18.1.2
n n
For every even n , there is a symmetric Latin square of order n with the values 1, . . . , , 1, . . . , appearing in that order
2 2
down the main diagonal.
Proof
Make the entries of the first row.
(n + 2) (n + 4) (n − 2)
1, , 2, , 3, . . . , , n. (18.1.4)
2 2 2
For i ≥ 2 , the entries of row i will be the entries of row i − 1 shifted one position to the left.
The same arguments as in the proof of Lemma 18.1.1 show that this is a symmetric Latin square. The entry in row i and
n
column i will be the entry in position 2i − 1(mod n) of row 1. These are the entries 1, 2, . . . , and since position
2
2(n + 2j)
− 1 ≡ 2j − 1(mod n) (18.1.5)
2
(n + 2j) (n + 2j)
the entries in positions (j, j) and ( , ) will be the same, so each of these entries will be repeated in the
2 2
same order.
18.1.2 https://math.libretexts.org/@go/page/60167
Example 18.1.3
When n = 10 , here is the (symmetric) Latin square constructed in the proof of Lemma 18.1.2:
1 6 2 7 3 8 4 9 5 10 (18.1.6)
6 2 7 3 8 4 9 5 10 1
2 7 3 8 4 9 5 10 1 6
7 3 8 4 9 5 10 1 6 2
3 8 4 9 5 10 1 6 2 7
8 4 9 5 10 1 6 2 7 3
4 9 5 10 1 6 2 7 3 8
9 5 10 1 6 2 7 3 8 4
5 10 1 6 2 7 3 8 4 9
10 1 6 2 7 3 8 4 9 5
We are now ready to characterise the values of v for which there is an STS(v) .
Theorem 18.1.1
Proof
We prove the two implications separately.
(⇒) Suppose that an STS(v) exists. Then by Theorem 17.1.3, the values
must be integers. The first of these conditions tells us that v must be odd, so must be 1, 3, or 5(mod 6). If v ≡ 5(mod 6) ,
then v = 6q + 5 for some q , so
\[\dfrac{v(v − 1)}{6} = \dfrac{(6q + 5)(6q + 4)}{6}.]
Since neither 6q + 5 nor 6q + 4 is a multiple of 3, this will not be an integer. Thus, the second condition eliminates the
possibility v ≡ 5(mod 6) . Therefore, v ≡ 1, 3(mod 6).
(⇐) Suppose that v ≡ 1, 3(mod 6). We give separate constructions of Steiner triple systems on v points, depending on the
congruence class of v . We will use the graph theoretic approach to the problem, so our goal is to find colour classes for the
edges of K such that each colour class consists of a K .
v 3
By Lemma 18.1.1, there is a Latin square of order 2q + 1 in which for every 1 ≤ i ≤ 2q + 1 , the entry i appears in
position (i, i).
For 1 ≤ i , j ≤ 2q + 1 with i ≠ j , if the entry in position (i, j) of this Latin square is ℓ , then use new colours to colour the
edges that join the vertices in each of the following sets:
{ ui , uj , vℓ }; { vi , vj , wℓ }; and { wi , wj , uℓ }. (18.1.9)
Since the Latin square was symmetric, both position (i, j) and position (j, i) give rise to the same colour classes. Since we
consider every pair i ≠ j , every edge of the form u u , v v , or w w has been coloured (we must have i ≠ j for such an
i j i j i j
edge to exist). Since the square is Latin, every possible entry ℓ occurs somewhere in row i, so every edge of the form u v , i ℓ
v w , or w u has been coloured, except that we did not look at the entry of the Latin square in the position (i, j) , where
i ℓ i ℓ
i = j . We know that the entry in position (i, i) is i, so the edges of the form u v , v w , and w u are the only edges that i i i i i i
18.1.3 https://math.libretexts.org/@go/page/60167
Now all of the edges of K6q+3 have been coloured, and each colour class forms a K3 , so we have constructed a Steiner
triple system.
v ≡ 1(mod 6) . Say v = 6q + 1 . Label the vertices of K 6q+1 with
By Lemma 18.1.2, there is a Latin square of order 2q in which for every 1 ≤i ≤q , the entry i appears in position (i, i) ,
and for every q + 1 ≤ i ≤ 2q , i − q appears in position (i, i).
For 1 ≤ i , j ≤ 2q with i ≠ j , if the entry in position (i, j) of this Latin square is ℓ , then use new colours to colour the
edges that join the vertices in each of the following sets:
{ ui , uj , vℓ }; { vi , vj , wℓ }; and { wi , wj , uℓ }. (18.1.11)
Since the Latin square was symmetric, both position (i, j) and position (j, i) give rise to the same colour classes. Since we
consider every pair i ≠ j , every edge of the form u u , v v , or w w has been coloured (we must have i ≠ j for such an
i j i j i j
edge to exist). Since the square is Latin, every possible entry ℓ occurs somewhere in row i, so every edge of the form u v , i ℓ
v w , or w u has been coloured, except that we did not look at the entry of the Latin square in the position (i, j) , where
i ℓ i ℓ
i = j . We know that the entry in position (i, i) is i (if i ≤ q ) or i − q (if i > q ), so the only edges that have not yet been
coloured are the edges of the form u v , v w , and w u when i ≤ q and u v , v w , and w u
i i i i i i when i > q , as well
i i−q i i−q i i−q
that are not incident with x , every vertex other than x is an endvertex of precisely one of the edges. For example, if i ≥ q
then u v i is one of these edges, while if i < q , w u is one of these edges, so either way, u is an endvertex of
i−q i+q i i
precisely one of these edges. Therefore, if for every q + 1 ≤ i ≤ 2q we use new colours to colour the edges that join the
vertices in each of the following sets:
ui , vi−q , x; vi , wi−q , x; andwi , ui−q , x, (18.1.12)
every edge incident with x (as well as all of our other remaining edges) will have been coloured.
Now all of the edges of K6q+1 have been coloured, and each colour class forms a K3 , so we have constructed a Steiner
triple system.
Example 18.1.4
4 2 5 3 1
2 5 3 1 4
5 3 1 4 2
3 1 4 2 5
{ u1 , u4 , v5 }, { v1 , v4 , w5 }, { w1 , w4 , u5 }, { u1 , u5 , v3 }, { v1 , v5 , w3 }, { w1 , w5 , u3 },
{ u2 , u3 , v5 }, { v2 , v3 , w5 }, { w2 , w3 , u5 }, { u2 , u4 , v3 }, { v2 , v4 , w3 }, { w2 , w4 , u3 },
{ u2 , u5 , v1 }, { v2 , v5 , w1 }, { w2 , w5 , u1 }, { u3 , u4 , v1 }, { v3 , v4 , w1 }, { w3 , w4 , u1 },
{ u3 , u5 , v4 }, { v3 , v5 , w4 }, { w3 , w5 , u4 }, { u4 , u5 , v2 }, { v4 , v5 , w2 }, { w4 , w5 , u2 },
{ u1 , v1 , w1 }, { u2 , v2 , w2 }, { u3 , v3 , w3 }, { u4 , v4 , w4 }, { u5 , v5 , w5 }
18.1.4 https://math.libretexts.org/@go/page/60167
After solving Woolhouse’s problem, Kirkman noticed that his construction of an STS(15) had a very nice property. He
challenged others to come up with this solution in the following problem that he published in the 1850 edition of the Ladies
and Gentleman’s Diary:
Fifteen young ladies in a school walk out three abreast for seven days in succession: it is required to arrange them daily so that
no two shall walk twice abreast.
This has become known as Kirkman’s Schoolgirl Problem.
Although this problem begins by requiring an STS(15), it has the additional requirement that it must be possible to partition the
blocks of this design (the rows of young ladies) into seven groups (of five blocks each) so that every point (young lady)
appears exactly once in each group. This extra requirement comes from the fact that each of the young ladies must walk out
every day, and can only be in one row in any given day.
v
Notice that since each block has 3 points in it, a Kirkman triple system is only possible if is an integer. Since a Kirkman triple
3
system is also a Steiner triple system, this means that we must have v ≡ 3(mod 6) .
There are seven non-isomorphic Kirkman triple systems of order 15.
Kirkman triple systems are also known to exist whenever v ≡ 3(mod 6) .
Exercise 18.1.1
1) For v = 37 , give the Latin square you would have to use in order to construct a Steiner triple system using the method
described in the proof of Theorem 18.1.1.
2) For v = 39 , give the Latin square you would have to use in order to construct a Steiner triple system using the method
described in the proof of Theorem 18.1.1.
3) For v = 19 , use the method described in the proof of Theorem 18.1.1 to construct a Steiner triple system.
4) Is the STS(15) constructed in Example 18.1.4 a Kirkman triple system? Explain your answer.
[Hint: Think of Pigeonhole-type arguments.]
5) Find all values of λ for which a triple system on six varieties exists. For each such value of λ , either give a design or explain
how to construct it.
2r
[Hint: Begin by showing that if such a design exists, its parameters are (2r, 6, r, 3, . Then determine what values r can take
)
5
on. Finally use some results about how to construct designs.]
6) Construct an STS(13) design. Show your work.
7) Let v = 21 .
(a) Write down the Latin square that you would use to construct a Steiner triple system (for this value of v ) using the method
described in the proof of Theorem 18.1.1.
(b) For the resulting Steiner triple system, which triples contain:
(i) both u and u ?
1 3
18.1.5 https://math.libretexts.org/@go/page/60167
(iv) both v and w ?
5 5
(v) w ?
3
8) Let v = 27 .
(a) Write down the Latin square that you would use to construct a Steiner triple system (for this value of v ) using the method
described in the proof of Theorem 18.1.1.
(b) For the resulting Steiner triple system, which triples contain:
(i) both v and u ?
3 8
(iv) v ?
2
This page titled 18.1: Steiner and Kirkman Triple Systems is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by
Joy Morris.
18.1.6 https://math.libretexts.org/@go/page/60167
18.2: t-Designs
In a BIBD, every pair appears together λ times. In the notation of Woolhouse’s problem, q = 2 . What about larger values of q?
(We’ll still only consider the case where every q-set appears an equal number of times λ , so the design must be balanced, but we
will include the more general situation that λ ≥ 1 .)
A t -(v, k, λ) design is a design on v points with blocks of cardinality k , such that every t -subset of V appears in exactly λ
blocks.
Theorem 18.2.1
In a t -(v, k, λ) design,
k−1 v−1
( ) is a divisor of λ( ) for every 0 ≤ i ≤ t − 1. (18.2.1)
t −1 t −1
Proof
We first consider the special case where i = 0 . Notice that in each of the b blocks, there are ( k
t
) subsets of cardinality t that
appear in that block. So in the entire design, b( ) subsets of cardinality t appear.
k
There exist ( ) subsets of cardinality t from the v points of V , and each appears in λ blocks, so in the entire design,
v
t
λ( )
v
t
v
) = λ( )
t
.
Similarly, if we fix any set of i varieties, there are ( ) subsets of cardinality t that include these i varieties. Each such
v−i
t−i
subset appears in λ blocks. However, for each of the blocks that contains these i elements (the number of these will be our
quotient), we can complete our i-set to a set of cardinality t that lies within this block, in ( ) ways. Thus, we have k−i
t−i
counted any such block ( ) times in the preceding count. So ( ) must be a divisor of λ( ) , as claimed.
k−i
t−i
k−i
t−i
v−i
t−i
Example 18.2.1
5
) , so
21b = 4368. Therefore b = 208 . This condition is satisfied.
t−i
6
) = 15
4
, and v−i
λ(
t−i
15
) =(
4
) = 15 ⋅ 14 ⋅ 13 ⋅ 124 ⋅ 3 ⋅ 2 = 15 ⋅ 7 ⋅ 13 , which is divisible by 15 .
This condition is satisfied.
When i = 2 we have ( ) = ( ) = 10 , and λ( ) = ( ) = 14 ⋅ 13 ⋅ 123 ⋅ 2 = 14 ⋅ 13 ⋅ 2 , which is not divisible by 10 since
k−i
t−i
5
3
v−i
t−i
14
10 ⋅ 9 ⋅ 8
so 4b = = 120 . Thus, b = 30 . Also, since bk = vr , we have 30 ⋅ 4 = 10r , so r = 12
6
18.2.1 https://math.libretexts.org/@go/page/60168
Example 18.2.2
{1, 5, 6, 10}, {1, 2, 8, 9}, {2, 3, 6, 7}, {3, 4, 9, 10}, {4, 5, 7, 8}, {1, 3, 4, 7},
{2, 4, 5, 10}, {1, 3, 5, 8}, {1, 2, 4, 6}, {2, 3, 5, 9}, {4, 6, 8, 9}, {1, 7, 9, 10},
{3, 6, 8, 9}, {5, 6, 7, 9}, {2, 7, 8, 10}, {1, 2, 3, 10}, {1, 2, 5, 7}, {1, 4, 5, 9}, (18.2.2)
{1, 3, 6, 9}, {1, 6, 7, 8}, {1, 4, 8, 10}, {2, 3, 4, 8}, {2, 4, 7, 9}, {2, 5, 6, 8},
{2, 6, 9, 10}, {3, 4, 5, 6}, {3, 5, 7, 10}, {3, 7, 8, 9}, {4, 6, 7, 10}, {5, 8, 9, 10}
Notice: For t ≥ 3 , a t -design is also a (t − 1) -design. If every t -set appears in exactly λ blocks, then any (t − 1) -set must
appear in exactly
λ(v − t + 1)
(k − t + 1)
blocks. This is because if we fix a (t − 1) -set, it can be made into a t -set by adding any one of the v − t + 1 other elements of
V . Each of these t -sets appears in λ of the blocks. However, some of these blocks are the same; in fact, we have counted each
block containing this (t − 1) -set once for every other element of the block (since every other element of the block forms a t -
set when put together with the (t − 1) -set). So every block that contains this (t − 1) -set has been counted k − (t − 1) times.
The result follows. (From the above formula we can see that k − t + 1 is a divisor of λ(v − t + 1) ; this is exactly the
condition that Theorem 18.2.1 gives when we take i = t − 1. )
Therefore, since
1(10 − 3 + 1)
=4 ,
(4 − 3 + 1)
the 3 − (10, 4, 1) design that we gave above, is also a 2 − (10, 4, 4) design. In more generality, a t − (v, k, λ) design with
v, k, λ(v − t + 1)
t >2 is also a (t − 1) − ( ) design.
(k − t + 1)
Steiner’s name is also used in this more general context, and without the constraint on the block
Given k and t , there is an integer v(k, t) such that for every v > v(k, t) that satisfies the conditions:
v∈ Z ; and
for every 0 ≤ i ≤ t − 1 ,
k−1 v−1
( ) is a divisor of λ( ) (18.2.3)
t −1 t −1
18.2.2 https://math.libretexts.org/@go/page/60168
a t − (v, k, 1) design exists.
Proof
We will not give a proof of this theorem.
Exercise 18.2.1
1) Substituting t = 2 into the equations of Theorem 18.1.1 doesn’t immediately look like either of the equations in Theorem
17.1.1. Use the equations of Theorem 17.1.1 to deduce that
k v
( ) is a divisor of λ( )
2 2
2) If v = 15 and λ = 1 , what are all possible values of k and t ≥ 2 for which t -designs might exist? Do not include any trivial
t − (v, t, 1) design, so you may assume v > k > t .
3) Is it possible for a 3 − (16, 6, 1) design to exist? If so, how many blocks will it have? What will the value of r be?
4) Let B be the (7, 3, 1)-design. Define a new design D as follows, on the varieties {1, . . . , 8}. It has 14 blocks, of two types:
(I) the blocks of B but with variety 8 added to each; and
(II) the blocks of the complementary design to B .
Prove that this is a Steiner system with t = 3 , k = 4 , and v = 8 . Use the structure of B and its complement to show that
λ = 1 ; do not check all ( ) possible 3 -subsets of {1, . . . , 8}.
8
5) Define a design as follows. Label the edges of the complete graph K ; these will be the varieties of the design. The blocks
6
Determine the parameters of this t -design (including the highest value of t for which this is a t -design, and justifying each
value you determine), and show that this is a Steiner system.
6) Might a 3 − (20, 5, 8) design exist according to the necessary conditions we have determined (Theorem 18.1.1)? State the
formulas that must be satisfied and show your work.
This page titled 18.2: t-Designs is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy Morris.
18.2.3 https://math.libretexts.org/@go/page/60168
18.3: Affine Planes
You are probably familiar with at least some of Euclid’s axioms of geometry. The following are amongst Euclid’s axioms (we have
not used the same terms Euclid used in his Elements, but commonly-used statements that are equivalent to Euclid’s):
If you haven’t taken geometry classes in university, you may not know that we can apply these axioms to finite sets of points, and
discover structures that we call finite Euclidean geometries, or more commonly, affine planes. To avoid some trivial situations,
we also require that the structure has at least three points, and that not all of the points lie on a single line.
The following definition probably seems obvious.
Definition: Parallel
We say that two lines are parallel if no point lies on both lines.
We’ve made special note of this definition because in the finite case, “parallel” lines might be drawn in such a way that they don’t
look parallel according to our usual understanding of the term. Since each line has only a finite number of points on it, two lines L 1
and L are parallel as long as none of the points on L also lies on L , even if in a particular drawing in appears that these lines
2 1 2
will meet if we extend them. In the figure below, lines L and L have three points each, and the lines are parallel.
1 2
For the purposes of this book, we will only consider finite affine planes, so assume from now on that the set of points is finite. It is
not very obvious, but the parallel postulate (together with the final axiom) ensure that it is not possible to have a line that doesn’t
contain any points, so the set of lines will also be finite.
There is a very nice bijective argument that can be used to show that the number of points on any two lines is equal. Before
presenting this, we show that there cannot be a line that contains only one point.
Proposition 18.3.1
Proof
The following diagram may be a helpful visual aid as you read through the proof below.
18.3.1 https://math.libretexts.org/@go/page/60169
Towards a contradiction, suppose that there were a line L that contained only a single point, p . By the final axiom, there are
at least two other points in the finite affine plane, q and r , that do not both lie on a line with p . By the first axiom, there is a
line L that contains both p and q (but by our choice of q and r , L, does not contain r ). Now by the parallel postulate, there
′
is a line M through r that is parallel to L . Furthermore, there is a unique line through p that is parallel to M . But we know
′
that M is parallel to L ; in addition, M does not contain p , so it is also parallel to L; this is a contradiction.
′
We can now show that every line contains the same number of points.
Proposition 18.3.2
In a finite affine plane, if one line contains exactly n points, then every line contains exactly n points.
Proof
Again, we begin with a diagram that may be a helpful visual aid to understanding this proof.
Let L be a line that contains exactly n points, and let L be any other line of the finite affine plane. By Proposition 18.3.1,
′
we know that n > 1 , and that L also contains at least two points (we observed above that no line can be empty of points).
′
Since any two points lie together on a unique line, if L and L meet, they meet in a single point, so there is a point p of L ′
that is not in L , and a point q of L that is not in L. By the first axiom, there is a line M that contains the points p and q .
′ ′
We define a map ψ from points of L to points of L as follows. Let ψ(p) = q . For any other point p of L (with p ≠ p ),
′ ′ ′
by the parallel postulate there is a unique line M through p that is parallel to M . Since M passes through q and is parallel
′ ′
to M , it is the unique line with this property, so in particular, L cannot be parallel to M . Therefore, M has a unique
′ ′ ′ ′
point of intersection, say q , with L . Define ψ(p ) = q . Since M and q were uniquely determined, the map ψ is well-
′ ′ ′ ′ ′ ′
defined (that is, there is no ambiguity about which point of L is found to be ψ(p ) ). ′ ′
We claim that ψ is a bijection between the points of L and the points of L ; proving this will complete the proof. We first ′
show that ψ is one-to-one. Suppose that ψ(p ) = q , ψ(p ) = q , and q and q are actually the same point of L . Then by
1 1 2 2 1 2
′
the definition of ψ , q is on some line M that is parallel to M , and contains p , while q is on some line M that is
1 1 1 2 2
parallel to M , and contains p . Since q = q , the parallel postulate tells us that we must have M = M . This line can
2 1 2 1 2
To show that ψ is onto, let q be any point of L with q ≠ q (we already know that q has a pre-image, p ). By the parallel
′′ ′ ′′
postulate, there is a unique line M through q that is parallel to M . Since M is the unique line through p that is parallel
′′ ′′
to M , we see that L is not parallel to M , so L must meet M at some point that we will call p . Now by the definition
′′ ′′ ′′ ′′
We refer to the number of points on each line of a finite affine plane as the order of the plane. We can now figure out how many
points are in a finite affine plane of order n .
18.3.2 https://math.libretexts.org/@go/page/60169
Proposition 18.3.3
Proof
Since the plane has at least three points, not all of which lie on the same line, it has at least two lines L and L that intersect
′
at a point q but are not equal. We know that each of these lines contains n points. By the parallel postulate, for each of the
n − 1 points on L that is not on L , there is a line through that point that is parallel to L . Now, L and these n − 1 lines that
′
are parallel to L each contain n points, and since they are all parallel (it is an exercise, see below, to prove that if M is
parallel to L and N is parallel to L, then N is parallel to M ), these points are all distinct. Therefore the plane has at least
n points.
2
Consider any point p that is not on L. By the parallel postulate, there must be a line P through p that is parallel to L. Now,
L is the unique line through q that is parallel to P , so in particular, L is not parallel to P . Therefore, L and P have a
′ ′
point of intersection, which is one of the n − 1 points of L that is not on L. So P was one of the n − 1 lines that we
′
found in our first paragraph, meaning that p is one of the n points that we found there. Thus, the plane has exactly n
2 2
points.
You might be wondering by now why we are spending so much time looking at affine planes, when they are a geometric structure.
Despite the fact that they come from geometry, finite affine planes can be thought of as a special kind of design.
Think of the points of a finite affine plane as points of a design, and the lines as blocks, with a point being in a block if it is incident
with (on) the corresponding line. The first axiom for the incidence relation guarantees that every pair of points appear together in
exactly one block, so our design has λ = 1 . By Proposition 18.3.2, we see that an affine plane of order n is uniform, with k = n .
By Proposition 18.3.3, an affine plane of order n has v = n . Although we have not included a proof of this, it can also be shown
2
Using
k v
b( ) = λ( ) (18.3.1)
t t
from Theorem 18.2.1, we see that in a finite affine plane of order n looked at as a design, we have b( n
2
n
) =(
2
) , so
bn(n − 1) = n (n − 1) . Hence b = n(n + 1) . In other words, a finite affine plane of order n has n(n + 1) lines.
2 2
Example 18.3.1
A finite affine plane of order 3 has 3 = 9 points. Each line has 3 points, and there are 3(4) = 12 lines. Since each line has
2
three points, every line lies in a parallel class consisting of three mutually parallel lines, so there are four such parallel classes.
We can choose two of the parallel classes of lines to be “horizontal” and “vertical” lines. The other two classes will be the two
types of diagonal lines. We can’t draw all of these as straight lines, so we have drawn one parallel class of lines as sets of three
points joined by dashes, and the other as sets of three points joined by dots.
The dashes and dots that join sets of points that aren’t in a straight line may not provide a very clear image of what’s going on;
it is probably clearer to think of the diagonal lines as “wrapping around” when they go off the bottom, top, or either side of the
image, and reappearing on the opposite side.
There is a nice connection between affine planes and mutually orthogonal Latin squares.
18.3.3 https://math.libretexts.org/@go/page/60169
Theorem 18.3.1
There is an affine plane of order n > 1 if and only if there are n − 1 mutually orthogonal Latin squares of order n .
Proof
(⇒) An affine plane of order n has n + 1 classes of parallel lines (each class containing n lines). Consider two of these
sets as the horizontal lines and the vertical lines: H , . . . , H and V , . . . , V . Label a point as (i, j) if it lies at the
1 n 1 n
intersection of V and H . Since each of the n points lies on a vertical line and on a horizontal line, and since every pair of
i j
2
points lie together on only one line, this is actually a bijection between {1, . . . , n} × {1, . . . , n}and the points of the affine
plane; that is, this provides n coordinates that uniquely determine the n points of the plane.
2 2
Consider any one of the remaining parallel classes of n lines, L , . . . , L . Observe that every point of the affine plane lies
1 n
on precisely one of these lines. Create a Latin square from this parallel class by placing k in position (i, j) of the Latin
square if and only if the point (i, j) of the affine plane lies on line L . Since every line of L , . . . , L meets every line of
k 1 n
H , . . . , H exactly once (by the axioms of an affine plane), each entry will appear exactly once in each row. Similarly,
1 n
since every line of L , . . . , L meets every line of V , . . . , V exactly once (by the axioms of an affine plane), each entry
1 n 1 n
will appear exactly once in each column. So we have indeed created a Latin square.
We will now show that the n − 1 Latin squares created by this method (using the n − 1 parallel classes of lines that remain
after excluding the ones we have designated as horizontal and vertical lines) are mutually orthogonal. We’ll do this by
considering two arbitrary Latin squares, L (coming from the lines L , . . . , L ) and M (coming from the lines
1 n
M , . . . , M ). In position (i, j) , the entry of L being i means that line L passes through the point (i, j) (which is the
′
′
1 n i
intersection of lines V and H ). Similarly, this entry of M being j means that line M passes through the point (i, j)
i j
′
′
j
(which is the intersection of lines V and H ). Since the lines L and M have a unique point of intersection, there cannot
i j
′
i j
′
be any other positions in which the entry of L is i while the entry of M is j . Thus, each ordered pair
′ ′
(i , j ) ∈ {1, . . . , n} × {1, . . . , n} must appear in exactly one position as the entries of L and M (in that order), and hence
′ ′
L and M are orthogonal. Since they were arbitrary, we have n − 1 mutually orthogonal Latin squares.
(⇐) The converse of this proof uses the same idea, in the opposite direction. Given n − 1 mutually orthogonal n by n
Latin squares, take the n coordinate positions to be the points of our affine plane. Define two parallel classes of lines (each
2
containing n lines) to be the points whose first coordinate is equal (so all of the points with first coordinate 1 form one line,
and all of the points with first coordinate 2 form a second line, etc.), and the points whose second coordinate is equal. Each
of the n − 1 Latin squares determines an additional parallel class of n lines: namely, each line consists of the points for
which the entry of the Latin square has some fixed value. Since n > 1 , there are clearly at least three points that are not all
on the same line. We leave it as an exercise to prove that any two points lie together in a unique line and that the parallel
postulate is satisfied.
Example 18.3.2
Use the formula from the proof of Theorem 16.3.2 to construct 6 MOLS of order 7. Use the construction given in the proof of
Theorem 18.3.1 to construct an affine plane of order 7 from your squares.
Solution
The squares will be:
18.3.4 https://math.libretexts.org/@go/page/60169
0 1 2 3 4 5 6 0 1 2 3 4 5 6 (18.3.2)
1 2 3 4 5 6 0 2 3 4 5 6 0 1
2 3 4 5 6 0 1 4 5 6 0 1 2 3
3 4 5 6 0 1 2 6 0 1 2 3 4 5
4 5 6 0 1 2 3 1 2 3 4 5 6 0
5 6 0 1 2 3 4 3 4 5 6 0 1 2
6 0 1 2 3 4 5 5 6 0 1 2 3 4
0 1 2 3 4 5 6 0 1 2 3 4 5 6
3 4 5 6 0 1 2 4 5 6 0 1 2 3
6 0 1 2 3 4 5 1 2 3 4 5 6 0
2 3 4 5 6 0 1 5 6 0 1 2 3 4
5 6 0 1 2 3 4 2 3 4 5 6 0 1
1 2 3 4 5 6 0 6 0 1 2 3 4 5
4 5 6 0 1 2 3 3 4 5 6 0 1 2
0 1 2 3 4 5 6 0 1 2 3 4 5 6
5 6 0 1 2 3 4 6 0 1 2 3 4 5
3 4 5 6 0 1 2 5 6 0 1 2 3 4
1 2 3 4 5 6 0 4 5 6 0 1 2 3
6 0 1 2 3 4 5 3 4 5 6 0 1 2
4 5 6 0 1 2 3 2 3 4 5 6 0 1
2 3 4 5 6 0 1 1 2 3 4 5 6 0
The affine plane will have 7 = 49 points, and we will denote these as ordered pairs (a, b), where a , b ∈ {1, . . . , 7}and
2
consider them to represent the 49 positions in a 7 by 7 Latin square. There will be 7(8) = 56 lines, in 8 parallel classes of
seven lines each. Although we could draw the affine plane, you’ve already seen from the affine plane of order 3 that a black-
and-white image all of which is pre-drawn can be more confusing than helpful, so instead we will list each of the 56 lines as a
set of 7 points.
The first parallel class will represent the horizontal rows:
{(1, 1), (2, 1), (3, 1), (4, 1), (5, 1), (6, 1), (7, 1)}, {(1, 2), (2, 2), (3, 2), (4, 2), (5, 2), (6, 2), (7, 2)},
{(1, 3), (2, 3), (3, 3), (4, 3), (5, 3), (6, 3), (7, 3)}, {(1, 4), (2, 4), (3, 4), (4, 4), (5, 4), (6, 4), (7, 4)},
(18.3.3)
{(1, 5), (2, 5), (3, 5), (4, 5), (5, 5), (6, 5), (7, 5)}, {(1, 6), (2, 6), (3, 6), (4, 6), (5, 6), (6, 6), (7, 6)},
{(1, 7), (2, 7), (3, 7), (4, 7), (5, 7), (6, 7), (7, 7)}
and similarly the second parallel class will represent the vertical rows:
{(1, 1), (1, 2), (1, 3), (1, 4), (1, 5), (1, 6), (1, 7)}, {(2, 1), (2, 2), (2, 3), (2, 4), (2, 5), (2, 6), (2, 7)},
{(3, 1), (3, 2), (3, 3), (3, 4), (3, 5), (3, 6), (3, 7)}, {(4, 1), (4, 2), (4, 3), (4, 4), (4, 5), (4, 6), (4, 7)},
(18.3.4)
{(5, 1), (5, 2), (5, 3), (5, 4), (5, 5), (5, 6), (5, 7)}, {(6, 1), (6, 2), (6, 3), (6, 4), (6, 5), (6, 6), (6, 7)},
{(7, 1), (7, 2), (7, 3), (7, 4), (7, 5), (7, 6), (7, 7)}.
The remaining six parallel classes will each represent one of the Latin squares. In the next parallel class, the first line consists
of all of the points where the entry of the first Latin square is 0; the second consists of all the points where the entry is 1, and so
on.
{(1, 1), (7, 2), (6, 3), (5, 4), (4, 5), (3, 6), (2, 7)}, {(2, 1), (1, 2), (7, 3), (6, 4), (5, 5), (4, 6), (3, 7)},
{(3, 1), (2, 2), (1, 3), (7, 4), (6, 5), (5, 6), (4, 7)}, {(4, 1), (3, 2), (2, 3), (1, 4), (7, 5), (6, 6), (5, 7)},
(18.3.5)
{(5, 1), (4, 2), (3, 3), (2, 4), (1, 5), (7, 6), (6, 7)}, {(6, 1), (5, 2), (4, 3), (3, 4), (2, 5), (1, 6), (7, 7)},
{(7, 1), (6, 2), (5, 3), (4, 4), (3, 5), (2, 6), (1, 7)}.
The next parallel class comes from the second Latin square (reading across, so the second square is the second one in the first
line):
18.3.5 https://math.libretexts.org/@go/page/60169
{(1, 1), (6, 2), (4, 3), (2, 4), (7, 5), (5, 6), (3, 7)}, {(2, 1), (7, 2), (5, 3), (3, 4), (1, 5), (6, 6), (4, 7)},
{(3, 1), (1, 2), (6, 3), (4, 4), (2, 5), (7, 6), (5, 7)}, {(4, 1), (2, 2), (7, 3), (5, 4), (3, 5), (1, 6), (6, 7)},
(18.3.6)
{(5, 1), (3, 2), (1, 3), (6, 4), (4, 5), (2, 6), (7, 7)}, {(6, 1), (4, 2), (2, 3), (7, 4), (5, 5), (3, 6), (1, 7)},
{(7, 1), (5, 2), (3, 3), (1, 4), (6, 5), (4, 6), (2, 7)}.
{(3, 1), (7, 2), (4, 3), (1, 4), (5, 5), (2, 6), (6, 7)}, {(4, 1), (1, 2), (5, 3), (2, 4), (6, 5), (3, 6), (7, 7)},
(18.3.7)
{(5, 1), (2, 2), (6, 3), (3, 4), (7, 5), (4, 6), (1, 7)}, {(6, 1), (3, 2), (7, 3), (4, 4), (1, 5), (5, 6), (2, 7)},
{(7, 1), (4, 2), (1, 3), (5, 4), (2, 5), (6, 6), (3, 7)}.
{(1, 1), (4, 2), (7, 3), (3, 4), (6, 5), (2, 6), (5, 7)}, {(2, 1), (5, 2), (1, 3), (4, 4), (7, 5), (3, 6), (6, 7)},
{(3, 1), (6, 2), (2, 3), (5, 4), (1, 5), (4, 6), (7, 7)}, {(4, 1), (7, 2), (3, 3), (6, 4), (2, 5), (5, 6), (1, 7)},
(18.3.8)
{(5, 1), (1, 2), (4, 3), (7, 4), (3, 5), (6, 6), (2, 7)}, {(6, 1), (2, 2), (5, 3), (1, 4), (4, 5), (7, 6), (3, 7)},
{(7, 1), (3, 2), (6, 3), (2, 4), (5, 5), (1, 6), (4, 7)}.
{(1, 1), (3, 2), (5, 3), (7, 4), (2, 5), (4, 6), (6, 7)}, {(2, 1), (4, 2), (6, 3), (1, 4), (3, 5), (5, 6), (7, 7)},
{(3, 1), (5, 2), (7, 3), (2, 4), (4, 5), (6, 6), (1, 7)}, {(4, 1), (6, 2), (1, 3), (3, 4), (5, 5), (7, 6), (2, 7)},
(18.3.9)
{(5, 1), (7, 2), (2, 3), (4, 4), (6, 5), (1, 6), (3, 7)}, {(6, 1), (1, 2), (3, 3), (5, 4), (7, 5), (2, 6), (4, 7)},
{(7, 1), (2, 2), (4, 3), (6, 4), (1, 5), (3, 6), (5, 7)}.
And finally,
{(1, 1), (2, 2), (3, 3), (4, 4), (5, 5), (6, 6), (7, 7)}, {(2, 1), (3, 2), (4, 3), (5, 4), (6, 5), (7, 6), (1, 7)},
{(3, 1), (4, 2), (5, 3), (6, 4), (7, 5), (1, 6), (2, 7)}, {(4, 1), (5, 2), (6, 3), (7, 4), (1, 5), (2, 6), (3, 7)},
(18.3.10)
{(5, 1), (6, 2), (7, 3), (1, 4), (2, 5), (3, 6), (4, 7)}, {(6, 1), (7, 2), (1, 3), (2, 4), (3, 5), (4, 6), (5, 7)},
{(7, 1), (1, 2), (2, 3), (3, 4), (4, 5), (5, 6), (6, 7)}.
Every affine plane that we know of, has as its order some prime power. We have previously seen (through the connection to
MOLS) that there are affine planes of every prime order. Many design theorists have tried to answer the question of whether or not
the order of an affine plane must always be a prime power, but the answer is not yet known. In fact, it is not currently known
whether or not there is an affine plane of order 12.
Exercise 18.3.1
1) Prove that if L, M , and N are lines of an affine plane, and L is parallel to both M and N , then M is parallel to N .
2) Draw a finite affine plane of order 5. How many lines does it have?
3) How many points, and how many lines are in a finite affine plane of order 19?
4) Prove the omitted details from the proof of Theorem 18.3.1: that is, that the given construction yields a structure that
satisfies the axioms of an affine plane.
5) Draw an affine plane of order 5. Use the construction given in the proof of Theorem 18.3.1 to produce 4 mutually
orthogonal Latin squares of order 5 from your plane.
This page titled 18.3: Affine Planes is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy Morris.
18.3.6 https://math.libretexts.org/@go/page/60169
18.4: Projective Planes
A projective plane is another geometric structure (closely related to affine planes). In a finite projective plane, the set of points (and
therefore the set of lines) must be finite. Like finite affine planes, finite projective planes can be thought of as a special kind of
design.
As in the case of affine planes, the final axiom has been developed to avoid some trivial situations.
Think of the points of a finite projective plane as points of a design, and the lines as blocks, with a point being in a block if it is
incident with the corresponding line. Then the first condition on the incidence relation for a projective plane guarantees that every
pair of points appear together in exactly one block.
Example 18.4.1
The Fano plane is the most well-known finite projective plane (and also the smallest). Here is a drawing of it. It has 7 points
and 7 lines, one of which is the circle around the middle.
You have seen this structure already in this course; it is the same as the BIBD(7, 3, 1) that appeared in Example 17.1.1.
The following is a very interesting connection. We will not try to present the proof here, but it is a natural extension of the similar
result that we proved for affine planes.
Theorem 18.4.1
There is a finite projective plane with n + 1 points on each line, if and only if there is a complete set of n − 1 MOLS of order
n.
Exercise 18.4.1
1) Is every design with λ = 1 a projective plane? If not, what condition could fail?
2) Which (if any) of the designs we have seen in this course, are projective planes?
3) From our results on MOLS, for what values can you be sure that a projective plane exists?
4) From our results on MOLS, for what values can you be sure that a projective plane does not exist?
5) What can you determine about the parameters of a design that corresponds to a projective plane?
This page titled 18.4: Projective Planes is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy Morris.
18.4.1 https://math.libretexts.org/@go/page/60170
18.5: Summary
Construction of Steiner Triple Systems
Structure of Affine Planes
Connection between Affine Planes and MOLS
Important Definitions:
Triple System, Steiner Triple System
Resolvable Design, Kirkman Triple System
t -Design
Affine Plane
Projective Plane
Notation:
STS(v)
This page titled 18.5: Summary is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy Morris.
18.5.1 https://math.libretexts.org/@go/page/60171
CHAPTER OVERVIEW
This page titled 19: Designs and Codes is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy Morris.
1
19.1: Introduction
When information is transmitted, it may get garbled along the way. Error-correcting codes can make it possible for the recipient of
a garbled message to figure out what the sender intended to say.
Assumption 19.1.1
For definiteness, we assume the message to be sent is a string (or “word”) of bits (0s and 1s). (Information stored in a
computer is always converted to such a string, so this is not a serious limitation.)
Example 19.1.1
Perhaps the word 0110 tells an automated factory to close the 2 and 3 valves. If we send that message over a wireless
nd rd
network, interference (or some other issue) might change one of the bits, so the factory receives the message 0010. As a result,
the factory closes only the 3 valve, and leaves the 2 valve open. This could have disastrous consequences, so we would
rd nd
Example 19.1.2
One simple solution is to append a check-bit to the end of the message. To do this, we set three rules:
1. We require all messages to have a certain length. (For example, let’s say that all messages must have exactly 5 bits.)
2. We require all messages to have an even number of 1s.
3. We agree that the final bit of the message (called a “parity check-bit”) will not convey any information, but will be used
only to guarantee that Rule 2 is obeyed. (Thus, each message we send will have 4 bits of information, plus the check bit.)
In particular, if we wish to send the message 0110 (which already has an even number of 1s), then we append 0 to the end, and
send the message 01100. If, say, the 2 bit gets changed in transmission, so the factory receives the message 00100, then the
nd
factory’s computer control can see that this cannot possibly be the intended message, because it has an odd number of 1s. So
the factory can return an error message, asking us to send our instructions again.
Note
As a real-life example, bar-code scanners used by cashiers employ the above principle: if the check-bit is not correct, then the
scanner does not beep, so the cashier knows that the item needs to be rescanned.
Exercise 19.1.1
Under the rules of Example 19.1.2, which of the following strings are allowed to be sent as a message?
00110, 10101, 00000, 11011 .
It is sometimes not feasible to have a message re-sent (for example, if it spends a long time in transit), so it would be much
better to have a system that enables the recipient to correct transmission errors, not just detect them.
We could agree to send each bit of our message 3 times. For example, if we want to send the message 0110, then we would
transform (or “encode”) it as 000111111000. If, say, the 2 bit gets garbled, so the factory receives 010111111000, then it
nd
knows there was a problem in the transmission of the first 3 bits, because they are not all the same. Furthermore, since most of
these bits are 0, the factory can figure out that we probably meant to say 000, and correctly decode the entire message as 0110.
The triple-repetition code works, but it is very inefficient, because only one-third of the bits we send are conveying information —
most of the bits are check-bits that were added to correct the possible errors. After developing some theory, we will see some codes
that are able to correct every single-bit error, but use far fewer check bits.
19.1.1 https://math.libretexts.org/@go/page/60173
Exercise 19.1.2
Let T be the set of all ternary sequences of length n (so every entry is 0, 1, or 2). Write a recurrence relation for c , the
n
n
number of code words from T that have no 2 consecutive zeros. Use generating functions to solve your recurrence relation,
n
This page titled 19.1: Introduction is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy Morris.
19.1.2 https://math.libretexts.org/@go/page/60173
19.2: Error-Correcting Codes
In order to be able to correct errors in transmission, we agree to send only strings that are in a certain set C of codewords. (So the
information we wish to send will need to be “encoded” as one of the codewords.) In our above examples, C was
the set of all words of length 5 that have an even number of 1s, or
the set of words of length 12 that consist of four strings of three equal bits.
The set C is called a code. Choosing the code cleverly will enable us to successfully correct transmission errors.
When a transmission is received, the recipient will assume that the sender transmitted the codeword that is “closest” to the string
that was received. For example, if C were the set of 5-letter words in the English language, and the string “fruiz” were received,
then the recipient would assume (presumably correctly), that the sender transmitted the word “fruit,” because that is only off by one
letter. (This is how we ordinarily deal with the typographical errors that we encounter.)
By the “closest” codeword, we mean the codeword at the smallest distance, in the following sense:
Example 19.2.1
For clarity, we underline the bits in the second string that differ from the corresponding bit in the first string:
d(11111, 11101) = 1
d(11100, 01001) = 3
d(10101, 01010) = 5
d(10101, 10101) = 0
Note
When two bits are “transposed” (or “switched”), meaning that a string 01 gets changed to 10 (or vice-versa), this counts as two
bits being different because a 0 is changed to a 1 and a 1 is changed to a 0, even though you might think of the switch as being
only a single operation.
Exercise 19.2.1
Compute the Hamming distance between the following pairs of words: {110, 011} , {000, 010} , {brats , grass} ,
.
{11101, 00111}
Exercise 19.2.2
19.2.1 https://math.libretexts.org/@go/page/60174
Exercise 19.2.3
Prove that the Hamming function satisfies each of the following properties, which define a metric.
Let x, y , and z be words of the same length. Then:
1) d(x, y) ≥ 0 .
2) d(x, y) = 0 ⟺ x = y .
3) d(x, y) = d(y, x).
4) d(x, z) ≤ d(x, y) + d(y, z) .
Note
Part (4) of the Exercise 19.2.3 is called the triangle inequality because it says that the length of one side of a triangle is always
less than (or equal to) the sum of the lengths of the other two sides.
Example 19.2.2
Exercise 19.2.4
Making the minimum distance of a code C large is the key to ensuring that it can detect (or correct) large errors. We will make this
idea very explicit in our next two results.
Theorem 19.2.1
A code C can detect all possible errors affecting at most k bits if and only d(C ) > k .
Proof
We will prove the contrapositive:
d(C ) ≤ k ⟺ there exists an error that cannot be detected, and affects only k (or fewer) bits.
(⇐) Suppose there is a situation in which:
a codeword x is sent,
a message y is received,
with only k incorrect bits, and
the receiver does not realize that there were any errors.
Since the receiver did not realize there were any errors, the message that was received must be a codeword. In other words,
y ∈ C . Since there are k errors in the received message, we also know that d(x, y) = k . Since x , y ∈ C , this implies
d(C ) ≤ k .
19.2.2 https://math.libretexts.org/@go/page/60174
(⇒) By assumption, there exist x , y ∈ C with x ≠ y , but d(x, y) ≤ k . Now, suppose codeword x is sent. Since
d(x, y) ≤ k , changing k (or fewer) bits can change x to y, so y can be the message that is received, even if errors in
transmission affect only k bits. Since y ∈ C , the recipient does not realize an error was made, and assumes that y was the
intended message. So the k (or fewer) errors were not detected.
Although a minimum distance of k allows us to detect errors that affect at most k bits, it isn’t sufficient to allow us to correct all
such errors. For the purposes of correcting errors, we require the minimum distance to be twice this large.
Theorem 19.2.2
A code C can correct all possible errors affecting at most k bits if and only d(C) > 2k .
Proof
We will prove the contrapositive:
d(C) ≤ 2k ⟺ there exists an error that is not properly corrected, and affects only 2k (or fewer) bits.
So d(C ) ≤ 2k (because x , z ∈ C ).
d(x, y) 2k
(⇒) By assumption, there exist x , y ∈ C with x ≠y , but d(x, y) ≤ 2k . Let r =⌈ ⌉ ≤⌈ ⌉ =k . (In other
2e 2e
d(x, y)
words, r is obtained by rounding up to the nearest integer.)
2
Now suppose codeword x is sent. Since d(x, y) ≤ 2k , the message y could be received, with no more than 2k incorrect
bits. Construct z from x by changing only r of the d(x, y) bits that are incorrect in y, so
Therefore z is at least as close to y as x is, so the recipient has no way of knowing that x is the message that was sent. So it
was not possible to correct the 2k (or fewer) errors.
Exercise 19.2.5
19.2.3 https://math.libretexts.org/@go/page/60174
Exercise 19.2.6
Let B represent the set of binary strings of length n . Prove that a code from B that has more than 2 words, cannot correct 3
n 10
errors. Hypothesize a generalisation of this result to codes on B with more than 2 words.
n
This page titled 19.2: Error-Correcting Codes is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Joy Morris.
19.2.4 https://math.libretexts.org/@go/page/60174
19.3: Using the Generator Matrix For Encoding
Notation
It is convenient to represent the binary string x 1 x2 . . . xn as a column vector:
x1
⎡ ⎤
⎢ x2 ⎥
⎢ ⎥. (19.3.1)
⎢ ...⎥
⎣ ⎦
xn
Example 19.3.1
Appending a parity check-bit to the string 010 yields 0101. The same result can be obtained by multiplying the column vector
corresponding to 010 by the following generator matrix:
1 0 0
⎡ ⎤
⎢0 1 0⎥ I3
G=⎢ ⎥ =[ ]
⎢0 0 1⎥ A
⎣ ⎦
1 1 1
In fact, multiplying any 3-bit string by G yields the same string with its parity check-bit appended.
x1
⎡ ⎤
x1
⎡ ⎤
⎢ x2 ⎥ 0 if even # of 1s
⎢ x2 ⎥ =⎢ ⎥ , and x1 + x2 + x3 (mod 2) = {
⎢x ⎥ 1 if odd # of 1s
⎣ ⎦ 3
x3
⎣ ⎦
x1 + x2 + x3
General Method
For some k , r ∈ N , +
Multiplying a k -bit string by G yields the same string, with r check bits appended at the end. We let C be the set of all possible
strings Gx, and we call G the generator matrix of this code.
Note
In the next section, we will see how to choose G so that the resulting code C can correct errors.
Although many important error-correcting codes are constructed by other methods, we will only discuss the ones that come from
generator matrices (except in Section 19.5).
19.3.1 https://math.libretexts.org/@go/page/60175
Definition: Binary Linear Code
Any code that comes from a generator matrix G (by the General Method described above) is said to be a binary linear code.
Example 19.3.2
Find all the codewords of the binary linear code C corresponding to the generator matrix
I3 1 0 1
G=[ ] , with A = [ ].
A 0 1 1
Solution
We have
1 0 0
⎡ ⎤
⎢0 1 0⎥
I3 ⎢ ⎥
G=[ ] , with A = ⎢ 0 0 1⎥ .
⎢ ⎥
A ⎢ ⎥
⎢1 0 1⎥
⎣ ⎦
0 1 1
0 0 0 0
⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤
⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦
0 1 1 0
(19.3.2)
1 1 1 1
⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤
⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦
0 1 1 0
Exercise 19.3.1
Ik
Encode each of the given words by using the generating matrix G = [ ] . associated to the given matrix A .
A
1 1 0 0
1) A = [ ] . Words to encode: 0101, 0010, 1110.
1 0 0 1
1 1 0
⎡ ⎤
⎢1 0 1⎥
2) A =⎢ ⎥ . Words to encode: 110, 011, 111, 000.
⎢0 1 1⎥
⎣ ⎦
1 1 1
The generator matrix provides an easy way to encode messages for sending, but it is hard to use it to decode a message that has
been received. For that, the next section will introduce a slightly different matrix. From this new matrix, it will be easy to
determine whether the corresponding code can correct every single-bit error.
This page titled 19.3: Using the Generator Matrix For Encoding is shared under a CC BY-NC-SA license and was authored, remixed, and/or
curated by Joy Morris.
19.3.2 https://math.libretexts.org/@go/page/60175
19.4: Using the Parity-Check Matrix For Decoding
Notation
Ik
A binary linear code is of type (n, k) (or we say C is an (n, k) code) if its generator matrix G = [ ] is an n × k matrix. In
A
other words, G encodes messages of length k as codewords of length n , which means that the number of check bits is n − k .
We usually use r to denote the number of check bits, so r = n − k . Then A is an r × k matrix.
Exercise 19.4.1
How many codewords are there in a binary linear code of type (n, k)?
parity-check matrix of C is
P = [A Ir ]. (19.4.1)
Example 19.4.1
1) For the code C of Example 19.3.2, the matrix A is 2 × 3 , so r = 2 . Therefore, the parity-check matrix of C is
1 0 1 1 0
P = [A Ir ] = [A I2 ] = [ ] .
0 1 1 0 1
2) For a single parity check-bit, as in Example 19.3.1, we have A = [1 1 . This is a 1 × 3 matrix, so r = 1 . Therefore, the
1]
(since I1 = [1] ).
Exercise 19.4.2
⎢ 0 1 0 0⎥
⎢ ⎥
⎢ 0 0 1 0⎥
⎢ ⎥
⎢ ⎥
⎢ 0 0 0 1⎥
⎢ ⎥
⎢ ⎥
⎢ 1 0 1 1⎥
⎢ ⎥
⎢ 1 0 0 1⎥
⎣ ⎦
0 1 1 1
The parity-check matrix can be used to check whether a message we received is a valid codeword:
Proposition 19.4.1
Proof
(⇒) Since x is a codeword, we have x = Gm for some (k -bit) message m. This means that
19.4.1 https://math.libretexts.org/@go/page/60176
Ik m
x = Gm = [ ]m = [ ] (19.4.2)
A Am
Then
m
P x = [A Ir ] [ ] = [Am + Am] = [2Am] ≡ 0(mod 2). (19.4.3)
Am
m
(⇐) Suppose P x = 0 . Write x = [ ] , where
y
m
0 = P x = [A Ir ] [ ] = [Am + y]. (19.4.4)
y
m m I −k
x =[ ] =[ ] =[ ] m = Gm (19.4.5)
y Am A
so x ∈ C .
Example 19.4.3
Here is a simple illustration of Proposition 19.4.1. For the code in which every codeword is required to have an even number of
1 s, Example 19.4.1(2) tells us that the parity-check matrix is P = [1 1 1 1]. Hence, for any 4 -bit string x x x x , we have 1 2 3 4
x1
⎡ ⎤
⎢ x2 ⎥
P x = [1 1 1 1] ⎢ ⎥ = [ x1 + x2 + x3 + x4 ].
⎢x ⎥
3
⎣ ⎦
x4
This is 0(mod 2) if and only if there are an even number of 1s in x, which is what it means to say that x is a codeword.
Example 19.4.4
Use the parity-check matrix to determine whether each of these words is in the code C of Example 19.3.2:
11111, 10101, 00000, 11010.
Solution
From Example 19.4.1(1), we know that the parity-check matrix of this code is
1 0 1 1 0
P =[ ].
0 1 1 0 1
We have:
1
⎡ ⎤
⎢1⎥
⎢ ⎥ 1 ⋅ 1 +0 ⋅ 1 +1 ⋅ 1 +1 ⋅ 1 +0 ⋅ 1 1 0
P =⎢1⎥ =[
⎢ ⎥
] =[ ] ≠[ ] , so 11111 is not a codeword.
⎢ ⎥ 0 ⋅ 1 +1 ⋅ 1 +1 ⋅ 1 +0 ⋅ 1 +1 ⋅ 1 1 0
⎢1⎥
⎣ ⎦
1
19.4.2 https://math.libretexts.org/@go/page/60176
1
⎡ ⎤
⎢0⎥
⎢ ⎥ 1 ⋅ 1 +0 ⋅ 0 +1 ⋅ 1 +1 ⋅ 0 +0 ⋅ 1 0
P =⎢1⎥ =[
⎢ ⎥
] =[ ] , so 10101 is a codeword.
⎢ ⎥ 0 ⋅ 1 +1 ⋅ 0 +1 ⋅ 1 +0 ⋅ 0 +1 ⋅ 1 0
⎢0⎥
⎣ ⎦
1
0
⎡ ⎤
⎢0⎥
⎢ ⎥ 1 ⋅ 0 +0 ⋅ 0 +1 ⋅ 0 +1 ⋅ 0 +0 ⋅ 0 0
P =⎢0⎥ =[
⎢ ⎥
] =[ ] , so 00000 is a codeword.
⎢ ⎥ 0 ⋅ 0 +1 ⋅ 0 +1 ⋅ 0 +0 ⋅ 0 +1 ⋅ 0 0
⎢0⎥
⎣ ⎦
0
1
⎡ ⎤
⎢1⎥
⎢ ⎥ 1 ⋅ 1 +0 ⋅ 1 +1 ⋅ 0 +1 ⋅ 1 +0 ⋅ 0 0 0
P =⎢0⎥ =[
⎢ ⎥
] =[ ] ≠[ ] , so 11010 is not a codeword.
⎢ ⎥ 0 ⋅ 1 +1 ⋅ 1 +1 ⋅ 0 +0 ⋅ 1 +1 ⋅ 0 1 0
⎢1⎥
⎣ ⎦
0
(These answers can be verified by looking at the list the elements of \mathcal{C} in the solution of Example 19.3.2.)
It is evident from the parity-check matrix whether a code corrects every single-bit error:
Theorem 19.4.1
A binary linear code C can correct every single-bit error if and only if the columns of its parity-check matrix are all distinct and
nonzero.
Proof
Suppose a codeword x is transmitted, but the ith bit gets changed, so a different string y is received. Let ei be the string that
is all 0s, except that the i bit is 1, so y = x + e . Then
th
i
Py = P (x + ei ) = Px + P ei = 0 + P ei = Pe i (19.4.6)
is the i th
column of P .
Therefore, if all the columns of P are nonzero, then P is nonzero, so the receiver can detect that there was an error. If, in
y
addition, all of the columns of P are distinct, then P is equal to the i column of P , and not equal to any other column,
y
th
so the receiver can conclude that the error is in the ith bit. Changing this bit corrects the error.
Conversely, if either the i column of P is zero, or the i column is equal to the j column, then either P e = 0 or
th th th
i
P e = P e . Therefore, when the codeword 00. . . 0 is sent, and an error changes the i bit, resulting in the message e
th
i j i
being received, either P e = 0 , so the receiver does not detect the error (and erroneously concludes that the message e is
i i
what was sent), or cannot tell whether the error is in the i bit (and message 0 was sent) or the error is in the j bit (and
th th
message e + e was sent). In either case, this is a single-bit error that cannot be corrected.
i j
Exercise 19.4.3
The proof of Theorem 19.4.1 shows how to correct any single-bit error (when it is possible):
19.4.3 https://math.libretexts.org/@go/page/60176
General Method.
Assume the word y has been received. Calculate P . y
If P = 0 , then y is a codeword. Assume there were no errors, so y is the codeword that was sent.
y
Now suppose P ≠ 0 .
y
If P is equal to the i column of P , then let x = y + e . (In other words, create x by changing the i bit of y from 0 to 1
y
th
i
th
Example 19.4.5
Solution
Let P be the given parity-check matrix. Then:
1
⎡ ⎤
⎢ 1⎥
⎢ ⎥ 1
⎢ ⎡ ⎤
1⎥
P =⎢
⎢
⎥ =
⎥
⎢0⎥ . This is the 4
th
column of P , so changing the 4 th
bit corrects the error. This means that the received word
⎢ 0⎥
⎣ ⎦
⎢ ⎥ 0
⎢ 0⎥
⎣ ⎦
0
1
⎡ ⎤
⎢ 0⎥
⎢ ⎥ 0
⎢ ⎡ ⎤
1⎥
P =⎢
⎢
⎥ =⎢0⎥
⎥
. This is 0, so there is no error. This means that the received word 101001 decodes as 101001.
⎢ 0⎥ ⎣ ⎦
⎢ ⎥ 0
⎢ 0⎥
⎣ ⎦
1
0
⎡ ⎤
⎢ 0⎥
⎢ ⎥ 0
⎢ ⎡ ⎤
1⎥
P =⎢
⎢
⎥ =
⎥ ⎢ 1 ⎥ . This is not any of the columns of P , so there are at least two errors. Therefore, we cannot decode the
⎢ 1⎥ ⎣ ⎦
⎢ ⎥ 1
⎢ 0⎥
⎣ ⎦
1
Exercise 19.4.4
⎢1 0 0 1 0 1 0 0⎥
P =⎢ ⎥.
⎢0 1 0 1 0 0 1 0⎥
⎣ ⎦
0 1 1 1 0 0 0 1
(a) Decode each of the following received words: 10001111, 11110000, 01111101.
19.4.4 https://math.libretexts.org/@go/page/60176
(b) Find the generator matrix of the code.
2) The parity check matrix of a certain binary linear code is
1 1 0 1 0 0
⎡ ⎤
P =⎢0 1 1 0 1 0 ⎥.
⎣ ⎦
1 0 1 0 0 1
Example 19.4.6
Let
1 1 1 1 1 1 1 0 0 0 0
⎡ ⎤
⎢1 1 1 1 0 0 0 1 1 1 0⎥
P =⎢ ⎥.
⎢1 1 0 0 1 1 0 1 1 0 1⎥
⎣ ⎦
1 0 1 0 1 0 1 1 0 1 1
This is a 4 × 11 matrix whose columns list all of the binary vectors of length 4 that have at least two 1s. The corresponding
4 × 15 parity-check matrix P = [A I ] lists all 2 − 1 = 15 nonzero binary vectors of length 4 (without repetition), so the
4
4
15 − 11 = 4 check bits. This is much more efficient than the triplerepetition code of Example 19.1.3, which would have to add
22 check bits to detect every single-bit error in an 11-bit message.
Note
Generalizing Example 19.4.6, a binary linear code is called a Hamming code if the columns of its parity-check matrix
P = [A I ] are a list of all the 2 − 1 nonzero binary vectors of length r (in some order, and without repetition). Every
r
r
Hamming code can correct all single-bit errors. Because of their high efficiency, Hamming codes are often used in real-world
applications. But they only correct single-bit errors, so other binary linear codes (which we will not discuss) need to be used in
situations where it is likely that more than one bit is wrong.
Exercise 19.4.5
1) Explain how to make a binary linear code of type (29, 24) that corrects all single-bit errors.
2) Explain why it is impossible to find a binary linear code of type (29, 25) that corrects all single-bit errors.
3) For each k ≤ 20 , find the smallest possible number r of check bits in a binary linear code that will let you send k -bit
messages and correct all single-bit errors. (That is, for each k , we want a code of type (n, k) that corrects all single-bit errors,
and we want r = n − k to be as small as possible.)
4) What is the smallest possible number r of check bits in a binary linear code that will let you send 100 -bit messages and
correct all single-bit errors?
This page titled 19.4: Using the Parity-Check Matrix For Decoding is shared under a CC BY-NC-SA license and was authored, remixed, and/or
curated by Joy Morris.
19.4.5 https://math.libretexts.org/@go/page/60176
19.5: Codes From Designs
An error-correcting code can be constructed from any design BIBD(v, k, λ) for which λ = 1 . Namely, from each block of the
design, create a binary string of length v , by placing a 1 in each of the positions that correspond to points in the design, and 0s
everywhere else. (However, this will not usually have a generator matrix, so it is not a binary linear code.)
Example 19.5.1
For the BIBD(7, 3, 1) that has arisen in previous examples, with blocks
{1, 2, 3}, {1, 4, 5}, {1, 6, 7}, {2, 4, 6}, {2, 5, 7}, {3, 4, 7}, {3, 5, 6},
Proposition 19.5.1
Proof
Let B and B be blocks of the design, and let b, b be the corresponding binary strings of length k as described at the start
′ ′
of this section. If the blocks have no points in common, then d(b, b ) = 2k . If the blocks have 1 entry in common, then
′
′
d(b, b ) = 2(k − 1)
(the strings differ in the k − 1 positions corresponding to points that are in B but not in B , and in the k − 1 positions
′
corresponding to points that are in B but not in B). Since λ = 1 , the blocks cannot have more than one point in common.
′
So in any case,
′
d(b, b ) ≥ 2(k − 1).
Since b andwere arbitrary output words of the code (because B and B were arbitrary blocks), this means that
′
b
′
d(C ) ≥ 2(k − 1) . This is greater than 2(k − 2) , so Theorem 19.2.2 tells us that the code can correct any k − 2 errors.
Exercise 19.5.1
1. If you use a BIBD to create a code whose words have length 10, that is 4-error-correcting. How many words will your code
have?
2. How many errors can be corrected by a code that comes from a BIBD(21, 4, 1)?
3. Recall the 2-(8, 4, 3) design given in Exercise 17.2.1(2). It is possible to show that this is also a 3-(8, 4, 1) design; for the
purposes of this problem, you may assume that this is true. If we convert these blocks to binary strings to form code words
for a code, how many errors can this code correct?
19.5: Codes From Designs is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by LibreTexts.
19.5.1 https://math.libretexts.org/@go/page/75515
19.6: Summary
Important Definitions:
Binary String
Code, Codeword
Hamming Distance
Minimum Distance of a Code
Detect Errors, Correct Errors
Encode, Decode
Generator Matrix
Parity-Check Matrix
Binary Linear Code of type (n, k)
Hamming Code
Notation:
d(x, y)
d(C )
Ik
G=[ ]
A
P = [A Ir ]
19.6: Summary is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by LibreTexts.
19.6.1 https://math.libretexts.org/@go/page/75517
Index
A Euler trail M
affine planes 13.1: Euler Tours and Trails maximum valency
18.3: Affine Planes Euler's handshaking lemma 13.2: Hamilton Paths and Cycles
11.3: Deletion, Complete Graphs, and the minimum valency
Handshaking Lemma
B Exponential Generating Functions
13.2: Hamilton Paths and Cycles
Bell numbers multiedge
9.3: Bell Numbers and Exponential Generating
9.3: Bell Numbers and Exponential Generating Functions 11.2: Basic Definitions, Terminology, and Notation
Functions multigraph
binomial theorem F 11.2: Basic Definitions, Terminology, and Notation
7.2: The Generalized Binomial Theorem
factorial
3.1: Permutations
P
C forest path
Catalan numbers 12.4: Trees 12.3: Paths and Cycles
9.2: Catalan Numbers permutation
closed polygon G 3.1: Permutations
13.2: Hamilton Paths and Cycles
generalized binomial theorem Pigeonhole Principle
closed walk 10.1: The Pigeonhole Principle
7.2: The Generalized Binomial Theorem
13.1: Euler Tours and Trails
graph
closure
11.2: Basic Definitions, Terminology, and Notation
R
13.2: Hamilton Paths and Cycles Ramsey theory
coding theory
1.5: Coding Theory
H 1.3: Ramsey Theory
14.2: Ramsey Theory
complete graph Hamilton cycle
11.3: Deletion, Complete Graphs, and the
13.2: Hamilton Paths and Cycles
S
Handshaking Lemma Hamilton paths
subgraph
cycles 13.2: Hamilton Paths and Cycles
11.3: Deletion, Complete Graphs, and the
12.3: Paths and Cycles handshaking lemma Handshaking Lemma
11.3: Deletion, Complete Graphs, and the sudokus
D Handshaking Lemma
16.1: Latin Squares and Sudokus
deletion
11.3: Deletion, Complete Graphs, and the
I T
Handshaking Lemma icosian game
tour
design theory 13.2: Hamilton Paths and Cycles
13.1: Euler Tours and Trails
1.4: Design Theory isomorphisms
trail
Dirac’s theorem 11.4: Graph Isomorphisms
13.1: Euler Tours and Trails
13.2: Hamilton Paths and Cycles
trees
L 12.4: Trees
E latin squares
Edge coloring 16.1: Latin Squares and Sudokus
14.1: Edge Coloring leaf
Euler tour 12.4: Trees
13.1: Euler Tours and Trails
1 https://math.libretexts.org/@go/page/60510
List of Notation
n! , Chapter 3.1 G ∖ {e} , Chapter 11.3 F (G) , Chapter 15.1
n
( )
r
, Chapter 3.2 Kn , Chapter 11.3 F , Chapter 15.1
n
(( ))
r
, Chapter 5.1 φ : G1 → G2 , Chapter 11.4 G
∗
, Chapter 15.1
(
n
r1 ,...,rm
) , Chapter 5.2 G1 ≅G2 , Chapter 11.4 MOLS , Chapter 16.2
uv , Chapter 11.2 ′
χ (G) , Chapter 14.1 λ , Chapter 17.1
1 https://math.libretexts.org/@go/page/75519
Appendix A: Solutions To Selected Exercises
For the reader’s convenience, solutions below are given with full work shown as well as a final numerical solution. Typically the
final numerical solution would not be expected, but makes it easier to verify an answer that has been reached using a different
method.
2) In Candyce’s book, the reader will have 3 choices at the first decision point, and 2 choices at each of the following three decision
points. Thus, there are a total of 3 ⋅ 2 ⋅ 2 ⋅ 2 = 3 ⋅ 2 = 24 possible storylines. Candyce must write 24 endings.
3
2) If Maple is thinking of a letter, there are 26 things she could be thinking of. If she is thinking of a digit, there are 10 things she
could be thinking of. In total, there are 10 + 26 = 36 things she could be thinking of.
positions). Similarly, if there are 9 characters, then there are 10 passwords consisting entirely of digits. In total, there are
9
2) Use only the product rule. There are 6 outcomes from the red die, and for each of these, there are 6 outcomes from the yellow
die, for a total of 6 ⋅ 6 = 36 outcomes.
3
52
whole, so the number of sets of three cards that are not all spades is ( ) − ( ) = 22100 − 286 = 21814 . 52
3
13
3) The leading digit cannot be a zero, so if there are to be exactly two zeroes, we have 4 possible positions in which they can be
placed. Thus, there are ( ) ways of choosing where to place the two zeroes. In each of the remaining three positions, we can place
4
9
any of the digits 1 through 9, so there are 93 choices for the remaining digits. Thus, there are 4
( )
2
= 4374 5 -digit numbers that
3
contain exactly two zeroes.
r=0
5 r
( )a b
r
5−r
) (∑
6
s=0
6
( )c d
s
s 6−s
) .
2
2 4 5 6 2
= ( )( )a b c d
2 2
3 2 4
.
Thus, the coefficient of a b c d is ( )( ) = 10 ⋅ 15 = 150 .
2 3 2 4 5
2
6
3
3 2 4
3
3
(b )
2
. Thus, the coefficient of a b is ( ) + ( ) = 10 + 4 = 14 .
1 3 2 5
3
4
structures from the set {1, . . . , n}: there are 3 choices for each of the n entries in the ternary string.
3) We identify each of the ten Olympic contenders with a crib, and each of the three dolls with one of the three medals. If the doll
corresponding to the gold medal goes into crib i, this corresponds to the competitor corresponding to crib i winning the gold medal.
Similarly, if the doll corresponding to the silver medal goes into crib j , this is equivalent to the contender corresponding to crib j
winning the silver medal; and the doll corresponding to bronze going into crib k is equivalent to the contender corresponding to
crib k winning the bronze medal.
Counting Method 1
From the n dogs, we first choose the r who will enter the competition. This can be done in ( ) ways. For each of these ways, we n
can choose k of the r competitors to become finalists in ( ) ways. Thus, there are a total of ( )( ) ways to choose the dogs.
r
k
n
r
r
Counting Method 2
From the n dogs, choose k who will be the finalists. This can be done in ( ) ways. For each of these ways, we can look at the
n
remaining n − k dogs and choose r − k to be the competitors who will not be finalists, in ( ) ways. Thus, there are a total of n−k
r−k
r−k
)
so we have ( )( ) = ( )( ) .
n
r
n
r
n
r
n−k
r−k
3) COMBINATORIAL PROOF. We will count the number of ways to choose a random sample of n people from a class of n
From the 2n total people, choose n of them for the random sample. This can be done in ( 2n
n
) ways.
Counting Method 2
Let r represent the number of men who will be in the sample. Notice that r may have any value from 0 up to n . We divide the
problem into these n + 1 cases, and take the sum of all of the answers. In each case, we can choose the r men for the sample from
the n men, in ( ) ways. For each of these ways, from the n women, we choose r who will not be part of the sample (so the
n
remaining n − r will be in the sample, for a total of r + n − r = n people in the sample). There are ( ) ways to do this. Thus the n
r
2
total number of ways of choosing r men and n − r women for the sample is ( n
r
) . Adding up the solutions for all of the cases, we
2
obtain a final answer of ∑ n
r=0
=( )
n
r
.
n 2
Since both of these solutions count the answer to the same problem, the answers must be equal, so we have ∑ r=0
n
=( )
r
2n
=(
n
) .
choice, there are ( ) ways of choosing the r books to display from the books that are kept. Thus, there are a total of
k
r
n
= ( )( ) solutions to this problem.
n k
∑
k=r r r
teacher). There are (( )) ways for Kim to choose his other three prizes; (( )) ways for Jordan to choose his other two prizes, and
6
3
6
(( )) ways for Finn to choose his other five prizes. Thus, the total number of ways for the prizes (including teacher gifts) to be
6
chosen is
6 6 6 6 +3 −1 6 +2 −1 6 +5 −1
3 3
6 (( )) (( )) (( )) = 6 ( )( )( )
3 2 5 3 2 5
8 7 10
3
=6 ( )( )( )
3 2 5
3
=6 ⋅ 56 ⋅ 21 ⋅ 252 = 64, 012, 032
3) Since the judges must choose at least one project from each age group, this is equivalent to a problem in which they are choosing
only six projects to advance, with no restrictions on how they choose them. They can choose six projects from three categories in
) = ( ) = 28 ways.
3 3+6−1 8
(( )) = (
6 6 6
k
.
))
Counting Method 2
We divide our count into two cases, according to whether or not we choose any orders of mac and cheese. If we do not choose any
mac and cheese, then we must choose our k items from the other n − 1 entries on the menu. We can do this in (( )) ways. If we n−1
do choose at least one order of mac and cheese, then we must choose the other k − 1 items from amongst the n entries on the menu
(with mac and cheese still being an option for additional choices). We can do this in (( )) ways. By the sum rule, the total
n
k−1
k
n
k−1
Since both of these methods are counting the same thing, the answers must be equal, so (( n
k
)) = ((
n−1
k
)) + ((
k−1
n
)) .
3) Adjusting the recurrence relation from Example 6.1.3, we obtain the new relation
rn = rn−1 − 20 + .01(rn−1 − 20).
bk+1 = 5 + 4(k + 1 − 1) = 5 + 4k .
Using the recursive relation, we have b k+1 = bk + 4 since k + 1 ≥ 2 . Using the inductive hypothesis, we have b k = 5 + 4(k − 1) .
Putting these together gives
bk+1 = 5 + 4(k − 1) + 4 = 5 + 4k − 4 + 4 = 5 + 4k = 5 + 4(k + 1 − 1),
Now I want to deduce that I can put $(k + 1) onto my gift card. Using the inductive hypothesis in the case i = k − 3 , I see that add
can put $(k − 3) onto my gift card by buying increments of $4 or $5. Now if I buy one additional increment of $4, I have put a
total of $(k − 3 + 4) = $(k + 1) onto my gift card, as desired. This completes the proof of the inductive step.
By the (strong) Principle of Mathematical Induction, I can put any amount of dollars that is at least $12 onto my gift card.
7
7
) = (−1 ) (
5+7−1
7
) = −(
11
7
) = −330 .
2) By the Generalised Binomial Theorem, the coefficient of y in (1 + y) 4 −2
is ( −2
4
) , so (replacing y with −x) the coefficient of x 4
in (1 − x) / is
−2
(−2)(−3)(−4)(−5)
(−1 )
4 −2
⋅(
4
) = (1) ⋅ =5 .
4!
gives 1 + x , so the two sides are equal. Since a generating function is a formal object, x is acting as a placeholder, and we do not
need to worry about the possibility that 1 − x = 0 that would prevent us from cancelling these factors. Inductive step: Let k ≥ 1 be
arbitrary, and suppose that
k+1
1 −x
k
1 +⋅ ⋅ ⋅ +x = .
1 −x
We have
k+1 k k+1
1 +⋅ ⋅ ⋅ +x = (1 + ⋅ ⋅ ⋅ + x ) + x
k+1
1 −x
Applying our inductive hypothesis, this is +x
k+1
. Adding this up over a common denominator of 1 − x gives
1 −x
as desired.
By the Principle of Mathematical Induction,
n+1
1 −x
n
1 +⋅ ⋅ ⋅ +x =
1 −x
for every n ≥ 1 .
4) The generating function for this problem is
2 3 4 5 6 5
(x + x +x +x +x +x ) .
6 5
1 −x
2 3 4 5 5
(1 + x + x +x +x +x ) =( ) .
1 −x
5 5 5 5
6 5 6 0 6 1 6 2 6 3 6 4 6 5
(1 − x ) = (−x ) +( )(−x ) + ( )(−x ) + ( )(−x ) + ( )(−x ) + (−x )
1 2 3 4
6 12 18 24 30
= 1 − 5x + 10 x − 10 x + 5x −x .
The function we’re interested in is the product of this with (1 − x) , and we are looking for the coefficient of x . The only ways
−5 6
of getting an x term from this product are by taking the x term above and multiplying it by the x term from (1 − x) , or by
6 0 6 −5
Thus, the number of ways in which Trent can roll a total of 11 on his five dice is the coefficient of x in our generating function, 11
which is ( ) − 5 = 205 . The probability of this happening is 205 divided by the total number of outcomes of his roll, which is
10
205
6
5
= 7776 , so , or about 2.5%.
7776
Now the numerator gives 2A + B + (2B − A)x = 1 as polynomials. Hence we must have 2B − A = 0 and 2A + B = 1 .
1 2
Combining these gives B = and A = . Thus the given generating function is equal to
5 5
2 1 2 1 1
−1 −1 −1 −1
(1 + 2x ) + (2 − x ) = (1 + 2x ) + (1 − x)
5 5 5 10 2
2
Using the Generalised Binomial Theorem, the coefficient of x in the first of these summands is r
(−1 ) 2
r r
, while the coefficient
5
r r
1 1 2 1 1
of x in the second summand is
r
( ) . Thus, the coefficient of x is r
(−1 ) 2
r r
+ ( ) .
10 2 5 10 2
2
= 2A + B + 2C + (3A − B − 3C )x + (A − 2B − 2C )x = 1 + 2x
−1
as polynomials, so we have 2A + B + 2C = 1 , 3A − B − 3C = 2 , and A − 2B − 2C = 0 . Solving this gives C = ,
3
3 8
B = , and A = . Thus (taking a factor of 2 out of the denominator of the second piece) the given generating function is equal
5 15
to
8 3 1 1
−1 −1 −1
(1 − 2x ) + (1 + x) − (1 + x ) .
15 10 2 3
8
Using the Generalised Binomial Theorem, the coefficient of x in the first of these summands is r r
2 ; the coefficient of x in the
r
15
r
3 1 1
second summand is r
(−1 ) ( ) ; and the coefficient of x in the third summand is −
r r
(−1 ) . We conclude that the coefficient
10 2 3
r
8 3 1 1
r r r
2 = (−1 ) ( ) − (−1 )
15 10 2 3
Now, write
2 +x 2 +x A B
f (x) = = = +
2
2x +x −1 (2x − 1)(x + 1) 2x − 1 x +1
1
Subtracting the second equation from the first tells us that −3B = 1 , so B =− . Then the first equation tells us that
3
1 5
A = 2 −( ) = . So we have
3 3
5 1 5 1
3 3 3 3
f (x) = − = −
2x − 1 x +1 1 − 2x 1 +x
1
The coefficient of x in r
is 1, so
(1 − x)
1
the coefficient of x in r
is 2 (by replacing x with 2x), and
r
(1 − 2x)
(1 + x)
5 r
1
r
− (2 ) − (−1 )
3 3
so
2 2
(1 − x − 2 x )C (x) = C (x) − xC (x) − 2 x C (x)
∞
n
= c0 + (c1 − c0 )x + ∑(cn − cn−1 − 2 cn−2 )x
n=2
= c0 + (c1 − c0 )x
since (by the recurrence relation) we have c n − cn−1 − 2 cn−2 = 0 for n ≥ 2 . Therefore
c0 + (c1 − c0 )x
C (x) =
2
1 − x − 2x
4
so A = . Then the second equation tells us that
3
4 2
B = 2A − 2 = 2 ( ) −2 = .
3 3
So
4 2
2 − 2x A B 3 3 4 1 2 1
C (x) = = + = = = ( )+ ( )
(1 + x)(1 − 2x) 1 +x 1 − 2x 1 +x 1 − 2x 3 1 +x 3 1 − 2x
−1 n 1+n−1 n n n
( ) = (−1 ) ( ) = (−1 ) ( ) = (−1 )
n n n
1
The coefficient of x in n
= (1 − 2x )
−1
is
1 − 2x
−1 n n 1+n−1 n 2n n n n
( )(−2 ) = (−1 ) ( )(−2 ) = (−1 ) ( )2 =2
n n n
3) Let E(x) = ∑ ∞
n=0
en x
n
be the generating function of {e n} . Then
2 3 4
E(x) = e0 +e1 x + e2 x +e3 x + e4 x +. . .
2 3 4
xE(x) = +e0 x + e1 x +e2 x + e3 x +. . .
1 2 3 4
=1 +x +x +x +x +. . .
1 −x
so
2 1
(1 − 3x)E(x) + = E(x) − 3xE(x) + 2 ⋅
1 −x 1 −x
∞
n
= (e0 + 2) + ∑(en − 3 en−1 + 2)x
n=2
= (e0 + 2),
Adding the two equations tells us that 2 − 4 = (A + B) − (A + 3B) = −2B , so B =1 . Then the first equation tells us that
A = 2 − B = 2 − 1 = 1 . So
2 − 4x A B 1 1
E(x) = = + = +
(1 − 3x)(1 − x) 1 − 3x 1 −x 1 − 3x 1 −x
1 1
From the generalized binomial theorem, we know that the coefficient of x in n
is 3n, and the coefficient of x in n
is
1 − 3x 1 −x
1 1 1 1
= 120 (1 − 1 + − + − )
2 6 24 120
= 60 − 20 + 5 − 1 = 44.
4) The initial conditions are D1 = 0 and D2 = 1 . The recursive relation Dn = (n − 1)(Dn−1 + Dn−2 ) for n ≥3 gives
D3 = 2(D2 + D1 ) = 2(1 + 0) = 2 .
and
D5 = 4(D4 + D3 ) = 4(9 + 2) = 4(11) = 44.
Now we want to deduce that c k+1 >0 . Using the recursive relation, we have
k k
ck+1 = ∑ ci c(k+1)−i−1 = ∑ ci ck−i
i=0 i=0
Using the inductive hypothesis, we have c > 0 for every j such that 0 ≤ j ≤ k . Putting these together gives that c
j k+1 is a sum of
k + 1 terms where each term has the form c c with 0 ≤ i ≤ k . Since 0 ≤ k − i ≤ k , we see that c > 0 and c
i k−i i k−i >0 so that
c c
i > 0 . Hence
k−i
k
ck+1 = ∑i=0 ci ck−i > 0
(i + 1)!
3) If b i = then the expanded exponential generating function for this sequence is
2
i i i
∞ bi x ∞
(i + 1)!x ∞
(i + 1)x
∑i=0 = ∑i=0 = ∑i=0 .
i! 2i! 2
This is
1 ∞
1 ∞
1
i i
∑ (i + 1)x = (∑ (i + 1)x ) =
i=0 i=0 2
2 2 2(1 − x)
8 rooks.
There are at least 17 − 8 = 9 rooks that are not in Row A . Since there are 7 other rows on the chessboard, and 9 > 1(7) , the
Pigeonhole Principle says that there must be at least 1 + 1 = 2 rooks that are in the same row, from amongst the other rows of the
board. Choose such a row, and call it Row B . Note that Row B also contains at most 8 rooks.
There is at least 17 − 8 − 8 = 1 rook remaining, so there must be a rook somewhere on the board that is in neither Row A nor
Row B . Choose such a rook, Rook 1, and call the row that it is in Row C . Since there are at least 2 rooks in Row B , at least one of
them must not be in the same column as Rook 1. Choose such a rook, Rook 2. Since there are at least 3 rooks in Row A , at least
one of them must not be in the same column as either Rook 1 or Rook 2. Choose such a rook, and call it Rook 3. Now Rooks 1, 2,
and 3 do not threaten each other, so fulfill the requirements of the problem.
n1 + n2 − m + 1 = 15 + 23 − 2 + 1 = 37
people are approached, he will have enough people to carry his art in the parade.
told:
|F | = 78, |I | = 124, |G| = 101, |F ∩ G| = 58, |G ∩ I | = 62, |F ∩ G ∩ I | = 48, |F ∪ G ∪ I | = 165.
We have been asked for |A ∪ B ∪ C | . Using inclusion-exclusion, we see that the answer is 30 + 20 + 12 − 10 − 6 − 4 + 2 = 44 .
Since e = {e, e} is a loop, the graph is not simple. There is no isolated vertex, because no vertex has valency 0. The only
6
Notice that H has e + 1 − 1 = e edges, so our induction hypothesis applies to H . Therefore, H has an even number of vertices
′ ′ ′
valency in H if the vertex is either u or v . Consider the three possible cases: u and v both have even valency in H ; u and v both
′ ′
valency of each of u and v goes up by 1). So the number of vertices of odd valency in H must be 2m (even though one of the
specific vertices of odd valency has changed between u and v ), which is even.
In all cases, H has an even number of vertices of odd valency. This completes the proof of the inductive step.
By the Principle of Mathematical Induction, every graph with at least 0 edges has an even number of vertices of odd valency.
(namely d ), but the only vertex of valency 1 in H (namely, z ) is adjacent only to a vertex of valency 1 (namely, y ).
Here is a more complete proof. Suppose φ : G → H is an isomorphism. (This will lead to a contradiction.) We must have
dH (φ(c)) = dG (c) = 1 .
(This principle was pointed out in the proof of Proposition 11.4.1(3).) The only vertex of valency 1 in H is z , so this implies that
φ(c) = z .
3) Notice that there are ( ) = 10 total edges possible in a graph on 5 vertices. Thus, the number of labeled graphs on 5 vertices
5
with 3 edges is the number of ways of choosing 3 of these 10 labeled edges. So there are ( ) = 120 labeled graphs on 5 vertices 10
4
5 vertices that have 4 edges. Thus, in total there are 120 + 210 = 330 labeled
graphs on 5 vertices that have 3 or 4 edges.
Base case: n =1 . Let G be a digraph with no loops or multiarcs, and with only one vertex v1 . Then there are no arcs in G , so
(v ) . So the desired conclusion is true when n = 1 .
+ −
|A(G)| = 0 = d (v1 ) = d 1
G G
We now establish the induction step. Assume that n ≥ 1 , the formula holds for every digraph with n vertices that has no loops or
multiarcs, and G is a digraph with n + 1 vertices that has no loops or multiarcs.
Pick an arbitrary vertex u of G. Let
N
+
be the set of outneighbours of u, and N the set of inneighbours of u, −
Note that:
′
V (G ) = V (G) ∖ u , so G has n vertices. ′
|A(G )| = |A(G)| − s − t .
′
For v ∈ V (G ) ∖ N , we have d (v) = d (v) (because the outneighbours of v in G are exactly the same as the
′ − +
′
G
+
G
′
outneighbours of v in G).
For v ∈ N , we have d (v) = d (v) − 1 (because u is counted as an outneighbour of v in G, but it is not in G so it cannot
− +
G
+
G
′
be counted as an outneighbour in G ). ′
Hence
+ + +
= ∑ d ′
(v) + ∑ (d ′
(v) + 1) + d ′
(u)
G G G
− −
v∈V (G)∖( N ∪{u}) v∈N
⎛ ⎞
+ + − +
= ∑ d ′
(v) + ∑ d ′
(v) + |N | +d ′
(u)
G G G
⎝ − − ⎠
v∈V (G)∖( N ∪{u}) v∈N
+
= ∑ d ′
(v) + t + s
G
′
v∈V ( G )
′
= |A(G )| + s + t (induction hypothesis)
= |A(G)|.
3) Beginning at the top and working clockwise, label the vertices of the digraph a , b , c , d , and e . Then:
a has outvalency 2 and invalency 1;
b has outvalency 2 and invalency 2;
c has outvalency 1 and invalency 2 ;
3) This graph is not connected. To see this, note that there are no edges from any vertex in {a, d, e, f , g, j} to any vertex in
{b, c, h, i}. Indeed the connected component that contains a is {a, d, e, f , g, j} . (The walk (a, d, e, f , g, j) passes through all of
these vertices, but none of these vertices are adjacent to any vertex that is not in the subset.) There are several walks of length 3
from a to d . One example is (a, d, a, d) .
occurs twice.) It is not a cycle, because the first vertex (namely, a ) is not the same as the final vertex (namely, b ).
3) PROOF. Let (u = u 1, u2 , . . . , v = uk , u) be a cycle of G in which u and v appear as consecutive vertices. Let G ′
= G ∖ {uv} .
Let x and y be arbitrary vertices of G. Since G is connected, there is a walk (x = x , x , . . . , x = y) from x to y in G. If this 1 2 m
walk does not contain the edge uv then it is also a walk in G . If it does contain the edge uv, then we can find some i with
′
1 ≤ i ≤ m − 1 such that either x = u and x i = v , or vice versa. For every such i , replace the pair (x , x
i+1 ) in the walk by i i+1
result is a walk from x to y that does not use the edge uv, so is in G . Since x and y were arbitrary vertices of G , for any two
′ ′
and after v in the walk, since v has only one neighbour, so such a walk would not be a path. Thus, the x − y path cannot use the
(b) The closure of this graph is K . We can easily see from this that the graph does have a Hamilton cycle. (To see that the closure
6
is K , observe that every vertex of the graph has valency at least 2. Thus, the two vertices of valency 4 can be joined to each of
6
their nonneighbours. After doing so, every vertex has valency at least 3, so every vertex can be joined to every other vertex.)
3) Let G be the graph that has been shown here. Using the notation of Theorem 13.2.1, let S = {a, f } . Then |S| = 2 , but G∖ S
has 3 connected components: {b, e}, {c, h}, and {d, g}. Since 3 > 2 , G cannot have a Hamilton cycle.
is class one. Since every vertex of C has valency two, this means that the graph has a proper edge-colouring that uses only
n 2
both incident to v ), so v v must be blue. The edge v v cannot be the same colour as v v (because they are both incident to v ),
1 1 2 2 3 1 2 2
whenever k is odd. (That is, the two colours must alternate red, blue, red, blue, red, blue,. . . as we go around the cycle.)
the edges v v and v v are both incident to the vertex v ), so they cannot be the same colour. The contradicts the fact that both
n−1 n 0 1 0
3) PROOF. We prove this by induction on k + ℓ . The base case is when k + ℓ = 2 (so k = ℓ = 1 ). Then
R(k, ℓ) = R(1, 1) = 1 < 4 = 2
1+1
=2
k+ℓ
.
So the inequality is valid in the base case.
For the induction step, assume k + ℓ ≥ 2 , and that R(k , ℓ ) ≤ 2k + ℓ , whenever
′ ′ ′ ′ ′
k +ℓ < k+ℓ
′
. Since R(k, ℓ) = R(ℓ, k) , we
may assume k ≤ ℓ (by interchanging k and ℓ , if necessary). If k = 1 , then
0 k+ℓ
R(k, ℓ) = R(1, ℓ) = 1 = 2 <2 .
Therefore, we may assume 2 ≤ k ≤ ℓ . Since (k − 1) + ℓ < k + ℓ and k + (ℓ − 1) < k + ℓ , the induction hypothesis tells us that
R(k − 1, ℓ) ≤ 2
(k−1)+ℓ
and R(k, ℓ − 1) ≤ 2
k+(ℓ−1)
.
Therefore
R(k, ℓ) ≤ R(k − 1, ℓ) + R(k, ℓ − 1)
(k−1)+ℓ k+(ℓ−1)
≤2 +2
k+ℓ−1 k+ℓ−1
=2 +2
k+ℓ−1
=2⋅2
k+ℓ
=2 .
a vertex v . Since v has N − 1 > (n − 1)(c + 1) incident edges, the generalised pigeonhole principle tells us that there must be
some set of at least n edges incident with v that are all coloured with the same colour, say colour i. Look at the induced subgraph
of K on the n other endpoints of these edges. If any edge xy of this induced subgraph is coloured with colour i, then all of the
N
edges of the triangle {v, x, y} have been coloured with colour i, so K has a monochromatic triangle.
N
If on the other hand no edge of the induced subgraph has been coloured with colour i, then the induced subgraph is a K whose n
edges have been coloured with the remaining c colours. By hypothesis, every such colouring has a monochromatic triangle. This
completes the proof.
dashed K or a dotted K or a solid triangle. The following colouring shows that R(2, 2, 3) > 2:
2 2
However, R(2, 2, 3) = 3. This is because if any edges are dotted or dashed, then there is a dotted or dashed K2 ; if no edges are
dotted or dashed, then every edge is solid, so there is a solid K .
3
2) We will show that R(2, 4) = 4 . We are looking for the smallest value of n such that every edge-colouring of K with dashed orn
solid lines has either a dashed K or a solid K . The following colouring shows that R(2, 4) > 3 ;
2 4
However, in K if any edge is dashed, then there is a dashed K , while if no edges are dashed, then there is a solid K .
4 2 4
Now consider the graph K . Let v be an arbitrary vertex of this graph. By our induction hypothesis, χ(K
k+1 ∖ {v}) = k . Thus,
k+1
any proper colouring of K must use at least k colours on the vertices other than v . It is not possible to colour v with any of
k+1
these k colours, since v is adjacent to all of the other vertices, so has a neighbour that is coloured with each of these k colours.
Therefore, χ(K ) ≥ k + 1 . In fact, since v is the only vertex not yet coloured by these k colours, it is clear that k + 1 colours
k+1
suffice to colour the graph: we colour v with a new colour, which is the k + 1 colour. This will certainly be a proper colouring of
st
χ(G) ≤ Δ(G) + 1 ≤ j + 1 .
So, 4 ≤ i ≤ χ(G) ≤ j + 1 ≤ 7 .
If we also know that Gis connected and is neither a complete graph nor a cycle of odd length, then χ(G) ≤ Δ(G) ≤ j , so
4 ≤ i ≤ χ(G) ≤ j ≤ 6 in this case.
3) We show a planar embedding of the graph, the planar embedding with the dual graph shown in grey, and the dual graph.
PROOF. We will prove this formula by induction on the number of faces of the embedding. Let G be a planar embedding of a
graph with exactly two connected components.
Base case: If |F | = 1 then G cannot have any cycles (otherwise the interior and exterior of the cycle would be 2 distinct faces). So
G must consist of two connected graphs that have no cycles, i.e., two trees, T and T . By Theorem 12.4.1 we know that we must
1 2
since the edge e being part of a cycle must separate two faces of G, which are united into one face of H . Furthermore, since e was
in a cycle and G has two connected components, by an argument similar to that given in Proposition 12.3.3 H has two connected
components, and H has a planar embedding induced by the planar embedding of G. Therefore our inductive hypothesis applies to
H , so
2 = |V (H )| − |E(H )| + |F (H )|
except that the i entry has been exchanged with the j entry. Since every element of N appears exactly once in column k of L, it
th th
also appears exactly once in column k of L (although possibly in a different position). Since k was arbitrary, every element of N
′
Now consider row k of L . If k ≠ i , j , then this row is exactly the same as row k of L. Since every element of N appears exactly
′
once in row k of L, it also appears exactly once (and in the same position even) in row k of L . If k = i or k = j , then row k of L ′ ′
is the same as some other row (the j or i row, respectively) of L. Since every element of N appears exactly once in that row of
th th
2 1 3 4 3 1 2 4 3 1 2 4
4 2 1 3 2 4 1 3 4 2 1 3
3 4 2 1 4 2 3 1 2 4 3 1
2 1 4 3
3 4 1 2
4 3 2 1
3)
3 4 1 2 7 8 5 6
5 6 7 8 1 2 3 4
7 8 5 6 3 4 1 2
4 3 2 1 8 7 6 5
2 1 4 3 6 5 8 7
8 7 6 5 4 3 2 1
6 5 8 7 2 1 4 3
4) This collection does have a system of distinct representatives: x, y , and z , for A , A , and A (respectively).
1 2 3
2
) = λ( )
v
2
.
k(k − 1)
and
v(v − 1) 16(16 − 1) 16 ⋅ 15
λ =1⋅ = =8
k(k − 1) 6(6 − 1) 6⋅5
1 ⋅ (v − 1) ≥ 20(20 − 1) = 380 ,
so v ≥ 380 + 1 = 381 . Therefore, v must be at least 381 to satisfy Fisher’s Inequality. Since
v−1 381 − 1 380
λ =1⋅ = = 20
k−1 20 − 1 19
and
v(v − 1) 381(381 − 1) 381(380)
λ =1⋅ = = 381
k(k − 1) 20(20 − 1) 380
are integers, the conditions in Theorem 17.1.2 are also satisfied. So 381 is the smallest value for v that satisfies all three conditions.
8 2 9 3 10 4 11 5 12 6 13 7 1
2 9 3 10 4 11 5 12 6 13 7 1 8
9 3 10 4 11 5 12 6 13 7 1 8 2
3 10 4 11 5 12 6 13 7 1 8 2 9
10 4 11 5 12 6 13 7 1 8 2 9 3
4 11 5 12 6 13 7 1 8 2 9 3 10
11 5 12 6 13 7 1 8 2 9 3 10 4
5 12 6 13 7 1 8 2 9 3 10 4 11
12 6 13 7 1 8 2 9 3 10 4 11 5
6 13 7 1 8 2 9 3 10 4 11 5 12
13 7 1 8 2 9 3 10 4 11 5 12 6
7 1 8 2 9 3 10 4 11 5 12 6 13
through the construction, 5 have one u-girl, one v -girl, and one w-girl; the other 30 have either two u-girls with a v -girl, two v -
girls with a w-girl, or two w-girls with a u-girl.
A Kirkman system requires us to divide the blocks into 7 groups of 5 blocks such that each girl appears exactly once in each group
of blocks. Since there should be 7 groups of 5 blocks, but there are only 5 blocks that have a u-girl, a v -girl, and a w-girl, there
must be at least one group of blocks (in fact, at least two) that has no block consisting of a u-girl, a v -girl, and a w-girl.
Consider such a group of 5 blocks. We must have all 5 of the u-girls. If no block contained more than one u-girl, then in order to
get all 5 u-girls we would have to choose only blocks that have two w-girls and a u-girl. However, this would mean that we had 10
w -girls and no v -girls, which is not allowed. So we must choose at least one block that has two u-girls and a v -girl. Repeating the
same argument with v or w taking the place of u, we see that we must also choose at least one block that has two v -girls and a w-
girl, and at least one block that has two w-girls and a u-girl. Since we are only choosing 5 blocks but there are these three classes of
blocks, there must be some class of blocks of which we only choose one.
Without loss of generality, suppose that we only choose one of the blocks that has two u-girls and a w-girl. In order to have all 5 of
the u-girls, we must choose three blocks that have two w-girls and a u-girl. But this means that we have six w-girls, which is not
allowed.
Therefore, there is no way to partition the blocks of this design into seven groups of five blocks so that every girl appears exactly
once in each group.
t
v
) = λ( ) = (
t
15
t
) .
Since we are not including any trivial t − (v, t, 1) design, we have t ≥ 2 , 3 ≤ k ≤ 14 , and t < k .
Now
15! k!
=b ,
t!(15 − t)! t!(k − t)!
k(k − 1) ⋅ ⋅ ⋅ (k + 1 − t)
is an integer.
t−1
) divides ( 14
t−1
, so that (
)
(k−1)!
(k−t)!
) divides ( 14!
(15−t)!
) . In other words,
is an integer. If we call this integer y , combining this with the previous paragraph tells us that k is a divisor of 15y. We can also
further work with the algebra to obtain
14 ⋅ 13 ⋅ ⋅ ⋅ k
y = .
(15 − t)(14 − t) ⋅ ⋅ ⋅ (k + 1 − t)
14
When k = 14 , this gives y = . Since k divides 15y and k is coprime to , we must have
15 k divides y . But then
(15 − t)
y 1
= is an integer, implying t = 14 . This contradicts t < k . Thus k = 14 cannot arise.
14 (15 − t)
14 ⋅ 13
When k = 13 this gives y = . Since k divides 15y and k is coprime to 15, we must have k divides y . But then
(15 − t)(14 − t)
y 14
= is an integer. Since t < k = 13 , we have 14 − t ≥ 2 , but no two consecutive integers each of which is at
13 (15 − t)(14 − t)
is an integer. Since the numerator is not a multiple of 2 , the denominator cannot be either, leaving only the possibilities t = 4 , 8.
2
Since the numerator is not a multiple of 3 , the denominator cannot be either, which eliminates t = 4 . When t = 8 , the numerator
2
of y is not a multiple of 5, but the denominator is, so this is also impossible. Thus k = 12 cannot arise.
When k = 11 this gives
14 ⋅ 13 ⋅ 12 ⋅ 11
y = .
(15 − t)(14 − t)(13 − t)(12 − t)
Since k divides 15y and k is coprime to 15, we must have k divides y . But then
y 14 ⋅ 13 ⋅ 12
=
11 (15 − t)(14 − t)(13 − t)(12 − t)
is an integer. Since the numerator is not a multiple of 5, the four consecutive numbers that are the factors of the denominator must
be 6 through 9 (since t ≥ 2 , they cannot be 11 through 14, and since t < 11 they cannot be 1 through 4). Thus, we must have
y
t =6 . But then the numerator is not divisible by 3
2
, while the denominator is divisible by 3
3 , contradicting being an integer.
11
Thus k = 11 is not possible.
When k = 10 , this gives
14 ⋅ 13 ⋅ 12 ⋅ 11 ⋅ 10
y = .
(15 − t)(14 − t)(13 − t)(12 − t)(11 − t)
is an integer. Since the numerator is not a multiple of 2 , the denominator cannot be either. In particular, 8 cannot be one of the
4
factors that appears in the denominator (since some other even factor would appear with it), nor can 2, 4, and 6 all be factors that
appear in the denominator. Also, the numerator is not divisible by 3 , so we cannot have 11 − t = 9 . This leaves t = 8 as the only
3
possibility. However, the numerator of y is not divisible by 3 , so t = 8 is also not possible. Thus k = 10 is not possible.
2
being an integer. Since the numerator is not divisible by 2 , the denominator cannot be either. In particular, 8 cannot appear as one
5
of the factors in the denominator (or two other numbers divisible by 2 would also appear as factors), so the only possibility is
t = 8 . However, if we take k = 9 , t = 8 , and i = 2 , the necessary condition is 7( ) = 7 divides ( ), which is not true. Thus,
7 13
6 6
k = 9 is not possible.
When k = 8 , we calculate
14 ⋅ 13 ⋅ 12 ⋅ 11 ⋅ 10 ⋅ 9 ⋅ 8
y = .
(15 − t)(14 − t)(13 − t)(12 − t)(11 − t)(10 − t)(9 − t)
Since k divides 15y and k is coprime to 15, we must have k divides y . But then
y 14 ⋅ 13 ⋅ 12 ⋅ 11 ⋅ 10 ⋅ 9
= .
8 (15 − t)(14 − t)(13 − t)(12 − t)(11 − t)(10 − t)(9 − t)
Since the consecutive factors in the denominator include 8 and at least two other even numbers, this implies that the numerator
should also be a multiple of 2 , but it is not. Thus k = 8 is not possible.
5
Since the consecutive factors in the denominator include 6 and 9, if they also include another multiple of 3 then the numerator must
be divisible by 3 , but it is not. This leaves the possibility that t = 4 so the factors in the denominator are 4 through 11, but this is
4
If k = 6 then
14 ⋅ 13 ⋅ 12 ⋅ 11 ⋅ 10 ⋅ 9 ⋅ 8 ⋅ 7 ⋅ 6
y = .
(15 − t)(14 − t)(13 − t)(12 − t)(11 − t)(10 − t)(9 − t)(8 − t)(7 − t)
being an integer. The numerator is not divisible by 2 , so the denominator cannot be either; in particular it cannot include as factors
8
14
all of the even integers from 4 through 10 as well as one other. This leaves the possibilities t = 2 and t = 4 . If t = 2 then y =
5
14 ⋅ 13 ⋅ 12
which is not an integer, and similarly if t = 4 then we have y = which is not an integer. So k = 6 is not possible.
5⋅4⋅3
If k = 5 then
14 ⋅ 13 ⋅ 12 ⋅ 11 ⋅ 10 ⋅ 9 ⋅ 8 ⋅ 7 ⋅ 6 ⋅ 5
y = .
(15 − t)(14 − t)(13 − t)(12 − t)(11 − t)(10 − t)(9 − t)(8 − t)(7 − t)(6 − t)
The numerator is not divisible by 2 , so the denominator cannot be either; in particular, the denominator cannot include all of 4, 6,
9
8 , 10, and 12 as factors in the product. This leaves only the possibility t = 4 . If k = 5 and t = 4 then i = 0 gives ( ) = 5 divides
5
( ) = 1365 which is true; i = 1 gives ( ) = 4 divides ( ) = 364 which is true; i = 2 gives ( ) = 3 divides ( ) = 78 , which is
15 4 14 3 13
4 3 3 2 2
true; i = 3 gives ( ) = 2 divides ( ) = 12 , which is true. Thus a 4−(15, 5, 1) design could exist, but this is the only possibility
2
1
12
with k = 5 .
When k = 4 we have
The denominator includes 3, 6, 9, and 12, so is divisible by 3 , but the numerator is not. Thus, k = 4 is not possible.
5
If k = 3 then 2 ≤t <k implies t =2 . We know these parameters are possible, as these are the parameters of a Steiner triple
system.
Thus, the only possible values of k and t ≥ 2 for which nontrivial t -designs might exist with v = 15 and λ =1 are k =5 and
t = 4 : a 4 −(15, 5, 1) design, or k = 3 and t = 2 : a (15, 3, 1) Steiner triple system.
3) If a 3−(16, 6, 1) design exists then we have bk = vr and b( ) = λ( ) . The second equation gives b( ) = ( ) , so b = 28 , so it
k
t
v
t
6
3
16
would have 28 blocks. Now bk = vr gives 28 ⋅ 6 = 16r . This has no integral solution, so such a design is not possible.
We obtain the following 4 MOLS of order 5 from this affine plane, by using the vertical and horizontal lines to create the
coordinates. To make things easier to see, we will have the positions in the Latin squares correspond visually to the positions in the
5 by 5 array of points that we have drawn, so the top-left entry in the Latin squares will come from the top-left point of the array,
etc. We will number the lines in each parallel class so as to ensure that the entries in the top row of each square are 1, 2, 3, 4, and 5,
in that order. The first square corresponds to the dashed lines; the second to the dotted lines; the third to the solid grey lines, and the
fourth to the dashed grey lines.
1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5
2 3 4 5 1 5 1 2 3 4 3 4 5 1 2 4 5 1 2 3
3 4 5 1 2 4 5 1 2 3 5 1 2 3 4 2 3 4 5 1
4 5 1 2 3 3 4 5 1 2 2 3 4 5 1 5 1 2 3 4
5 1 2 3 4 2 3 4 5 1 4 5 1 2 3 3 4 5 1 2
differ from y . In each of these positions, since y has the same entry as x, y must have a different entry than z . Therefore
d(y, z) ≥ k − i . Now d(x, y) + d(y, z) ≥ i + k − i = k = d(x, z) , completing the proof.
⎢ 0 1 0 0⎥
⎢ ⎥
Ik ⎢ 0 0 1 0⎥
G=[ ] =⎢ ⎥
⎢ ⎥
A ⎢ 0 0 0 1⎥
⎢ ⎥
⎢ 1 1 0 0⎥
⎣ ⎦
1 0 0 1
we have
0 0 1
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
0 ⎢ 1⎥ 0 ⎢ 0⎥ 1 ⎢ 1⎥
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
1 ⎢ 0⎥ 0 ⎢ 1⎥ 1 ⎢ 1⎥
⎢ ⎥ ⎢ ⎥, ⎢ ⎥ ⎢ ⎥, ⎢ ⎥ ⎢ ⎥
G=⎢ ⎥ = G=⎢ ⎥ = G=⎢ ⎥ =
⎢0⎥ ⎢ ⎥ ⎢1⎥ ⎢ ⎥ ⎢1⎥ ⎢ ⎥
⎢ 1⎥ ⎢ 0⎥ ⎢ 0⎥
⎣ ⎦ ⎢ ⎥ ⎣ ⎦ ⎢ ⎥ ⎣ ⎦ ⎢ ⎥
1 ⎢ 1⎥ 0 ⎢ 0⎥ 0 ⎢ 0⎥
⎣ ⎦ ⎣ ⎦ ⎣ ⎦
1 0 1
This means that 0101 encodes as 010111, 0010 encodes as 001000, and 0010 encodes as 111001.
A =⎢1 0 0 1⎥
⎣ ⎦
0 1 1 1
⎢ 0⎥
⎢ ⎥ 0
⎢ ⎡ ⎤
1⎥
G=⎢
⎢
⎥ =
⎥ ⎢1⎥. This is the 5
th
column of P , so changing the 5 th
bit corrects the error. The received word 001001
⎢ 0⎥ ⎣ ⎦
⎢ ⎥ 0
⎢ 0⎥
⎣ ⎦
1
decodes as 00101
–
1.
1
⎡ ⎤
⎢ 1⎥
⎢ ⎥ 0
⎢ ⎡ ⎤
0⎥
G=⎢
⎢
⎥ =
⎥ ⎢0⎥. This is 0, so there is no error. The received word 110011 decodes as 110011.
⎢ 0⎥ ⎣ ⎦
⎢ ⎥ 0
⎢ 1⎥
⎣ ⎦
1
0
⎡ ⎤
⎢ 0⎥
⎢ ⎥ 1
⎢ ⎡ ⎤
0⎥
G=⎢
⎢
⎥ =
⎥ ⎢ 1⎥. This is the 2
nd
column of P , so changing the 2 nd
bit corrects the error. The received word 000110
⎢ 1⎥ ⎣ ⎦
⎢ ⎥ 0
⎢ 1⎥
⎣ ⎦
0
decodes as 010110.
–
31 − 5 = 26 different nonzero strings with at least two 1s. Therefore, we can make a 5 × 24 matrix A , such that the columns of A
are 24 different binary column vectors with at least two 1s in each column (because there are 26 different possible columns to
choose from, and we need only 24 of them). The columns of the resulting parity-check matrix P = [A I ] are all nonzero and 5
distinct, so Theorem 19.4.1 tells us that the resulting binary linear code can correct every single-digit error.
Furthermore, since P is 5 × 24 , we know that r = 5 and k = 24 . Since r = n − k , this implies n = k + r = 24 + 5 = 29 . So the
code is of type (n, k) = (29, 24), as desired.
3) Suppose P is the parity-check matrix of a binary linear code of type (n, k) that corrects all single-bit errors, and let r = n − k .
Then Theorem 19.4.1 tells us that the columns of P must be distinct (and nonzero). However, P is r × n , and n = k + r , so it has
k + r columns of length r, and there are only 2 − 1 different possible nonzero columns of length r . Therefore, we must have
r
k + r ≤ 2 − 1 . Conversely, if this inequality is satisfied, then we can construct a k × (k + r) parity-check matrix whose columns
r
Thus:
r =2 check bits suffice for k = 1 , because k + r = 1 + 2 = 3 = 2 − 1 = 2 − 1 . (But r = 1 check bit does not suffice,
2 r
because k + r ≥ 1 + 1 = 2 > 2 − 1 = 2 − 1 ).
1 r
r = 3 check bits suffice for k = 2, 3, 4 , because k + r ≤ 4 + 3 = 7 = 2 − 1 = 2 − 1 . (But r = 2 check bits do not suffice,
3 r
because k + r ≥ 2 + 2 = 4 > 2 − 1 = 2 − 1 ).
2 r
By Page
Combinatorics (Morris) - CC BY-NC-SA 4.0 5.3: Summary - CC BY-NC-SA 4.0
Front Matter - CC BY-NC-SA 4.0 6: Induction and Recursion - CC BY-NC-SA 4.0
TitlePage - CC BY-NC-SA 4.0 6.1: Recursively-Defined Sequences - CC BY-NC-
InfoPage - CC BY-NC-SA 4.0 SA 4.0
Table of Contents - Undeclared 6.2: Basic Induction - CC BY-NC-SA 4.0
Licensing - Undeclared 6.3: More Advanced Induction - CC BY-NC-SA
1: Introduction - CC BY-NC-SA 4.0 4.0
6.4: Summary - CC BY-NC-SA 4.0
1: What is Combinatorics? - CC BY-NC-SA 4.0
1.1: Enumeration - CC BY-NC-SA 4.0 7: Generating Functions - CC BY-NC-SA 4.0
1.2: Graph Theory - CC BY-NC-SA 4.0 7.1: What is a Generating Function? - CC BY-NC-
1.3: Ramsey Theory - CC BY-NC-SA 4.0 SA 4.0
1.4: Design Theory - CC BY-NC-SA 4.0 7.2: The Generalized Binomial Theorem - CC BY-
1.5: Coding Theory - CC BY-NC-SA 4.0 NC-SA 4.0
1.6: Summary - CC BY-NC-SA 4.0 7.3: Using Generating Functions To Count Things
2: Enumeration - CC BY-NC-SA 4.0 - CC BY-NC-SA 4.0
7.4: Summary - CC BY-NC-SA 4.0
2: Basic Counting Techniques - CC BY-NC-SA 4.0
8: Generating Functions and Recursion - CC BY-NC-
2.1: The Product Rule - CC BY-NC-SA 4.0
SA 4.0
2.2: The Sum Rule - CC BY-NC-SA 4.0
2.3: Putting Them Together - CC BY-NC-SA 4.0 8.1: Partial Fractions - CC BY-NC-SA 4.0
2.4: Summing Up - CC BY-NC-SA 4.0 8.2: Factoring Polynomials - CC BY-NC-SA 4.0
2.5: Summary - CC BY-NC-SA 4.0 8.3: Using Generating Functions to Solve
Recursively-Defined Sequences - CC BY-NC-SA
3: Permutations, Combinations, and the Binomial
4.0
Theorem - CC BY-NC-SA 4.0
8.4: Summary - CC BY-NC-SA 4.0
3.1: Permutations - CC BY-NC-SA 4.0
9: Some Important Recursively-Defined Sequences -
3.2: Combinations - CC BY-NC-SA 4.0
CC BY-NC-SA 4.0
3.3: The Binomial Theorem - CC BY-NC-SA 4.0
9.1: Derangements - CC BY-NC-SA 4.0
3.4: Summary - CC BY-NC-SA 4.0
9.2: Catalan Numbers - CC BY-NC-SA 4.0
4: Bijections and Combinatorial Proofs - CC BY-NC-
9.3: Bell Numbers and Exponential Generating
SA 4.0
Functions - CC BY-NC-SA 4.0
4.1: Counting via Bijections - CC BY-NC-SA 4.0 9.4: Summary - CC BY-NC-SA 4.0
4.2: Combinatorial Proofs - CC BY-NC-SA 4.0
10: Other Basic Counting Techniques - CC BY-NC-SA
4.3: Summary - CC BY-NC-SA 4.0
4.0
5: Counting with Repetitions - CC BY-NC-SA 4.0
10.1: The Pigeonhole Principle - CC BY-NC-SA
5.1: Unlimited Repetition - CC BY-NC-SA 4.0 4.0
5.2: Sorting a Set that Contains Repetition - CC 10.2: Inclusion-Exclusion - CC BY-NC-SA 4.0
BY-NC-SA 4.0
1 https://math.libretexts.org/@go/page/115448
10.3: Summary - CC BY-NC-SA 4.0 16.2: Mutually Orthogonal Latin Squares (MOLS)
3: Graph Theory - CC BY-NC-SA 4.0 - CC BY-NC-SA 4.0
11: Basics of Graph Theory - CC BY-NC-SA 4.0 16.3: Systems of Distinct Representatives - CC
BY-NC-SA 4.0
11.1: Background - CC BY-NC-SA 4.0
16.4: Summary - CC BY-NC-SA 4.0
11.2: Basic Definitions, Terminology, and
17: Designs - CC BY-NC-SA 4.0
Notation - CC BY-NC-SA 4.0
11.3: Deletion, Complete Graphs, and the 17.1: Balanced Incomplete Block Designs (BIBD)
Handshaking Lemma - CC BY-NC-SA 4.0 - CC BY-NC-SA 4.0
11.4: Graph Isomorphisms - CC BY-NC-SA 4.0 17.2: Constructing Designs and Existence of
11.5: Summary - CC BY-NC-SA 4.0 Designs - CC BY-NC-SA 4.0
17.3: Fisher’s Inequality - CC BY-NC-SA 4.0
12: Moving Through Graphs - CC BY-NC-SA 4.0
17.4: Summary - CC BY-NC-SA 4.0
12.1: Directed Graphs - CC BY-NC-SA 4.0
12.2: Walks and Connectedness - CC BY-NC-SA 18: More Designs - CC BY-NC-SA 4.0
4.0 18.1: Steiner and Kirkman Triple Systems - CC
12.3: Paths and Cycles - CC BY-NC-SA 4.0 BY-NC-SA 4.0
12.4: Trees - CC BY-NC-SA 4.0 18.2: t-Designs - CC BY-NC-SA 4.0
12.5: Summary - CC BY-NC-SA 4.0 18.3: Affine Planes - CC BY-NC-SA 4.0
13: Euler and Hamilton - CC BY-NC-SA 4.0 18.4: Projective Planes - CC BY-NC-SA 4.0
18.5: Summary - CC BY-NC-SA 4.0
13.1: Euler Tours and Trails - CC BY-NC-SA 4.0
13.2: Hamilton Paths and Cycles - CC BY-NC-SA 19: Designs and Codes - CC BY-NC-SA 4.0
4.0 19.1: Introduction - CC BY-NC-SA 4.0
13.3: Summary - CC BY-NC-SA 4.0 19.2: Error-Correcting Codes - CC BY-NC-SA 4.0
19.3: Using the Generator Matrix For Encoding -
14: Graph Coloring - CC BY-NC-SA 4.0
CC BY-NC-SA 4.0
14.1: Edge Coloring - CC BY-NC-SA 4.0
19.4: Using the Parity-Check Matrix For
14.2: Ramsey Theory - CC BY-NC-SA 4.0
Decoding - CC BY-NC-SA 4.0
14.3: Vertex Colouring - CC BY-NC-SA 4.0
19.5: Codes From Designs - CC BY-NC-SA 4.0
14.4: Summary - CC BY-NC-SA 4.0
19.6: Summary - CC BY-NC-SA 4.0
15: Planar Graphs - CC BY-NC-SA 4.0
Back Matter - CC BY-NC-SA 4.0
15.1: Planar Graphs - CC BY-NC-SA 4.0
Index - CC BY-NC-SA 4.0
15.2: Euler’s Formula - CC BY-NC-SA 4.0
Glossary - CC BY-NC-SA 4.0
15.3: Map Colouring - CC BY-NC-SA 4.0
List of Notation - CC BY-NC-SA 4.0
15.4: Summary - CC BY-NC-SA 4.0
Appendix A: Solutions To Selected Exercises - CC
4: Design Theory - CC BY-NC-SA 4.0 BY-NC-SA 4.0
16: Latin Squares - CC BY-NC-SA 4.0 Detailed Licensing - Undeclared
16.1: Latin Squares and Sudokus - CC BY-NC-SA
4.0
2 https://math.libretexts.org/@go/page/115448