LectureNotes DHolt
LectureNotes DHolt
Derek Holt
revisions by David Mond and Daan Krammer
September 1, 2006
Contents
1 Introduction: 3 Sample Problems 2
2 Sums 6
2.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2 Sums and Recurrences . . . . . . . . . . . . . . . . . . . . . . 6
2.3 The Perturbation Method . . . . . . . . . . . . . . . . . . . . 9
2.4 Multiple Sums . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.5 Finite and Infinite Calculus . . . . . . . . . . . . . . . . . . . 11
2.6 Negative Exponents in Rising and Falling powers . . . . . . . 14
3 Integer Functions 16
3.1 Floors and Ceilings . . . . . . . . . . . . . . . . . . . . . . . . 16
3.2 Floor and Ceiling Problems . . . . . . . . . . . . . . . . . . . 17
3.3 Spectra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.4 Division . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.5 Floor and Ceiling Sums . . . . . . . . . . . . . . . . . . . . . . 21
4 Binomial Coefficients 22
4.1 Homogeneous trees . . . . . . . . . . . . . . . . . . . . . . . . 22
4.2 Binomial Coefficients . . . . . . . . . . . . . . . . . . . . . . . 23
4.3 Easy Identities . . . . . . . . . . . . . . . . . . . . . . . . . . 24
4.4 Some more complicated identities . . . . . . . . . . . . . . . . 28
4.5 Derangements . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.6 Multinomial coefficients . . . . . . . . . . . . . . . . . . . . . 33
5 Special Numbers 35
5.1 Stirling Numbers . . . . . . . . . . . . . . . . . . . . . . . . . 35
5.2 Harmonic Numbers . . . . . . . . . . . . . . . . . . . . . . . . 40
5.3 Fibonacci Numbers, Fn . . . . . . . . . . . . . . . . . . . . . . 44
6 Generating Functions 47
6.1 Basic Manipulation . . . . . . . . . . . . . . . . . . . . . . . . 47
6.2 Representations of sequences . . . . . . . . . . . . . . . . . . . 48
6.3 Partial Fractions . . . . . . . . . . . . . . . . . . . . . . . . . 49
6.4 Solving Recurrences . . . . . . . . . . . . . . . . . . . . . . . . 52
6.5 The simplest sort of differential equations . . . . . . . . . . . . 61
6.6 Exponential Generating Functions . . . . . . . . . . . . . . . . 62
6.7 Generating functions in more variables . . . . . . . . . . . . . 68
7 Discrete Probability 69
7.1 Sample Spaces and Random Variables . . . . . . . . . . . . . . 69
7.2 Probability Generating Functions (PGF’s) . . . . . . . . . . . 73
7.3 Tossing Coins . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
1
1 Introduction: 3 Sample Problems
n discs
A B C
Object: Move pile A to B by moving one disc at a time. A disc may never
rest on a smaller one. What is the minimal number of moves, Tn ?
T1 = 1 , Tn = 2Tn−1 + 1 (1)
What is the maximum number of regions that n infinite lines can divide a
plane into? Call this Ln . Get maximum if no lines are parallel and no three
lines meet at a point.
L0 = 1 (no line)
L1 =2
L2 =4
L3 =7
2
What happens when we introduce line n which intersects other (n − 1) lines?
It gets divided into n sections. Each section divides an old region into two.
So
Ln = Ln−1 + n . (2)
Pn
Hence Ln = L0 + k=1 k, for example:
L5 = L4 + 5
= L3 + 4 + 5
= L2 + 3 + 4 + 5
= L1 + 2 + 3 + 4 + 5
= L0 + 1 + 2 + 3 + 4 + 5 .
Solution:
n(n + 1)
Ln = 1 + ,
2
which can be proved formally by induction.
In general, if
Ln = Ln−1 + f (n) ,
then n
X
Ln = L 0 + f (k) .
k=1
1
n 2
n people sit in a circle. Going round clockwise, alternate people are removed.
Which person is left last? Call the number of this person Jn . For example,
3
1
8 2
7 3
6 4
In the first round all even people are removed. Then k people remain num-
bered 2i − 1 (1 6 i 6 k). Hence the person left at the end will have number
2Jk − 1. So J2k = 2Jk − 1.
In the first round even people go. Then 1 goes. Then k people are left with
numbers 2i + 1 (1 6 i 6 k). J2k+1 = 2Jk + 1.
Case 1: n = 2k.
n = 2m + L ⇐ k = 2m−1 + L/2
By inductive hypothesis:
Jk = 2(L/2) + 1 = L + 1 .
So
Jn = J2k = 2Jk − 1 = 2(L + 1) − 1 = 2L + 1
as required.
Case 2: n = 2k + 1.
n = 2m + L ⇐ k = 2m−1 + (L − 1)/2
4
By induction hypothesis:
µ ¶
L−1
Jk = 2 + 1 = L.
2
So
Jn = J2k+1 = 2Jk + 1 = 2L + 1 ,
as required.
5
2 Sums
2.1 Notation
n
X X
ak = a 1 + a 2 + . . . + a n and ak
k=1 16k6n
[5 is prime] = 1
[2 + 3 = 5] = 1
[2 + 3 = 6] = 0
X [k is prime] 1 1 1 1 1
= + + + +
16k612
k 2 3 5 7 11
½
f (x) = 0 , x rational
f : R → R, where
f (x) = 1 , x irrational
is the same as
f (x) = [x ∈
/ Q].
The empty sum is zero. That is, if
n
X
Sn = k
k=1
then
S0 = empty sum = 0
Discrete Continuous
Sum Integral
Recurrence relation Differential equation
6
P
Sums can be converted to recurrences e.g. Sn = nk=1 2−k is equivalent to
S0 = 0 , Sn = Sn−1 + 2−n . It is useful to go the other way sometimes. This
can be done be multiplying by the “summation factor” (which is like the
integrating factor in differential equations).
Example 4: (Hanoi)
T0 = 0 , Tn = 2Tn−1 + 1
Tn Tn−1 1
n
= n−1 + n .
2 2 2
Put Sn = Tn /2n , so
Sn = Sn−1 + 2−n ,
with S0 = 0. So
n µ ¶n
X
−k 1
Sn = 2 =1− (sum of a G.P.)
k=1
2
So Tn = 2n − 1.
(We used:
n
X a(rn+1 − r)
k
ar =
k=1
r−1
the sum of a G.P. (geometric progression).)
an Tn = bn Tn−1 + cn (bn 6= 0) ,
S n = Ω n an T n
to get
Sn = Sn−1 + Ωn cn .
So n
X
Sn = Ω k ck + S 0 .
k=1
7
Example 5: (from “analysis Quicksort”)
c0 = 0
n−1
2X
cn = n+1+ ck (3)
n k=0
n−1
X
2
ncn = n + n + 2 ck (4)
k=0
so
2
Sn = Sn−1 +
n+1
where S0 = 0. So
n
X 2
Sn = .
k=1
k+1
Notation: Since there is no formula for this sum, we introduce a new notation.
Define the harmonic numbers Hn by
n
X 1
Hn = , H0 = 0
k=1
k
Note that Hn → ∞ as n → ∞.
Solution to recurrence is
Sn = 2(Hn+1 − 1)
cn = 2(n + 1)(Hn+1 − 1) .
In fact
Hn ∼ loge n + γ
where
γ = Euler’s constant ≈ 0.5772156
8
2.3 The Perturbation Method
Pn
Let Sn = k=0 ak be a sum. Then
n
X
Sn+1 = Sn + an+1 = a0 + ak+1
k=0
The idea is to try and get a formula for the above in terms of Sn . Then solve
for Sn .
Pn
Example 6: Geometric Series. We define Sn = k=0 xk . Then
Sn+1 = Sn + xn+1 .
Also
n
X
Sn+1 = 1 + xk+1 = 1 + xSn
k=0
so
xn+1 − 1
Sn = if x 6= 1
x−1
Pn
Example 7: Tn = k=0 kxk . We have
Tn+1 = Tn + (n + 1)xn+1
n n
X
k+1
X
k+1 xn+2 − x
and Tn+1 = (k + 1)x = xTn + x = xTn +
k=0 k=0
x−1
−xn+2 + x (n + 1)xn+1
so Tn = + .
(x − 1)2 (x − 1)
factorises to n
µX n
¶µ X ¶
aj bk .
j=1 k=1
9
We aim to reduce these to single sums. And more generally, the indices are
not independent, for example:
X n
X n
X n X
X k−1
(· · · ) = (· · · ) = (· · · ).
16j<k6n j=1 k=j+1 k=1 j=1
If you find it hard to find an equation like (7), it can help to draw a picture
consisting of boxes, one box at position (i, j) for each value of (i, j) involved
in the sum. In the case of (7) with n = 3 that would be:
Example 8: X
aj ak
16j<k6n
a21 a1 a2 . . . a 1 an
a2 a1 a22 . . . a 2 an
.. .. ..
. . .
an a1 an a2 . . . a2n
We are summing terms above the main diagonal of the matrix. Call this sum
Sq . Note that the matrix is symmetric, giving Sq = Sx . So
µX n ¶2 µ X n ¶µ X n ¶ X n X n
ak = aj ak = aj ak
k=1 j=1 k=1 j=1 k=1
n
X X n
= Sx + Sq + a2k = 2 Sq + a2k .
k=1 k=1
So n
·µ X ¶2 n ¸
1 X
Sq = ak − a2k .
2 k=1 k=1
10
Example 9:
X 1
If Sn = then
16j<k6n
k−j
S1 = 0 (empty sum)
1 1 1 5
S3 = + + =
2−1 3−1 3−2 2
k
1/1 1/2 1/3 1/4
• Sum by rows:
n n
X X 1
Sn = .
j=1 k=j+1
k − j
Change of variable: put k − j = l.
n−j
n X n n−1
X 1 X X
Sn = = Hn−j = Hj .
j=1 l=1
l j=1 j=0
n
X n(n + 1) n2
S1 (n) = k= = + O(n)
k=0
2 2
n
X n3
S2 (n) = k2 = + O(n2 )
k=0
3
11
In general,
ni+1
Si (n) = + O(ni )
i+1
corresponds roughly to Z
i xi+1
x dx = ,
i+1
but the correspondence is not exact.
Definition 1: The following notation is not very common, but we will find
it quite useful.
Some properties:
Then
³ ´
(x + 1)m − xm = x(x − 1) . . . (x − m + 2) (x + 1) − (x − m + 1)
= mxm−1 (8)
Similarly
xm − (x − 1)m = mxm−1 . (9)
These are similar to differentiation formulae.
Consider n
X
km
k=0
with
(k + 1)m+1 − k m+1
km =
m+1
(by (8) using (m + 1) for m).
12
On summing, most terms cancel. We get
n
X
m (n + 1)m+1 − 0m+1 (n + 1)m+1
k = = , for m > 0 , (10)
k=0
m+1 m+1
We have:
n
X n(n + 1)
k = ,
k=1
2
n
X n(n + 1)(n + 2)
k(k + 1) = , etc . . .
k=1
3
So
n(n + 1)(n + 2)(n + 3) 3n(n + 1)(n + 2) n(n + 1)
S3 (n) = − +
4 3 2
³ ´ 2 2
n(n + 1) 2 n (n + 1)
= n + 5n + 6 − 4n − 8 + 2 = = (S1 (n))2 .
4 4
Note: Since µ ¶
m k
k = m! ,
m
equation (10) gives
n µ ¶ µ ¶
X k n+1
= , n > 0.
m m+1
k=0
13
2.6 Negative Exponents in Rising and Falling powers
Motivation:
x3 = x(x − 1)(x − 2) divide by (x − 2)
x2 = x(x − 1) divide by (x − 1)
x1 = x divide by x
x0 = 1 divide by x+1
1
x−1 = x+1
...
1
x−2 = (x+1)(x+2)
...
Definition 2:
1 1
x−m = =
(x + 1) . . . (x + m) (x + 1)m
1 1
x−m = =
(x − 1) . . . (x − m) (x − 1)m
It can be checked that equations (8) and (9) are still true for m 6 0. So they
are true for all m ∈ Z.
Example 10:
n n ³ 1 ´ ∞
X X 1 X
k −2 = =− −1 so k −2 = 1
k=0 k=0
(k + 1)(k + 2) n+2 k=0
n ∞
X
−3 1³ 1 1´ X 1
k =− − so k −3 =
k=0
2 (n + 2)(n + 3) 2 k=0
4
14
What happens for m = −1?
n n
X
−1
X 1
k = = Hn+1 ∼ loge (n + 1)
k=0 k=0
k+1
Z
1
Compare this with dx = loge x.
x
15
3 Integer Functions
Definition 3: For x ∈ R:
Example 11:
b3c = bπc = 3 ,
d3e = 3,
dπe = 4,
b−2c = b−1.4c = −2 ,
d−1.4e = −1 ,
d−2e = −2 .
bxc = x ⇐⇒ x ∈ Z ⇐⇒ dxe = x
x − 1 < bxc 6 dxe < x + 1
b−xc = −dxe , d−xe = −bxc
bxc = n ⇐⇒ n 6 x < n + 1 ⇐⇒ x − 1 < n 6 x
If n ∈ Z then bx + nc = bxc + n.
Note that bx + yc and bxc + byc are not always equal, for example b3/4 +
3/4c = 1 6= 0 = b3/4c + b3/4c.
x mod y = x − ybx/yc
Example 12:
16
Exercise. Let a < b be real numbers. We have the four types of intervals
[a, b] = {x ∈ R | a 6 x 6 b},
[a, b) = {x ∈ R | a 6 x < b},
(a, b] = {x ∈ R | a < x 6 b},
(a, b) = {x ∈ R | a < x < b}.
Complete the following table, which expresses the number of integers in such
intervals in terms of the floor and ceiling:
|[a, b] ∩ Z| = |[a, b) ∩ Z| =
Proof: p p
m=b bxcc ⇐⇒ m 6 bxc < m + 1
⇐⇒ m2 6 bxc < (m + 1)2
⇐⇒ m2 6√x < (m + 1)2
⇐⇒ m 6 √x < (m + 1)
⇐⇒ m = b xc
p √
Example 14: Is d bxce = d xe? Not always.
p
m + 1 = d bxce
p ⇐⇒ m2 < bxc 6 (m + 1)2 . If m2 6 bxc < m2 + 1 then
bxc = m2 and bxc = m.
So this equation fails for x satisfying m2 < x < m2 + 1 for some m, for
example x = 4.3 or x = 9.5.
17
√
For n ∈ [729 . . . 999] , b 3 nc = 9 so (∗) holds for multiples of 9 in this range.
√
For n = 1000 , b 3 nc = 10.
3.3 Spectra
Proof
Since 1 6 x we have bxc < b2xc < b3xc < · · · and similarly for y. So in
18
order to prove spec(x) 6= spec(y) it is enough to show that there exists n > 0
with bnxc 6= bnyc.
Since x < y, there exists n ∈ N with 1/n < y − x. It follows that 1 < ny − nx
and therefore bnyc > bnxc as required. ¤
√
Let φ = ( 5 + 1)/2, the golden ratio, about 1.618. Then
We notice that
spec(φ) ∪ spec(φ2 ) = N
spec(φ) ∩ spec(φ2 ) = ∅
√ √
For example, take α = 2 and β = 2 + 2.
√
spec( 2) = {1, 2, 4, 5, 7, 8, 9, 11, . . .}
√
spec(2 + 2) = {3, 6, 10, 13, 17, . . .}
Proof
For α ∈ R , α > 0 , n ∈ N put
We claim that if
1 1
+ = 1,
α β
where α and β are irrational, then
N (α, n) + N (β, n) = n , ∀n ∈ N.
This will prove the proposition as changing n to (n+1) must increase exactly
one of the terms N (α, n) , N (β, n) by 1, that is,
19
Proof of claim:
X X
N (α, n) = [bkαc 6 n] = [bkαc < n + 1]
k>0 k>0
X X
= [kα < n + 1] = [0 < k < (n + 1)/α]
k>0 k>0
= d(n + 1)/αe − 1.
(In general the number of integers in (0, x) is dxe − 1.)
Since α ∈
/ Q, we have n + 1/α ∈
/ Z, so
d(n + 1)/αe = (n + 1)/α + η
for some η with 0 < η < 1. Similarly,
N (β, n) = d(n + 1)/βe − 1 = (n + 1)/β + θ − 1
for some 0 < θ < 1.
n+1 n+1
N (α, n) + N (β, n) = +η−1+ +θ−1
α β
= n+1+η+θ−1−1
= n+η+θ−1
Since this is an integer and η , θ ∈ (0, 1), this can only be equal to n. So this
completes the proof. ¤
3.4 Division
Given n cakes to share between k people, how many should each get?
Then r people get q + 1 cakes and k − r people get q cakes. Note that
dn/ke = d(n − 1)/ke = . . . = . . . d(n − r + 1)/ke = q + 1,
d(n − r)/ke = d(n − r − 1)/ke = . . . = d(n − k + 1)/ke = q.
So
dn/ke + d(n − 1)/ke + . . . + d(n − k + 1)/ke = r(q + 1) + (k − r)q
= r + kq = n.
True for all n ∈ Z , k ∈ N.
Similarly,
bn/kc + b(n + 1)/kc + . . . + b(n + k − 1)/kc = n
This generalises to x ∈ R. Replace n by kx to get
bkxc = bxc + bx + 1/kc + bx + 2/kc + . . . + bx + (k − 1)/kc
True for all x ∈ R , k ∈ N.
20
3.5 Floor and Ceiling Sums
Example 16:
X √
b kc = 0 + 1| +{z
1 + 1} + 2| + 2 +{z
2 + 2 + 2} +3 + 3 + . . .
06k6n 22 −12 terms 32 −22 terms
X
= j((j + 1)2 − j 2 ) + some left over
√
More precisely, put a = b nc. Then
X √ X √ X √
b kc = b kc + b kc
06k6n 06k<a2 a2 6k6n
| {z }
left over terms
X
= j((j + 1)2 − j 2 ) + (n − a2 + 1)a
06j<a
X
= j(2j + 1) + (n − a2 + 1)a
06j<a
X
= (2j(j − 1) + 3j) + a(n − a2 + 1)
06j<a
2a(a − 1)(a − 2) 3a(a − 1)
= + + a(n − a2 + 1)
3 2
a3 a2 5a
= na − − +
3 2 6
21
4 Binomial Coefficients
n! = 1 · 2 · 3 · · · n.
| Bij(Xn )| = n!
f (2) = 3 f (2) = 8
When choosing f (1), all values of Xn are available, which gives n possibilities.
Xn \{f (1)}.
n · (n − 1) · (n − 2) · · · 1 = n!. 2
The remarkable thing in this proof is that the number of possible values of
f (2) is n − 1, regardless the value of f (1) that’s just been chosen.
1
Compare the number of elements of a Cartesian product: |A × B| = |A| × |B|
22
An event of choosing an object is a path from top to bottom in a diagram
called a homogeneous (rooted) tree .
x = vertex,
a0 = 2 y = a child of x,
x
z = a leaf,
a1 = 3 y number of leaves:
!
a2 = 2 a0 a1 a2 = 2 · 3 · 2 = 12.
!
z
This rooted tree is homogeneous
A rooted tree is homogeneous if any two vertices on the same level have the
same number of children. If a homogeneous tree has n levels, and a vertex
on level i has ai children, then the number of leaves (= vertices with no
children) is
a0 a1 · · · an−2 .
For nonhomogeneous trees there is no such formula.
We’ll see another homogeneous tree in the following section, and many more
after that.
Proof. For the duration of this proof, we write g(n, k) for the number ¡n¢ of
k-element subsets of Xn = {1, . . . , n}. So we want to prove g(n, k) = k .
23
In Proposition 3 (counting permutations) we saw that | Bij(Xn )| = n!. Let
us compute | Bij(Xn )| another way as follows.
Step 1: Choose {f (1), f (2), . . . , f (k)} (only as a set, not the individual f (i)!).
By definition, there are g(n, k) ways to do this.
Step 2: Choose the individual values f (1), f (2), . . . , f (k). This boils down
to ordering k objects. By Proposition 3 (counting permutations) there are
k! ways of doing this. Note that this number does not depend on what
happened during step 1: the tree is homogeneous so far!
and µ ¶
n! n
g(n, k) = = .
k! (n − k)! k
This finishes the proof. 2
µ ¶ µ ¶ µ ¶
r n n
Note that for all r ∈ R, = 1. Also, = if 0 6 k 6 n are
0 k n−k
integers.
Absorption: for k 6= 0,
µ ¶ µ ¶
r r r−1
=
k k k−1
Proof
First assume n > k > 0, n ∈ Z. Then
µ ¶
n
= number of ways of choosing k things from n things
k
24
µ ¶
n−1
= subsets of size k which do not include the red ball
k
So µ ¶ µ ¶ µ ¶
n n−1 n−1
= + .
k k−1 k
Write (13) as µ ¶ µ ¶ µ ¶
x x−1 x−1
= + .
k k k−1
This is a polynomial equation in x of degree k. It is true for all x ∈ Z with
x > 0, so it has more than k roots. Hence (13) must be an identity. That is,
true for all x ∈ R. ¤
Using Pascal’s triangle identity repeatedly, one easily produces the first few
rows of Pascal’s triangle:
0 1 2 3 4 5 6 7 k
0 1
1 1 1
2 1 2 1
3 1 3 3 1 Pascal’s triangle.
4 1 4 6 4 1
5 1 5 10 10 5 1
6 1 6 15 20 15 6 1
7 1 7 21 35 35 21 7 1 ¡ ¢
n
n k
25
In general (upper summation),
X µ k ¶ µ n+1 ¶
= for m, n ∈ Z and m, n > 0 ,
m m+1
06k6n
Negation:
rk = r(r − 1) . . . (r − k + 1)
= (−1)k (−r)(1 − r) . . . (k − 1 − r)
= (−1)k (k − 1 − r)k
In particular
µ ¶
−1
= (−1)k for k > 0,
k
µ ¶ µ ¶
−2 k k+1
= (−1) = (−1)k (k + 1).
k k
From this we obtain a formula for the alternating sum of the first m elements
in the r’th row of Pascal’s triangle:
µ ¶ Xµ k−1−r ¶
X
k r
(−1) = (upper negation)
k k
k6m k6m
µ ¶
−r + m
= (parallel summation)
m
µ ¶
m r−1
= (−1) (upper negation)
m
But there is no known simple expression for the same sum without alternating
signs,
Xµ r ¶
,
k
k6m
26
Vandermonde Convolution:
Xµ r ¶µ s ¶ µ r + s ¶
= , for n ∈ Z r, s ∈ R.
k n−k n
k∈Z
1. r ∈ N, series is finite
27
The top ten binomial coefficient identities.
µ ¶
n n!
Factorial expansion = , n > k > 0 integers
k k! (n − k)!
µ ¶ µ ¶
n n
Symmetry = , n > k > 0 integers
k n−k
µ ¶ µ ¶
r r r−1
Absorption = , k > 0 integer
k k k−1
µ ¶ µ ¶ µ ¶
Pascal’s triangle r r−1 r−1
= + , k integer
identity k k k−1
µ ¶ µ ¶
r k k−r−1
Upper negation = (−1) , k integer
k k
µ ¶µ ¶ µ ¶µ ¶
r m r r−k
Trinomial revision = , m, k integers
m k k m−k
X µr ¶ r > 0 integer
Binomial theorem (x + y)r = xk y r−k ,
k or |x/y| < 1
X µr + k ¶ µk ¶
r+n+1
Parallel summation = , n integer
k n
k6n
X µk¶ µ
n+1
¶
Upper summation = , m, n > 0 integers
m m+1
06k6n
µ ¶ X µ r ¶µ s ¶
Vandermonde r+s
= , n integer.
convolution n k n−k
k
By trinomial revision
µ ¶µ ¶ µ ¶µ ¶
n m n n−k
= ,
m k k m−k
So
µ ¶Áµ ¶ µ ¶Áµ ¶
m n n−k n
=
k k m−k m
Hence µ ¶ m µ ¶
n X n−k
×S = .
m m−k
k=0
28
This looks like parallel summation. We need to change variable in summa-
tion. Put l = m − k, k = m − l.
µ ¶ m µ ¶
n X n−m+l
×S =
m l
l=0
µ ¶
n−m+m+1
= (parallel summation)
m
µ ¶
n+1
=
m
µ ¶Áµ ¶
n+1 n
⇒S =
m m
n+1
=
n+1−m
Example 18:
m µ ¶Áµ ¶
X n−k−1 n
S= k for n > m > 0 .
n−m−1 m
k=0
where m µ ¶
X n−k−1
S1 = n
n−m−1
k=0
and m µ ¶
X n−k−1
S2 = (n − k) .
n−m−1
k=0
By absorption,
m µ ¶
X n−k
S2 = (n − m)
n−m
k=0
l = n−k, k = n−l.
n µ ¶
X l
S2 = (n − m)
n−m
l=n−m
µ ¶
l
But if l < n − m, then = 0. So
n−m
n µ ¶ µ ¶
X l n+1
S2 = (n − m) = (n − m)
n−m n−m+1
l=0
29
by upper summation.
m µ ¶
X n−k−1
S1 = n
n−m−1
k=0
n−k −1 = l, k =n−l−1
n−1 µ ¶ µ ¶
X l n
S1 = n =n by upper summation
n−m−1 n−m
l=0
So
µ ¶
n
S = S1 − S2
m
µ ¶ µ ¶
n n+1
= n − (n − m)
n−m n−m+1
µ ¶ µ ¶
n (n + 1)(n − m) n
= n −
n−m (n − m + 1) n−m
µ ¶
m n
=
n−m+1 n−m
µ ¶
m n
= .
n−m+1 m
So
m
S= .
n−m+1
4.5 Derangements
Pn ¡n¢
Note: The sums are the same as k=0 because k
= 0 otherwise. So they
are finite sums.
30
Proof
By symmetry, enough to prove LHS ⇒ RHS, so assume LHS. Then
Xµ n ¶ Xµ n ¶ Xµ k ¶
k k
(−1) g(k) = (−1) (−1)j f (j)
k k j
k k j
µ ¶µ ¶
X X
j+k n k
= f (j) (−1)
k j
j k
So our sum is
µ ¶µ ¶
X n n−j−1
f (j) . (∗)
j n−j
j
31
— only non-zero term in (∗) is when j = n. So
µ ¶
n
(∗) = f (n) = f (n) , for all n > 0 .
n
Let D(n) be the number of derangements. Can we get a formula for D(n)?
Example 19: For n = 4 the permutations (1, 2, 3, 4), (1, 2, 4, 3), (1, 3, 4, 2),
(1, 3, 2, 4), (1, 4, 2, 3), (1, 4, 3, 2), (1, 2)(3, 4), (1, 4)(3, 2), (1, 3)(2, 4) fix 0 points.
So there are 9 derangements.
(1, 2, 3), (1, 3, 2), (1, 2, 4), (1, 4, 2), (1, 3, 4), (2, 3, 4), (2, 4, 3), (1, 4, 3) fix 1
point.
(1, 2), (1, 3), (1, 4), (2, 3), (2, 4), (3, 4) fix 2 points
32
We can now use Theorem (7) with g(n) = n! and f (n) = (−1)n D(n) to
conclude that,
Xµ n ¶
n
f (n) = (−1) D(n) = (−1)k k!
k
k
n n
X n! X n!
so D(n) = (−1)n+k = (−1)n−k
k=0
(n − k)! k=0
(n − k)!
n
X n!
= (−1)k , replacing n − k by k .
k=0
k!
Example 20:
D(1) = 0
³ 1 1´
D(2) = 2! 1 − + =1
1! 2!
³ 1 1 1´
D(3) = 3! 1 − + − =2
1! 2! 3!
³ 1 1 1 1´
D(4) = 4! 1 − + − + =9
1! 2! 3! 4!
Note that µ ¶ µ ¶
k+` k+`
= ,
k, ` k
the binomial coefficient.
¡ ¢
We’ve seen before that nk is the number of k-element subsets of an n-element
set. The following generalises this to multinomial coefficients, necessarily
with a somewhat different language.
Let X(m) denote a set of m elements, and let A(n1 , . . . , nk ) denote the set
of k-tuples (Y1 , . . . , Yk ) of disjoint sets whose union is X(n1 + · · · + nk ), and
such that |Yi | = ni for all i.
33
Proposition 8 Combinatorial description of multinomial coefficients.
µ ¶
n1 + · · · + n k
= |A(n1 , . . . , nk )|.
n1 , . . . , n k
The binomial theorem also has its generalisation, which looks as follows.
1. Prove µ ¶ µ ¶µ ¶
k+`+m k+`+m k+`
=
k, `, m k + `, m k, `
and deduce trinomial revision.
2. Prove the multinomial theorem, using the binomial theorem and induc-
tion on k.
3. Prove the multinomial theorem, using differentiation and induction on
m.
4. Prove:
µ ¶ µ ¶ µ ¶
n1 + · · · + n k n1 + · · · + n k − 1 n1 + · · · + n k − 1
= +
n1 , . . . , n k n1 − 1, n2 , . . . , nk n1 , n2 − 1, n3 , . . . , nk
µ ¶
n1 + · · · + n k − 1
+··· +
n1 , . . . , nk−2 , nk−1 − 1, nk
µ ¶
n1 + · · · + n k − 1
+ (15)
n1 , . . . , nk−1 , nk − 1
34
5 Special Numbers
Properties:
½ ¾
0
= 1
0
½ ¾
n
= 0 , for n > 0
0
½ ¾
n
= 1 , for n > 0
1
½ ¾
n
= 2n−1 − 1 , for n > 2
2
2n − 2
=
2
(note that 2n − 2 is the number of non empty proper subsets of Xn .)
Proof
Consider a partition of Xn into k non-empty subsets (n, k > 0).
Either
35
© n−1 ª
1. {n} is one of the subsets. Then there are k−1
ways of decomposing
Xn \ {n} into k − 1 non empty subsets.
2. n is in a larger
© n−1subset.
ª Then we decompose Xn \ {n} into k non-empty
subsets in© k ª ways. We can place n in any of these k subsets. So
there are n−1 k
ways of doing this.
Result follows. ¤
0 1
1 0 1
2 0 1 1
3 0 1 3 1
4 0 1 7 6 1
5 0 1 15 25 10 1
36
For example, the orbits of (123)(67) ∈ S7 are {1, 2, 3}, {4}, {5}, {6, 7}.
Its cycles are (123), (4), (5), (67). Cycles are always non-empty. Note:
(1234) = (2341) etc (cyclic permutation).
· ¸
n
Definition 5: is the number of permutations in Sn with k cycles.
k
Example 23: n = 4, k = 2. (1, 2, 3)(4), (1, 3, 2)(4), (1, 2, 4)(3), (1, 4, 2)(3),
(1, 3,
· 4)(2),
¸ (1, 4, 3)(2), (2, 3, 4)(1), (2, 4, 3)(1), (1, 2)(3, 4), (1, 3)(2, 4), (1, 4)(2, 3).
4
So = 11
2
· ¸
0
= 1,
0
· ¸
n
= 0 , for n > 0
0
· ¸
n
= (n − 1)!
1
the last equality because we can write the same n-cycle in precisely n different
ways: given an n-cycle, choose as first digit any one of the n digits in the
cycle, and then the cyclic order determines the order of the remaining digits.
Note that
X· n ¸
= n!
k
k
Proof
Either (n) is a cycle by itself,£ leaving
¤ Xn \ {n} = Xn−1 to be decomposed
n−1
into k − 1 disjoint cycles (in k−1 ways), or we decompose Xn \ {n} into k
£ ¤
non-empty cycles in n−1 k
ways and then insert the n into one of the cycles.
For any given decomposition of Xn−1 into disjoint cycles, there are (n − 1)
ways of inserting n into one of them: put n immediately before any of the
other (n − 1) numbers. ¤
37
£n¤ £n¤ £n¤ £n¤ £n¤ £n¤
n 0 1 2 3 4 5
0 1
1 0 1
2 0 1 1
3 0 2 3 1
4 0 6 11 6 1
5 0 24 50 35 10 1
Example 24: How many ways are there of inserting 6 into (1, 2, 3)(4, 5)?
(6, 1, 2, 3)(4, 5)
(1, 6, 2, 3)(4, 5)
(1, 2, 6, 3)(4, 5)
(1, 2, 3)(6, 4, 5)
(1, 2, 3)(4, 6, 5)
(Note that (1, 2, 3, 6)(4, 5) is the same as (6, 1, 2, 3)(4, 5), and (1, 2, 3)(4, 5, 6)
is the same as (1, 2, 3)(6, 4, 5), as decompositions into disjoint cycles — and,
thus, of course, as permutations). Gives 6 − 1 = 5 possible ways.
· ¸
X
k n
Exercise. (−1) = 0 if n > 2.
k
k
Recall the rising and falling powers from Assignment 2. We were able to
express ordinary powers in terms both of rising powers, and of falling powers.
For example
x4 = x4 + 6x3 + 7x2 + x1
x4 = x4 − 6x3 + 7x2 − x1
Proof
We do (1) and (3). (2) and (4) are similar.
38
1. Induction on n. n = 0 gives 1 = 1, true. So assume true for n − 1.
Note that xk+1 = xk (x − k). So x.xk = xk+1 + kxk . We have
xn = x.x
Ã
n−1
!
X½ n − 1 ¾
= x xk (by induction)
k
k
X½ n − 1 ¾
= (xk+1 + kxk )
k
k
X½ n − 1 ¾ X ½ n−1 ¾
k+1
= x + k xk
k k
k k
X½ n − 1 ¾ X ½ n−1 ¾
= xk + k xk
k−1 k
k k
X½ n ¾
= xk ( by Proposition 23)
k
k
xn = xn−1 (x − n + 1)
X· n − 1 ¸
= (x − n + 1) (−1)n−1−k xk (by induction)
k
k
X· n − 1 ¸ X· n − 1 ¸
n−1−k k+1
= (−1) x + (−1)n−k (n − 1)xk
k k
k k
X· n − 1 ¸ X· n − 1 ¸
= (−1)n−k xk + (−1)n−k (n − 1)xk
k−1 k
k k
X· n ¸
= (−1)n−k xk (by Proposition 24)
k
k
But the polynomials {x` : ` ∈ Z>0 } (falling powers) are linear independent.
Therefore, we can compare coeffients of x` which yields
· ¸½ ¾
n−k n k
X
[n = `] = (−1) .
k
k `
39
such that n + 1 is in a (k + 1)-cycle, and f (k) the number of elements of
F (k). So · ¸ X
n+1
= f (k). (18)
`+1 k
Step 1. Choose
¡n¢ the orbit {g m (n + 1) | m ∈ Z} of n + 1. The number of
choices is k .
£ ¤
Step 2. Choose the cycle of n + 1. The number of them is k+1
1
= k!.
£ ¤
Step 3. Finish g. There are n−k`
choices.
so by (18) we find
· ¸ Xµ ¶ · ¸
n+1 n n−k
= k! .
`+1 k
k `
We define
n
X 1
Hn = , H0 = 0.
k=1
k
Recall that Z x
1
log x = dx
1 x
(where log x means loge x).
Consider the following figure, showing the graph of y = 1/x and of a step
function f whose integral between 1 and n + 1 is Hn .
40
2
1 2 3
It is clear that Hn > log(n + 1) > log n, and that log n > Hn − 1 (throw away
the first (square) box and shift all the others one unit to the left). Hence
Hn − log(n + 1) is bounded above by 1. It is an increasing sequence, and so
tends to a limit, known as γ. Since log(n + 1) − log n = log(1 + 1/n) → 0 as
n → ∞,
card 1
d1 card 2
d2 card 3
card n
dn
table
Let dn be a possible overhang (possible in the sense that the cards don’t fall).
Number the cards 1–n from top to bottom. Since the first card does not fall,
we have
d1 6 1.
The reason for this is that the center of the first card must be above the
second card.
41
More generally, the center of gravity of the first k cards must be above the
(k + 1)-st card, where the table plays the role of the (n + 1)-st card. In order
to compute this center of gravity, put x = 0 at the end of the first card. Then
the center of gravity of the first k cards is
(1 + d0 ) + (1 + d1 ) + · · · + (1 + dk−1 )
k
and therefore,
(1 + d0 ) + (1 + d1 ) + · · · + (1 + dk−1 )
dk 6 . (19)
k
Here d0 = 0.
Note that, the greater d0 , . . . , dk are, the greater the upper bound for dk
given by (19) is. It follows that dn is greatest precisely when equality holds
in (19), for all k:
(1 + d0 ) + (1 + d1 ) + · · · + (1 + dk−1 )
dk = for all k,
k
or
kdk = k + (d0 + · · · + dk−1 ). (20)
Replacing k by k − 1 gives
k dk − (k − 1)dk−1 = 1 + dk−1 ,
that is,
1
dk =+ dk−1 .
k
Since d0 = 0 we find that dk is just the harmonic number dk = Hk .
Example 28: You are collecting football stickers. The complete set has
size n. You buy them one at a time, selected randomly from a complete set.
How many do you expect to have to buy to get a full set?
Suppose you already have k < n distinct stickers. How many do you expect
to have to buy until you get a new sticker number k + 1? At this point, the
probability that a random sticker
P is new is (n − k)/n = p. The expected
time for new sticker k + 1 is r>1 r P (r), where P (r) is the probability that
you get the new sticker on your r’th purchase. This occurs if you get r − 1
42
old stickers in a row, and then get a new one. So P (r) = (1 − p)r−1 p, and
expected time is
X X p p 1 n
r (1 − p)r−1 p = r q r−1 p = = = = .
r>1 r>1
(1 − q)2 p2 p n−k
worm
1 cm/sec
A
elastic
100cm
fixed
Does worm ever reach end of elastic? After 1 second 1/100’th journey is over.
Then the elastic is stretched to 200cm long. But worm remains 1/100’th way
along since the stretching is uniform. In 2nd second, worm completes further
1/200’th of journey. So in total 1/100 + 1/200 has been done. This remains
true after stretching to 300cm. In 3rd second 1/300’th of journey done. After
n seconds, the fraction of journey completed is 1/100+1/200+. . .+1/100n =
Hn /100. Since Hn → ∞ the answer to the question is yes.
It takes approximately n seconds where n is the first integer with Hn > 100.
Hn ∼ log n + γ, so,
n ∼ e100−γ = e99.423
= 4.79 × 1035 years
At this stage, length of elastic ∼ 1025 light years!
dx x
= + 1.
dt t+1
Solution x = log(t + 1)(t + 1).
x log(t + 1)
=
l 100
100
x = l when ln(t + 1) = 100 ⇒ t = e − 1.
43
5.3 Fibonacci Numbers, Fn
n −5 −4 −3 −2 −1 0 1 2 3 4 5 6 7
Fn 5 −3 2 −1 1 0 1 1 2 3 5 8 13
Let √
5+1
φ= ∼ 1.618
2
be a root of x2 − x − 1 = 0. The other root is
φ̂ = 1 − φ = −φ−1 ∼ −0.618
Then
¹ n º
φn − φ̂n φ 1
Fn = √ = √ +
5 5 2
which will be proved in chapter 6.
Cassini’s Identity:
Fn+1 Fn−1 − Fn2 = (−1)n
For example,
n = 8, 13 × 34 − 212 = 1
n = 6, 13 × 5 − 64 = 1
Proof by induction.
Another identity:
Fn+k = Fk Fn+1 + Fk−1 Fn , for all n, k ∈ Z
by easy induction on k. For example,
k = n , F2n = Fn (Fn+1 + Fn−1 ) , so Fn | F2n
k = 2n , F3n = F2n Fn+1 + F2n−1 Fn , so Fn | F3n
By easy induction Fn | Fkn for all k > 1. In fact hcf(Fm , Fn ) = Fhcf(m,n)
(harder).
44
For example, 100 = 89 + 8 + 4 = F11 + F6 + F4
Proof
Induction on r
r = 2 : F 2 < F3
r = 3 : F 3 < F4
Proof of proposition
Existence: induction on n. For n = 1, 1 = F2 , so OK.
For n > 1, choose k1 , with Fk1 6 n < Fk1 +1 . Apply induction to n − Fk1 .
Since n − Fk1 < Fk1 +1 − Fk1 = Fk1 −1 . So we get k2 < k1 − 1 when we write
n − Fk1 = Fk2 + Fk3 + . . . + Fkn . Since F2 = F1 = 1, we never need to use F1 ,
so get kr > 1.
Uniqueness: to prove uniqueness let n = Fk1 +. . . +Fkr , ki > ki+1 +1, kr > 1.
k1 must be the largest possible such that
Example 30: What is the Fibonary expansion of 3Fn (n > 0)? We begin
by computing a few small cases.
Fibonary ex-
n Fn 3Fn pansion of 3Fn
0 0 0 0
1 1 3 F4
2 1 3 F4
3 2 6 F5 + F2
4 3 9 F6 + F2
5 5 15 F7 + F3
This is true for all n ∈ Z and can easily be proved by induction. Now (22) is
the Fibonary expansion for 3Fn provided n > 4. For 0 6 n < 4 refer to the
table.
45
Example 31: Game: two players A and B. A chooses n ∈ N, n > 1. Then
players take it in turn to subtract an integer m (which they choose each time)
from n. The player who first gets it to 0 wins. Rules:
2. You may never subtract more than 2m, where m was previous number
subtracted.
n B A B A B A B
−2 −3 −1 −1 −1 −1 −2
12 10 7 5 4 3 2 0
Here B wins.
46
6 Generating Functions
We will nearly always treat this as a formal power series, and not concern
ourselves with convergence.
Basic Operations:
2. Shifting: (m ∈ N)
X X
z m G(z) = gn z n+m = gn−m z n .
n n
3. Scalar Multiplication:
X
G(cz) = gn c n z n (6= cG(z))
n
4. Differentiation:
X
G0 (z) = (n + 1)gn+1 z n
n
X
0
zG (z) = ngn z n .
n
5. Multiplication or convolution:
XµX ¶
F (z)G(z) = fk gn−k z n
n k
= f0 g0 + (f0 g1 + f1 g0 )z + (f0 g2 + f1 g1 + f2 g0 )z 2 + . . .
6. Division: Since
(1 − z)(1 + z + z 2 + z 3 + . . .) = 1 ,
we can write
1 X
= zn .
1−z n>0
47
In general
µX ¶ XµX ¶
G(z) n 2
= gn z (1 + z + z + . . .) = gk z n
1−z n n k6n
48
(2) Recursion. A formula which expresses each an in the previous terms.
For example an = 2an−1 or
an = an−1 + an−2 + · · · + a0 + 1.
(3) Equation
P for the GF. An equation involving the generating function
A(x) = n>0 an xn . We distinguish between the following.
You’re supposed to know about partial fractions, but for all clarity we give
the main result and a few examples.
49
By (4) we are given a direct formula for a GF. We suppose it’s a rational
function (a quotient of two polynomials) because only then (5) makes sense.
By C(z) we shall denote the set of rational functions, that is, quotients of
polynomials.
In C(z) one can add, substract, multiply and divide by nonzero elements.
In particular, it is a vector space over C. The following proposition gives a
basis.
(b) Let q ∈ C[z] be a nonzero polynomial, and let d > 0. Let V = V (q, d) ⊂
C(z) be the set of rational functions whose denominator is q (or a
divisor of it), and such that the degree of the numerator is smaller than
d plus the degree of the denominator. Here is a basis for the complex
vector space V (q, d), which is indeed a subset of the basis of (a):
n ¯ o n ¯ o
z k ¯ 0 6 k < d ∪ (z − α)−k ¯ 0 < k, (z − α)−k q ∈ C[z] .
¯ ¯
Expressing a given rational function in its partial fraction simply means writ-
ing it as a linear combination of the particular basis given in part (a) of the
above proposition. In practice, one calculates a partial fractions expansion
by choosing q and d such that the involved rational function is in V (q, d) –
usually one immediately knows such q and d. Then one needs only consider
the finite basis of V (q, d) given in part (b).
50
so
a+b+c=0
a + 3b − 4c = −25
−6a + 4c = 0.
You know how to solve this and you find (a, b, c) = (2, −5, 3), so
2 −5 3
G(x) = + 2
+ . (24)
1 − 2x (1 − 2x) 1 + 3x
The second and last step is to turn partial fraction expansions into a direct
formula for the coefficients. This is aided by the following formula.
1 X µn + k − 1 ¶
= xk .
(1 − x)n k>0
k
Proof. We have
X µ−n¶ X µn + k − 1 ¶
k
(1 − x) −n
= (−x) = xk .
k>0
k k>0
k
The first identity is the Binomial Theorem 6, the second identity is upper
negation (14). 2
so the coefficient of xk is
Remark. This remark does not belong to the syllabus, but you may find it
helpful because it explains what power series of rational functions look like,
and in practice many generating functions are rational functions. Let G(z)
be the generating function of (g0 , g1 , . . .). Then the following can be shown
to be equivalent:
51
2. There is a recurrence a0 gn + a1 gn−1 + . . . + ak gn−k = 0 (ai ∈ C, a0 6= 0),
which is true for all n big enough.
The examples in this subsection have in common that the recursion formula
for the sequence (gn )n is linear inhomogeneous, that is, of the form
The procedure for finding a formula for gn from the recursion formula is as
follows.
Step 1. Write down a single equation for a recurrence, true for all n.
g0 = 0 , g1 = 1 , gn = gn−1 + gn−2
Step 1:
gn = gn−1 + gn−2 , for n > 2 (25)
What happens for n < 2?
n = 1, 1 = g1 = g0 + g−1 + 1
n = 0, 0 = g0 = g−1 + g−2
In fact, (25) is true for all n 6 0. Wrong only for n = 1. Single equation is
52
Step 2:
gn z n = gn−1 z n + gn−2 z n + [n = 1]z n
Sum over n ∈ Z to get
Step 3:
z
G(z) =
1 − z − z2
Now factorise denominator and use partial fractions. Note: if x2 + ax + b = 0
has roots α, β then 1 + az + bz 2 = (1 − αz)(1 − βz).
√ In this case, the roots
of x2 − x − 1 = 0 are φ and φ̂ where φ = (1 + 5)/2. So 1 − z − z 2 =
(1 − φz)(1 − φ̂z). By partial fractions,
µ ¶
z 1 1 1
=√ −
(1 − φz)(1 − φ̂z) 5 1 − φz 1 − φ̂z
√
since 5 = φ − φ̂.
Example 35: g0 = g1 = 1
n 0 1 2 3 4 5 6 7
gn 1 1 4 5 14 23 52 97
Step 1:
n = 1, 1 = g1 = g0 + 2g−1 + (−1)n + 1
n = 0, 1 = g0 = g−1 + 2g−2 + (−1)n
n < 0, gn = gn−1 + 2gn−2
So single equation:
Step 2: X
G(z) = zG(z) + 2z 2 G(z) + (−1)n z n + z
n>0
53
Step 3: Note that
X 1
(−1)n z n = .
n>0
1+z
1 + z(1 + z)
G(z) =
(1 + z)(1 − z − 2z 2 )
1 + z + z2
=
(1 − 2z)(1 + z)2
A B C
= + +
1 − 2z (1 + z) (1 + z)2
Step 4: Expressions for each of the three terms can easily be obtained from
the Binomial Theorem. Adding them, we find: coefficient of z n is
µ ¶
7 n 1 n n+1 n 7 · 2n n 2
2 − (−1) + (−1) = + + (−1)n
9 9 3 9 3 9
Example 36:
Part 1. How many ways can you tile a 2 × n rectangle with n dominoes
(1 × 2 rectangles). Call this tn .
n=3
t3 = 3
54
tn−1 ways of filling the remainder
u2 = 3
1 3
2
n−1
55
u n−2 ways v n−3 ways
n−1 n−1
Evidently v0 = 0, v1 = 1.
Step 2:
U (z) = z 2 U (z) + 2zV (z) + 1 (26)
V (z) = zU (z) + z 2 V (z) (27)
Step 4: Coefficient of z n
√ √
(2 + 3)n (2 − 3)n
un = √ + √ (so u2n+1 = 0)
3− 3 3+ 3
56
n 0 1 2 3 4 5 6 7 8
un 1 0 3 0 11 0 41 0 153
vn 0 1 0 4 0 15 0 56 0
domino hook
if we agree that a picture stands for a number, namely the number of ways
to tile the picture by dominoes and hooks. Or the number of ways to finish
the tiling if a tiling has already begun (this applies in the pictures starting
from (28)).
We also define n
z }| {
qn = .
= + + +
= pn−1 + pn−2 + 2qn−1 . (28)
Also
qn = = + = qn−1 + pn−2 . (29)
57
Example 38: Given a product x0 x1 x2 . . . xn of n + 1 terms. How many
different ways can we bracket this? i.e. how many different ways can we
calculate it using n multiplications? Call it cn .
c0 = c 1 = 1
(we have used convolution of sequences, (5) in the Basic Operations, to get
the last line here).
Hence
zC(z)2 − C(z) + 1 = 0,
and
Step 3: √
1± 1 − 4z
C(z) =
2z
58
With positive sign get term 1/z in expansion, so that cannot be correct. So
negative sign must be correct.
√
1 − 4z = (1 − 4z)1/2
X µ 1/2 ¶
= (−4z)k , by binomial theorem
k
k>0
X 1 µ −1/2 ¶
= 1+ (−4z)k , by absorption.
2k k − 1
k>1
So
√
1− 1 − 4z
C(z) =
2z
X 1 µ −1/2 ¶ X µ −1/2 ¶ (−4z)k
k−1
= (−4z) = .
k>1
k k−1 k>0
k k+1
Note that
µ ¶
−1/2 −1/2 × −3/2 × −5/2 × . . . × (−2k − 1)/2
=
k k!
k
(−1)
= k (1 × 3 × 5 × . . . × (2k − 1))
2 k!
(−1)k (2k)!
=
2k k! 2 × 4 × 6 . . . × 2k
µ ¶ µ ¶
(−1)k (2k)! (−1)k 2k 1 2k
= k k = = .
2 k!2 k! 4k k (−4)k k
So
X µ 2k ¶ z k
C(z) = .
k k + 1
k>0
Step 4: µ ¶
2n 1
cn =
n n+1
These are called the Catalan Numbers.
n 0 1 2 3 4 5 6 7 8 9
cn 1 1 2 5 14 42 132 429 1430 4862
cn+1
lim =4
n→∞ cn
Another example of where these numbers arise: How many solutions are
there of:
a1 + a2 + . . . + a2n = 0, ai = ±1
a1 + a2 + . . . + aj > 0, 0 6 j 6 2n
59
Call this number dn . Let S2k = a1 + a2 + . . . + a2k . Let k be maximal with
k < n and S2k = 0 (possibly k = 0). There are dk possibilities for a1 , . . . , a2k ,
we must have a2k+1 = 1 and a2n = −1.
P
dn−k−1 possibilities for a2k+2 , . . . , a2n−1 . So we get recurrence dn = k dk dn−k−1
(n > 1). Since this is the same recurrence as for (cn ), and the initial values
c1 = d1 , c2 = d2 are the same, cn = dn for all n.
To avoid all confusion, let us recall a few definitions about graphs. A graph
is a pair (V, E) where V is a set and E a subset of
The elements of V are the vertices, the elements of E the edges. In a picture,
the elements of V are depicted by points and an edge {x, y} by a line connect-
ing the vertices x and y. Thus, the graph ({1, 2, 3, 4}, {{1, 2}, {1, 3}, {1, 4}})
can be depicted as follows.
4
3
1 2
A graph (E, V ) is connected if any two of its vertices x, y can be connected
by a path
x = z0 , z1 , . . . , z k = y
which means that {zi , zi+1 } is an edge for all i. This path is called a cycle if
z0 , . . . , zk−1 are all different but zk = z0 , and k > 2.
89
,- 67,-
*+
v4
:; 45*+
v3
./ *+
v2
()&'$%"#!
<=./ DEBC@A>?<=:;896745 *+
v1
() >? DE23
&' 01 23
$% 01 @A
60
Let v1 , . . ., vn denote the vertices of Gn different from the origin, in this order
starting and ending at the missing edge.
In order to count the number of type k spanning trees, note that they depend
on two things. Firstly, one among v1 , . . . , vk needs to be connected to the
origin: there are k choices for this. Secondly, all vertices except v1 , . . ., vk
form a smaller almost-wheel Gn−k . A spanning tree for the Gn−k should be
chosen for which there are tn−k choices.
Let us write
X
T (z) = tn z n = generating function of tn .
n>0
Then
X X n
X
n n
T (z) = 1 + tn z = 1 + z k tn−k
n>1 n>0 k=0
µX ¶µ X ¶
z
=1+ k zk t` z ` =1+ T (z).
k>0 `>0
(1 − z)2
61
We will be interested in differential equations of the form
where a(x) is an (explicitly) given function, and one looks for solutions f .
Such equations are called linear homogeneous ordinary differential equations
of order 1.
Proposition 17 Let A(x) denote a primitive of a(x) (that is, a(x) = A0 (x)).
Then all solutions of (30) are given by
for a constant c.
f 0 (x)
= a(x).
f (x)
Then we note that a primitive of the left hand side is known:
¡ ¢0
log f (x) = a(x).
62
This is only a minor variation on the definition of ordinary Generating Func-
tion, but in some situations it is a lot more convenient. Some recurrences are
much easier to solve using EGF’s rather than GF’s.
Example 40:
g0 = 0 , 3gn = ngn−1 + 2n! (n > 1)
So gn = n! − n!/3n .
Binomial Convolution
Let F̂ (z) and Ĝ(z) be EGF’s of (f0 , f1 , . . .) and (g0 , g1 , . . .).
Let
µX ¶µ X ¶
fi z i gj z j
Ĥ(z) = F̂ (z)Ĝ(z) =
i>0
i! j>0
j!
µ
X X fk gn−k ¶ X hn
= zn = zn
n>0 k
k!(n − k)! n>0
n!
where
µ ¶
X n! X n
hn = fk gn−k = fk gn−k .
k!(n − k)! k
k k
We call the sequence (hn ) defined by this formula the binomial convolution
of the sequences (fn ) and (gn ). Thus, binomial convolution of sequences
corresponds to multiplication of their EGF’s.
63
Example 41: Bernoulli numbers.
Exercise. Show that Bn = 0 if n > 2 is odd. Hint: let f (z) denote the
function defined in (32) and consider f (z) − f (−z).
n 0 1 2 3 4 5 6 7 8 9 10
Bn 1 − 12 1
6
0 1
− 30 0 1
42
0 1
− 30 0 5
66
It is clear that
Bn (0) = Bn .
We will now show that Bn (x) is indeed a polynomial, and we will express
these polynomials in terms of the Bernoulli numbers. We have
µX ¶µ X ¶
X z n (33) z xz zk kz
k
Bn (x) = z e = Bk x
n>0
n! e − 1 k>0
k! k>0
k!
where the last equality is by (32) and the exponential series. By binomial
convolution we find n µ ¶
X n
Bn (x) = Bk xn−k .
k=0
k
Proposition 18
n
X Bm (n + 1) − Bm
k m−1 = .
k=0
m
64
Proof. We have
X³ ´ z m (33) z e(x+1)z z exz
Bm (x + 1) − Bm (x) = −
m>0
m! ez − 1 ez − 1
z exz (ez − 1) xz
X
m z
m+1 X
m−1 zm
= = z e = x = x .
ez − 1 m>0
m! m>1
(m − 1)!
Bm (x + 1) − Bm (x) = m xm−1 .
So if we put
X zn
B(z) = b(n) = exponential GF of the Bell numbers
n>0
n!
65
Therefore,
d B 0 (z)
log B(z) = = ez , log B(z) = ez + c,
dz B(z)
and z
B(z) = ee + c
for some constant c. Since e1+c = B(0) = b(0) = 1 we must have c = −1 so
z
B(z) = ee − 1 .
We thus find a closed formula for the exponential generating function of the
Bell numbers, but a closed formula for the Bell numbers is not possible.
X kn
Exercise. Prove b(n) = e−1 . (Ignore convergence questions.)
k>0
k!
66
Step 3. Finish g. There are (n + 1) − (p + 1) = n − p elements left to permute
so the number of choices is D(n − p, k).
and by (34)
n
X n! D(n − p, k)
D(n + 1, k) = . (35)
p=k
(n − p)!
We find
X D(n, k) xn−1 X D(n + 1, k) xn
f 0 (x) = =
n>1
(n − 1)! n>0
n!
X D(n + 1, k) xn
= because the other terms are zero
n>k
n!
n
X xn µ X ¶
n! D(n − p, k)
= by (35)
n>k
n! p=k
(n − p)!
X X xm+p D(m, k)
=
p>k m>0
m!
µ X ¶µ X ¶
p D(m, k) xm xk
= x = f (x).
p>k m>0
m! 1 − x
In section 6.5 we have learned how to solve a differential equation like the
one here, f 0 (x) = xk (1 − x)−1 f (x). We find
f0 xk
= ,
f 1−x
1 ¡ ¢
(log f )0 = xk + xk+1 + · · · = − 1 + x + x2 + · · · + xk−1 ,
1−x
¡ x x3 xk ¢
log f = − log(1 − x) − x + + + ··· + +c (and c = 0),
2 3 k
2 3 k
−(x + x2 + x3 + · · · + xk )
e
f (x) = .
1−x
A closed formula for D(n, k) is not possible.
exists and compute it. (In this exercise, you may use, without proving it,
that there are a0 , a1 , . . . ∈ C such that
x2 x3 xk
−(x + 2
+ 3
+ ··· + k
) X
e = an x n
n>0
67
for any complex number x, in the sense that the right hand side converges to
the left hand side. By the way, if you know a little complex function theory
then you can prove this by observing that the left hand side is a holomorphic
function in x on the complex plane.)
X µk + ` ¶ 1
Exercise. (a) Prove: xk y ` = .
k,`>0
k 1−x−y
n µ ¶
X n−m P
(b) Put an := . Use (a) to compute A(t) = n>0 an tn .
m=0
m
Exercise. Prove
X x1 · · · x k
min(n1 , . . . , nk ) xn1 1 · · · xnk k = .
n1 ,...,nk >0
(1 − x1 ) · · · (1 − xk )(1 − x1 · · · xk )
68
7 Discrete Probability
Example 45: Ω = {1, 2, 3, 4, 5, 6}, P (ω) = 1/6 for all ω ∈ Ω (for example
throwing a dice)
1 1 1
A = {5, 6}, P (A) = + =
6 6 3
P (A ∪ B) = P (A) + P (B) − P (A ∩ B)
(Exercise).
1
Example 46: Ω = {(i, j) | i, j ∈ [1 . . . 6]}, P (ω) = 36
for all ω ∈ Ω (throw
dice twice). Then
S1 : (i, j) → i
S2 : (i, j) → j
S : (i, j) → i + j
x 2 3 4 5 6 7 8 9 10 11 12
1 2 3 4 5 6 5 4 3 2 1
P (S = x) 36 36 36 36 36 36 36 36 36 36 36
Range(S) = [2 . . . 12]
69
Two random variables X : Ω → T1 and Y : Ω → T2 are called independent if
P (X = x and Y = y) = P (X = x)P (Y = y)
= E(X)E(Y ).
70
p
The standard deviation σ(X) is V (X)
V (X + Y ) = E((X + Y )2 ) − E(X + Y )2
= E(X 2 ) + E(Y 2 ) + 2E(XY ) − E(X)2 − E(Y )2 − 2E(X)E(Y )
But
E(XY ) = E(X)E(Y )
by independence. So
V (X + Y ) = V (X) + V (Y ) .
Note that
V (X + X) = V (2X) = 4V (X)
so we need independence. In the dice example
12 + 22 + . . . + 62 49 35
V (S1 ) = − = ,
6 4 12
35
V (S) = V (S1 + S2 ) = .
6
Choose any α > 0,
X
V (X) = (X(ω) − µ)2 P (ω)
ω∈Ω
X
> (X(ω) − µ)2 P (ω)
ω∈Ω
(X(ω)−µ)2 >α
X
> αP (ω)
ω∈Ω
(X(ω)−µ)2 >α
Define c by
√
α = c2 σ 2 (where σ = V is the standard deviation of X)
V (X)
P ((X − µ)2 > α) 6
α
1
⇒ P (|X − µ| > cσ) 6 2 (36)
c
71
So the probability that the difference from the mean is greater than or equal
to c times the standard deviation, is less than or equal to 1/c2 . This is true
for any RV on any sample space. In specific cases, we get a stronger result.
For example, in the normal distribution
P (|X − µ| > 2σ) ∼ 0.05
We next discuss the notion of multiple independent instances of a single
random variable X. This models the idea of repeated independent trials, e.g.
n independent dice throws. Suppose Ω1 is a sample space, with probability
measure P1 : Ω1 → [0, 1]. Define Ωn = Ω1 × · · · × Ω1 (n times). We define a
probability measure Pn : Ωn → [0, 1] by
72
The spread of averages of n instances is smaller than the spread of single
instances.
µ¯ ¯ ¶
¯S ¯ cσ 1
P ¯¯ − µ¯¯ > √ 6 2.
n n c
√
Or putting d = c/ n,
µ¯ ¯ ¶
¯S ¯ 1
P ¯¯ − µ¯¯ > dσ 6 2 .
n nd
Since X
gk = 1
k>0
(for the subsets {X = k} partition the sample space Ω), we must have
GX (1) = 1. And so GX (1) must be a convergent series. Moreover,
X
E(X) = P (X = k)k = G0X (1) , where G0X is the derivative of GX
k
This series is not always convergent, that is, E(X) can be infinite:
Example 47: we toss a coin until we get tails, and if we get k heads before
the tail, then you pay me 2k pennies. The sample space Ω is
Ω = {t, ht, hht, hhht, . . . , }.
Denote h · · · ht (k heads followed by a tail) by hk t. Provided the coin is
fair, heads and tails each have probability 1/2; we assume the outcomes of
successive tosses are independent, so the probability measure P : Ω → [0, 1]
is given by P (hk t) = (1/2)k+1 . The random variable X is X(hk t) = 2k , so
X 2k X1
E(X) = k+1
= =∞
k>0
2 k>0
2
73
Example 48: A random variable Un : Ω → [0, · · · , n−1] ⊂ N0 with uniform
probability distribution: Un (ω) = 0, 1, . . . , n − 1 each with probability 1/n.
The PGF is
1 + z + z 2 + . . . + z n−1 (1 − z n )
GUn (z) = = , if z 6= 1
n n(1 − z)
So
GUn (1) = 1
n−1
G0Un (1) =
2
(n − 1)(n − 2)
G00Un (1) =
3
n−1
E(Un ) = G0Un (1) =
2
n2 − 1
V (Un ) = G00Un (1) + G0Un (1) − G0Un (1)2 =
12
For example, if n = 6,
35
V (U6 ) =
12
74
So PGF of X + Y is
X³X ´
GX+Y (z) = gk hn−k z n = GX (z)GY (z).
n>0 k
A single coin toss has two outcomes h, t with probabilities p, q with p+q = 1.
A fair coin has p = q = 1/2.
2. Suppose we toss the coin until we get h then stop. Ω = {h, th, tth, ttth, . . .} =
{tk h | k ∈ N0 } where tk h means t k times, then h.
P (tk h) = q k p
X p p
qk p = = =1
k>0
1−q p
P
So we do have a sample-space, that is, ω∈Ω P (ω) = 1.
75
Then
X pz
F (z) = q k−1 pz k = .
k>1
1 − qz
The PGF of Fn is µ ¶n
pz
.
1 − qz
So
(1 − qz)p + pzq p
F 0 (z) = 2
=
(1 − qz) (1 − qz)2
p 1
F 0 (1) = = ,
(1 − q)2 p
1
E(F ) = ,
p
n
E(Fn ) = ,
p
2pq
F 00 (z) =
(1 − qz)3
2q
F 00 (1) = ,
p2
2q 1 1 q
V (F ) = 2
+ − 2 = 2,
p p p p
nq
V (Fn ) =
p2
3. Now we keep tossing the coin until we get h twice in a row, hh. The sample
space is {hh, thh, hthh, tthh, . . .} with probabilities {p 2 , qp2 , qp3 ,
q 2 p2 , . . .}. Let G on Ω be an RV equal to total number of tosses as in (2).
Strings of length 1 in t and ht are t, ht. Length 2: tt, tht, htt, htht. Adding
these up, this could be written as (t + ht)2 . Adding up strings of length 3
in t and ht gives (t + ht)3 . Adding up strings of length k in t and ht gives
(t + ht)k . So sum of elements of Ω can be written as
X
(t + ht)k hh
k>0
76
So PGF
X p2 z 2
G(z) = (qz + pqz 2 )k p2 z 2 = G(1)
k>0
1 − qz − pqz 2
p2 p2 p2
= = = = 1.
1 − q − pq p − pq p(1 − q)
So X
P (ω) = 1
ω∈Ω
and Ω really is a sample space. That is, sequence hh occurs with probability 1
if you keep tossing the coin.
2(1 − qz − pqz 2 )p2 z + p2 z 2 (q + 2pqz)
G0 (z) =
(1 − qz − pqz 2 )2
expected num- 0 2p4 + p2 (q + 2pq)
ber of tosses
= G (1) =
p4
2p4 + p2 (1 − 2p2 + p) 1 1
= = + .
p4 p p2
In case p = 1/2
z3
G(z) =
z 3 − 8z + 8
(z 3 − 8z + 8)3z 2 − z 3 (3z 2 − 8)
G0 (z) =
(z 3 − 8z + 8)2
G0 (1) = 8
77
5. Consider the following game between A and B. We toss a fair coin (p =
1/2) until either hht occurs or thh. A wins if hht comes first, B if thh first.
A wins with pattern hk t (k > 2) only (that is, A wins only if first two
tosses are h, with probability 1/4). B wins with (t + ht)k hh (k > 1). The
generating function for the game is the GF of hk t plus the GF of (t + ht)k hh.
z (z/2 + z 2 /4) z 2 /4
G(z) = + .
4(2 − z) 1 − z/2 − z 2 /4
Putting z = 1 gives
1 3
4
+ 4
(A wins) (B wins)
So thh is “better combination than hht”.
6. A new game: now suppose that A wins with hht and B wins with htt.
Winning pattern for A is tk (ht)l hm hht, k, l, m > 0. B wins with tk (ht)l htt,
k, l > 0
The GF is
z3 z3
+
8(1 − z/2)(1 − z 2 /4)(1 − z/2) 8(1 − z/2)(1 − z 2 /4)
(A wins) (B wins)
Putting z = 1 gives
2 1
+ .
3 3
A wins with probability 2/3.
So hht is better than htt. We have thh > hht > htt.
But thh and htt are equally likely to come first (by symmetry). So the
relationship ‘better than’ is not transitive.
78