Part IA - Probability: Definitions
Part IA - Probability: Definitions
Definitions
Lent 2015
These notes are not endorsed by the lecturers, and I have modified them (often
significantly) after lectures. They are nowhere near accurate representations of what
was actually lectured, and in particular, all errors are almost surely mine.
Basic concepts
Classical probability, equally likely outcomes. Combinatorial analysis, permutations
and combinations. Stirling’s formula (asymptotics for log n! proved). [3]
Axiomatic approach
Axioms (countable case). Probability spaces. Inclusion-exclusion formula. Continuity
and subadditivity of probability measures. Independence. Binomial, Poisson and geo-
metric distributions. Relation between Poisson and binomial distributions. Conditional
probability, Bayes’s formula. Examples, including Simpson’s paradox. [5]
1
Contents IA Probability (Definitions)
Contents
0 Introduction 3
1 Classical probability 4
1.1 Classical probability . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 Counting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Stirling’s formula . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2 Axioms of probability 5
2.1 Axioms and definitions . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Inequalities and formulae . . . . . . . . . . . . . . . . . . . . . . 6
2.3 Independence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.4 Important discrete distributions . . . . . . . . . . . . . . . . . . . 6
2.5 Conditional probability . . . . . . . . . . . . . . . . . . . . . . . 7
4 Interesting problems 11
4.1 Branching processes . . . . . . . . . . . . . . . . . . . . . . . . . 11
4.2 Random walk and gambler’s ruin . . . . . . . . . . . . . . . . . . 11
6 More distributions 16
6.1 Cauchy distribution . . . . . . . . . . . . . . . . . . . . . . . . . 16
6.2 Gamma distribution . . . . . . . . . . . . . . . . . . . . . . . . . 16
6.3 Beta distribution* . . . . . . . . . . . . . . . . . . . . . . . . . . 16
6.4 More on the normal distribution . . . . . . . . . . . . . . . . . . 16
6.5 Multivariate normal . . . . . . . . . . . . . . . . . . . . . . . . . 16
8 Summary of distributions 18
8.1 Discrete distributions . . . . . . . . . . . . . . . . . . . . . . . . . 18
8.2 Continuous distributions . . . . . . . . . . . . . . . . . . . . . . . 18
2
0 Introduction IA Probability (Definitions)
0 Introduction
3
1 Classical probability IA Probability (Definitions)
1 Classical probability
1.1 Classical probability
Definition (Classical probability). Classical probability applies in a situation
when there are a finite number of equally likely outcome.
Definition (Sample space). The set of all possible outcomes is the sample space,
Ω. We can lists the outcomes as ω1 , ω2 , · · · ∈ Ω. Each ω ∈ Ω is an outcome.
Definition (Event). A subset of Ω is called an event.
Definition (Set notations). Given any two events A, B ⊆ Ω,
– The complement of A is AC = A0 = Ā = Ω \ A.
– “A or B” is the set A ∪ B.
– “A and B” is the set A ∩ B.
– A and B are mutually exclusive or disjoint if A ∩ B = ∅.
1.2 Counting
Definition (Sampling with replacement). When we sample with replacement,
after choosing at item, it is put back and can be chosen again. Then any sampling
function f satisfies sampling with replacement.
Definition (Sampling without replacement). When we sample without replace-
ment, after choosing an item, we kill it with fire and cannot choose it again.
Then f must be an injective function, and clearly we must have X ≥ n.
Definition (Multinomial coefficient). A multinomial coefficient is
n n n − n1 n − n1 · · · − nk−1 n!
= ··· = .
n1 , n2 , · · · , nk n1 n2 nk n1 !n2 ! · · · nk !
It is the number of ways to distribute n items into k positions, in which the ith
position has ni items.
4
2 Axioms of probability IA Probability (Definitions)
2 Axioms of probability
2.1 Axioms and definitions
Definition (Probability space). A probability space is a triple (Ω, F, P). Ω is a
set called the sample space, F is a collection of subsets of Ω, and P : F → [0, 1]
is the probability measure.
F has to satisfy the following axioms:
(i) ∅, Ω ∈ F.
(ii) A ∈ F ⇒ AC ∈ F.
S∞
(iii) A1 , A2 , · · · ∈ F ⇒ i=1 Ai ∈ F.
Items in Ω are known as the outcomes, items in F are known as the events, and
P(A) is the probability of the event A.
Definition (Probability P∞ distribution). Let Ω = {ω1 , ω2 , · · · }. Choose numbers
p1 , p2 , · · · such that i=1 pi = 1. Let p(ωi ) = pi . Then define
X
P(A) = p(ωi ).
ωi ∈A
This P(A) satisfies the above axioms, and p1 , p2 , · · · is the probability distribution
Definition (Limit of events). A sequence of events A1 , A2 , · · · is increasing if
A1 ⊆ A2 · · · . Then we define the limit as
∞
[
lim An = An .
n→∞
1
5
2 Axioms of probability IA Probability (Definitions)
6
2 Axioms of probability IA Probability (Definitions)
P(A ∩ B)
P(A | B) = .
P(B)
7
3 Discrete random variables IA Probability (Definitions)
and similarly for the other distributions we have come up with before.
Definition (Expectation). The expectation (or mean) of a real-valued X is
equal to X
E[X] = pω X(ω).
ω∈Ω
8
3 Discrete random variables IA Probability (Definitions)
3.2 Inequalities
Definition (Convex function). A function f : (a, b) → R is convex if for all
x1 , x2 ∈ (a, b) and λ1 , λ2 ≥ 0 such that λ1 + λ2 = 1,
λ1 f (x1 ) + λ2 f (x2 )
λ1 x 1 + λ2 x 2
x1 x2
9
3 Discrete random variables IA Probability (Definitions)
P(X = x, Y = y)
P(X = x | Y = y) = .
P(Y = y)
10
4 Interesting problems IA Probability (Definitions)
4 Interesting problems
4.1 Branching processes
4.2 Random walk and gambler’s ruin
Definition (Random walk). Let X1 , · · · , Xn be iid random variables such
that Xn = +1 with probability p, and −1 with probability 1 − p. Let Sn =
S0 + X1 + · · · + Xn . Then (S0 , S1 , · · · , Sn ) is a 1-dimensional random walk.
If p = q = 12 , we say it is a symmetric random walk.
11
5 Continuous random variables IA Probability (Definitions)
12
5 Continuous random variables IA Probability (Definitions)
f (x̂) ≥ f (x)
for all x. Note that a distribution can have many modes. For example, in the
uniform distribution, all x are modes.
We say it is a median if
Z x̂ Z ∞
1
f (x) dx = = f (x) dx.
−∞ 2 x̂
where
f (x1 , · · · , xn ) ≥ 0
and Z
f (x1 , · · · , xn ) dx1 · · · dxn = 1.
Rn
13
5 Continuous random variables IA Probability (Definitions)
and
f (x1 , · · · , xn ) = fX1 (x1 ) · · · fXn (xn )
are each individually equivalent to the definition above.
(x − µ)2
1
f (x) = √ exp − ,
2πσ 2σ 2
14
5 Continuous random variables IA Probability (Definitions)
15
6 More distributions IA Probability (Definitions)
6 More distributions
6.1 Cauchy distribution
Definition (Cauchy distribution). The Cauchy distribution has pdf
1
f (x) =
π(1 + x2 )
λn xn−1 e−λx
f (x) = .
(n − 1)!
Γ(a + b) a−1
f (x; a, b) = x (1 − x)b−1
Γ(a)Γ(b)
for 0 ≤ x ≤ 1.
This has mean a/(a + b).
16
7 Central limit theorem IA Probability (Definitions)
17
8 Summary of distributions IA Probability (Definitions)
8 Summary of distributions
8.1 Discrete distributions
8.2 Continuous distributions
18