Lecture 03
Lecture 03
1 Independence
Definition 1.1 (Independence of events). For a probability space (Ω, F, P), a family of events ( Ai ∈ F : i ∈ I )
is said to be independent, if for any finite set F ⊆ I, we have
P(∩i∈ F Ai ) = ∏ P( Ai ).
i∈ F
Remark 1. The certain event Ω and the impossible event ∅ are always independent to every event A ∈ F.
Example 1.2 (Two coin tosses). Consider two coin tosses, such that the sample space is Ω =
{ HH, HT, TH, TT }, and the event space is F = 2Ω . It suffices to define a probability function P : F →
[0, 1] on the sample space. We define one such probability function P, such that
1
P({ HH }) = P({ HT }) = P({ TH }) = P({ TT }) = .
4
Let event A1 , { HH, HT } and B2 , { HH, TH } correspond to getting a head on the first or the second
toss respectively.
From the defined probability function, we obtain the probability of getting a tail on the first or the
second toss is 21 , and identical to the probability of getting a head on the first or the second toss. That is,
P( A1 ) = P( A2 ) = 12 and the intersecting event A1 ∩ A2 = { HH } with the probability P( A1 ∩ A2 ) = 41 .
That is, for events A1 , A2 ∈ F, we have
P ( A1 ∩ A2 ) = P ( A1 ) P ( A2 ).
Example 1.3 (Countably infinite coin tosses). Consider a sequence of coin tosses, such that the sample
space is Ω = { H, T }N . For set of outcomes En , {ω ∈ Ω : ωn = H }, we consider an event space gener-
ated by F , σ ({ En : n ∈ N}). We define a probability function P : F → [0, 1] by P(∩i∈ F Ei ) = p| F| for any
finite subset F ⊆ N. By definition, ( En : n ∈ N) is a sequence of independent events.
We observe that the set of outcomes corresponding to at least one head in first n outcomes
In particular this implies that σ ({ An : n ∈ N}) ⊆ F and σ({ Bn : n ∈ N}) ⊆ F. We can show that P( An ) =
1 − (1 − p)n and P( Bn ) = p(1 − p)n−1 for n ∈ N.
1
Let Fn be the event space generate by the first n coin tosses, i.e. Fn , σ ({ Ei : i ∈ [n]}). Then, we
can show that F = σ ({Fn : n ∈ N}). For any ω ∈ Ω, we can define the number of heads in first n
trials by k n , ∑in=1 1{ωi = H } . Then, we observe that any event A ∈ Fn can be written as union of ∩in=1 Ci
where Ci = Ei or Eic . That is, we can specify the first n outcomes for each ω ∈ A. Since the probability
P(∩in=1 Ci ) = ∏in=1 P(Ci ), we have
P( A) = ∑ p k n ( ω ) (1 − p ) n − k n ( ω ) .
ω∈ A
Example 1.4 (Counter example). Consider a probability space (Ω, F, P) and the events A1 , A2 , A3 ∈ F.
The condition P( A1 ∩ A2 ∩ A3 ) = P( A1 ) P( A2 ) P( A3 ) is not sufficient to guarantee independence of the
three events. In particular, we see that if
P ( A1 ∩ A2 ∩ A3 ) = P ( A1 ) P ( A2 ) P ( A3 ), P( A1 ∩ A2 ∩ A3c ) 6= P( A1 ) P( A2 ) P( A3c ),
then P( A1 ∩ A2 ) = P( A1 ∩ A2 ∩ A3 ) + P( A1 ∩ A2 ∩ A3c ) 6= P( A1 ) P( A2 ).
Definition 1.5. A family of collections of events (Ai ⊆ F : i ∈ I ) is called independent, if for any finite set
F ⊆ I and Ai ∈ Ai for all i ∈ F, we have
P(∩i∈ F Ai ) = ∏ P( Ai ).
i∈ F
3 Conditional Probability
Consider N trials of a random experiment over an outcome space Ω and an event space F. Let ωn ∈ Ω
denote the outcome of the experiment of the nth trial. Consider two events A, B ∈ F and denote the number
of times event A and event B occurs by N ( A) and N ( B) respectively. We denote the number of times both
events A and B occurred by N ( A ∩ B). Then, we can write these numbers in terms of indicator functions as
N N N
N ( A) = ∑ 1{ ω n ∈ A } , N ( B) = ∑ 1{ ω n ∈ B } , N ( A ∩ B) = ∑ 1{ ω n ∈ A ∩ B } .
n =1 n =1 n =1
N ( A) N ( B) N ( A∩ B)
We denote the relative frequency of events A, B, A ∩ B in N trials by N , N , N respectively. We can
find the relative frequency of events A, on the trials where B occurred as
N ( A∩ B)
N N ( A ∩ B)
N ( B)
= .
N ( B)
N
2
Inspired by the relative frequency, we define the conditional probability function conditioned on events.
Definition 3.1. Fix an event B ∈ F such that P( B) > 0, we can define the conditional probability P(·| B) :
F → [0, 1] of any event A ∈ F conditioned on the event B as
P( A ∩ B)
P( A| B) = .
P( B)
Lemma 3.2 (Conditional probability). For any event B ∈ F such that P( B) > 0, the conditional probability
P(·| B) : F → [0, 1] is a probability measure on space (Ω, F ).
Proof. We will show that the conditional probability satisfies all four axioms of a probability measure.
Non-negativity: For all events A ∈ F, we have P( A| B) > 0 since P( A ∩ B) > 0.
σ-additivity: For an infinite sequence of mutually disjoint events ( Ai ∈ F : i ∈ N) such that Ai ∩ A j = ∅
for all i 6= j, we have P(∪i∈N Ai | B) = ∑i∈N P( Ai | B). This follows from disjointness of the sequence
( A i ∩ B ∈ F : i ∈ N).
Certainty: Since Ω ∩ B = B, we have P(Ω| B) = 1.
Remark 2. For two independent events A, B ∈ F such that P( A ∩ B) > 0, we have P( A| B) = P( A) and
P( B| A) = P( B). If either P( A) = 0 or P( B) = 0, then P( A ∩ B) = 0.
Remark 3. For any partition B of the sample space Ω, if P( Bn ) > 0 for all n ∈ N, then from the law of total
probability and the definition of conditional probability, we have
P( A) = ∑ P( A| Bn ) P( Bn ).
n ∈N
4 Conditional Independence
Definition 4.1 (Conditional independence of events). For a probability space (Ω, F, P), a family of events
( Ai ∈ F : i ∈ I ) is said to be conditionally independent given an event C ∈ F such that P(C ) > 0, if for any
finite set F ⊆ I, we have
P(∩i∈ F Ai |C ) = ∏ P( Ai |C ).
i∈ F
Remark 4. Let C ∈ F be an event such that P(C ) > 0 Two events A, B ∈ F are said to be conditionally
independent given event C, if
P ( A ∩ B | C ) = P ( A | C ) P ( B | C ).
If the event C = Ω, it implies that A, B are independent events.
Remark 5. Two events may be independent, but not conditionally independent and vice versa.
Example 4.2. Consider two independent events A, B ∈ F such that P( A ∩ B) > 0 and P( A ∪ B) < 1.
Then the events A and B are not conditionally independent given A ∪ B. To see this, we observe that
P(( A ∩ B) ∩ ( A ∪ B)) P( A ∩ B) P( A) P( B)
P( A ∩ B| A ∪ B) = = = = P ( A | A ∪ B ) P ( B ).
P( A ∪ B) P( A ∪ B) P( A ∪ B)
P( B)
We further observe that P( B| A ∪ B) = P( A∪ B)
6= P( B) and hence P( A ∩ B| A ∪ B) 6= P( A| A ∪ B) P( B| A ∪
B ).
Example 4.3. Consider two non-independent events A, B ∈ F such that P( A) > 0. Then the events A
and B are conditionally independent given A. To see this, we observe that
P( A ∩ B)
P( A ∩ B| A) = = P ( B | A ) P ( A | A ).
P( A)