Mathematical Foundations of Computer Science Lecture Outline
Mathematical Foundations of Computer Science Lecture Outline
Lecture Outline
November 8, 2018
1 2 2
Pr[X = 0 ∧ Y = 0] = 0, but Pr[X = 0] · Pr[Y = 0] = · = 6= 0.
3 3 9
Note that for all ω ∈ Ω, X(ω)Y (ω) = 0. Also, E[X] = 0 and E[Y ] = 1/3. Thus we have
E[XY ] = 0 = E[X]E[Y ]
Example (Chebyshev’s Inequality). Let X be a random variable. Show that for any
a > 0,
Var[X]
Pr[|X − E[X]| ≥ a] ≤
a2
Solution. The inequality that we proved in the earlier homework is called Markov’s in-
equality. We will use it to prove the above tail bound called Chebyshev’s inequality.
Example. Use Chebyshev’s inequality to bound the probability of obtaining at least 3n/4
heads in a sequence of n fair coin flips.
2 Lecture Outline November 8, 2018
Solution. Let X denote the random variable denoting the total number of heads that
result in n flips of a fair coin. For 1 ≤ i ≤ n, let Xi be a random variable that is 1, if the
ith flip results in Heads, 0, otherwise. Thus,
X = X1 + X2 + · · · + Xn
By the linearity of expectation, E[X] = n/2. Since the random variables Xi s are indepen-
dent, we have
n n
X X n
Var[X] = Var[Xi ] = (1/2 − 1/4) =
4
i=1 i=1
Probability Distributions
Tossing a coin is an experiment with exactly two outcomes: heads (“success”) with a
probability of, say p, and tails (“failure”) with a probability of 1 − p. Such an experiment
is called a Bernoulli trial. Let Y be a random variable that is 1 if the experiment succeeds
and is 0 otherwise. Y is called a Bernoulli or an indicator random variable. For such a
variable we have
E[Y ] = p · 1 + (1 − p) · 0 = p = Pr[Y = 1]
Thus for a fair coin if we consider heads as ”success” then the expected value of the corre-
sponding indicator random variable is 1/2.
A sequence of Bernoulli trials means that the trials are independent and each has a prob-
ability p of success. We will study two important distributions that arise from Bernoulli
trials: the geometric distribution and the binomial distribution.
Note that the sample space Ω consists of all sequences that end in H and have exactly one
H. That is
Ω = {H, T H, T T H, T T T H, T T T T H, . . .}
For any ω ∈ Ω of length i, Pr[ω] = (1 − p)i−1 p.
Let’s now calculate the expectation of a geometric random variable, X. We can do this in
several ways. One way is to use the definition of expectation.
∞
X
E[X] = i Pr[X = i]
i=0
∞
X
= i(1 − p)i−1 p
i=0
∞
p X
= i(1 − p)i
1−p
i=0
∞
!
p 1−p X x
= ∵ kxk = , for |x| < 1.
1−p (1 − (1 − p))2 (1 − x)2
i=0
p 1−p
=
1−p p2
1
=
p
Another way to compute the expectation is to note that X is a random variable that takes
on non-negative values. From a theorem proved in last class we know that if X takes on
only non-negative values then
X∞
E[X] = Pr[X ≥ i]
i=1
Using this result we can calculate the expectation of the geometric random variable X. For
the geometric random variable X with parameter p,
∞ ∞
X
j−1 i−1
X 1
Pr[X ≥ i] = (1−p) p = (1−p) p (1−p)j = (1−p)i−1 p× = (1−p)i−1
1 − (1 − p)
j=i j=0
4 Lecture Outline November 8, 2018
Therefore
∞ ∞ ∞
X X 1 X 1 1−p 1
E[X] = Pr[X ≥ i] = (1 − p)i−1 = (1 − p)i = · =
1−p 1 − p 1 − (1 − p) p
i=1 i=1 i=1
Memoryless Property. For a geometric random variable X with parameter p and for
n > 0,
Pr[X = n + k | X > k] = Pr[X = n]
where the summation is over all possible values y that the random variable Y can assume.
We can also calculate the expectation of a geometric random variable X using the
memoryless property of the geometric random variable. Let Y be a random variable that
is 0, if the first flip results in tails and that is 1, if the first flip is a heads. Using conditional
expectation we have