0% found this document useful (0 votes)
31 views

Mathematical Foundations of Computer Science Lecture Outline

1) The document outlines key concepts in probability theory including the theorem on independence of random variables, the geometric distribution, and Chebyshev's inequality. 2) It defines a geometric random variable which models the number of Bernoulli trials until the first success, where the probability of success is p. 3) It shows the expectation of a geometric random variable is 1/p using multiple methods like the definition of expectation and the memoryless property.

Uploaded by

Chenyang Fang
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views

Mathematical Foundations of Computer Science Lecture Outline

1) The document outlines key concepts in probability theory including the theorem on independence of random variables, the geometric distribution, and Chebyshev's inequality. 2) It defines a geometric random variable which models the number of Bernoulli trials until the first success, where the probability of success is p. 3) It shows the expectation of a geometric random variable is 1/p using multiple methods like the definition of expectation and the memoryless property.

Uploaded by

Chenyang Fang
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Mathematical Foundations of Computer Science

Lecture Outline
November 8, 2018

Theorem. If X and Y are independent real-valued random variables then

Var[X + Y ] = Var[X] + Var[Y ] and E[X · Y ] = E[X] · E[Y ]

The result can be extended to a finite number of random variables.


Note that the converse of the above statement is not true as illustrated by the following
example. Let Ω = {a, b, c}, with all three outcomes equally likely. Let X and Y be
random variables defined as follows: X(a) = 1, X(b) = 0, X(c) = −1 and Y (a) = 0, Y (b) =
1, Y (c) = 0. Note that X and Y are not independent since

1 2 2
Pr[X = 0 ∧ Y = 0] = 0, but Pr[X = 0] · Pr[Y = 0] = · = 6= 0.
3 3 9

Note that for all ω ∈ Ω, X(ω)Y (ω) = 0. Also, E[X] = 0 and E[Y ] = 1/3. Thus we have

E[XY ] = 0 = E[X]E[Y ]

It is also easy to verify that Var[X + Y ] = Var[X] + Var[Y ].

Example (Chebyshev’s Inequality). Let X be a random variable. Show that for any
a > 0,
Var[X]
Pr[|X − E[X]| ≥ a] ≤
a2

Solution. The inequality that we proved in the earlier homework is called Markov’s in-
equality. We will use it to prove the above tail bound called Chebyshev’s inequality.

Pr[|X − E[X]| ≥ a] = Pr[(X − E[X])2 ≥ a2 ]


E[(X − E[X])2 ]
≤ (using Markov’s Inequality)
a2
Var[X]
=
a2

Example. Use Chebyshev’s inequality to bound the probability of obtaining at least 3n/4
heads in a sequence of n fair coin flips.
2 Lecture Outline November 8, 2018

Solution. Let X denote the random variable denoting the total number of heads that
result in n flips of a fair coin. For 1 ≤ i ≤ n, let Xi be a random variable that is 1, if the
ith flip results in Heads, 0, otherwise. Thus,

X = X1 + X2 + · · · + Xn

By the linearity of expectation, E[X] = n/2. Since the random variables Xi s are indepen-
dent, we have
n n
X X n
Var[X] = Var[Xi ] = (1/2 − 1/4) =
4
i=1 i=1

Using Chebyshev’s inequality, we get

Pr[X ≥ 3n/4] = Pr[X − n/2 ≥ n/4]


= Pr[X − E[X] ≥ n/4]
1
= · Pr[|X − E[X]| ≥ n/4]
2
1 Var[X]
≤ · 2
2 n /16
2
=
n

Probability Distributions
Tossing a coin is an experiment with exactly two outcomes: heads (“success”) with a
probability of, say p, and tails (“failure”) with a probability of 1 − p. Such an experiment
is called a Bernoulli trial. Let Y be a random variable that is 1 if the experiment succeeds
and is 0 otherwise. Y is called a Bernoulli or an indicator random variable. For such a
variable we have
E[Y ] = p · 1 + (1 − p) · 0 = p = Pr[Y = 1]
Thus for a fair coin if we consider heads as ”success” then the expected value of the corre-
sponding indicator random variable is 1/2.

A sequence of Bernoulli trials means that the trials are independent and each has a prob-
ability p of success. We will study two important distributions that arise from Bernoulli
trials: the geometric distribution and the binomial distribution.

The Geometric Distribution


Consider the following question. Suppose we have a biased coin with heads probability p
that we flip repeatedly until it lands on heads. What is the distribution of the number
of flips? This is an example of a geometric distribution. It arises in situations where we
perform a sequence of independent trials until the first success where each trial succeeds
with a probability p.
November 8, 2018 Lecture Outline 3

Note that the sample space Ω consists of all sequences that end in H and have exactly one
H. That is
Ω = {H, T H, T T H, T T T H, T T T T H, . . .}
For any ω ∈ Ω of length i, Pr[ω] = (1 − p)i−1 p.

Definition. A geometric random variable X with parameter p is given by the following


distribution for i = 1, 2, . . . :
Pr[X = i] = (1 − p)i−1 p
We can verify that the geometric random variable admits a valid probability distribution
as follows:
∞ ∞ ∞
X X p X p 1−p
(1 − p)i−1 p = p (1 − p)i−1 = (1 − p)i = · =1
1−p 1 − p 1 − (1 − p)
i=1 i=1 i=1
P∞ i c
Note that to obtain the second-last term we have used the fact that i=1 c = 1−c , |c| < 1.

Let’s now calculate the expectation of a geometric random variable, X. We can do this in
several ways. One way is to use the definition of expectation.

X
E[X] = i Pr[X = i]
i=0

X
= i(1 − p)i−1 p
i=0

p X
= i(1 − p)i
1−p
i=0

   !
p 1−p X x
= ∵ kxk = , for |x| < 1.
1−p (1 − (1 − p))2 (1 − x)2
i=0
  
p 1−p
=
1−p p2
1
=
p
Another way to compute the expectation is to note that X is a random variable that takes
on non-negative values. From a theorem proved in last class we know that if X takes on
only non-negative values then
X∞
E[X] = Pr[X ≥ i]
i=1
Using this result we can calculate the expectation of the geometric random variable X. For
the geometric random variable X with parameter p,
∞ ∞
X
j−1 i−1
X 1
Pr[X ≥ i] = (1−p) p = (1−p) p (1−p)j = (1−p)i−1 p× = (1−p)i−1
1 − (1 − p)
j=i j=0
4 Lecture Outline November 8, 2018

Therefore
∞ ∞ ∞
X X 1 X 1 1−p 1
E[X] = Pr[X ≥ i] = (1 − p)i−1 = (1 − p)i = · =
1−p 1 − p 1 − (1 − p) p
i=1 i=1 i=1

Memoryless Property. For a geometric random variable X with parameter p and for
n > 0,
Pr[X = n + k | X > k] = Pr[X = n]

Conditional Expectation. The following is the definition of conditional expectation.


X
E[Y | Z = z] = y Pr[Y = y | Z = z],
y

where the summation is over all possible values y that the random variable Y can assume.

Example. For any random variables X and Y ,


X
E[X] = Pr[Y = y]E[X | Y = y]
y

We can also calculate the expectation of a geometric random variable X using the
memoryless property of the geometric random variable. Let Y be a random variable that
is 0, if the first flip results in tails and that is 1, if the first flip is a heads. Using conditional
expectation we have

E[X] = Pr[Y = 0]E[X|Y = 0] + Pr[Y = 1]E[X|Y = 1]


= (1 − p)(E[X] + 1) + p · 1 (using the memoryless property)
∴ pE[X] = 1
1
E[X] =
p

You might also like