4 Random Variables
4 Random Variables
Random Variables
May 2017
A Random Variable is a variable that probabilistically takes on different values. You can think of an RV as
being like a variable in a programming language. They take on values, have types and have domains over
which they are applicable. We can define events that occur if the random variable takes one values that satisfy
a numerical test (eg does the variable equal 5, is the variable less than 8). We often think of the probabilities
of such events.
As an example, let’s say we flip three fair coins. We can define a random variable Y to be the total number
of “heads” on the three coins. We can ask about the probability of Y taking on different values using the
following notation:
Even though we use the same notation for random variables and for events (both use capitol letters) they
are distinct concepts. An event is a scenario, a random variable is an object. The scenario where a random
variable takes on a particular value (or range of values) is an event. When possible, I will try and use letters
E, F, G for events and X,Y, Z for random variables.
Using random variables is a convenient notation technique that assists in decomposing problems. There are
many different types of random variables (indicator, binary, choice, Bernoulli, etc). The two main families of
random variable types are discrete and continuous. For now we are going to develop intuition around discrete
random variables.
1
1/6
P(X=x)
0 0
1 2 3 4 5 6
x
Figure: On the left, the PMF of a single 6 sided die roll. On the right, the PMF of the sum of two dice rolls.
There are many ways that these Probability Mass Functions can be specified. We could draw a graph. We
could have a table (or for you CS folks, a Map) that lists out all the probabilities for all possible events. Or
we could write out a mathematical expression.
For example lets consider the random variable X which is the sum of two dice rolls. The probability mass
function can be defined by the graph on the right of figure . It could have also been defined using the equation:
x
36
if x ∈ R , 0 ≤ x ≤ 6
12−x
pX (x) = 36 if x ∈ R , x ≤ 7
0 else
The probability mass function, pX (x), defines the probability of X taking on the value x. The new notation
pX (x) is simply different notation for writing P(X = x). Using this new notation makes it more apparent that
we are specifying a function. Try a few values of x, and compare the value of pX (x) to the graph in figure .
They should be the same.
Expected Value
A relevant statistic for a random variable is the average value of the random variable over many repetitions
of the experiment it represents. This average is called the Expected Value.
E[X] = ∑ xP(x)
x:P(x)>0
It goes by many other names: Mean, Expectation, Weighted Average, Center of Mass, 1st Moment.
Example 1
Lets say you roll a 6-Sided Die and that a random variable X represents the outcome of the roll. What is the
E[X]? This is the same as asking what is the average value.
2
Example 2
Lets say a school has 3 classes with 5, 10, and 150 students. If we randomly choose a class with equal
probability and let X = size of the chosen class:
E[Y ] = 5(1/3) + 10(1/3) + 150(1/3)
= 165/3 = 55
If instead we randomly choose a student with equal probability and let Y = size of the class the student is in
E[X] = 5(5/165) + 10(10/165) + 150(150/165)
= 22635/165 = 137
Example 3
Consider a game played with a fair coin which comes up heads with p = 0.5. Let n = the number of coin
flips before the first “tails”. In this game you win $2n . How many dollars do you expect to win? Let X be a
random variable which represents your winnings.
1 2 3 4 ∞ i+1
1 0 1 1 1 2 1 3 1
E[X] = 2 + 2 + 2 + 2 +··· = ∑ 2i
2 2 2 2 i=0 2
∞
1
=∑ =∞
i=0 2
Properties of Expectation
Expectations preserve linearity which means that
E[aX + b] = aE[X] + b
There is a wonderful law called the Law of the Unconcious Statistician that is used to calculate the expected
value of a function g(X) of a random variable X when one knows the probability distribution of X but one
does not explicitly know the distribution of g(X).
E[g(X)] = ∑ g(x) · pX (x)
x
For example, lets apply the law of the unconcious statistician to compute the expectation of the square of a
random variable (called the second moment).
E[X 2 ] = E[g(X)] where g(X) = X 2
= ∑ g(x) · pX (x) by the unconcious statistician
x
= ∑ x2 · pX (x) by the unconcious statistician
x
Variance
Expectation is a truly useful statistic, but it does not give a detailed view of the probability mass function.
Consider the following 3 distributions (PMFs)
All three have the same expected value, E[X] = 3 but the “spread” in the distributions is quite different.
Variance is a formal quantification of “spread”.
Spread is a hard concept to quantify
3
The variance of a discrete random variable, X, with expected value µ is:
Var(X) = E[(X–µ)2 ]
When computing the variance often we use a different form of the same equation:
Example 1
Let X = value on roll of a 6 sided die. Recall that E[X] = 7/2. First lets calculate E[X 2 ]
1 1 1 1 1 1 91
E[X 2 ] = (12 ) + (22 ) + (32 ) + (42 ) + (52 ) + (62 ) =
6 6 6 6 6 6 6
Which we can use to compute the variance: