Ec310 Day 4 Lecture Notes
Ec310 Day 4 Lecture Notes
• Everything we’ve covered so far has been to make data more digestible
• Probability theory helps us to be more formal about what the sample tells us about the
Day 4 – Video 1 population
Probability Theory
Random Experiments
Terminology Terminology
• When we’re discussing a random occurrence, it’s often useful to clarify • A random experiment is an action that leads to one of several possible outcomes
whether or not we know the outcome yet
• Examples: coin flip, die roll
• We do this using the following (latin) phrases:
• Is your grade in this class a random experiment?
• Ex ante – means “before the fact”
• To analyze a random experiment, the first step is to come up with the sample space (S)
• Ex post – means “after the fact”
• This is a list of all possible outcomes (O) of the experiment
• Why make this distinction? Well, it has a huge impact on how we answer basic questions… In other words: Ex post, exactly one of
• This list must be exhaustive and mutually exclusive the outcomes must have occurred
• Consider a coin toss:
• Are these valid sample spaces? (If not, what’s wrong and how do we fix it?)
• What is the probability of heads ex ante?
• Your grade in this class: A, AB, B, BC, C, D, Pass, & Fail So there is more than one way to define
the sample space!
• What is the probability of heads ex post?
• Roulette: Red & Black
• Usually in statistics we’re speaking ex ante, but there will be a few exceptions
• Will the temperature tomorrow be greater than 65 degrees: Yes & No
• Venn diagrams are a useful way to illustrate the rules of probability • The second step in analyzing an experiment is to assign probabilities to each outcome
• Think of the area as the probability of that outcome (aka as a probability diagram) • The probability of any outcome must lie between 0 and 1:
• And the sum of the probabilities over all possible outcomes must equal 1:
This diagram represents a valid sample space k
X
A B C
S={A,B,C}
P (Oi ) = 1
Outcomes are exhaustive and mutually exclusive i=1
• So graphically on our Venn Diagrams, the area of each outcome must be in the range [0,1]
Not Mutually Exclusive Not Exhaustive O1 O2 And the area of the entire diagram must equal one...
A
B A B C • When we’re analyzing an experiment, how do we determine the probability of
C 5
each outcome? 6
• What is the probability of a head for this coin? • This corresponds to the classical approach of assigning probabilities
1. Define a sample space having k equally likely outcomes (requires theory or intuition)
7 8
Classical Approach Relative Frequency Approach
Total on Two Dice Also known as the
• Will the classical approach work for these experiments? • Was your intuition for the coin example that P(H)=2/3?
Die 1 empirical method
• Rolling a six-sided die? 1 2 3 4 5 6 • This corresponds to the relative frequency approach to assigning probabilities
1 2 3 4 5 6 7
• More formally, the relative frequency approach recommends the following procedure:
2 3 4 5 6 7 8
• Rolling two dice and observing the total?
Die 3 4 5 6 7 8 9 1. Define the sample space
• Outcomes of interest: {2, 3, , 12} 2 4 5 6 7 8 9 10
5 6 7 8 9 10 11 2. Assign a probability to each outcome based on experimentation or historical data
6 7 8 9 10 11 12
• But we could instead define the sample space as:
(1,1), (1,2), ... (1,6)
(2,1), (2,2), ... (2,6) k =?
...
(6,1), (6,2), ... (6,6) P (1, 1) =?
• And use the classical approach on this sample space to calculate the probabilities for
the actual outcomes of interest....
P(total is 2)=? P(total is 6)=?
=1/36 =5/36 10
• Example: A computer shop tracks the number of computers it sells per day, and they have • Usually one of these two methods is most appropriate
data on sales for the past 30 days:
• But occasionally we’ll be in an unfortunate situation where:
Desktops Sold Number of Days Probability of Outcome
• Theory/intuition isn’t helpful in determining a set of k equally likely outcomes
0 1 1/30 = .03
1 2 2/30 = .07 • We have no data
2 10 10/30 = .33
• In this case, we’re left with no option but to rely on judgment
3 12 12/30 = .40
4 5 5/30 = .17 • When we use an expert opinion or an individual’s beliefs to assign a probability to each
• From this we can construct the probability a specific # of desktop sold on a given day outcome, this is called the subjective approach
11 12
Which Approach to Use?
• Recall the three approaches are: classical, relative frequency, and subjective
• We’ve defined the random experiment, the sample space, and we’ve assigned probabilities
to each outcome
Econ 310 • But often we care about the probability of some more complicated event
• A simple event is another term for one of the exhaustive and mutually exclusive
outcomes that make up a random experiment
• Recall the example from the last video where we were interested in the two dice being rolled • Recall we found that P(6)=5/36 Die 1
1 2 3 4 5 6
• There are 36 outcomes: (1,1) (1,2) (1,3) (1,4) (1,5) (1,6) (2,1) (2,2) (2,3) (2,4) (2,5) (2,6) (3,1) (3,2) (3,3) (3,4) 1 2 3 4 5 6 7
(3,5) (3,6) (4,1) (4,2) (4,3) (4,4) (4,5) (4,6) (5,1) (5,2) (5,3) (5,4) (5,5) (5,6) (6,1) (6,2) (6,3) (6,4) (6,5) (6,6)
2 3 4 5 6 7 8
• But we were interested in the sum of the two dice, so the events of interest are: Die 3 4 5 6 7 8 9
{2,3,4,5,6,7,8,9,10,11,12} 2 4
Die 1 5 6 7 8 9 10
• The event 6 is composed of the outcomes: 5 6 7 8 9 10 11
1 2 3 4 5 6
{(1,5) (2,4) (3,3) (4,2) (5,1)} 6 7 8 9 10 11 12
1 2 3 4 5 6 7
2 3 4 5 6 7 8 • What’s the probability that the sum of the two dice is NOT equal to six?
• The probability of an event is the sum of the
probabilities of the outcomes that Die 3 4 5 6 7 8 9 • This rule has a name – the complement rule
compose the event 2 4 5 6 7 8 9 10
• This coincides with what we correctly 5 6 7 8 9 10 11
intuited a few slides back: 6 7 8 9 10 11 12
P (6) = P [(1, 5)] + P [(2, 4)] + P [(3, 3)] + P [(4, 2)] + P [(5, 1)]
1 1 1 1 1 5 NOTE: This is a special case of the addition rule,
= + + + + = which we’ll come back to at the end of the lecture 4
36 36 36 36 36 36
The Complement Rule Intersection of Events
• So we’ve calculated the probability of some event A, but we’re interested in the probability • The intersection of events A and B is the set of outcomes that are in both A and B
that A does not occur
• The probability of this intersection is written as: P(A and B) or P(A ∩ B)
• In other words, we’re interested in the probability of the complement of A
• We call this the joint probability of A and B
• Where the complement of an event is all the outcomes that are not in that event
• This Venn diagram illustrates the concept of an intersection
• The complement of A is denoted by Ac
• Example: Two dice
• This Venn diagram illustrates the concept of a complement
• If A is the event where the first die is a 1
• Since the sum of all outcomes {(1,1), (1,2), (1,3), (1,4), (1,5), (1,6)}
must sum to one, we know that:
• And B is the event where the second die is a 5
P (Ac ) = 1
A B
A and B
P (A) {(1,5), (2,5), (3,5), (4,5), (5,5), (6,5)}
• P(Total=6) = 5/36
A Ac • Then what is the intersection?
• When two events are mutually exclusive, this means they cannot occur together • The union of events A and B is the set of outcomes that are in either A or B (or both)
• Example: Consider two events – It rains tomorrow and it does not rain tomorrow • The probability of this union is written as: P(A or B) or P(A ∪ B)
• What is the joint probability P( RAIN and NO RAIN ) ? • This Venn diagram illustrates the concept of a union
A B
• And B is the event where the second die is a 5
that the events are disjoint
{(1,5), (2,5), (3,5), (4,5), (5,5), (6,5)}
• Then the union of A and B is {(1,1), (1,2), (1,3), (1,4), (1,5), (1,6), (2,5), (3,5), (4,5), (5,5), (6,5)}
7 8
Joint Probability Tables Marginal Probabilities
• Why do some individuals engage in crime while others do not? • Let’s introduce shorthand notation for these events:
Low Testosterone 0.11 0.38 • What is P(High) – the probability an individual has high testosterone?
So it’s a joint probability! • They have this name because they are typically written in the margins of the table
And the entire table is thus called • Useful way to double check your work – the total row and total column should both sum
a joint probability table to one (caveat: there may be rounding error)
9 10
• In case the previous slide wasn’t intuitive, here are the contents of each cell and the
equations for calculating the marginal probabilities
Low P(Low and Crime) P(Low and NoCrime) P(Low)=P(Low and Crime) + P(Low and NoCrime)
P(Crime)= P(NoCrime)=
Totals P(High and Crime) +
P(Low and Crime)
P(High and NoCrime)
+ P(Low and NoCrime)
11 12
Source: http://xkcd.com
Conditional Probability Conditional Probability
• Conditional probability are used when we want to know the probability of one event given • These Venn diagrams illustrate the concept of a conditional probability
the occurrence of another event A and B
P (A and B)
P (A|B) =
P (B) A A B
• Let’s compare the equations for P(A|B) and P(B|A):
P (A and B)
P (B|A) = P (A and B)
P (A) P (A|B) =
P (B)
13 • You can think of P(A) as being ex ante, and P(A|B) as being ex post (wrt event B) 14
• Let’s return to our crime example: What’s the probability a young man has committed a • We often want to know whether events are independent
felony given that the individual has high testosterone?
• Two events are independent of one another when the probability of one event is
• So in our shorthand notation, we want to know P(Crime|High) unaffected by the occurrence of the other event
• One useful property of conditional probabilities is they give us a quick way to check
Crime NoCrime Totals whether two events are related
High .17 .34 .51 • Two events A and B are said to be independent if:
• We can also illustrate independence with a Venn Diagram: • The multiplication rule is used to calculate the joint probability of two events. It is based on
the formulas for conditional probability defined earlier:
P (A and B) P (A and B)
P (A|B) = and P (B|A) =
B So knowing we’re in
B doesn’t change
P (B) P (A)
• If we multiply both sides of the first equation by P(B) we have:
the probability of A!
A and B
A P (A and B) = P (A|B) ⇥ P (B)
• Now suppose you decide to enroll in your last two classes randomly
E E E H H
• Again, there are 3 open classes in Economics and 2 in History
• Let E1 represent the event that the first class is in Econ and E2 represent the event that the
• You can’t decide on your last class, so you decide to enroll randomly second class is in Econ
• There are 3 open classes in Economics and 2 in History • We want to know: What is the probability both classes are in Econ?
• What’s the probability that you’ll enroll in an economics class? • Begin by assuming that the two events are independent
3 • If the events are independent, we can use this version of the multiplication rule:
P (E) =
5 3 3
P (E1 and E2 ) = P (E1 ) ⇥ P (E2 ) = ⇥ = 0.36
5 5
• Any objections?
21 22
• We stated earlier that the probability of a union of two events is denoted as P(A or B)
E E E H H
• We can use this concept to answer questions like: What’s the probability an individual has
committed a felony OR has high testosterone?
• Now we realize E1 and E2 aren’t independent (since we can’t enroll in a class twice)
• Since the events aren’t independent, we need this version of the multiplication rule:
• What is P(High or Crime)? • We’ve been throwing around the addition rule to calculate the probability of events from
simple events, but there’s a complication to the addition rule that we’ve yet to address
• One way to calculate this probability is to divide the event into the simple events that
make up the event: (High and Crime), (High and NoCrime), and (Low and Crime) • Suppose you’re working concessions at a Mallard game and you know:
• So far, when calculating the probability of unions, we’ve simply added the probabilities of the • So back to our Mallard’s example, if we have:
relevant events
• 60% buy a hot dog
• But this is because, in all the examples thus far, we’ve considering the unions of mutually
exclusive events • 80% buy a beer
• The actual addition rule looks like this: • 50% buy both a hot dog and a beer
P (A or B) = P (A) + P (B) P (A and B) • What proportion buy either a hot dog or a beer? P (Hot Dog or Beer)
• And if A and B are mutually exclusive, it simplifies to: P (A or B) = P (A) + P (B)
• If you’re a visual person, here’s the rule in Venn diagram form: Using the addition rule
P (Hot Dog or Beer) = P (Hot Dog) + P (Beer) P (Hot Dog and Beer)
A B = A + B – = .60 + .80 .50 = .90
• In fact, this gives us a third way of calculating the probability we were interested in before