0% found this document useful (0 votes)
9 views

Ec310 Day 4 Lecture Notes

This document discusses probability theory and approaches to assigning probabilities to outcomes of random experiments. It introduces key terminology like sample space and explains the classical and relative frequency approaches to determining probabilities, with the classical approach assigning equal probabilities to all outcomes and the relative frequency approach basing probabilities on experimental results or historical data.

Uploaded by

mikeywilfert
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

Ec310 Day 4 Lecture Notes

This document discusses probability theory and approaches to assigning probabilities to outcomes of random experiments. It introduces key terminology like sample space and explains the classical and relative frequency approaches to determining probabilities, with the classical approach assigning equal probabilities to all outcomes and the relative frequency approach basing probabilities on experimental results or historical data.

Uploaded by

mikeywilfert
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Motivation

• Everything we’ve covered so far has been to make data more digestible

Econ 310 • This is an important topic

• But our long run goal is more ambitious...


Statistics – Measurement in Economics • We’d like to be able to conduct statistical inference!

• In order to conduct inference, we’re going to need some theory

• Probability theory helps us to be more formal about what the sample tells us about the
Day 4 – Video 1 population

Probability Theory
Random Experiments

(Also see Keller Chapter 6) 2

Terminology Terminology

• When we’re discussing a random occurrence, it’s often useful to clarify • A random experiment is an action that leads to one of several possible outcomes
whether or not we know the outcome yet
• Examples: coin flip, die roll
• We do this using the following (latin) phrases:
• Is your grade in this class a random experiment?
• Ex ante – means “before the fact”
• To analyze a random experiment, the first step is to come up with the sample space (S)
• Ex post – means “after the fact”
• This is a list of all possible outcomes (O) of the experiment
• Why make this distinction? Well, it has a huge impact on how we answer basic questions… In other words: Ex post, exactly one of
• This list must be exhaustive and mutually exclusive the outcomes must have occurred
• Consider a coin toss:
• Are these valid sample spaces? (If not, what’s wrong and how do we fix it?)
• What is the probability of heads ex ante?
• Your grade in this class: A, AB, B, BC, C, D, Pass, & Fail So there is more than one way to define
the sample space!
• What is the probability of heads ex post?
• Roulette: Red & Black
• Usually in statistics we’re speaking ex ante, but there will be a few exceptions
• Will the temperature tomorrow be greater than 65 degrees: Yes & No

• Notation: S = {O1 , O2 , . . . , Ok } For now, assume


3 k is finite 4
Venn Diagrams Assigning Probabilities to Outcomes

• Venn diagrams are a useful way to illustrate the rules of probability • The second step in analyzing an experiment is to assign probabilities to each outcome

• Think of the area as the probability of that outcome (aka as a probability diagram) • The probability of any outcome must lie between 0 and 1:

• So in this diagram, outcome A is most likely... P (Oi ) 2 [0, 1] for each i

• And the sum of the probabilities over all possible outcomes must equal 1:
This diagram represents a valid sample space k
X
A B C
S={A,B,C}
P (Oi ) = 1
Outcomes are exhaustive and mutually exclusive i=1
• So graphically on our Venn Diagrams, the area of each outcome must be in the range [0,1]

Not Mutually Exclusive Not Exhaustive O1 O2 And the area of the entire diagram must equal one...

A
B A B C • When we’re analyzing an experiment, how do we determine the probability of
C 5
each outcome? 6

Approaches to Assigning Probabilities Classical Approach

• A coin has been flipped 6 times: H T H H T H • Is your intuition that P(H)=1/2?

• What is the probability of a head for this coin? • This corresponds to the classical approach of assigning probabilities

• More formally, the classical approach recommends the following procedure:

1. Define a sample space having k equally likely outcomes (requires theory or intuition)

2. Assign a probability of 1/k to each outcome

7 8
Classical Approach Relative Frequency Approach
Total on Two Dice Also known as the
• Will the classical approach work for these experiments? • Was your intuition for the coin example that P(H)=2/3?
Die 1 empirical method

• Rolling a six-sided die? 1 2 3 4 5 6 • This corresponds to the relative frequency approach to assigning probabilities
1 2 3 4 5 6 7
• More formally, the relative frequency approach recommends the following procedure:
2 3 4 5 6 7 8
• Rolling two dice and observing the total?
Die 3 4 5 6 7 8 9 1. Define the sample space
• Outcomes of interest: {2, 3, , 12} 2 4 5 6 7 8 9 10
5 6 7 8 9 10 11 2. Assign a probability to each outcome based on experimentation or historical data
6 7 8 9 10 11 12
• But we could instead define the sample space as:
(1,1), (1,2), ... (1,6)
(2,1), (2,2), ... (2,6) k =?
...
(6,1), (6,2), ... (6,6) P (1, 1) =?
• And use the classical approach on this sample space to calculate the probabilities for
the actual outcomes of interest....
P(total is 2)=? P(total is 6)=?
=1/36 =5/36 10

Relative Frequency Approach Subjective Approach

• Example: A computer shop tracks the number of computers it sells per day, and they have • Usually one of these two methods is most appropriate
data on sales for the past 30 days:
• But occasionally we’ll be in an unfortunate situation where:
Desktops Sold Number of Days Probability of Outcome
• Theory/intuition isn’t helpful in determining a set of k equally likely outcomes
0 1 1/30 = .03
1 2 2/30 = .07 • We have no data
2 10 10/30 = .33
• In this case, we’re left with no option but to rely on judgment
3 12 12/30 = .40
4 5 5/30 = .17 • When we use an expert opinion or an individual’s beliefs to assign a probability to each
• From this we can construct the probability a specific # of desktop sold on a given day outcome, this is called the subjective approach

• So using the relative frequency approach, the probabilities are…

11 12
Which Approach to Use?

• Recall the three approaches are: classical, relative frequency, and subjective

• Which approach would you lean toward in the following situations:

• What is the probability of finding life on Mars?

• What is the probability of landing on four when


you play Roulette?

• What’s the probability of rain tomorrow?


13
Events

• We’ve defined the random experiment, the sample space, and we’ve assigned probabilities
to each outcome

Econ 310 • But often we care about the probability of some more complicated event

Statistics – Measurement in Economics • This raises some new terminology:

• A simple event is another term for one of the exhaustive and mutually exclusive
outcomes that make up a random experiment

• An event, on the other hand, is a set of one or more simple events


Day 4 – Video 2
Probability Theory
Rules of Probability

(Also see Keller Chapter 6) 2

Probability of an Event Probability of a Complement

• Recall the example from the last video where we were interested in the two dice being rolled • Recall we found that P(6)=5/36 Die 1
1 2 3 4 5 6
• There are 36 outcomes: (1,1) (1,2) (1,3) (1,4) (1,5) (1,6) (2,1) (2,2) (2,3) (2,4) (2,5) (2,6) (3,1) (3,2) (3,3) (3,4) 1 2 3 4 5 6 7
(3,5) (3,6) (4,1) (4,2) (4,3) (4,4) (4,5) (4,6) (5,1) (5,2) (5,3) (5,4) (5,5) (5,6) (6,1) (6,2) (6,3) (6,4) (6,5) (6,6)
2 3 4 5 6 7 8
• But we were interested in the sum of the two dice, so the events of interest are: Die 3 4 5 6 7 8 9
{2,3,4,5,6,7,8,9,10,11,12} 2 4
Die 1 5 6 7 8 9 10
• The event 6 is composed of the outcomes: 5 6 7 8 9 10 11
1 2 3 4 5 6
{(1,5) (2,4) (3,3) (4,2) (5,1)} 6 7 8 9 10 11 12
1 2 3 4 5 6 7
2 3 4 5 6 7 8 • What’s the probability that the sum of the two dice is NOT equal to six?
• The probability of an event is the sum of the
probabilities of the outcomes that Die 3 4 5 6 7 8 9 • This rule has a name – the complement rule
compose the event 2 4 5 6 7 8 9 10
• This coincides with what we correctly 5 6 7 8 9 10 11
intuited a few slides back: 6 7 8 9 10 11 12

P (6) = P [(1, 5)] + P [(2, 4)] + P [(3, 3)] + P [(4, 2)] + P [(5, 1)]
1 1 1 1 1 5 NOTE: This is a special case of the addition rule,
= + + + + = which we’ll come back to at the end of the lecture 4
36 36 36 36 36 36
The Complement Rule Intersection of Events

• So we’ve calculated the probability of some event A, but we’re interested in the probability • The intersection of events A and B is the set of outcomes that are in both A and B
that A does not occur
• The probability of this intersection is written as: P(A and B) or P(A ∩ B)
• In other words, we’re interested in the probability of the complement of A
• We call this the joint probability of A and B
• Where the complement of an event is all the outcomes that are not in that event
• This Venn diagram illustrates the concept of an intersection
• The complement of A is denoted by Ac
• Example: Two dice
• This Venn diagram illustrates the concept of a complement
• If A is the event where the first die is a 1
• Since the sum of all outcomes {(1,1), (1,2), (1,3), (1,4), (1,5), (1,6)}
must sum to one, we know that:
• And B is the event where the second die is a 5
P (Ac ) = 1
A B

A and B
P (A) {(1,5), (2,5), (3,5), (4,5), (5,5), (6,5)}

• So for the sum of two dice

• P(Total=6) = 5/36
A Ac • Then what is the intersection?

• What is the joint probability?

• So P(Total≠6) = 1 - 5/36 • P(A and B) = ?


5 6

Mutually Exclusive Events Union of Events

• When two events are mutually exclusive, this means they cannot occur together • The union of events A and B is the set of outcomes that are in either A or B (or both)

• Example: Consider two events – It rains tomorrow and it does not rain tomorrow • The probability of this union is written as: P(A or B) or P(A ∪ B)

• What is the joint probability P( RAIN and NO RAIN ) ? • This Venn diagram illustrates the concept of a union

• This Venn diagram illustrates the concept of mutually exclusive events

• Example: Two dice

• If A is the event where the first die is a 1


{(1,1), (1,2), (1,3), (1,4), (1,5), (1,6)}
A B
Another way of saying this is

A B
• And B is the event where the second die is a 5
that the events are disjoint
{(1,5), (2,5), (3,5), (4,5), (5,5), (6,5)}

• Then the union of A and B is {(1,1), (1,2), (1,3), (1,4), (1,5), (1,6), (2,5), (3,5), (4,5), (5,5), (6,5)}

7 8
Joint Probability Tables Marginal Probabilities

• Why do some individuals engage in crime while others do not? • Let’s introduce shorthand notation for these events:

• Is it possible young men’s criminal behavior is related to testosterone levels?


Crime NoCrime
High .17 .34
Has Committed Has Not Committed
One or More Felonies a Felony Low .11 .38
High Testosterone 0.17 0.34

Low Testosterone 0.11 0.38 • What is P(High) – the probability an individual has high testosterone?

• What is P(Crime) – the probability an individual has committed a felony?


This is the probability the individual is high
testosterone AND committed a felony • Marginal probabilities are probabilities computed by adding across either rows or columns

So it’s a joint probability! • They have this name because they are typically written in the margins of the table

And the entire table is thus called • Useful way to double check your work – the total row and total column should both sum
a joint probability table to one (caveat: there may be rounding error)
9 10

Marginal Probabilities What’s Wrong with this Logic?

• In case the previous slide wasn’t intuitive, here are the contents of each cell and the
equations for calculating the marginal probabilities

Crime NoCrime Totals


High P(High and Crime) P(High and NoCrime) P(High)=P(High and Crime) + P(High and NoCrime)

Low P(Low and Crime) P(Low and NoCrime) P(Low)=P(Low and Crime) + P(Low and NoCrime)

P(Crime)= P(NoCrime)=
Totals P(High and Crime) +
P(Low and Crime)
P(High and NoCrime)
+ P(Low and NoCrime)

What did we call


this rule again?

11 12
Source: http://xkcd.com
Conditional Probability Conditional Probability

• Conditional probability are used when we want to know the probability of one event given • These Venn diagrams illustrate the concept of a conditional probability
the occurrence of another event A and B

• The conditional probability is written as P(A|B) (“the probability of A given B”):

P (A and B)
P (A|B) =
P (B) A A B
• Let’s compare the equations for P(A|B) and P(B|A):

P (A and B) Marginal probability of A Probability of A conditional on B


P (A|B) =
P (B) • Note how this corresponds with the equation for conditional probability

P (A and B)
P (B|A) = P (A and B)
P (A) P (A|B) =
P (B)
13 • You can think of P(A) as being ex ante, and P(A|B) as being ex post (wrt event B) 14

The disorder is present in 1 per 1000 individuals


Result is positive 99% of the time for someone with the disease
Result is negative 95% of the time for someone without the disease

Blood Disorder Example Blood Disorder Example


The disorder is present
• During an annual checkup, a patient tests positive for a rare blood disorder: • Consider a diagram of 1000 individuals… in 1 in 1000 individuals,
so suppose one
• The disorder is present in only 1 per 1000 individuals person is ill
• When a person with the disorder takes the test, the result is positive 99% of the time
We’d expect this ill
person to test positive
• When a person is free of the disorder, the test is negative 95% of the time
(99% probability), so
• Suppose you test positive – what is the probability you truly have the disease? suppose they do

A. Below 50% For the 999 without the


disorder, we’d expect
B. 50% to 95%
5% to test positive, so
C. Between 95% and 99% suppose 50 do

D. 99% or above Conditional on testing


positive, what’s the
probability of being ill?

1/51, or less than 2%!


15 16
Conditional Probability Independence

• Let’s return to our crime example: What’s the probability a young man has committed a • We often want to know whether events are independent
felony given that the individual has high testosterone?
• Two events are independent of one another when the probability of one event is
• So in our shorthand notation, we want to know P(Crime|High) unaffected by the occurrence of the other event

• One useful property of conditional probabilities is they give us a quick way to check
Crime NoCrime Totals whether two events are related

High .17 .34 .51 • Two events A and B are said to be independent if:

Low .11 .38 .49 P (A|B) = P (A)


Totals .28 .72 1.00 or
• So what is P(Crime|High)?
P (B|A) = P (B)
• Example: Returning to our crime example, we saw that:
P (Crime and High) .17
P (Crime|High) = = = .333 • P(Crime|High)=.333
P (High) .51
• P(Crime)=0.28
17 18
• How do we interpret this? Are these events independent?

Independence Multiplication Rule

• We can also illustrate independence with a Venn Diagram: • The multiplication rule is used to calculate the joint probability of two events. It is based on
the formulas for conditional probability defined earlier:

P (A and B) P (A and B)
P (A|B) = and P (B|A) =
B So knowing we’re in
B doesn’t change
P (B) P (A)
• If we multiply both sides of the first equation by P(B) we have:

the probability of A!
A and B
A P (A and B) = P (A|B) ⇥ P (B)

• Likewise, multiplying both sides of the second equation by P(A) we have:


P (A) = 0.25 P (A and B) = P (B|A) ⇥ P (A)
P (A and B)
P (A|B) = = 0.25
P (B) • Note, if A and B are independent events these equations simplify to:

P (A and B) = P (A) ⇥ P (B)


19 20
Multiplication Rule Example Multiplication Rule Example

• Now suppose you decide to enroll in your last two classes randomly
E E E H H
• Again, there are 3 open classes in Economics and 2 in History

• Let E1 represent the event that the first class is in Econ and E2 represent the event that the
• You can’t decide on your last class, so you decide to enroll randomly second class is in Econ
• There are 3 open classes in Economics and 2 in History • We want to know: What is the probability both classes are in Econ?
• What’s the probability that you’ll enroll in an economics class? • Begin by assuming that the two events are independent
3 • If the events are independent, we can use this version of the multiplication rule:
P (E) =
5 3 3
P (E1 and E2 ) = P (E1 ) ⇥ P (E2 ) = ⇥ = 0.36
5 5
• Any objections?

21 22

Multiplication Rule Example Union

• We stated earlier that the probability of a union of two events is denoted as P(A or B)
E E E H H
• We can use this concept to answer questions like: What’s the probability an individual has
committed a felony OR has high testosterone?
• Now we realize E1 and E2 aren’t independent (since we can’t enroll in a class twice)

• Since the events aren’t independent, we need this version of the multiplication rule:

P (E1 and E2 ) = P (E2 |E1 ) ⇥ P (E1 )


3
• We’ve already decided: P (E1 ) =
5
• How about: P (E2 |E1 )
So there’s a 30%
chance both classes
will be in Econ
23 24
Union Addition Rule

• What is P(High or Crime)? • We’ve been throwing around the addition rule to calculate the probability of events from
simple events, but there’s a complication to the addition rule that we’ve yet to address
• One way to calculate this probability is to divide the event into the simple events that
make up the event: (High and Crime), (High and NoCrime), and (Low and Crime) • Suppose you’re working concessions at a Mallard game and you know:

• 60% buy a hot dog


Crime NoCrime Totals
• 80% buy a beer
High .17 .34 .51
• What we want to know is the probability an attendee buys a hot dog or a beer:
Low .11 .38 .49
• Can we do this?
Totals .28 .72 1.00
P (Hot Dog or Beer) = P (Hot Dog) + P (Beer)
• Summing the probabilities of these simple events, we get the appropriate probability:
• What’s going wrong?
P(High or Crime) = .17 + .34 + .11 = .62
• Alternatively, take 1.00 and subtract the outcomes where High or Crime doesn’t occur

P(High or Crime) = 1 – P(Low and NoCrime) = 1 – .38 = .62 25 26

Addition Rule Addition Rule

• So far, when calculating the probability of unions, we’ve simply added the probabilities of the • So back to our Mallard’s example, if we have:
relevant events
• 60% buy a hot dog
• But this is because, in all the examples thus far, we’ve considering the unions of mutually
exclusive events • 80% buy a beer

• The actual addition rule looks like this: • 50% buy both a hot dog and a beer

P (A or B) = P (A) + P (B) P (A and B) • What proportion buy either a hot dog or a beer? P (Hot Dog or Beer)
• And if A and B are mutually exclusive, it simplifies to: P (A or B) = P (A) + P (B)
• If you’re a visual person, here’s the rule in Venn diagram form: Using the addition rule

P (Hot Dog or Beer) = P (Hot Dog) + P (Beer) P (Hot Dog and Beer)
A B = A + B – = .60 + .80 .50 = .90

• So without the final term, we’d end up double-counting the overlap! 27


So 90% buy a hot dog or beer (or both) 28
Union

• We can also practice using our crime and testosterone example

• In fact, this gives us a third way of calculating the probability we were interested in before

Crime NoCrime Totals


High .17 .34 .51
Low .11 .38 .49
Totals .28 .72 1.00
• Now we can add up the marginal probabilities of the events we’re interested in...

P (High or Crime) = P (High) + P (Crime) P (High and Crime)


But since these events
= 0.51 + 0.28 0.17 = 0.62 aren’t mutually exclusive,
we need to subtract out
the joint probability... 29

You might also like