0% found this document useful (0 votes)
12 views

Mod02 Intro Probability

Uploaded by

Zameer Qasim
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

Mod02 Intro Probability

Uploaded by

Zameer Qasim
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 49

Review

Probability

1
Events

• Definition: any collection of outcomes of an experiment.


• Events consisting of single outcomes in the sample space are
called elementary or simple events.
• Events consisting of more than one outcome are called compound
events.

2
Venn Diagrams

A B

3
Venn Diagrams

A B

4
Venn Diagrams

AB

A B

5
Venn Diagrams

AB

A B

6
Venn Diagrams

A−B=AB

A B

7
Venn Diagrams

A−B=AB

A B

8
Venn Diagrams

AB

A B

9
Venn Diagrams

AB
B

10
Venn Diagrams

• Mutually Exclusive Events

A B

11
Venn Diagrams

• Totally Exhaustive Events

A
A B

12
Venn Diagrams

• Mutually Exclusive and Totally Exhaustive Events

A
A B

13
Relationships among events

• If A and B are two events in the sample space S, then


– A  B(A union B) = 'either A or B occurs or both occur'
– A  B (A intersection B) = 'both A and B occur'
– AB (A is a subset of B) = 'if A occurs, so does B'
– A' or A= 'event A does not occur'
–  (the empty set) = an impossible event
– S (the sample space) = an event that is certain to occur

14
Probability in discrete space

Probability Axioms:

P(A)  0

P ( ) = 1
For Mutually Exclusive/Disjoint Events:

P(A1  A 2  ...  A n ) = P(A1 ) + P(A 2 ) + ... + P(A n )


A1 A2 A3 A4
A5 A6 A7 A8
A9 A10 A11 A12
A13 A14 A15 A16
A17 A18 A19 A20

15
Probability in discrete space

Lemma:

P(A) = 1 − P(A )

P( A  B) = P( A ) + P(B) − P(A  B)

16
Probability in discrete space

Example 1:

Jerry and Susan have a joint bank account.

Jerry goes to the bank 20% of the days.

Susan goes there 30% of the days.

Together they are at the bank 8% of the days.

What is the probability that in a particular day at least one of them is


visiting the bank?

42%

17
Events – class assignments
An insurance company offers four different deductible level- none(N), low(L), medium(M), and high (H). for its
homeowner’s policyholders, and three different for its automobile policyholders. Given the following random
sample of policyholders.
• What is the probability that the individual has a medium auto deductible and a high homeowner’s
deductible?
• What is the probability that the individual has a medium auto deductible ?
• What is the probability that the individual has a high homeowner’s deductible ?
• What is the probability that the individual is in the same category for both auto and homeowner’s
deductibles?
• What is the probability that the individual is in two different categories?
• What is the probability that the individual has a medium auto deductible given he/she has a high
homeowner’s deductible?
• What is the probability that the individual high homeowner’s deductible given he/she has a has a medium
auto deductible?

Home
Auto N L M H
L 40 60 50 30
See the
M 70 100 200 100 Excel File
H 20 30 150 150

18
Conditional Probability

P(A  B)
P(A | B) =
P(B)

P(A ∩ B)= 𝑃 𝐴 𝐵 𝑃 𝐵 = 𝑃 𝐵 𝐴 𝑃(𝐴)

19
Events – class assignments

Another insurance company offers four different deductible level- none(N), low(L), medium(M), and high (H). for
its homeowner’s policyholders, and three different for its automobile policyholders. Given the following random
sample of policyholders.

Home
Auto N L M H
L 20 40 80 60
M 50 100 200 150
H 30 60 120 90

See the
Excel File

20
Independent Events

A and B are independent (no additional information) if:

P(A | B) = P(A) or P(A  B) = P(A)P(B)

21
Conditionally Independent Events

A and B are conditionally independent if:

P((A ∩ B)/C)= 𝑷 𝑨 𝑪 𝑷 𝑩/𝑪

See the
Excel File

22
Example

Home
Auto N L M H
L 4% 6% 5% 3% 18%
M 7% 10% 20% 10% 47%
H 2% 3% 15% 15% 35%
13% 19% 40% 28% 100%
• What is P(Auto=H/Home=H)=?

15% / 28% = 53.57%

23
Example

You roll 2 dice

P(First Die =1) =?


P(Total = 7) = ?
P(First Die=1/Total =7)= ?
P(First Die=1/Total=8)=?
Does Total=7 give any additional information about the first die?

24
Example

Die 2
1 2 3 4 5 6

1 2 3 4 5 6 7

2 3 4 5 6 7 8

Die 1 3 4 5 6 7 8 9

4 5 6 7 8 9 10

5 6 7 8 9 10 11

6 7 8 9 10 11 12

See the
Excel File

25
Law of Total Probability

Let B1,B2,…,Bn be disjoint sets

P(B1  B 2  ...  B n ) = 

Then, for each event A

P(A) = P(A | B1 )P(B1 ) + P(A | B 2 )P(B 2 ) + ... + P(A | B n )P(B n )

26
Law of Total Probability:
P(B1U B2 U B3 U.....)= P(B1) + P(B2) + (B3) + ⋯

B5 B6 B7 B8
B1 B2 B3 B4
B9 B10 B11 A B12
B13 B14 B15 B16
B17 B18 B19 B20

P(A )= P(A∩B1) + P(A∩B2) + P(A∩B3) + ⋯


P(A ∩ B)= 𝑃 𝐴 𝐵 𝑃 𝐵 = 𝑃 𝐵 𝐴 𝑃(𝐴)

P(A) = P(A | B1 )P(B1 ) + P(A | B 2 )P(B 2 ) + ... + P(A | B n )P(B n )

27
Game Show

You are the finalist in a game show, and you have the option of choosing one of the
three briefcases. One of the briefcases contains a $1 million prize, and the other
two are empty. To make the show more exciting, after you choose a briefcase, the
host opens one of the remaining two briefcases, shows you that it is empty, and
gives you the option of either keeping your original briefcase, or switching to the
remaining briefcase. Should you switch?

𝑃(𝐴) = 𝑃(𝐴|𝐵1 )𝑃(𝐵1 ) + 𝑃(𝐴|𝐵2 )𝑃(𝐵2 )+. . . +𝑃(𝐴|𝐵𝑛 )𝑃(𝐵𝑛 )

𝐵1 = Choosing the Prize 𝐵2 = Choosing Empty


Policy 1: Always Switch
P(wining)=P(prize)* P(wining/prize)+P(empty)*P(wining/empty).
P(wining)= 1/3 * 0 + 2/3 * 1= 2/3

Policy 2: Do Not Switch


P(wining)=1/3*1 + 2/3 * 0 = 1/3

28
Bayes’ Formula

P(A ∩ B1)= 𝑃 𝐴 𝐵1 𝑃 𝐵1

P(A) = P(A | B1 )P(B1 ) + P(A | B 2 )P(B 2 ) + ... + P(A | B n )P(B n )

29
Bayes’ Formula

The probability of an event, before obtaining information, is called


“prior probability”.

The probability of an event, after obtaining information, is called


“posterior probability”.

Using Bayes’ Formula we can calculate posterior probabilities using


signals.

30
Medical Testing

A new and very accurate test has been developed for the detection of a disease (e.g. Cancer). The test is 99.9
percent accurate with error rates of 0.1 % for both types of errors. In other words:

• Out of 1,000 sick patients, the test misses only 1 patient, and
• Results in only 1 false positive for every 1,000 healthy individuals.
The prevalence rate of the disease in the general populations is about 1,000 per million. Given the positive result
of the test, what is the probability that the individual is in fact sick?

The second test of the patient is also positive. Now, what is probability that the individual is in fact sick?

31
Medical Testing

Sick Not-sick Total

Positive
999 999 1,998

Not Positive
1 998,001 998,002
1,000 999,000 1,000,000

P(sick/positive)=999/1998=50%

32
Medical Testing

The second test of the patient is also positive. Now, what is


probability that the individual is in fact sick?

Sick Not-sick Total


Positive 998 1 999
Not Positive 1 998 999
999 999 1,998

P(sick/1st & 2nd positive)=998/999=99.9%

33
Document classification

A developer claims that her app can distinguish AI-generated documents from
human-generated ones. To assess its performance, we have submitted 1000
AI-generated and 1000 human-generated documents to the app.
• The app misidentified/misclassified 60 human-generated documents as AI-
generated
• and 50 AI generated documents as human- generated.
Create a probability table based on this information and answer the following
questions:
1. What is the probability of a randomly selected document being classified
correctly (accuracy?)
2. Given that a document is predicted as AI-generated, what is probability of
the document truly being AI-generated(precision?)
3. Given that a document is truly AI-generated, what is the probability that
the app classifies the document correctly(recall?)

34
Results of document classification

AI-Generated Human-Generated
Total
(Postitive) (Negative)

Predicted as AI Generated(Positive) 950 60 1,010

Predicted as Human-Generated( Negative) 50 940 990

1,000 1,000 2,000

1. What is the probability of a correct prediction?“(Accuracy)


2. "Given an AI-generated prediction, what is the probability of correctness?“ (Precision)
3. "If all documents were AI-generated, what is the probability of a correct prediction?“ (Recall)

35
Confusion Matrix, Accuracy, precision , recall and F1

𝑇𝑃+𝑇𝑁 𝑇𝑃
Accuracy= Precision=
𝑇𝑃+𝐹𝑃+𝐹𝑁+𝑇𝑁 𝑇𝑃+𝐹𝑃

𝑇𝑃 2∗Precision∗Recall
Recall= F1=
𝑇𝑃+𝐹𝑁 Precision+Recall

Actual Positive Actual Negative

Predicted Positive TP FP

Predicted Negative FN TN

36
Expectation of function variables

E[ f ( x)] =  x f ( x) p ( x)

37
Expectation of a random variables

 = E[X] =  x xp ( x )

E[a]=a; E[aX]=aE[X]

38
Variance of a random variables

Var (X) = E[( X − ) ] 2

 =  x ( x − ) p( x )
2 2

Var(a)=0; Var(aX)=a2Var(X)

39
Example

You roll a die.

a. What are the possible outcomes?

b. What are the probabilities for each outcome?

c. What’s the expected value?

d. What is the variance? The standard deviation?

40
Example

X P X*P (X-Mue)^2 ((X-Mue)^2)*p

1 1/6 0.167 6.25 1.04

2 1/6 0.333 2.25 0.38

3 1/6 0.500 0.25 0.04

4 1/6 0.667 0.25 0.04

5 1/6 0.833 2.25 0.38

6 1/6 1 6.25 1.04

Overall 21 3.5 2.92

Std= sqrt(2.92)=1.70

41
Example

You roll 2 dice.

a. What are the possible outcomes? Show the histogram

b. What are the probabilities for each outcome?

c. What’s the expected value?

d. What is the variance? The standard deviation?

42
Example

X P X*P (X-Mue)^2 ((X-Mue)^2)*p


2 1/36 0.056 25.00 0.69
3 2/36 0.167 16.00 0.89
4 3/36 0.333 9.00 0.75
5 4/36 0.556 4.00 0.44
6 5/36 0.833 1.00 0.14
7 6/36 1.167 0.00 0.00
8 5/36 1.111 1.00 0.14
9 4/36 1.000 4.00 0.44
10 3/36 0.833 9.00 0.75
11 2/36 0.611 16.00 0.89
12 1/36 0.333 25.00 0.69
Overall 7.000 5.83

Mean 7
Var 5.83
Std 2.42
43
Normal Distribution

44
Standard Normal Distribution

45
Z- Score

46
Example

The amount of distilled water dispensed by a certain machine is


normally distributed with mean value 64 oz and standard deviation
0.78 oz. What container size (c) will ensure that overflow occurs
only one-half of one percent of the time?

C=66

47
Example

The amount of distilled water dispensed by a certain machine is


normally distributed with mean value 64 oz and standard deviation
0.78 oz. What container size (c) will ensure that overflow occurs
only one-half of one percent of the time?

C=66

48
49

You might also like