0% found this document useful (0 votes)
77 views

Probability: 1.1 The Concept of Probability

This document introduces concepts of probability through examples and definitions. It discusses: - Defining probability as the relative frequency of an event occurring in repeated trials. - Sample spaces containing all possible outcomes of an experiment, and events as subsets of outcomes. - Rules for calculating probabilities of unions, intersections, complements and differences of events. When events are not mutually exclusive, probabilities must be adjusted to avoid double counting overlapping outcomes. - Examples applying the rules to find probabilities of dice rolls achieving certain outcomes, or cards being certain suits.

Uploaded by

AoSd' Tikman
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
77 views

Probability: 1.1 The Concept of Probability

This document introduces concepts of probability through examples and definitions. It discusses: - Defining probability as the relative frequency of an event occurring in repeated trials. - Sample spaces containing all possible outcomes of an experiment, and events as subsets of outcomes. - Rules for calculating probabilities of unions, intersections, complements and differences of events. When events are not mutually exclusive, probabilities must be adjusted to avoid double counting overlapping outcomes. - Examples applying the rules to find probabilities of dice rolls achieving certain outcomes, or cards being certain suits.

Uploaded by

AoSd' Tikman
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Chapter 1

Probability
1.1

The concept of probability

For the purpose of this course, and unless stated otherwise, we take probability to be the
relative frequency of an event to occur after repeating it a large number of times, under similar
conditions.
Example 1 If you toss a fair coin for many times, half of the times the result will be head and
half of the times the result will be tail. That is, the ratio of obtaining head and tail is 1:1, and
we say that the probability of each side showing up is 21 , or 0.5.
This does not mean that if you toss it 100 times, 50 results will be exactly head (and 50
results exactly tail). However, if you toss it 1,000,000,000 times, the proportion of head to tails
will approach 1:1. (You dont have to believe this yet. In fact, see question below.)
Question 1 According to this understanding, is it possible to speak of the probability of
(1) the sun to explode tomorrow?
(2) you being run over by a bicycle today?
Question 2 (1) Why do you believe that a fair coin will give head half of the times and tail half
of the times?
(2) After getting all heads in 10 subsequent coin tosses, what is the probability that the next toss
will result in a head?
(3) If your answer is 0.5, why do you think that the ratio of head to tail will be 1:1 after more
tosses?
Dont worry if you cannot answer these question satisfactorily. On the contrary, you should
congratulate yourself if you are able to understand why these are troubling questions. The
concept of probability is a difficult one; it took us many many years to get it to work. (In fact,
read http://en.wikipedia.org/wiki/Probability interpretations to see where we are today.)
Most of the time, one may use the term probability very liberally. However, in this course
you are expected to know how and when to define it properly. Being able to define it properly
will
(1) Save you from many pitfalls mankind have faced in the past, as well as
(2) Allow you to make full use of its power.
1

1.2

CHAPTER 1. PROBABILITY

Definitions: outcomes and events

Probability arises when one considers possible results of some experiment.


Definition 1 (1) A collection of all elementary results, or outcomes of an experiment, is called
a sample space. We use the symbol to denote sample spaces.
(2) An event is a collection of outcomes within the sample space. (Of course, an event may
consist of only one outcome.)
You are free to define the sample space anyway you want. Typically, we want the sample
space to be defined in such a way that the probabilities of the outcomes are evident and easily
assigned, and that they allow us to solve our problem elegantly.
Example 2 If we are interested in finding the probability of getting three (possibly non-consecutive)
sixes in five consecutive dice throws, we may define our sample space as all the possible results
from five consecutive dice throws, that is
11111
11112
11113
11114
11115
11116
11121
11122
...
66665
66666.
Why do we want to choose this as the sample space? Because it has these nice properties
which helped simplify the calculation:
(1) The sample space includes all the possible outcomes,
(2) No two outcomes in the sample space can happen at the same time (we say that they
are mutually exclusive),
(3) The probability of each outcome in the sample space is the same, i.e. ( 16 )5 .
To get the probability for the event of obtaining three sixes in five dice throws, simply find
all the outcomes in the sample space with exactly three sixes, and add up their probabilities.
Note how we have made use of all the three properties above to enable this simple calculation.
Question 3 (1) How many outcomes are there in the sample space above? What is the sum of
their probabilities?
(2) How many outcomes in the sample space has exactly three sixes? What is the sum of their
probabilities?

1.2. DEFINITIONS: OUTCOMES AND EVENTS

1.2.1

Handling non-mutually exclusive events

Consider now if we want to find the probability of making five consecutive dice throws and
getting either one of these events:
(E1 ) There are exactly two (possibly non-consecutive) ones in the dice throws, and
(E2 ) There are exactly two (possibly non-consecutive) sixes in the dice throws.
We can simply add up the probabilities of all the outcomes in the earlier sample space with
two sixes or two ones.
Let us first look at the sample space.
Outcome
11111
11112
11113
...
11165
11166
11211
...
66116
66121
66122
...
66665
66666

Probability
( 61 )5
( 16 )5
( 16 )5

Fulfills E1 or E2
No
No
No

( 16 )5
( 16 )5
( 61 )5

No
Yes (fulfills E2 )
No

( 16 )5
( 61 )5
( 16 )5

Yes (fulfills E1 )
Yes (fulfills both E1 and E2 )
Yes (fulfills E2 )

( 61 )5
( 16 )5

No
No.

We can write a program which


(1) Generates all the possible outcomes in the sample space,
(2) For each outcome, checks if it fulfills either E 1 or E2 , and
(3) If yes, adds its probability into our sum.
However, when the sample space is large, this computation is going to take a long time. (Also,
some poeple may dislike such a method because its brute-force and not elegant. However,
this is just a matter of taste. An ugly solution is still a solution as long as it serves our purpose,
and serves it correctly.)
So, lets try to get a more efficient solution.

5!
(5)3 = 1250 outcomes that fulfill E1 .
We first note that all there are exactly 52 (5)3 = 2!3!
Similarly, there are exactly 1250 outcomes that fulfill E 2 . Aha! So, the solution is 2500( 16 )5 , no?
No. Regrettably, some outcomes in the sample space have both two sixes and two ones, e.g.
11266, 13616.
That is, some of the 1250 outcomes which fulfill E 2 also fulfill E1 . (In such a situation, we
say that the events E1 and E2 are not mutually exclusive.) Hence, our calculation of 2500( 16 )5
carelessly added in the probabilities of these outcomes twice.
One way to resolve this situation is to count how many of the outcomes which fulfill E 1 also
fulfill E2 , and minus off their probabilities. (If you dont know how to solve this combinatorial

CHAPTER 1. PROBABILITY

problem, take out a piece of paperor use a spreadsheetand write out all the outcomes. That
will give you some intuitive to solve the problem.)
 
Convince yourself that there are 52 32 (4) = 120 outcomes that fulfill both (E1) and (E2).
Hence, our solution is (2500 120)( 16 )5 = 2380( 61 )5 = 0.306.
Notice the trick which we have just used? We have minused off the probabilities of the overlapping outcomes. We now proceed formally with proper terminology and rules on handling
event probability.

1.3

Rules of probability

Definition 2 A union of events A, B, C, . . . is an event consisting of all the outcomes in all


these events. We write the union as A B C . . . (A or B or C, . . .).
Definition 3 An intersection of events A, B, C, . . . is an event consisting of all the outcomes
that are common to all these events. We write the intersection as A B C . . . (A and B and
C, . . .).
Definition 4 A complement of an event A is an event which consists of all the outcomes in
the sample space which is not in A. We write the complement of A as A (not A).
Definition 5 The difference of events A and B consists of all the outcomes that are in A but
not in B. We write the difference of A and B as A \ B (A but not B).
Definition 6 Two events A and B are mutually exclusive if A B = .
E1 and E2 in Section 1.2.1 are non-mutually exclusive events. As such, E 1 E2 6= .
Definition 7 Events A, B, C, . . . are exhaustive if A B C . . . equals the sample space.
Exhaustiveness is an important concept because (for convenience and by common sense) we
routinely construct our sample space to have a probability of 1. That is, P {} = 1. You will
appreciate this later.
Example 3 When a card is pulled from a complete deck of cards (excluding the joker), the
events
(E1 ) The card is a spade,
(E2 ) The card is a diamond,
(E3 ) The card is a heart,
(E4 ) The card is a club.
are mutually exclusive. Furthermore, the events (E 1 )(E4 ) are exhaustive.
Theorem 1 For any events E1 , E2 , . . . , En ,
(1) E1 E2 . . . En = E1 E2 . . . En .
(2) E1 E2 . . . En = E1 E2 . . . En .

1.3. RULES OF PROBABILITY


Theorem 2 For any event E consisting of mutually exclusive outcomes w 1 , . . . , wn ,
X
P {E} =
P {wi }
wi E

Theorem 3 For any mutually exclusive events A and B,


P {A B} = P {A} + P {B}
Example 4 If the email which you are waiting for arrives today with probability 0.6 and arrives
tomorrow with probability 0.3, then (since it cannot arrive on both days) the probability of it
arriving today or tomorrow is 0.3 + 0.6 = 0.9. (Yes, this example may be problematic since
your email may arrive BOTH today and tomorrow due to various reasons such as problems with
email servers/gateways. I know.)
Recall that we computed the outcomes in E 1 E2 to get rid of their probabilities to arrive
at the probability for E1 E2 . The observant will be quick to point out that this is similar to
the principle of |A B| = |A| + |B| |A B|. The following formalizes this principle for the
case of probability.
Theorem 4 (Probability of Union) For any events A and B,
P {A B} = P {A} + P {B} P {A B}
Example 5 If a network failure occurs today with probability 0.5 and occurs tomorrow with
probability 0.7, what is the probability that a network failure occurs either today or tomorrow?
There are at least two ways of solving this problem:
First: the probability can be computed as 1.0 minus the probability of having both events
not occurring, that is, 1 (1 0.7)(1 0.5) = 0.85,
Second: sum up the probabilities of
(A) the failure happening today, and
(B) the failure happening tomorrow,
minus off the probability where (A) and (B) overlapthat is, the probability of failures
occurring on both days. This gives us 0.7 + 0.5 (0.7 0.5) = 0.85.
Note that Theorem 4 reduces to Theorem 3 when A and B are mutually exclusive, that is,
when P {A B} = P {} = 0.
Question 4 How would you generalize Theorem 4 to the case of three events? That is, ABC?
How about four events? Hint: Ask Google about the inclusion-exclusion principle.
Theorem 5 (Complement Rule) For any event A, P {A} = P {} P {A} = 1 P {A}.
Definition 8 Events E1 , E2 , . . ., En are said to be independent if and only if they occur
independent of each other. That is, P {E i } is not affected by the occurrence of E j , j 6= i. (For
those who already know, this means the conditional probability P {E i |Ej } = P {Ei }.)

CHAPTER 1. PROBABILITY

Theorem 6 (Probability of Intersection of Indepedent Events) For any independent events


E1 , E2 , . . ., En ,
P {E1 E2 . . . En } = P {E1 }P {E2 } . . . P {En }
Question 5 There is a 0.01 probability for a harddisk to crash. It has two backups, each with a
probability of 0.02 of crashing. The event of one harddisk crashing will not affect the probability
for the other two to crash. What is the probability for all three harddisks to crash at the same
time?
Example 6 Suppose that your email will reach your friend if and only if three important gateways are functioning. These three gateways fail with probability 0.03, 0.02 and 0.05, respectively.
What is the probability of your email reaching your friend?
Your email will reach your friend if all three gateways are functioning. That is, (1 0.03)
(1 0.02) (1 0.05) = 0.90307.
Alternatively, we may want to solve the problem by adding up the probabilities of all the
cases where at least one of the gateway fails. However, we cannot simply add up the three
probabilities (0.03, 0.02, and 0.05) because these events are not mutually exclusivemore than
one gateways can fail simultaneously.
One way to solve this problem is to discount the overcounted probability (like we did before
with the five dice throws in Section 1.2.1) through Theorem 4. However, whereas we have used
P {A B} = P {A} + P {B} P {A B} earlier, in the present case, the theorem becomes
P {F1 = X F2 = X F3 = X} =
P {F1 = X} + P {F3 = X} + P {F3 = X}
P {F1 = X F2 = X} P {F1 = X F3 = X} P {F2 = X F3 = X}
+ P {F1 = X F2 = X F3 = X}.
where we have denoted the condition of the first gateway as F 1 , the second F2 , the third F3 ,
and have used an X to indicate failure.
The probabilities of the cases where at least one gateway fails are as follows (where we use
an O to indicate a working state):
F1
X
O
O
X
X
O
X

F2
O
X
O
X
O
X
X

F3
O
O
X
O
X
X
X

probability
.02793
.01843
.04753
.00057
.00147
.00097
.00003
.09693

Now the observant would have noted one thing: the sum of these probabilities already gives
us P {F1 = X F2 = X F3 = X}! We have no need to use Theorem 4. The probability of
the email reaching your friend is 1 .09693 = .90307. (Note that from the table, P {F 1 = X} =

1.3. RULES OF PROBABILITY

.02793 + .00057 + .00147 + .00003 = .03, P {F 2 = X} = .01843 + .00057 + .00097 + .00003 = .02,
P {F3 = X} = .04753 + .00147 + .00097 + .00003 = .05, as expected.)
Just for the sake of verification we compute P {F 1 = X F2 = X F3 = X} again through
Theorem 4:
P {F1 = X F2 = X} = .00057 + .00003 = .0006.
P {F1 = X F3 = X} = .00147 + .00003 = .0015.
P {F2 = X F3 = X} = .00097 + .00003 = .001.
P {F1 = X F2 = X F3 = X} = .00003.
Finally,
P {F1 = X F2 = X F3 = X} =
P {F1 = X} + P {F3 = X} + P {F3 = X}
P {F1 = X F2 = X} P {F1 = X F3 = X} P {F2 = X F3 = X}
+ P {F1 = X F2 = X F3 = X}.
= .03 + .02 + .05 .0006 .0015 .001 + .00003 = .09693.
Question 6 Define a sample space for four computers where each one has a 0.2 probability to
be infected with a virus.
(1) How many outcomes are in the sample space?
(2) Count the number of outcomes where exactly two computers are infected. Subsequently, give
the probability for exactly two computers to be infected.

1.3.1

When combinatorics is all you need

As we have seen so far, combinatorics can help us calculate probabilities very efficiently for the
problems where they are applicable. In fact, when the sample space consists of outcomes of
equal probability (such as the case discussed in Section 1.2.1), each outcome would have the
same probability 1/n say, where n is the size of the sample space and the probability for an
event E of t outcomes can be computed as nt , or
P {E} =

number of outcomes in E
.
number of outcomes in

In such a case, the computation of probabilities becomes purely a matter of counting the
number of outcomesthe kind of problems studied in combinatorics. Hence, when combinatorics
can be applied, they offer very efficient computation of probabilities to these problems. (Of
course, some problems with counting may not yield simple solutions through combinatorics.)
Example 7 The total number of outcomes in the sample space of Question 3 is 6 5 . The total

number of outcomes with three sixes is 53 (5)2 = 250. Hence, the probability of obtaining exactly
three sixes in five throws of a fair dice is 250/6 5 = 0.032.
Example 8 When a card is drawn from a complete deck of 52 cards, the probability that the
card is a spade is 13/52 = 0.25.

CHAPTER 1. PROBABILITY

Example 9 The probability of finding a string of length 8 of only lowercase letters from the set
of all strings of length 8 of both uppercase and lowercase letters is 1/2 8 = 0.0039 (surprisingly
small!).
Notice how the probabilities in these examples are calculated using only combinatorics? Are
you convinced that combinatorics is important now? Use the following questions to practice
your combinatorics kung-fu.
Question 7 How many passwords of exactly eight letters long can be constructed, if only alphanumeric characters (including both upper and lowercase roman characters) are allowed?
Question 8 Define a sample space which consists of all the possible ways to seat five (distinct)
students in ten chairs (chairs are indistinguishable).
(1) How many outcomes are there in the sample space?
(2) Count the number of outcomes where no two students are seated next to each other.
Question 9 How many passwords of
(Condition 1) exactly six characters long, and
(Condition 2) contain the string ccc,
can be constructed from lowercase roman characters?
Question 10 How many passwords of exactly eight letters long can be constructed from the
roman alphabet,
(1) if both lowercase and uppercase characters are allowed?
(2) if each occurrence of a vowel must follow a non-vowel letter?
(3) if a vowel cannot follow another vowel, except that e can follow a (i.e., ae is allowed)?
Question 11 Some smartphones use pattern locksa sequence of digits entered by swiping on
a numeric keypad. Each of the pattern swiped specifies a password string, for example, the
Z pattern corresponds to the string 1235789. How would you write a program to count the
number of pattern lock strings of at most six digits long?

1.4

Conditional probability

Definition 9 The conditional probability of event A given event B, written P {A|B}, is the
probability of event A occuring given that event B has occurred.
We can compute the conditional probability P {A|B} as
P {A|B} =

P {A B}
P {B}

In a sample space where every outcome has equal probability, this becomes
P {A|B} =
Since P {A|B} =

P {AB}
P {B} ,

Number of outcomes in A B
Number of outcomes in B

it follows that
P {A B} = P {A|B}P {B}.

1.4. CONDITIONAL PROBABILITY

Question 12 Ninety percent of flights depart on time. Eighty percent of flight arrive on time.
Seventy percent of flights depart on time and arrive on time.
(1) You are meeting a flight that departed on time. What is the probability that it will arrive on
time?
(2) You have met a flight, and it arrived on time. What is the probability that it departed on
time?
(3) Are the events, departing and arriving on time, independent?
Question 13 A test for a certain virus is 95% reliable for infected patients and 99% reliable
for others. That is, if a patient has the virus (event V), the test will show the virus to be present
(event S) with probability P {S|V } = 0.95, and if the patient does not have the virus, the test will
show the virus to be absent with probability P {S|V } = .99. Given that the test shows a patient
to have the virus, what is the probability that the patient indeed has the virus? Give your answer
in terms of P {V }.
Question 14 (1) The Smiths has two children. What is the probability that both are girls?
(2) The Smiths has two children. You met the elder child, who is a girl. What is the probability
that the other is also a girl?
(3) The Smiths has two children. You saw one of them walking out of the house, and it turns
out to be a girl. What is the probability that the other is also a girl?
Hint: This question is so famous that you cant possibly miss it when searching the Internet.
Theorem 7 (Bayes Rule)
P {B|A} =

P {A|B}P {B}
P {A}

Question 15 On a mid-term exam, students X, Y, and Z forgot to sign their papers. Professor
knows that they can do well in the exam with probabilities 0.8, 0.7, and 0.5, respectively. After
the grading, he notices that two unsigned exams are good while the remaining one is bad. Given
this information, what is the probability that the paper belongs to student Z?
Definition 10 The law of total probability states that,
P {A} =

k
X

P {A|Bj }P {Bj }

j=1

Combining this with Bayes Rule, it follows that


P {B|A} =

P {A|B}P {B}
P {A|B}P {B} + P {A|B}P {B}

Question 16 A new computer program consists of two components. The first component contains an error with probability 0.2. The second component is more complex, it has a probability of
0.4 to contain an error, independently from the first component. An error in the first component
alone causes the program to crash with probability 0.5. For the second component, this probability
is 0.8. If there are errors in both components, the program crashes with probability 0.9. Suppose
the program crashed. What is the probability that there are errors in both components?

You might also like