Biostat Lecture five
Biostat Lecture five
Introduction to Probability
02/08/25 [email protected] 1
Probability
Probability is the chance of observing a particular outcome or likelihood of
observing an event.
Assumes a “random” process: i.e.. the outcome is not predetermined - there
is an element of chance
Probability theory developed from the study of games of chance like dice and
cards.
A process like flipping a coin, rolling a die or drawing a card from a deck are
probability experiments.
02/08/25 [email protected] 2
Why Probability in Statistics and
medicine?
• Because medicine is not an exact science,
physicians seldom can predict an outcome with
absolute certainty.
population.
02/08/25 [email protected] 4
Basic Terms of
Probability
• Probability experiment: is an action through
which specific results/outcomes (counts,
measurements or responses) are obtained.
Example:
• Tossing a coin and observing the face showing
up is a probability experiment.
• Outcome: It is the result of a single trial in a
probability experiment. It is also called simple
event.
Example: the outcome of the sex of a newborn
from a mother in delivery room is either Male
or female
02/08/25 [email protected] 5
Basic Terms
Cont…
• Sample space: The set of all possible outcomes of a statistical
experiment is called the sample space and is represented by the
symbol S.
Example: The sample space for the sex of newborns
when two mothers are in the gynecology ward to
give birth is: S= {MM, MF, FM, FF}
• An event: consists of one or more outcomes and is a
subset of the sample space
Example: From the above experiment, an event
consisting of
at least one female is E = {MF, FM, FF}
• Random Variable: is a function that associates a unique
numerical value with every outcome of an experiment.
02/08/25 [email protected] 6
Basic terms…
Certain event: An event which is sure to occur.
02/08/25 [email protected] 7
Two Categories of
Probability
Objective and Subjective Probabilities.
Objective probability
1) Classical probability and
2) Relative frequency probability.
02/08/25 [email protected] 8
Types of
probability
Classical (or theoretical)
probability
It is used when each outcome in a sample
space is equally likely to occur.
That is if an experiment has n equally likely
outcomes, then each possible outcome must
have probability of 1/n to occur Or, equivalently
the probability for event E is;
02/08/25 [email protected] 9
Types of probability
cont…
Empirical (or statistical) probability: is based
on observations obtained from
experiments /a large number of trials or
from historical data.
Example:
•A medical doctor realized that out of 100,000
patients visited the hospital, there are 50
cancer cases. What is the probability that a
patient to be examined will be positive for
cancer?
P(+ve
02/08/25 for cancer) = 50/100,000 = 0.0005
[email protected] 10
Example 2
In a sample of 50 people, 21 had type O blood, 22 had
type A blood, 5 had type B blood, and 2 had type
AB blood. Set up a frequency distribution and find
the following probabilities
a. A person has type O blood
b. A person has type A or type B blood
c. A person has neither type A nor type O blood
d. A person does not have type AB blood
02/08/25 [email protected] 11
Solutio
n
Blood type Frequency
A 22
B 5
AB 2
O 21
Total 50
02/08/25 [email protected] 12
Example: Of 158 people who attended a dinner party, 99 were ill.
02/08/25 [email protected] 13
Subjective Probability
Personalistic (represents one’s degree of belief in the occurrence of
an event).
02/08/25 [email protected] 14
E.g., If someone says that he is 95% certain that a cure for AIDS will be
discovered within 5 years, then he means that:
P(discovery of cure for AIDS within 5 years) = 95% = 0.95
Although the subjective view of probability has enjoyed increased
attention over the years, it has not fully accepted by scientists.
02/08/25 [email protected] 15
Mutually Exclusive
Events
Two events A and B are mutually exclusive if they cannot both Happen
at the same time:
P (A ∩ B) = 0
Example:
A coin toss cannot produce heads and tails simultaneously.
Weight of an individual can’t be classified simultaneously as
“underweight”, “normal”, “overweight”
02/08/25 [email protected] 16
Independent Events
Two events A and B are independent if the probability of the first one
happening is the same no matter how the second one turns out.
The outcome of one event has no effect on the occurrence or non-
occurrence of the other.
Example:
The outcomes on the first and second coin tosses are independent
02/08/25 [email protected] 17
Intersection, and union
The intersection of two events A and B, A ∩ B, is the event that A and
B happen simultaneously
P ( A and B ) = P (A ∩ B )
The intersection of A and B is the event that the infant is both LBW
and from a multiple birth
02/08/25 [email protected] 18
The union of A and B, A U B, is the event that either A happens or B
happens or they both happen simultaneously
P ( A or B ) = P ( A U B )
In the example above, the union of A and B is the event that the
newborn is either LBW or from a multiple birth, or both
02/08/25 [email protected] 19
Probability concept is used to understand:
About probability distributions: Binomial, Poisson, and Normal
Distributions
Sampling and sampling distributions
Estimation
Hypothesis testing
02/08/25 [email protected] 20
Properties of
Probability
1. The numerical value of a probability always lies between 0 and 1,
inclusive.
0 P(E) 1
A value 0 means the event can not occur=impossible event
A value 1 means the event definitely will occur=sure event
A value of 0.5 means that the probability that the event will occur
is the same as the probability that it will not occur.
02/08/25 [email protected] 21
2. The sum of the probabilities of all mutually exclusive outcomes is
equal to 1.
02/08/25 [email protected] 22
4. The complement of an event A, denoted by Ā or Ac, is the event
that A does not occur
Consists of all the outcomes in which event A does NOT occur
02/08/25 [email protected] 23
In the example, the complement of A is the event that a newborn is
not LBW
In other words, A is the event that the child weighs 2500 grams at
birth
P(Ā) = 1 − P(A)
= 1− 0.076
= 0.924
02/08/25 [email protected] 24
Basic Probability
Rules
1. Addition rule
If events A and B are mutually exclusive:
02/08/25 [email protected] 25
Example: The probabilities below represent years of
schooling completed by mothers of newborn infants.
02/08/25 [email protected] 26
What is the probability that a mother has completed < 12 years of
schooling?
P( 8 years) = 0.056 and
P(9-11 years) = 0.159
Since these two events are mutually exclusive,
02/08/25 [email protected] 27
What is the probability that a mother has completed 12 or more years of
schooling?
= P(12)+P(13-15)+P(16)
= 0.321+0.218+0.230
= 0.769
02/08/25 [email protected] 28
If A and B are not mutually exclusive events,
then subtract the overlapping:
P(AU B) = P(A)+P(B) − P(A ∩ B)
02/08/25 [email protected] 29
2. Multiplication rule
If A and B are independent events, then
02/08/25 [email protected] 30
Conditional Probability
Refers to the probability of an event, given that another event is
known to have occurred.
02/08/25 [email protected] 31
The conditional probability that event B has occurred given that
event A has already occurred is denoted P(B|A) and is defined
provided that P(A) ≠ 0.
02/08/25 [email protected] 32
Example:
A study investigating the effect of prolonged exposure to
bright light on retina damage in premature infants.
02/08/25 [email protected] 33
The probability of developing retinopathy is:
= 0.65
02/08/25 [email protected] 34
We want to compare the probability of retinopathy, given that the
infant was exposed to bright light, with that the infant was
exposed to reduced light.
Exposure to bright light and exposure to reduced light are
conditioning events, events we want to take into account when
calculating conditional probabilities.
02/08/25 [email protected] 35
The conditional probability of retinopathy, given exposure to bright
light, is:
= 18/21 = 0.86
02/08/25 [email protected] 36
P(Retinopathy/exposure to reduced light) =
= 21/39 = 0.54
The conditional probabilities suggest that premature infants exposed
to bright light have a higher risk of retinopathy than premature infants
exposed to reduced light.
02/08/25 [email protected] 37
For independent events A and B
P(A/B) = P(A).
02/08/25 [email protected] 38
Test for
Independence
Two events A and B are Two events A and B are dependent
independent if: if:
or or
02/08/25 [email protected] 39
Example
In a study of optic-nerve degeneration in Alzheimer’s disease,
postmortem examinations were conducted on 10 Alzheimer’s
patients.
02/08/25 [email protected] 40
Optic-nerve Degeneration
Sex
Present Not Present
Female 4 1
Male 4 1
02/08/25 [email protected] 41
Solution
P(Optic-nerve degeneration/Female) =
No. of females
= 4/5 = 0.80
P(Optic-nerve degeneration) =
= 8/10 = 0.80
02/08/25 [email protected] 45
Random variables: can be either discrete or continuous.
02/08/25 [email protected] 46
Common Probability
distributions
1. Binomial distribution
Consider a dichotomous variable (a nominal variable with only two
possible values).
The two mutually exclusive outcomes are referred as “failure” and
“success”.
E.g. Let X represents smoking status; X=1 smoker and X=0 non-smoker.
The two outcomes are mutually exclusive.
E.g In USA; in 1987, 29% of the adults in USA were smokers, therefore
Pr (X=1) = 0.29 and Pr (X=0) = 1-0.29 = 0.71.
02/08/25 [email protected] 47
Binomial distribution…
02/08/25 [email protected] 48
If an experiment is repeated n times, the probability P(X=x) that
outcome X occurs exactly x times is
Pr (X= x) = n! p x (1- p) n- x
x ! (n- x )!
02/08/25 [email protected] 49
Binomial distribution….
02/08/25 [email protected] 50
Binomial distribution….
Suppose that in a certain population 52% of all recorded births are
males. If we select randomly 10 birth records What is the probability
that : A. Exactly 5 will be males? n=10, x=5,
Pr (X= x) = n! p x (1- p) n- x
x ! (n -x )!
02/08/25 [email protected] 51
2. Normal Distributions
The ND is the most important probability distribution in statistics
02/08/25 [email protected] 52
Properties of the Normal
Distribution
1. It is symmetrical about its mean, .
2. The mean, the median and mode are almost equal and it is uni-modal.
3. The total area under the curve about the x-axis is 1 square unit.
5. As the value of increases, the curve becomes more and more flat.
02/08/25 [email protected] 53
Standard Normal Distribution
It is a normal distribution that has a mean equal to 0 and a
SD equal to 1, and is denoted by N(0, 1).
The main idea is to standardize all the data that is given by
using Z-scores.
These Z-scores can then be used to find the area (and thus
the probability) under the normal curve.
The standard normal distribution has mean 0 and variance 1
02/08/25 [email protected] 54
Z - Transformation
If a random variable X~N(,) then we can transform it to a SND with
the help of Z-transformation xx
zz
Z represents the Z-score for a given x value.
Tells us how many SDs away from mean for normal distribution.
2. Find the z value in tenths in the column at left margin and locate its row.
Find the hundredths place in the appropriate column.
3. Read the value of the area (P) from the body of the table where the row and
column intersect.
Values of P are in the form of a decimal point and four places.
02/08/25 [email protected] 56
Some Useful Tips
Only a single curve for which μ = 0 and σ = 1 is tabulated.
02/08/25 [email protected] 57
02/08/25 [email protected] 58
a) What is the probability that z < -1.96?
(4) The answer is the area to the left of the line P(z < -1.96) = 0.0250
02/08/25 [email protected] 59
b) What is the probability that -1.96 < z < 1.96?
02/08/25 [email protected] 60
c) What is the probability that z > 1.96?
The answer is the area to the right of the line; found by subtracting table value
from1.0000;P(z>1.96)=1.0000-.9750=.0250
Formula
P(x<Z<Y)=p(y)-1-P(X)
P(x<Z)=1-P(x=Z)
P(x>Z)=1-P(x=Z)
02/08/25 [email protected] 61
Exercise
1. Compute P(-1 ≤ Z ≤ 1.5)
0.4265
02/08/25 [email protected] 62
Example on z-transformation
The diastolic blood pressures of males 35–44 years of age are normally
distributed with µ = 80 mm Hg and σ2 = 144 mm Hg2, Let individuals with
BP above 95 mm Hg are considered to be hypertensive
02/08/25 [email protected] 63
Approximately 10.6% of this population would be classified as
hypertensive.
02/08/25 [email protected] 64
b. What is the probability that a randomly selected male has a
DBP above 110 mm Hg?
Z = 110 – 80 = 2.50
12
02/08/25 [email protected] 65
c. What is the probability that a randomly selected male has a DBP
below 60 mm Hg?
Z = 60 – 80 = -1.67
12
02/08/25 [email protected] 66
The normal distribution
depends on the two
parameters and .
determines the 1
2
3
1
<<
location of the curve. 1 2 3
2
But, determines
the scale of the curve, i.e.
3
02/08/25 [email protected] 67
Student’s t Distribution
The t distribution was discovered by W. S. Gosset in 1908 under a
family of continuous probability distributions
He used the pseudonym Student to avoid getting fired for doing
statistics on the job!!!
02/08/25 [email protected] 68
Flatter/broader than the Normal (0,1).
This means:
The variability of t is greater than that of a Z that is normal(0,1).
Thus, there is more area under the tails and less at center
02/08/25 [email protected] 69
Student’s t Distribution…….
The t distribution has a (slightly) different shape for each possible
sample size.
02/08/25 [email protected] 70
Student’s t Table
02/08/25 [email protected] 71
Thank You for Being
Patient Till the End!!!
02/08/25 [email protected] 72