Statistics and Probability Module
Statistics and Probability Module
Week # ____1___
I. INTRODUCTION
Basketball players are being accepted or draft pick based on their performance in the game, and other relevant
characteristics. If you are a manager of a basketball team, how would you answer the following inquiries?
1. What information should be obtained to select the player your team needs?
2. How do you count or measure the information needed for making decisions?
To answer the question posted, we need to know certain basic concepts. Elements is the source of relevant
information or data, i.e an individual, entity, population unit.
Variable is a variable being measured to produce numerical observations associated with the random
outcomes of a chance experiment.
Random Variable is a variable being measured to produce numerical observations associated with the random
outcomes of a chance experiment.
Observations are numerical values associated with measuring the variable.
To answer the focus question, let us apply and illustrate the concepts to this table.
Elements Random Variables
Playing
Players Points per No. of No. of time per Field goal Height Weight
game rebounds assists per game (%) (in m) (in lbs)
per game game (in mins)
A 5 2 4 4.65 85 1.83 165
B 10 3 4 5.9 80 1.88 175
C 18 5 6 6.7 75.7 1.96 195
D 22 7 8 8 68.4 2.06 210
E 20 4 10 7.5 50 1.93 205
F 11 3 15 6 45.3 1.83 160
G 4 2 18 5 38.9 1.80 158.2
In this table “Players” is considered an element, while “Points per game” “number of rebounds per game”,
etc. are the random variables. Of these, the first three are discrete random variables, while the last four are the
continuous random variables.
The variable “Points per game” is considered a discrete random variable since the observation is any whole
number from 4 to 22. Since playing time per game is measured in minutes, the observation took on any value
from 4.65 minutes to 8 minutes; this is a continuous random variable.
V. PRACTICE
Let us consider another example. In each of the experiments to be performed, determined the possible
observations that can be made, and classify the variable according to type.
In rolling a die, there are six possible observations corresponding to numbers 1 to 6. Since the random variable is
the sum of the numbers from the two dice, the possible observations would be the combined number in the
two dice, such as 1 and 1, 1 and 2 or 6 and 6. Therefore, it should be any whole number from 2 to 12 written
mathematically as 2 < x < 12. The variable is then considered discrete.
For free falling objects, acceleration which is distance traveled by the object over the square of time expressed
in second, may be any value greater than or equal to 0, written mathematically as x > 0 and thus, is considered
continuous.
VI. ENRICHMENT
Most of the improvements the world is enjoying today are attributed to the understanding of the nature of
information or data that occurred in any experimental event. These are the characteristics of certain variable
randomly observed as an outcome of an experiment of chance.
Let us apply and illustrate the concepts we have learned in the table below.
ALL RIGHTS RESERVED: IMMACULADA CONCEPCION COLLEGE (STATISTICS AND PROBABILITY) PAGE 2 OF 48
Immaculada Concepcion College
Of Soldier’s Hills Caloocan City, Inc.
Soldier’s Hills III Subd. Brgy. 180, Tala, North Caloocan City
VII. EVALUATION
B. Give an example of a discrete and a continuous variable that would be an interest to the following:
Discrete Variable Continuous
1. Biologist _____________________________ _____________________________
2. Accountant _____________________________ _____________________________
3. Economist _____________________________ _____________________________
4. Engineer _____________________________ _____________________________
5. Chef _____________________________ _____________________________
6. Computer game developer _____________________________ _____________________________
C. Scholars of physical science devote much of their time in performing experiments. They are interested in
verifying theories on areas such as physics, astronomy, geology, and chemistry based on the data
resulting from experiment. The following variables have been gathered through various conditions.
Which are discrete and which are continuous variable.
ALL RIGHTS RESERVED: IMMACULADA CONCEPCION COLLEGE (STATISTICS AND PROBABILITY) PAGE 3 OF 48
Immaculada Concepcion College
Of Soldier’s Hills Caloocan City, Inc.
Soldier’s Hills III Subd. Brgy. 180, Tala, North Caloocan City
Week # _____2____
I. INTRODUCTION
Discoveries of patterns regarding the likelihood of its occurrences (probability distribution) paved way to
forecasting and estimating significant results of related variables. This lesson set forth to bring about
understanding the concepts and exploring the applications.
1. What is the chance that when a die is rolled, the number 2 will appear? The number 5?
2. How can we show graphically the probability of the occurrence of an event?
Recall that since the data to be obtained by a rolling a die are whole numbers from 1 to 6, the variable is
considered discrete random variable. Let us consider how to describe a discrete random variable.
Since the die is rolled 20 times, the total number of occurrences (N) in the experiment is 20 observations (N=20)
From the table, the number of times the possible outcome “2” (x=2) has occurred [(x):n] is 7 or [(2):7].
The chance that a “2” will appear when a die is rolled is the quotient of dividing the number of occurrences
associated to the value of [(x):n] by the total number of observations N. Thus, we get 7/20. This is also known as
the probability of occurrence.
The probability mass function for this random variable is given by
P(X)= [(X):N]
N
We extend this and say that the probability that a “5” will appear when a die is rolled is 2/20.
ALL RIGHTS RESERVED: IMMACULADA CONCEPCION COLLEGE (STATISTICS AND PROBABILITY) PAGE 4 OF 48
Immaculada Concepcion College
Of Soldier’s Hills Caloocan City, Inc.
Soldier’s Hills III Subd. Brgy. 180, Tala, North Caloocan City
Let us come up with the following table for the die rolling experiment.
Outcomes (x) No. of Occurrences (x):n P(x)
[(x):n] N
1 3 3/20 0.15
2 7 7/20 0.35
3 4 4/20 0.20
4 3 3/20 0.15
5 2 2/20 0.10
6 1 1/20 0.05
N=20 ∑𝑃(𝑋) = 1.00
The following is the histogram showing the probability distribution of the die rolling experiment.
From the above, the following observations and analysis can be made.
1. In a discrete probability distribution, the probability values for all its possible outcomes are greater than
or equal to zero. [P(x)>0] (1st descriptive condition).
2. The sum of the probability values associated to the corresponding outcome is equal to one [∑P(x)=1]
(2nd descriptive]
V. PRACTICE
A common selection criteria being considered by any sports team are the points scored by the prospective
player on every game played. A coaching staff obtained the following data on two players.
Points per game
No. of Games Played Player A Player B
0 2 0
1 3 16
2 5 14
3 10 12
4 15 8
5 4 2
6 8 3
7 5 1
Total 60 60
1. What kind of information or data should be obtained to select the player your team needs?
2. What can be done statistically to get the best player?
3. How can we describe completely the data given?
4. How can we visually represent this distribution with a graph?
5. Who between Player A and Player B is the best player?
ALL RIGHTS RESERVED: IMMACULADA CONCEPCION COLLEGE (STATISTICS AND PROBABILITY) PAGE 5 OF 48
Immaculada Concepcion College
Of Soldier’s Hills Caloocan City, Inc.
Soldier’s Hills III Subd. Brgy. 180, Tala, North Caloocan City
VI. ENRICHMENT
The histogram below gives a better picture of the performance of the two players based on the possibility
of scoring points per game.
Points of Player A
0.3
0.25
0.2
0.15
0.1
0.05
0
0 1 2 3 4 5 6 7
Points of Player B
0.3
0.25
0.2
0.15
0.1
0.05
0
0 1 2 3 4 5 6 7
Activity:
Compute the probabilities for each random variable x. Draw its histogram.
A. Given the variable x and the frequency of its occurrence.
X ƒ P(x)
5 3
10 8
15 26
20 10
25 3
ALL RIGHTS RESERVED: IMMACULADA CONCEPCION COLLEGE (STATISTICS AND PROBABILITY) PAGE 6 OF 48
Immaculada Concepcion College
Of Soldier’s Hills Caloocan City, Inc.
Soldier’s Hills III Subd. Brgy. 180, Tala, North Caloocan City
VII. EVALUATION
x (𝑥): 𝑛 P(X)
𝑁
100 0.38
250 0.30
380 0.17
420 0.10
510 0.05
X ƒ P(x)
1-5 0.10
6-10
11-15 0.25
16-20 0.50
3. A real state broker needs to advertise 2 townhouses, 2 single detached homes, and 2 duplexes. However, the
broker decides to choose at random only one of the six properties for open house on a certain weekend. Let
the random variable x take on the value:
a. if a townhouse is chosen,
b. if a single detached is chosen,
c. if a duplex is chosen.
5. In a batch of circuit boards, there is 1 board that needs to be returned to the factory, 2 boards that need
repair but do not need to be sent back to the factory, and 5 boards that are in good working condition.
Answer the following:
a. What is the probability that a circuit board selected at random needs to be returned to the factory?
b. What is the probability that a circuit board is in good working condition?
c. What is the probability that a circuit board needs repair?
d. Construct a probability distribution table.
ALL RIGHTS RESERVED: IMMACULADA CONCEPCION COLLEGE (STATISTICS AND PROBABILITY) PAGE 7 OF 48
Immaculada Concepcion College
Of Soldier’s Hills Caloocan City, Inc.
Soldier’s Hills III Subd. Brgy. 180, Tala, North Caloocan City
Week # ___3___
I. INTRODUCTION
Discoveries of patterns regarding the likelihood of its occurrences (probability distribution) paved way to
forecasting and estimating significant results of related variables. This lesson set forth to bring about
understanding the concepts and exploring the applications.
1. What is the variance of the results if a number is drawn from a jar containing numbers 2, 3, 4, 5, and 6?
2. What is the standard deviation of the results if a number is picked from a jar containing two 2s, three 4s, and
five 6s?
3. By investing in a particular stock, Florence can make $40 in a month with a probability of 0.2 or take a loss of
$10 with a probability of 0.8. What is the variance and standard deviation?
4. A box contains balls numbered 1 through 5. You are to draw a ball from the box and you will be paid 12
chips if the number is even. However, you are going to pay 7 chips if the number is odd. What is standard
deviation of the possible results?
5. Your father said, if in one grading period your grade in math is 90 and above, he will add 50Php to your daily
allowance, 20Php if your grade is ;80-89, but decrease it by 10Php if your grade is 79 and below. If the
probability to get 90 and above is 12% while you have 45% chance to get 80-89, what is the standard deviation
of your possible allowance per quarter?
Suppose four tiles numbered 1,2,3, and 4 are in a jar. A tile is picked and returned in the jar 15 times.
The results are as follow:
From the results, the average number per pick would be computed by:
=2.53
This means that for every tile picked from the jar, the number in the tile is in average 2.53. This may not be a
possible result of any individual yield or outcome, but this is very important measure in statistics.
If we rewrite the calculation separating the tile number from the probability of each based on the results, the
computation would be:
ALL RIGHTS RESERVED: IMMACULADA CONCEPCION COLLEGE (STATISTICS AND PROBABILITY) PAGE 8 OF 48
Immaculada Concepcion College
Of Soldier’s Hills Caloocan City, Inc.
Soldier’s Hills III Subd. Brgy. 180, Tala, North Caloocan City
2 4 8 1
𝑥̅ = 1( ) + 2 ( ) + 3 ( ) + 4 ( ) = 2.53
15 15 15 15
The value 2.53 in the example above is called expected value, mathematical expectation, or mean of
the discrete random variable defined.
Definition
If X is a discrete random variable with values 𝑋1, 𝑋2, 𝑋3, … . 𝑋𝑛 with probabilities ƒ(𝑥1 ), ƒ(𝑥2 ), ƒ(𝑥3 ), … . . ƒ(𝑥𝑛) ,
respectively, then the mean or expected value of X denoted by E(X) is:
Answer:
Let Y be the random variable defined by the outcomes. Since the die is fair, each of the outcomes has a
probability 1/6, thus the expected value per roll is:
= 21/6
E(Y) = 3.5
From the example of picking a tile from a jar containing tiled numbers 1, 2, 3 and 4, in which the results after 15
times of picking are:
Tile Number of times picked
1 2
2 4
3 8
4 1
The mean or expected value E(X) IS 2.53.
To better describe these results, the variance of the random variable defined here must also be known.
The variance denoted by 𝜎 2 or V(X) of any random variable X, could be computed by getting the
average of the product of the squared deviations from the mean of X and their corresponding probabilities.
This process is very similar to the way we solve for the variance of any data set (especially if weighted or
grouped). The probabilities of each value of the random variable are used as weights.
By mathematical manipulation and through the idea or property that the sum of all the probabilities or ƒ(x) in a
random variable is 1. It follows
Note that formula 2 is usually used as the computational formula because the use of formula 1 can sometimes
be more difficult especially if the mean, E(X) has a decimal part.
ALL RIGHTS RESERVED: IMMACULADA CONCEPCION COLLEGE (STATISTICS AND PROBABILITY) PAGE 9 OF 48
Immaculada Concepcion College
Of Soldier’s Hills Caloocan City, Inc.
Soldier’s Hills III Subd. Brgy. 180, Tala, North Caloocan City
Aside from the variance, the standard deviation is usually used as a measure of variability. As in any set of data,
of data, the standard deviation is the positive square root of variance. Thus, the standard deviation of a
random variable, say X is Sd(X) = 𝑽(𝑿)
Example 1:
From the experiment above, we can use x to represent the tiled number, and change the “number of times
picked” to ƒ(x) by dividing each of its value by 15, thus the table becomes:
X Ƒ(x)
1 2/15
2 4/15
3 8/15
4 1/15
The formula 1 for variance can be applied. However, more columns can be added to the table to make the
calculation easier.
x Ƒ(x) Xƒ(x) x-E(X) [X-E(X)]2 [𝒙 − 𝑬(𝑿)]2ƒ(x)
1 2/15 2/15 -1.53 2.341 0.312
2 4/15 8/15 -0.53 0.281 0.075
3 8/15 24/15 0.47 0.221 0.118
4 1/15 4/15 1.47 2.161 0,144
E(X)=2.53 V(X) = 0.65
V(X) = 0.67
Sd(X) = 0.82
Notice that the results have minimal little discrepancies. These are of course accounted from rounding off value
throughout the computations.
V. PRACTICE
A. Xander is paid P20 whenever the results of tossing two coins are both heads but pays P10 whenever the
results are not both heads. What is his expected gain per toss?
Let X be the random variable defined. There are 4 outcomes in tossing two coins, in which only 1 is a HH. The
other results are HT, TH, and TT. The probability of both heads is ¼ while the probability of not both heads is ¾,
therefore, Xander’s expected gain per toss is:
E(X) = ?????
ALL RIGHTS RESERVED: IMMACULADA CONCEPCION COLLEGE (STATISTICS AND PROBABILITY) PAGE 10 OF 48
Immaculada Concepcion College
Of Soldier’s Hills Caloocan City, Inc.
Soldier’s Hills III Subd. Brgy. 180, Tala, North Caloocan City
B. The probability distribution below shows the number of typing errors (x) and the probability ƒ(x) of committing
these errors whenever clerks type-in a document.
Compute the variance and standard deviation.
Y 0 1 2 3 4 5
VI. ENRICHMENT
1. Find the expected number of monthly absences of Jemar based on his previous records of absences as
presented in the probability distribution below.
0 25%
1 30%
2 30%
3 15%
2. Find the variance of the number of monthly absences of Jemar based on his previous records of absences as
presented in the probability distribution below.
0 25%
1 30%
2 30%
3 15%
VII. EVALUATION
1.
X 0 1 2 3 4
P(x) 1/5 1/5 1/5 1/5 1/5
2.
Y 1 2 3
P(y) 1/2 1/6 1/3
3.
Z 3 5 7 9
P(z) 0.6 0.1 0.2 0.1
4.
R 10 20
P(r) 3/7 4/7
ALL RIGHTS RESERVED: IMMACULADA CONCEPCION COLLEGE (STATISTICS AND PROBABILITY) PAGE 11 OF 48
Immaculada Concepcion College
Of Soldier’s Hills Caloocan City, Inc.
Soldier’s Hills III Subd. Brgy. 180, Tala, North Caloocan City
5.
S 3 4 12 20
P(s) 0.1 0.5 0.2 0.2
6.
S 4 6 8 10
P(s) 1/12 3/12 7/12 1/12
7.
Y 1 2 3
P(y) 20% 30% 50%
8.
S 3 6 9 10 15
P(s) 0.25 0.30 0.05 0.20 0.20
1. The random variable X, representing the number of nuts in a chocolate bar has the following probability
distribution. Compute the mean.
X 0 1 2 3 4
P(x) 1/10 3/10 3/10 2/10 1/10
2. Find the mean of the random variable Y representing the number of red m&m’s chocolates per 160-gram
pack that has the following probability distribution.
S 3 4 12 20
P(s) 0.1 0.5 0.2 0.2
3. Find the mean of the random variable Z representing the number of male teachers per elementary school.
Z 3 4 5 6 7
P(z) 40% 32% 11% 9% 8%
4. Find the expected number of times a baby wakes his/her mother after midnight, given the following
probability distribution.
X 1 2 3 4 5
P(x) 0.12 0.25 0.45 0.1 0.08
Week # ____4______
The area between the curve and the horizontal axis is exactly equal to 1. Half of the area is above the mean
and the remaining half is below the mean.
There are many normal distributions. A normal distribution is determined by two parameter: the mean μ and the
standard deviation σ. If the mean μ is 0 and the standard deviation σ is 1. Then the normal distribution is a
standard normal distribution the areas under this curve can be found using the Areas under the Normal Curve
Table.
However, the mean μ is not always equal to 0 and the standard deviation σis not always equal to 1. In the normal
curve below, μ= 40 and σ= 12
ALL RIGHTS RESERVED: IMMACULADA CONCEPCION COLLEGE (STATISTICS AND PROBABILITY) PAGE 13 OF 48
Immaculada Concepcion College
Of Soldier’s Hills Caloocan City, Inc.
Soldier’s Hills III Subd. Brgy. 180, Tala, North Caloocan City
Suppose two curves are sketched above the same horizontal axis and those normal curves have the same
standard deviations but different means.
Notice that if the mean μis changed from 55 to 39, the curve is moved to the left but its shape remains the same.
Suppose the normal curves have the same means but different standard deviations.
ALL RIGHTS RESERVED: IMMACULADA CONCEPCION COLLEGE (STATISTICS AND PROBABILITY) PAGE 14 OF 48
Immaculada Concepcion College
Of Soldier’s Hills Caloocan City, Inc.
Soldier’s Hills III Subd. Brgy. 180, Tala, North Caloocan City
Notice that the shape if the normal curve with σ= 20 is flatter than that with aσ= 10.
Suppose the curves have different means and different standard deviations.
Notice that they are centered at different positions on the horizontal axis. The normal curve on the left is flatter
and spreads out further. This is because it has a larger standard deviation.
Areas under normal curve can be found using the Areas under the Standard Normal Curve table. Those areas
are regions under the normal curve.
ALL RIGHTS RESERVED: IMMACULADA CONCEPCION COLLEGE (STATISTICS AND PROBABILITY) PAGE 15 OF 48
Immaculada Concepcion College
Of Soldier’s Hills Caloocan City, Inc.
Soldier’s Hills III Subd. Brgy. 180, Tala, North Caloocan City
ALL RIGHTS RESERVED: IMMACULADA CONCEPCION COLLEGE (STATISTICS AND PROBABILITY) PAGE 16 OF 48
Immaculada Concepcion College
Of Soldier’s Hills Caloocan City, Inc.
Soldier’s Hills III Subd. Brgy. 180, Tala, North Caloocan City
SOLUTION 1:
SOLUTION 2:
A1 = .4357
A2 = .4938
A = A2 -A1
A = .4938 - .4938
= 0. 0581
Hence, the area between z = 1.52 and z = 2.5 is 0. 0581
➢ A positive z – score – indicates that the score or observed value is above the mean.
➢ A negative z – score – indicates that the score or observed value is below the mean.
EXAMPLE 1: The scores of the students in the midyear examination for Mathematics has a mean (μ)of 32 and
a standard deviation (𝜎) of 5. Find the z – scores corresponding to each of the following:
a) 37
b) 22
c) 33
d) 28
ALL RIGHTS RESERVED: IMMACULADA CONCEPCION COLLEGE (STATISTICS AND PROBABILITY) PAGE 17 OF 48
Immaculada Concepcion College
Of Soldier’s Hills Caloocan City, Inc.
Soldier’s Hills III Subd. Brgy. 180, Tala, North Caloocan City
SOLUTIONS:
𝑥−μ 37−32 5
a) 𝑧 = = = =1
𝜎 5 5
𝑥−μ 28−32 −4
d) 𝑧 = = = = -0.8
𝜎 5 5
V. PRACTICE
I. Find the area under the normal curve in each of the following cases.
VI. ENRICHMENT
A. A light bulb manufacturer knows that the life time of their manufactured light bulbs is normally distributed
with a mean life of 2150 hours and a standard deviation of 75 hours.
1. What is the proportion of light bulbs with life time exceeding 2000 hours?
2. If the standard deviation remains the same, find the necessary mean life time so that 98% of the light
bulbs will last more than 2000 hours?
3. If the mean life remains at 2150 hours, find the necessary standard deviation so that 98% of the light
bulbs will last more than 2000 hours.
B. In R&B Manufacturing Company, workers are able to produce an average of 250 units of its product per
person per day with a standard deviation of 25 units. In order to raise the productivity level, the management
announced that there will be an incentive pay for the top 20% producers.
1. If a worker is chosen at random, what is the probability that the worker can produce:
a. more than 270 units per day
b. less than 260 units per day
2. What is the minimum number of units that a worker should produce in order to qualify for the
incentive pay?
C. Copper rods are mass-produced at XYZ Factory. A customer ordered rods with lengths 45cm on the
condition that the rods will be acceptable if their lengths lie within the limits 44.95 cm and 45.05 cm. On testing
the rods supplied to him, the customer finds that 5% are under-size and 10% are over-size. If the lengths of the
rods are normally distributed, find its mean length and standard deviation.
ALL RIGHTS RESERVED: IMMACULADA CONCEPCION COLLEGE (STATISTICS AND PROBABILITY) PAGE 18 OF 48
Immaculada Concepcion College
Of Soldier’s Hills Caloocan City, Inc.
Soldier’s Hills III Subd. Brgy. 180, Tala, North Caloocan City
VII. EVALUATION
A. Given a normal distribution with a mean of 42 and standard deviation of 6, find the area BELOW.
(SHOW YOUR SOLUTION)
1. 36
2. 54
3. 38
4. 60
5. 58
B. Given a normal distribution with a mean of 125 and standard deviation of 15, find the area ABOVE.
(SHOW YOUR SOLUTION)
6. 128
7. 119
8. 158
9. 100
10. 120
. Given a normal distribution with a mean of 24 and standard deviation of 4, find the area BETWEEN the
following:
11. 28 and 30
12. 12and 38
13. 16 and 22
14. 19 and 31
15. 17 and 24
ALL RIGHTS RESERVED: IMMACULADA CONCEPCION COLLEGE (STATISTICS AND PROBABILITY) PAGE 19 OF 48
Immaculada Concepcion College
Of Soldier’s Hills Caloocan City, Inc.
Soldier’s Hills III Subd. Brgy. 180, Tala, North Caloocan City
Week # ___5______
I. INTRODUCTION
Oftentimes, in our researches or even in daily activities, we are concerned with a large group of people or
objects. It is of course difficult, or sometimes impossible to deal with every member of this large group known as
population. In times like this, we have remedy, which is selecting a portion known as sample, and this process is
called sampling.
Sampling
Oftentimes, in our research or even in daily activities, we are concerned with a large group of people or objects.
It is of course very difficult, or sometimes impossible to deal with every member of this large group known as
population. In times like this, we have a remedy, which is, selecting a portion of the population known as sample.
This process is called sampling. One of the best methods of sampling which is usually used in research is called
sampling.
If a population we are concerned with is finite or small in number, say the 25 captive – bred Philippine Eagles
successfully produced by the Philippine Eagle Foundation (PEF) as of October 15, 2015, then, we can easily
describe it. Every measurement or quantity that represents the general characteristics of this population, say the
average height of these 25 captives – bred raptors in 2.5 meters, is called parameter.
On the other hand, if we are dealing with every large population and we have resorted to sampling, then, every
measurement or quantity that describes the characteristics of the sample is called sample statistic or simply
statistic.
ALL RIGHTS RESERVED: IMMACULADA CONCEPCION COLLEGE (STATISTICS AND PROBABILITY) PAGE 20 OF 48
Immaculada Concepcion College
Of Soldier’s Hills Caloocan City, Inc.
Soldier’s Hills III Subd. Brgy. 180, Tala, North Caloocan City
Suppose a jar contains number 1, 3 and 5. If we take two numbers in succession with replacement,
then, the possible 2 – number samples are: (1.1), (3,3), (5,5), (1,3), (3,1),(1,5),(5,1),(3,5) and (5,3). The average or
mean of each pair, in that order are 1,3,5,2,2,3,3,4 and 4. If we denote the means as random variable X, then
X = {1, 2, 3, 4}
As we can see, P(1) = 1/9, P(2) = 2/9, P(3) = 3/9, or 1/3, P(4) = 2/9, and P(5) = 1/9
X 1 2 3 4 5
The probability distribution above represents the means of the samples, that’s why the distribution is now called
Sampling Distribution of the Sample Means.
Example 1:
In order to test the effect of the new drug to humans, 20 patients were given the dose. After a minute, it was
found that the body temperature in average, decreased by 20C. Answer the following:
a).Are the 20 patients mentioned above population or sample?
Answer:
b. Since the measurement 20C refers to the average decrease of the 20 patients (sample), it is therefore
considered as statistic.
Example 2
Construct the sampling distribution of the sample means when two dice are rolled.
Answer:
1 2 3 4 5 6
ALL RIGHTS RESERVED: IMMACULADA CONCEPCION COLLEGE (STATISTICS AND PROBABILITY) PAGE 21 OF 48
Immaculada Concepcion College
Of Soldier’s Hills Caloocan City, Inc.
Soldier’s Hills III Subd. Brgy. 180, Tala, North Caloocan City
1/36 2/36 3/36 4/36 5/36 6/36 5/36 4/36 3/36 2/36 1/36
f ( X)
V. PRACTICE
Construct the sampling distribution of the sample Means and answer the questions that follow:
A jar contains number 1,2,3 and 4. Construct the sampling distribution of the sample means when two numbers
are taken from the jar with replacement.
1. What is the probability that the mean of the number is 2.5?
2. What is the probability that the mean of the numbers is less than 2?
3. What is the probability that the mean of the numbers is greater than 1.5?
4. What is the probability that the mean of the numbers is between 1.5 and 5.
Construct the histogram of the sampling distribution.
VI. ENRICHMENT
The totality of subjects (people, animals or subjects) under consideration is called population. The portion
chosen from a population is called sample and the process of taking samples is called sampling.
Random Sampling refers to the sampling technique in which each member of the population is given equal
chance to be chosen as part of the sample. The lottery method, drawing lots, or the use of random numbers
can be used to accomplish random sampling.
The measurement or quantity that describe the population is called parameter while the measurement or
quantity that describe the sample is called statistics.
VII. EVALUATION
Construct the sampling distribution of the sample Means and answer the questions that follow:
1. What is the probability that his mean grade is lower than 83?
2. What is the probability that his mean grade is greater than 82.33?
4. What is the probability that his mean grade is between 82.33 and 83?
ALL RIGHTS RESERVED: IMMACULADA CONCEPCION COLLEGE (STATISTICS AND PROBABILITY) PAGE 22 OF 48
Immaculada Concepcion College
Of Soldier’s Hills Caloocan City, Inc.
Soldier’s Hills III Subd. Brgy. 180, Tala, North Caloocan City
B. Three containers contain the numbers 0,1, and 2. Construct the sampling distribution of the sample mean
when a number is taken from each container.
2. What is the probability that the sample mean is greater than 0.67?
C. Determine if the given subject is population_ or_ sample, then describe the given quantity as parameter or
statistic:
2. 50 out of the 200 animals in the zoo were taken and checked on their weight, The variance of their weight is
12.5 kg.
50 animals: __________________________
Variance (12.5 kg): ___________________
3. The standard deviation of the life span of a specie endemic in the Philippines is 2.3 years
A specie endemic in the Philippines: ____________________
Standard Deviation (2.3 years): _________________________
4. Based on the survey conducted to 1200 respondents, I out of 3 Filipinos can’t live without cell phone.
5. Based on the US National Hospital Discharge Record in 2010, the average length of stay of patients in US
hospitals in US hospitals is 4.8 days.
Patients: ________________________________
ALL RIGHTS RESERVED: IMMACULADA CONCEPCION COLLEGE (STATISTICS AND PROBABILITY) PAGE 23 OF 48
Immaculada Concepcion College
Of Soldier’s Hills Caloocan City, Inc.
Soldier’s Hills III Subd. Brgy. 180, Tala, North Caloocan City
Week # ______6____
I. INTRODUCTION
In probability distributions, it is important that we know the mean and variance or standard deviation of the
sampling distribution, specifically of the sample means. This lesson includes the discussion of the mean and
variance as well as the standard deviation of the sampling distribution of the sample means.
EX = X1 + X2 + X3 +…. + Xn
n
THEOREM
If all possible random samples of size n are taken with replacement (independent) from a
population with a mean µ and variance σ 2 , then the mean (µx) and standard deviation (σx)
of the sampling distribution of the sample mean are:
µx = µ (mean)
σ2 x = σ2 (Variance)
n
µx = µ (mean)
ALL RIGHTS RESERVED: IMMACULADA CONCEPCION COLLEGE (STATISTICS AND PROBABILITY) PAGE 24 OF 48
Immaculada Concepcion College
Of Soldier’s Hills Caloocan City, Inc.
Soldier’s Hills III Subd. Brgy. 180, Tala, North Caloocan City
σ2 x = σ2N – n (variance)
n N-1
Note: The factor N – n is called correction factor for finite population. It will be close
N–1
To 1 and can be safely – ignored when n is small compared to N.
Note :we increase the sample size, the variance of the sample mean decreases.
Example :
From our previous example, suppose a jar contains numbers 1,3 and 5.
X 1 2 5
f(x) 1/3 1/3 1/3
Since the distribution is uniform, that is, the observations have the same probabilities, the mean (µ) and sample
variance (σ2) can be easily computed as:
If we take two numbers in succession with replacement, then, the possible 2 – number sample are: (1,1), (3,3),
(5,5), (1,3), (3,1), (1,5), (5,1), (3,5) and (5,3). The average or mean of each pair, in that order are 1, 3, 5, 2,2,3,3,4
and 4,
1 2 3 4 5
X
1/9 2/9 3/9 2/9 1/9
f(x)
The mean of the sampling distribution (using the formula for mean of random variable) is:
µx =∈ 𝑥𝑓(𝑥)
= 1(1/9) + 2(2/9) + 3(3/9) + 4(2/9) + 5(1/9)
= 27/9
= 3, but µ = 3
Therefore, µẊ = µ
ALL RIGHTS RESERVED: IMMACULADA CONCEPCION COLLEGE (STATISTICS AND PROBABILITY) PAGE 25 OF 48
Immaculada Concepcion College
Of Soldier’s Hills Caloocan City, Inc.
Soldier’s Hills III Subd. Brgy. 180, Tala, North Caloocan City
The variance of the sampling distribution (using the formula for variance of random variable) is:
σ2 x = E(X2) – [E(X)]2
= 93/9 – 9
Therefore, σ2 x = σ2
n
V. PRACTICE
Determine the mean (µx), variance (σ2 x)and standard deviation σ x) of each.( Show your solution)
1. A random sample size 4 is taken with replacement from a population with µ = 12 and σ2 = 8
µx = _____ σ2 x = ______ σ2 x = ________
2. 2. An independent random of sample size 9 is taken from population with µ = 25.2 and σ2= 12
µx = ______ σ2 x = ______ σ2 x = __________
3. A random sample size 25 is taken with replacement from a population with µ = 121.4 and σ2
= 50.5.
µx = _______ σ2 x = _______ σ2 x = __________
4. independent random of sample size 100 is taken with the replacement from population with
µ = 72 and σ2= 25.
µx = ____ σ2 x = ______ σ2 x = __________
5. A random sample size 40 is taken with replacement from a population with µ = 82.4 and σ2 = 60.
6. A random sample size 3 is taken with replacement from a population with µ = 8 and σ2 = 2
µx = ____ σ2 x = ______ σ2 x = __________
7. A random sample size 20 independent observation is taken from a population withµ = 48 and σ2 = 5.
µx = ____ σ2 x = ______ σ2 x = __________
8. A random sample size 30 is drawn with replacement from a population with µ = 48 and σ2 = 6.5.
µx = ____ σ2 x = ______ σ2 x = __________
9. A random sample size 1600 is taken with replacement from a population with µ = 509.23
andσ2 = 40.
µx = ____ σ2 x = ______ σ2 x = __________
10. A random sample size 120 is taken with replacement from a population with µ = 120 and σ2 = 28
µx = ____ σ2 x = ______ σ2 x = __________
ALL RIGHTS RESERVED: IMMACULADA CONCEPCION COLLEGE (STATISTICS AND PROBABILITY) PAGE 26 OF 48
Immaculada Concepcion College
Of Soldier’s Hills Caloocan City, Inc.
Soldier’s Hills III Subd. Brgy. 180, Tala, North Caloocan City
VI. ENRICHMENT
Compute the mean (Ux), variance (a2 x) and standard deviation (σx) of the sampling distribution taken from
the following populations.
1. When 25 numbers are drawn with replacement from a jar containing 1,3,5 and 6.
2. When a sample of size 9 are taken with replacement from the population 1,1,2,2,2,3,3,4
3. When 36 samples are taken with replacement from the population 7, 6,6,6,5,4,3,1,1 and 1.
4. When an unbiased die is rolled 50 times.
5. When a biased die whose even numbers come up twice as the odd numbers is rolled 16 times.
6. A community has 1500 people with a mean age of 42 and variance of 16. If you draw a random sample
of 30 people, what are mean variance and standard error of the sampling distribution of their ages?
7. What are the mean, variance and standard error of the sample mean when 60 students are taken from a
population of 2000 with a mean score of 75 and standard deviation of 5?
8. The mean sugar level of 1000 patients in XYZ Hospital is 150 mg/Dl with variance of 64. If 50 of them were
taken as samples, what are the mean, variance and standard error of the sampling distribution?
9. The mean IQ of 1000 students of AJ University is 98 with standard deviation of 4. If 100 of them were taken as
samples, what are the mean, variance and standard error of the sampling distribution?
10. The mean monthly salary of the 1440 employees of Ragos Electrical Company is P20,000 with standard
deviation of P800. If 40 from them were randomly selected, what are the mean, variance and standard error of
the sampling distribution?
VII. EVALUATION
Compute the mean (μx) variance (σ2 ) and standard deviation (𝜎𝑥) of each sampling distribution.
𝑋
X 1 2 3 4 5
X 0 1 2 3 4
Y 1 2 3
ALL RIGHTS RESERVED: IMMACULADA CONCEPCION COLLEGE (STATISTICS AND PROBABILITY) PAGE 27 OF 48
Immaculada Concepcion College
Of Soldier’s Hills Caloocan City, Inc.
Soldier’s Hills III Subd. Brgy. 180, Tala, North Caloocan City
z 3 5 7 9
r 10 20
Ƒ(r) 3/7 4/7
s 3 4 12 20
t 5 10 20
V -1 0 1
m -5 -2 2 4
ALL RIGHTS RESERVED: IMMACULADA CONCEPCION COLLEGE (STATISTICS AND PROBABILITY) PAGE 28 OF 48
Immaculada Concepcion College
Of Soldier’s Hills Caloocan City, Inc.
Soldier’s Hills III Subd. Brgy. 180, Tala, North Caloocan City
Week # ___7______
I. INTRODUCTION
Sampling distributions are important in the understanding of statistical inference. Statistical inference
techniques are based on the concept of the sampling distribution of statistic. Probability distribution allowed us
to answer questions about sampling and they provide the foundation for statistical inference procedures.
In this lesson you will learn the concept of sampling distribution and its application.
Let X1, X2, X3, ..., X9 be independent normal random variables with mean μX = 3 and standard
deviation σX = 2. Let 𝑋̅ be the distribution of the mean of these 9 random variables, namely 𝑋̅ =
𝑋1+𝑋2+⋯+𝑋9
9
(a) What is the shape of the distribution of 𝑋̅?
(d) Can we determine P(𝑋̅ < 2.5) using a z-score? You do not need to compute this probability,
just answer yes or no and briefly explain why or why not.
Theorem
If random samples of size n are taken from a population with a mean μ and standard deviation σ, then
the sampling distribution of the sample mean X approaches normal distribution with mean μx¯¯¯=μ and
𝜎
standard deviation σ𝑥. = thus can be standardized as
√𝑛
𝐱− 𝛍
Z=
𝛔/√𝒏
As the n increases, the sampling distribution of the sample mean gets nearer and nearer to the normal
distribution.
Note:
➢ If σ in unknown, compute the sample standard deviation s then use it to replace σ in the formula
provided than n≥ 30.
➢ Even if n < 30, the formula can still be used provided that the population is approximately normal
and the population standard deviation σ is known.
Example 1.
The height of pupils in Luna Elementary School has a mean of 121 cm with standard deviation of 5. If 50 of them
are taken as samples, what is the probability that their mean weight is less than 120 cm?
Answer:
From the problem, μ = 121, σ = 5, x = 120, and n = 50. Using the formula:
ALL RIGHTS RESERVED: IMMACULADA CONCEPCION COLLEGE (STATISTICS AND PROBABILITY) PAGE 29 OF 48
Immaculada Concepcion College
Of Soldier’s Hills Caloocan City, Inc.
Soldier’s Hills III Subd. Brgy. 180, Tala, North Caloocan City
𝐱− 𝛍 𝟏𝟐𝟎−𝟏𝟐𝟏 −𝟏
Z = 𝛔/ = = 𝟎.𝟕𝟎𝟕 = 1.41
√ 𝒏 𝟓/√𝟓𝟎
Using the z-table, the area below z = -1.41 is 0.0793. Thus, the probability that the mean weight of the sample is
less than 120 cm is 0.0793 or 7.93%.
Theorem
If 𝑥̅ and s are the mean and standard deviation, respectively, of a random samples of size n taken from a
normally distributed population with a mean μ, can be standardized as:
𝐱− 𝛍
t=
𝒔/√𝒏
a value of a random variable T following the t-distribution.
Note:
➢ The formula is used when n < 30 and the population standard deviation is unknown.
S= √Ʃ(𝑥 + 𝑥̅ )2
n-1
The T- distribution
The t- distribution, like the z-distribution/normal distribution, is belle shaped and symmetric about the y-axis. As
compared to the z-distribution, the t-distribution is more variable since its value depends on the fluctuations of
mean and variance from sample to sample. Notice from the formula of s or s2 the divisor n-1 instead of n, which
is called degrees of freedom, d𝑓. This means that the t- distribution is different from sample to sample. Since it is
not practical to create the t-distribution from d𝑓=1, to d𝑓 = 28, only values of t for some special areas such as
0.005, 0.001, 0.025 etc. These special areas are denoted by a. if a= 0.05, then it refers to the area 0.05 or 5% on
the right tail of the t-curve for any v. The notation ta,df is a way of conveniently writing the t-value at a given 𝛼
and d𝑓. The notation ta=0.05, df=20 means the t-value corresponding to the 𝛼=0.05 and d𝑓=20. To look for this value
in the t-table, first locate 𝛼 on the top row, then the 𝑑𝑓 on the leftmost column. The intersection of 𝛼 = 0.05, and
d𝑓=20 is 1.725. Thus, t- 1.725.
Example:
What is the t-value when n=22 at 𝛼=0.01, then t = 2.518
V. PRACTICE
ALL RIGHTS RESERVED: IMMACULADA CONCEPCION COLLEGE (STATISTICS AND PROBABILITY) PAGE 30 OF 48
Immaculada Concepcion College
Of Soldier’s Hills Caloocan City, Inc.
Soldier’s Hills III Subd. Brgy. 180, Tala, North Caloocan City
VI. ENRICHMENT
A. Assume that the heights of adult women ware normally distributed with a mean of 63 in and standard
deviation of 2.5 in.
1. If 36 women are randomly selected, what is the probability that the mean height is less than 62 in.
2. If 70 women are taken as samples, what is the probability that their mean height is greater than 62.5 in?
3. If 100 women are randomly selected, what is the probability that their mean height is between 63.2 in
and 63.8 in?
B. Replacement times of TV sets are reported to follow a normal distribution having a mean of 8.5 years with
standard deviation of 1.2 years
4. If 30 TV sets are selected at random, what is the probability that the mean replacement time is less than
8 years?
5. If 20 TV sets are taken as samples, what is the probability that the mean replacement time is longer than
7.8 years?
6. If 25 TV sets are selected, what is the probability that the replacement time is between 8.4 years and 9
years.
C. American teenage girls are reported to spend an average of $31 on shopping per month, with standard
deviation of $8. If these expenses are normally distributed, answer the following.
7. If 85 American teenage girls are randomly selected, what is the probability that their mean expenses on
shopping per month is less than $30?
8. If 60 American teenage girls are selected, what is the probability that their mean expenses on shopping
is greater than $32.5?
9. If 90 American teenage girls are randomly selected, what is the probability that their mean expenses is
between $30.5 and $32?
10. If 5 of these teenage girls are asked on their expenses on shopping per month, what is the probability
that their mean expenses is between $28.7 and $35.8?
VII. EVALUATION
A. Compute the z-value for each; assume that each population is normally distributed.
1. µ = 100, 𝝈 = 2, 𝒙
̅ = 100.5, and n = 80
2. µ = 62, 𝝈 = 6, 𝒙̅ = 59, and n = 30
3. µ = 140, 𝝈 = 14, 𝒙
̅ = 145, and n = 12
4. µ = 46, 𝝈 = 9, 𝒙
𝟐 ̅ = 45.5, and n = 20
5. µ = 245, 𝝈 = 20, 𝒙
𝟐
̅ = 248, and n = 25
6. µ = 45, 𝒔 = 6 𝒙
̅ = 46.5, and n = 55
7. µ = 12.5, 𝒔 = 5, 𝒙
̅ = 11.8, and n = 50
8. µ = 156 𝒔 = 18.5, 𝒙̅ = 159, and n = 40
9. µ = 87, 𝒔𝟐 = 30, 𝒙̅ = 86.2 and n = 33
10. µ = 75, 𝒔𝟐 = 18, 𝒙 ̅ = 73.2, and n = 48
ALL RIGHTS RESERVED: IMMACULADA CONCEPCION COLLEGE (STATISTICS AND PROBABILITY) PAGE 31 OF 48
Immaculada Concepcion College
Of Soldier’s Hills Caloocan City, Inc.
Soldier’s Hills III Subd. Brgy. 180, Tala, North Caloocan City
Week # ____8_______
I. INTRODUCTION
Since population is usually large, describing it (determining its parameters) is very difficult. This is one of the
reasons why there is Statistics. In this lesson, we will discuss estimating the population parameter.
Point estimate is a single value that estimates the population parameter, such as 𝒙
̅ as estimate for µ or s as
estimate for .
Interval estimate sometimes called confidence interval, is a range or interval (with lower and upper limits)
used to estimate the population parameter. It is usually in the form a < 𝜃 > b, which tells that the estimated
parameter (𝜃) is between two values (a and b) at a certain level of confidence.
When the population variance or standard deviation is known, or when n ≥ 30 (by central limit theorem), the
formula below can be used as an interval estimate of population mean (µ) at a certain degree of confidence
(a):
𝑜 𝑜
𝑥̅ – za/2 ( ) < µ < 𝑥̅ + za/2 ( )
√𝑛 √𝑛
Where 𝑥̅ = sample mean,
= population standard deviation
n = sample size
za/2 = z value that leaves an area of a/2.
The values of za/2 are listed below with the usual confidence level used in estimating population mean.
𝑍𝑎/2𝜎 2
Formula: n( ) where E = margin of error
𝐸
= population standard deviation
Example:
Compute the margin of error of the 95% confidence interval estimate of µ when = 10, n = 25.
Answer:
From the table, for 95% level, za/2 = 1.96, thus, the margin of error is:
𝜎 10
za/2 ( ) = 1.96 ( ) = 1.96 (2) = 3.92
√𝑛 √25
We have established earlier than when n < 30 (small sample size), the Central limit theorem cannot be
applied, and thus, if population standard deviation 𝜎 is unknown for small sample size, the sample standard
deviation s cannot take its place, therefore, the interval estimate using the z-table cannot be used. For this
case, the t-distribution is used to make an interval estimate. The formula becomes:
𝑠 𝑠
𝑥̅ – 𝑡a/2 ( ) < 𝜇 < 𝑥̅ + 𝑡a/2 ( )
√𝑛 √𝑛
Where 𝑥̅ = sample mean, s = sample standard deviation, n = sample size
𝑡a/2 = 𝑡-value with n -1 degrees of freedom that leaves an area of a/2.
𝑠
From the formula 𝑡 a/2 ( ) is called margin of error.
√𝑛
Example:
Compute the 95% confidence interval estimate of µ given the following: s = 9, n = 12, and 𝑥̅ = 27
Answer:
From the given, df = 12-1 = 11, since it is 95% confidence level, a = 5%, thus 𝑡 a/2 = 𝑡 0.025. From the t-table, for df =
11, 𝑡 0.025 = 2.201
𝑠 𝑠
𝑥̅ – 𝑡a/2 ( ) < 𝜇 < 𝑥̅ + 𝑡a/2 ( )
√𝑛 √𝑛
9 9
27 – 2.201 ( ) < 𝜇 < 27 + 2.201 ( )
√27 √27
27 – 2.201(1.73) < 𝜇 < 27+ 2.201 (1.73)
Estimating population proportion is similar to estimating population mean. When the sample proportion 𝑝̂
(pronounced as p-hat) is computed from a large sample n, then the interval estimate of the population
proportion p at certain a can be computed as:
𝑝𝑞 𝑝𝑞
𝑝̂ – za/2 √ 𝑛 < 𝑝 < 𝑝̂ + za/2√ 𝑛
Where 𝑝 = sample proportion, q = 1- 𝑝̂ , n= sample size, za/2 = z value that leaves an area of a/2
Example:
Compute the 90% confidence interval estimate of p given the following 𝑝̂ =0.65 and n = 50.
Answer:
Since p = 0.65, then q = 0.35. From the table, for 90% level, za/2 = 1.64
𝑝𝑞 𝑝𝑞
𝑝̂ – za/2 √ < 𝑝 < 𝑝̂ + za/2√
𝑛 𝑛
(0.65)(0.35) (0.65)(0.35)
0.65 -1.64 √ < 𝑝 < 0.65 + 1.64 √
50 50
ALL RIGHTS RESERVED: IMMACULADA CONCEPCION COLLEGE (STATISTICS AND PROBABILITY) PAGE 33 OF 48
Immaculada Concepcion College
Of Soldier’s Hills Caloocan City, Inc.
Soldier’s Hills III Subd. Brgy. 180, Tala, North Caloocan City
V. PRACTICE
Compute the margin of error of µ given the level of confidence, sample standard deviation s, and sample size
n.
1. Confidence level = 90%, s = 3, n = 10
2. Confidence level = 95%, s = 8, n = 20
3. Confidence level = 98%, s = 12, n = 12
4. Confidence level = 99%, s = 21, n = 15
5. Confidence level = 98%, s = 15, n = 21
6. Confidence level = 90%, s = 8.2, n = 8
7. Confidence level = 90%, s = 14.4, n = 26
8. Confidence level = 95%, s = 8.8, n = 17
9. Confidence level = 95%, s = 12.8, n = 13
10. Confidence level = 98%, s = 9, n = 19
VI. ENRICHMENT
Solve the following problems:
1. A coffee machine is regulated so that the amount it dispenses is normally distributed. If a random
sample of 21 cups had an average of 8 ounces with standard deviation of 0.5 ounces. Construct a 95%
confidence interval estimate for the average amount of all cups of coffee dispensed by this machine.
2. The average weight of 15 adult Dagupan bangus is 750 grams with standard deviation of 80 grams.
Construct a 98% confidence interval estimate of the average weight of all adult Dagupan bangus.
3. Ten (10) “Taklobo” or giant clams have an average of 45 inches across its shell with standard deviation
of 4 inches. Construct a 95% confidence interval estimate of the average length across shells of all giant
clams.
4. In a study of a personnel services analytics, 20 managers were found to spend a mean of 2.5 hours
each day on paper works with a standard deviation of 1.2 hours. Construct a 90% confidence interval
estimate of the average time spent on paper works by all managers.
5. A study was conducted to test a new variety of rice. A sample of 5 plots showed an average yield of
per square meter as recorded below. Construct a 95% confidence interval estimate of the average
yield per square meter of the new variety of rice.
Plot 1 2 3 4 5
Yield (kg/m2 2 2.5 3 1.6 2.4
(Hint: Compute first the sample mean and the sample standard deviation)
VII. EVALUATION
A. Compute the interval estimate of µ given the confidence level, sample mean 𝑥̅ , population standard
deviation , and sample size n.
1. Confidence level = 90%, 𝑥̅ = 42, = 10, and n = 40
2. Confidence level = 98%, 𝑥̅ = 21, = 15, and n = 50
3. Confidence level = 95%, 𝑥̅ = 142, = 9, and n = 25
4. Confidence level = 99%, 𝑥̅ = 28, = 12, and n = 60
5. Confidence level = 97%, 𝑥̅ = 45, = 8, and n = 140
B. Compute the interval estimate of µ given the confidence level, sample mean 𝑥̅ , sample standard
deviation s, and sample size n.
1. Confidence level = 90%, 𝑥 ̅ = 42, s = 10, and n = 20
2. Confidence level = 98%, 𝑥 ̅ = 21, s = 15, and n = 10
3. Confidence level = 95%, 𝑥 ̅ = 142, s = 9, and n = 15
4. Confidence level = 99%, 𝑥 ̅ = 28, s = 12, and n = 11
5. Confidence level = 90%, 𝑥 ̅ = 45, s = 8, and n = 16
C. Compute the interval estimate for p given the level of confidence, sample proportion p , and sample
size n.
1. Confidence level = 95%, p = 0.3, n = 30
2. Confidence level = 90%, p = 0.8, n = 50
3. Confidence level = 98%, p = 0.6, n = 35
4. Confidence level = 99%, p = 0.5, n = 70
5. Confidence level = 90%, p = 0.15, n = 55
ALL RIGHTS RESERVED: IMMACULADA CONCEPCION COLLEGE (STATISTICS AND PROBABILITY) PAGE 34 OF 48
Immaculada Concepcion College
Of Soldier’s Hills Caloocan City, Inc.
Soldier’s Hills III Subd. Brgy. 180, Tala, North Caloocan City
Week # ____1____
Hypotheses or claims can be classified as Null Hypothesis (H 0) or Alternative Hypothesis (Ha). Hypothesis (a) is a
Null Hypothesis, while hypotheses (b) and (c) are called Alternative Hypotheses.
Definition
Hypothesis Testing is a process of gathering evidences to either support or rebut a claim or conjecture,
known as hypothesis.
Null Hypothesis (H0) is a claim that denotes “absence” such as absence of difference, absence of
relationship, or equality to a certain value, and the like.
It usually comes with “=,≥, 𝑜𝑟 ≤ " when written in symbol.
Alternative Hypothesis (Ha) is a claim that denotes “presence” such as presence of difference, presence of
relationship, or inequality to a certain value and the like. It usually comes with "≠, " <, 𝑜𝑟 > " when written in
symbol.
After testing a hypothesis, of course a decision shall be made as bases for a conclusion that is to reject or not to
reject the null hypothesis (testable hypothesis). In making a decision, four possible results can be made, two
right decisions and two wrong decisions.
ALL RIGHTS RESERVED: IMMACULADA CONCEPCION COLLEGE (STATISTICS AND PROBABILITY) PAGE 35 OF 48
Immaculada Concepcion College
Of Soldier’s Hills Caloocan City, Inc.
Soldier’s Hills III Subd. Brgy. 180, Tala, North Caloocan City
Right Decisions:
✓ rejecting a fall null hypothesis
✓ not rejecting a true null hypothesis.
Wrong Decisions:
x rejecting a true null hypothesis (Type I Error)
x not rejecting a false null hypothesis (Type II Error)
As shown above, two possible errors could be committed. The probability of committing a Type I error is
represented by 𝛼 (Greek letter alpha) while the probability of committing a Type II error is denoted as 𝛽 (Greek
letter beta). However, in testing hypothesis, only the probability of committing a Type I error a is used. The usual
acceptable values of a used are 0.05 and 0.01. If a=0.05, then the probability of rejecting a true null hypothesis
is 5%, which means that the probability of not rejecting a true null hypothesis is 95% (this is NOT the value of 𝛽)
V. PRACTICE
A social worker wants to test (at a = 0.05) whether the average body mass index (BMI) of the pupils under
feeding program is different from 8.2 kg.
a. State the null and alternative hypothesis in words.
b. State the null and alternative hypothesis in symbols.
c. What is the probability of committing Type I error?
d. State the conclusion when H0 is rejected.
e. State the conclusion when H0 is not rejected.
VI. ENRICHMENT
In each of the situations, answer the following.
a. State the null and alternative hypothesis in words.
b. State the null and alternative hypothesis in symbols.
c. What is the probability of committing Type I error?
d. State the conclusion when H0 is rejected.
e. State the conclusion when H0 is not rejected.
1. A college dean claims that a bachelor’s degree could be earned in an average of five years.
Test the claim using 95% confidence level.
a) H0: _______________________________________________
Ha: _______________________________________________
b) H0: __________ Ha:________________________
c) ___________________________________________________
d) ___________________________________________________
e) ___________________________________________________
2. An FDA officer claims that Pharma XYZ’s new caplet drug contains less than 300mg of paracetamol.
Test the claim using 99% confidence level.
a) H0: _______________________________________________
Ha: _______________________________________________
b) H0: __________ Ha:________________________
c) ___________________________________________________
d) ___________________________________________________
e) ___________________________________________________
3. The manufacturer of cigarette claims that the average nicotine content per stick is 2.1 mg.
Test the claim using 90% confidence level.
a) H0: _______________________________________________
Ha: _______________________________________________
b) H0: __________ Ha:________________________
c) ___________________________________________________
d) ___________________________________________________
e) ___________________________________________________
ALL RIGHTS RESERVED: IMMACULADA CONCEPCION COLLEGE (STATISTICS AND PROBABILITY) PAGE 36 OF 48
Immaculada Concepcion College
Of Soldier’s Hills Caloocan City, Inc.
Soldier’s Hills III Subd. Brgy. 180, Tala, North Caloocan City
4. A real estate agent claims that 60% of all condominium units built today are studio-type. Test the claim
using 98% confidence level.
a) H0: _______________________________________________
Ha: _______________________________________________
b) H0: __________ Ha:________________________
c) ___________________________________________________
d) ___________________________________________________
e) ___________________________________________________
5. The PQR Chamber of Commerce claim that their mean annual income is US$60,000. Test the claim using
95% confidence level.
a) H0: _______________________________________________
Ha: _______________________________________________
b) H0: __________ Ha:________________________
c) ___________________________________________________
d) ___________________________________________________
e) ___________________________________________________
VII. EVALUATION
Write the Null Hypothesis (H0) or the Alternative Hypothesis (Ha) of the following:
Ha: ______________________________________________________________
Ha: ______________________________________________________________
8. H0: ______________________________________________________________
9. H0: ______________________________________________________________
Ha: The proportion of obese children ages 3-10 is higher than 40%.
Ha: The mean amount of dispensed coffee of a new vending machine is greater than 300ml.
ALL RIGHTS RESERVED: IMMACULADA CONCEPCION COLLEGE (STATISTICS AND PROBABILITY) PAGE 37 OF 48
Immaculada Concepcion College
Of Soldier’s Hills Caloocan City, Inc.
Soldier’s Hills III Subd. Brgy. 180, Tala, North Caloocan City
Week # _____2_______
I. INTRODUCTION
As mentioned earlier, the hypothesis or claim about population mean or population proportion could be tested
using the five-step hypothesis testing procedure. This text only includes the discussion of two basic tests of
hypothesis about population mean using only single sample.
𝒙̅ − 𝝁𝟎
Z=
𝝈/√𝒏
Where: 𝝁𝟎 = claimed population mean = population standard deviation (can be replaced by n ≥ 30)
̅ = sample mean, n = sample size
𝒙
Note:
NEGATIVE sign of the computed z is disregarded when comparing it to the critical value of z if the hypothesis is
non-directional.
𝒙̅ − 𝝁𝟎
𝖙=
𝒔/√𝒏
ALL RIGHTS RESERVED: IMMACULADA CONCEPCION COLLEGE (STATISTICS AND PROBABILITY) PAGE 38 OF 48
Immaculada Concepcion College
Of Soldier’s Hills Caloocan City, Inc.
Soldier’s Hills III Subd. Brgy. 180, Tala, North Caloocan City
Note:
NEGATIVE sign of the computed t is disregarded when comparing it to the critical value of t if the hypothesis is
non-directional.
V. PRACTICE
Determined the decision for each of the following given the computed and critical value if the z or t:
1. zcomputed = 1.02 zcritical = 1.64 Decision: _______________________
2. zcomputed = 2.15 zcritical = 1.96 Decision: _______________________
3. zcomputed = 2.24 zcritical = 2.33 Decision: _______________________
4. zcomputed = 0.16 zcritical = 2.17 Decision: _______________________
5. zcomputed = 3.25 zcritical = 1.28 Decision: _______________________
6. tcomputed = 5.26 tcritical = 2.55 Decision: _______________________
7. tcomputed = 1.97 tcritical = 2.12 Decision: _______________________
8. tcomputed = 0.26 tcritical = 1.53 Decision: _______________________
9. tcomputed = 1.19 tcritical = 1.31 Decision: _______________________
10. tcomputed = 2.89 tcritical = 1.86 Decision: _______________________
VI. ENRICHMENT
A printer manufacturing company claims that its new ink-efficient printer can print an average of 1500 pages of
word documents with standard deviation of 60. Thirty five (35) of these printers showed a mean of 1475 pages.
Does this support the company’s claim? Use 95% confidence level.
−𝟐𝟓
= z = 2.47 (NEGATIVE SIGN could be disregarded since the test is two-tailed)
𝟏𝟎.𝟏𝟒
4. Decision (reject or not to reject H0)
Since the computed z (disregarding negative sign) is greater than the critical value of z. H0 is REJECTED.
5. Conclusion:
There is a sufficient evidence to deny the company’s claim.
VII. EVALUATION
ALL RIGHTS RESERVED: IMMACULADA CONCEPCION COLLEGE (STATISTICS AND PROBABILITY) PAGE 39 OF 48
Immaculada Concepcion College
Of Soldier’s Hills Caloocan City, Inc.
Soldier’s Hills III Subd. Brgy. 180, Tala, North Caloocan City
Week # _____3_____
I. INTRODUCTION
We have mentioned in the previous lesson that the population proportion can be estimated only for large
sample size (n ≥ 30). The same is true in testing a claim or hypothesis about the population proportion (p).
In the example, the researcher may initially believe that 50% of the rat population are female. Suppose he has
set-up traps to collect a number of rats in different parts of the region and out of the 50 rats he has collected,
23 are female. Would this support his initial belief?
To test a claim about population proportion, we use the z-test for population proportion.
The formula is:
̂−𝒑
𝒑
Z=
√𝒑𝒒/𝒏
As in the use of the z-test for means, the decision rule below is used:
➢ If zcomputed ≥ zcritical REJECT H0
➢ If zcomputed < zcritical DO NOT REJECT H0
V. PRACTICE
From the example above, the researcher wants to test his belief that 50% or 0.5 of the population of rats is
female. From his collected samples, 23 out of 50 are female. Would this support his claim? Use a = 0.05
H0: p = ______
Ha: p ≠ _______
a = 0.05
zcritical = 1.96
ALL RIGHTS RESERVED: IMMACULADA CONCEPCION COLLEGE (STATISTICS AND PROBABILITY) PAGE 40 OF 48
Immaculada Concepcion College
Of Soldier’s Hills Caloocan City, Inc.
Soldier’s Hills III Subd. Brgy. 180, Tala, North Caloocan City
3. Computation:
̂ = 23/50 = 0.46 thus, q = 0.54
From the problem, 𝒑
̂−𝒑
𝒑
Z=
√𝒑𝒒/𝒏
𝟎.𝟒𝟔−𝟎.𝟓
=
√(𝟎.𝟓)(𝟎.𝟓)(𝟎.𝟓)
?
= z = _______(NEGATIVE SIGN could be disregarded since the test is two-tailed)
?
4. Decision (reject or not to reject H0)
Since the computed z (disregarding negative sign) is (less than? or greater than?) the critical value of z.
H0 is REJECTED.
5. Conclusion:
There is (a sufficient? or no sufficient?) evidence to deny the researcher’s claim. Thus, 50% of the rat
population are female.
VI. ENRICHMENT
VII. EVALUATION
Compute the z for each given the claim (p), the observed proportion (𝑝̂ ), and the sample size (n).
1. p=0.2, 𝑝̂ = 0.18, n = 50
3. p=0.66, 𝑝̂ = 0.61, n = 40
5. p=0.7, ̂𝑝 = 0.68, n = 30
6. p=0.85, ̂𝑝 = 0.88, n = 45
7. p=0.6, ̂𝑝 = 0.65, n = 36
8. p=0.12, ̂𝑝 = 0.1, n = 49
ALL RIGHTS RESERVED: IMMACULADA CONCEPCION COLLEGE (STATISTICS AND PROBABILITY) PAGE 41 OF 48
Immaculada Concepcion College
Of Soldier’s Hills Caloocan City, Inc.
Soldier’s Hills III Subd. Brgy. 180, Tala, North Caloocan City
Week # ____4______
I. INTRODUCTION
A college mathematics instructor wants to analyze the grades of his 30 students in English and Mathematics. He
asked the students to get a piece of paper and write on it their grades in these two subjects, as well as the
school where they graduated from. The teacher tallied the data collected and set up three tables. He found
out that 10 students graduated from public schools, 10 from private sectarian schools, and 10 from private non-
sectarian schools.
How do we determine the relationship between Math and English grades for each group of students?
Let us now determine the relationship between the English and Math grades fir each group using scatterplot.
Since Math and English grades are both quantitative, one variable (Math grade) is plotted along the horizontal
axis and the second variable (English grade) along the vertical axis.
From the scatterplot, we can say that there is positive correlation between Math and English grades of students
from the public schools, negative correlation between the math and English grades of students from the
private sectarian schools, and no correlation between the math and English grades of students from the private
non-sectarian schools. A perfect correlation happens when all the points lie on straight line.
V. PRACTICE
Write the statement showing the relationship between two variables.
1. 2.
3. 4.
VI. ENRICHMENT
A candy retailer repacked candies in different ways and recorded the number of packs sold based on the
numbers of candies per pack. He noticed that very few packs are sold when there are 100 candies in a pack,
so he offers discount per packs of 100 candies based on the number of packs to be bought.
No. of pieces per pack No. of packs sold No. of packs (100 pcs. Per pack) Discount per pack (in P)
10 30 3 P2
20 27 4 P4
30 24 5 P6
40 21 6 P8
50 18 7 P10
60 15 8 P12
70 12 9 P14
80 9 10 P16
90 6 11 P18
100 3 12 P20
Draw the scatterplot for the following and interpret.
1. Number of candies per pack and the number of packs sold.
2. Number of packs (100 pcs. Per pack) to be bought and discount to be given.
ALL RIGHTS RESERVED: IMMACULADA CONCEPCION COLLEGE (STATISTICS AND PROBABILITY) PAGE 43 OF 48
Immaculada Concepcion College
Of Soldier’s Hills Caloocan City, Inc.
Soldier’s Hills III Subd. Brgy. 180, Tala, North Caloocan City
VII. EVALUATION
Draw the scatterplot and write the statement showing the relationship between the bivariate data.
English grade 80 80 85 88 89
Science grade 80 90 83 80 89
Anxiety level 2 3 5 7 8 9
GPA 95 92 91 88 86 87
Family size 3 5 6 7 8 9 10
Communication skills 4 5 6 7 7 8 9 10
Confidence level 5 4 7 7 6 9 10 9
Weight 45 48 50 51 53 54 57 60 63
Fuel in tank 80 73 67 61 52 46 37
(liters)
ALL RIGHTS RESERVED: IMMACULADA CONCEPCION COLLEGE (STATISTICS AND PROBABILITY) PAGE 44 OF 48
Immaculada Concepcion College
Of Soldier’s Hills Caloocan City, Inc.
Soldier’s Hills III Subd. Brgy. 180, Tala, North Caloocan City
Week # ____5_____
I. INTRODUCTION
In this lesson, you will learn how to determine the association between bivariate data using scatterplot,
compute and interpret the correlation coefficients and coefficient of determination.
X 11 17 26
Y 23 18 19
X 11 17 26
Z
𝑁 (∑ 𝑋𝑌)−(∑ 𝑋)(∑ 𝑌)
r=
√[𝑁(∑ 𝑋 2 )−(∑ 𝑋)2][𝑁(∑ 𝑌 2 )−(∑ 𝑌)2]
To interpret the value of correlation coefficient, we can use the table below.
Value of r Interpretation
1.0 Perfect Positive Correlation
0.90 to 0.99 Very high positive correlation
0.70 to 0.89 High positive correlation
0.40 to 0.69 Moderate positive correlation
0.20 to 0.39 Small positive correlation
- 0.20 to 0.19 Very small; negligible
- 0.40 to - 0.21 Small negative correlation
- 0.70 to - 0.41 Moderate Negative Correlation
- 0.90 to - 0.71 High Negative Correlation
- 0.99 to - 0.91 Very high Negative Correlation
-1.0 Perfect Negative Correlation
V. PRACTICE
VI. ENRICHMENT
The production manager at XYZ Company is interested in determining the nature of the relationship between
training and productivity. The following data were collected over a one-quarter period on 10 employees.
TRAINING LEVEL (HRS) PRODUCTIVITY (UNITS/HR)
18 124
14 110
26 155
9 119
17 137
6 100
22 146
10 123
18 117
12 120
VII. EVALUATION
The head of the production department of a RM electronic company wants to determine the relationship
between the number of workers who assemble the product and the number of units assembled per day.
No. of workers 10 12 14 16 13 20 18 17
No. of units produced. 120 180 220 224 176 320 270 275
x Y XY X2 Y2
ALL RIGHTS RESERVED: IMMACULADA CONCEPCION COLLEGE (STATISTICS AND PROBABILITY) PAGE 46 OF 48
Immaculada Concepcion College
Of Soldier’s Hills Caloocan City, Inc.
Soldier’s Hills III Subd. Brgy. 180, Tala, North Caloocan City
Week # ___6_____
I. INTRODUCTION
This lesson will focus on Regression analysis which will help you determined the effect of the independent
variable to the dependent variable and allow you to create mathematical models that can be used for
prediction purposes.
1. Using the least square regression line, find the percentage of passing (to the nearest whole number)
when the class contains:
a. 30 students
b. 35 students
c. 40 students
d. 50 students
e. 55 students
Coefficient of determination (r2) is used to determine how well the least square regression line fits the sample
data. It is very useful in assessing how much errors of prediction of the dependent variable (y) can be reduced
by the information provided by the independent variable (x)
To get the value of the coefficient of determination (r2), compute the value of the coefficient of correlation (r)
and square the result. Since the value of correlation coefficient is from -1 to 1, therefore the value of the
coefficient of determination is from 0 to 1.
Key Concepts
Coefficient of determination – used to determine how well the least square regression line fits the sample data
Least square regression equation – an equation that is used to predict the value of the dependent variable
based on the value of the independent variable.
Least square regression line – the graphical presentation of the least square regression equation and can be
used to determine the approximate value of the dependent variable based on the value of the independent
variable given in the scatterplot.
V. PRACTICE
The marketing manager of the a large supermarket chain would like to determine the effect of shelf space on
the sales of pet food. A random sample of 7 stores was selected.
VI. ENRICHMENT
VII. EVALUATION
The production manager at XYZ Company is interested in determining the nature of the relationship between
training and productivity. The following data were collected over a one-quarter period on 10 employees.
TRAINING LEVEL (HRS) PRODUCTIVITY (UNITS/HR)
18 124
14 110
26 155
9 119
17 137
6 100
22 146
10 123
18 117
12 120
3. Using the least square regression line, determine the weekly sales when the shelf space is 200 cm.
4. Using the least square regression equation, predict the weekly sales when the shelf space is 215cm? 250
cm?
ALL RIGHTS RESERVED: IMMACULADA CONCEPCION COLLEGE (STATISTICS AND PROBABILITY) PAGE 48 OF 48