0% found this document useful (0 votes)
72 views

Statistics and Probability M - PLV TextBook

Statistics can be defined as the process of making discoveries, decisions, and predictions using data. It plays a role in many fields like health organizations' response to the COVID-19 pandemic by collecting data on cases, recoveries, and deaths to make decisions. This document discusses key concepts in statistics including descriptive and inferential statistics, populations and samples, parameters and statistics, quantitative and qualitative variables, and levels of measurement. It also covers data collection methods like primary and secondary sources, and sampling techniques used to collect a portion of data from a population.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
72 views

Statistics and Probability M - PLV TextBook

Statistics can be defined as the process of making discoveries, decisions, and predictions using data. It plays a role in many fields like health organizations' response to the COVID-19 pandemic by collecting data on cases, recoveries, and deaths to make decisions. This document discusses key concepts in statistics including descriptive and inferential statistics, populations and samples, parameters and statistics, quantitative and qualitative variables, and levels of measurement. It also covers data collection methods like primary and secondary sources, and sampling techniques used to collect a portion of data from a population.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 83

STATISTICS AND

PROBABILITY

ALCANTARA, JOHN AARON D.


ASUNCION, JUSTINE N.
BALILI, JUN P.
BELLOSO, CHERRY MAE H.
PEREZ, FATIMA M.
UNIT INTRODUCTION TO STATISTICS
1

1
1 Statistics and Data
6
Statistics can be defined as a process behind how we
2 Collection of Data make discoveries, make decisions based on data,
and make predictions. The application of Statistics
is very wide for it plays a vital role in every field of
3 Presentation of Data human activity.

For instance, during the pandemic we have all experienced in the year 2020, everyday we can see data
about updates of COVID 19 around the world. This data includes the no. of active cases, recoveries, and
deaths during the pandemic. Through statistics, national governments, health organizations, and
universities were able to make decisions on how to stop and prevent the spread of coronavirus like
imposing community quarantines in which restricted gatherings of people. Also, they were able to make
predictions and set goals during the said pandemic.
In this unit, you will learn the basic concepts of statistics, and how to collect and present data.
Lesson 1 Statistics and Data

Pre-assessment:

At the end of this lesson, you are expected Identify if the variable being described is quantitative or
to: qualitative.
 identify the different branches of
Statistics, 1. Monthly income in a household
 define sample and population, 2. Beverage preference
 distinguish parameter and 3. Degree of agreement
statistic,
4. Learner reference number
 illustrate quantitative and
qualitative data,
5. Average score of students in a quiz
 distinguish and illustrate the
different levels of measurement.

WHAT IS STATISTICS?

It is the science of collecting, organizing, analyzing, and interpreting data in order to make decisions.

DATA
Collection of facts from experiments, observations, sample surveys and censuses and administrative report
systems.

VARIABLE
A characteristic that is observable or measurable in every unit of the population.

BRANCHES OF STATISTICS

Descriptive Statistics
It is the branch of Statistics that involves the organization, summarization, and display of data.

Inferential Statistics
The branch of Statistics that uses data from samples to make inferences about the population from which
the sample was drawn. In inferential statistics, we use statistics to estimate parameters.
Population
the collection of all outcomes, responses, measurements, or counts that are of interest.

Sample
A subset, or a part, of a population.

Parameter Statistic
It is a numerical measure that It is a numerical measure that
describes characteristics of a describes characteristics of a
population. sample.

Here are some examples of parameters and statistics that we will be using in this module:

Parameter Statistic
Mean µ 𝑥̅
Proportion 𝑝 𝑝̂
Variance 𝜎2 𝑠2
Standard Deviation 𝜎 𝑠

LET’S TRY THIS!

Which of the following are statistics and which are parameters?

1. The proportion of all patients who recovered from COVID 19 virus for the month of June.
2. The mean difference score between a randomly selected class taught statistics by a new method and another
class by an old method.
3. The mean score of all incoming senior high students of Pamantasan ng Lungsod ng Valenzuela in their
entrance exam.
4. The proportion of voters who resides at Valenzuela among all the voters of the Philippines.
5. The variability of salaries of 10% of the employees in the company.
6. The average height of 100 grade 11 students in PLV.

Answer:
1. Parameter
2. Statistic
3. Parameter
4. Parameter
5. Statistic
6. Statistic
TYPES OF VARIABLES

 Qualitative or Categorical – variables that express a categorical attribute.


o Example: sex, religion, region of residence

 Quantitative – otherwise called numerical data; it has actual units of measure.


o Example: height, weight, household size

Discrete – these are measurements that can only be expressed in whole units.
Example:
Continuous – data that can be measured. The possible values are uncountably infinite.

LEVELS OF MEASUREMENT

Nominal – it refers to measurements that serve as labels to identity, items, or classes. It is classified into
categories and cannot be arranged in any particular order.
Example: Student number, color, Music genre, sex

Ordinal – measurements that reflect the rank order of the individuals or objects. It can be arranged in
some order, but the differences between data values cannot be determined or are meaningless. It does
not tell how much one is different from the other.
Example: Social status, hardness of minerals, degrees of agreement

Interval – the values of the variable can be ranked, and the difference of the values show the distances
between the values. It has no true zero point. True zero point refers to the absence of the characteristic.
Example: temperature, test scores

Ratio – it is the highest level of measurement. The differences of the values show the distances between
the values and also the ratio of values is defined, it has a true zero point or absolute zero.
Example: height, age, weight
Lesson 2 Collection of Data

Pre-assessment:
WHAT YOU SHOULD LEARN
EARNlllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll 1. Identify if the source is a primary or secondary source.
At the end of this lesson, you are expected
llllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll a. Wikipedia
to:
llllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll
 identify the different sources of b. Interview
llllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll
data, c. Administrative data
llllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll
 identify the different methods of
llllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll
d. Journal
collecting data, e. Textbook
llll
 identify the sampling techniques
2. Differentiate probability and non-probability sampling.
to be used.

DATA SOURCES

Variables were observed or measured using any of the three methods of data collection; objective,
subjective and use of existing records.

 Primary – uses the method of objective and subjective. Obtained data directly from the source.
 Secondary – data obtained through the use of existing records or data collected by other entities
for certain purposes.

Advantages Disadvantages

Primary  You know how the data  It will take a long time to collect
was collected. the data you need.
 You can get exactly the  It can be expensive.
data you need.
 You know the accuracy
of the data.
Secondary  It is a quick and cheap  You may not know how the data
way to get a large was collected.
amount of data.  You may not get the exact data you
need.
 The data might be out of date.
 You may not know how accurate
the data is.
METHODS OF COLLECTING DATA

 INTERVIEW METHOD
o DIRECT – the researcher personally interviews the respondents.
o INDIRECT – the researcher use telephone, web cam or cellphone to interview.

 QUESTIONNAIRE METHOD – is a list of well-planned questions written on paper which can


be either personally administered or mailed by the researcher to the respondents using any of
the following forms.
o Guided-Response Type
o Multiple Choice Type
o Recall Type
o Multiple Response Type
o Dichotomous Type
o Rating Scale Type

 EMPIRICAL OBSERVATION METHOD – observation is commonly used is psychological and


anthropological studies; obtaining data through seeing, hearing, testing, touching and
smelling.

 TEST METHOD – this is widely used in psychological research and psychiatry. Standard tests
are used because of the validity, reliability, and usability.

 REGISTRATION METHOD – the mechanical devices that can be used for social and
educational research in data gathering are the camera, projector, video tape, tape recorder,
etc.
SAMPLING TECHNIQUES

Sampling – the process of obtaining samples.

 RANDOM OR PROBABILITY SAMPLING – one in which every member of the population has
an equal chance of being selected.

Simple Random Sampling - names of respondents are written on a small pieces of paper and rolled
then place in a jar and picked at random.

Stratified Sampling – it is used when it is important for the sample to have members from each
segment of the population.
Depending on the focus of the study, members of the population are divided into two or
more subsets, called strata, that share a similar characteristic such as age, gender,
ethnicity, or even political preference.

Cluster Sampling – clusters consist of geographic groupings and each cluster should contain
members with all of the characteristics. All of the members of one or more groups are used.

Systematic Sampling – a sample in which each member of the population is assigned a number.
The members of the population are ordered in some way, a starting number is randomly selected
and then sample members are selected at regular intervals from the starting number. (Ex. Every 3rd,
5th, or 100th member is selected)

 NON-RANDOM OR NON-PROBABILITY SAMPLING – where element of the population is


drawn based on the judgment of the researcher.

Purposive – the respondents chosen based on their knowledge of the information required by the
researcher.
Example: Suppose a researcher wants to make a historical study about Town A. The target
population is the senior citizens of the town living in Town A since birth since they are the most
reliable persons to know the history of the town.

Convenience – this technique is resorted to by the researcher who need the information the fastest
way possible.
Example: A computer software store conducts a marketing study by interviewing potential
customers who happen to be in the store browsing through the available software.

Quota – is formed when the main consideration is to complete the designated proportional part of
the population.
Example: You are to investigate the relationship of students’ performance in Math and their attitude
towards the subject. However, you are only given limited time to do the study. You may only
consider 25 out of 500 students in your school.

Census Or Complete Enumeration – is a method of data collection from entire population.


Example: To know the number of persons in different places in our country, the government
conducts census by taking into considerations the entire population.
Lesson 3 Presentation of Data

WHAT YOU SHOULD LEARN Pre-assessment:


LEARNlllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll
At the end of this lesson, you are expected 1. Which type of graph can represent the performance of
lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll
to:
lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll Asian countries in Stock Market based on its BMI Global
 identify the different presentation Indexes? How about average temperature of Baguio City
lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll
of data,
lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll in a year?
 illustrate graphs/charts of a given
lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll
data 2. Differentiate the three types of data presentation?
llll

KINDS OF GRAPHS/CHARTS

TEXTUAL TABULAR GRAPHICAL


detailed information numerical values are trends are easily
is given in textual presented using seen in graphs
presentation; tables, frequency compared to tables.
narrative report distribution table It is good to present
(qualitative data using pictures
purposes) or figures like
pictograph; Pie
Chart, Line Graph

KINDS OF GRAPHS/CHARTS

1. BAR GRAPH – is a graph drawn using rectangular bars to show how large each value is. The
bars can either be horizontal or vertical. It is used to show how is one item related to another.
It is composed of x-axis (horizontal) and y-axis (vertical); where x-axis has the categories
being measured and y-axis has a scale for the numbers in each category.
For instance, the Facebook Page BusinessWorld recorded the unique individuals tested per day
on Corona Virus Disease (COVID-19) for the month of April and May 2020.

Figure 1.0 Unique Individuals tested per day

Using figure 1.0, which day has a highest number of unique individuals tested? Which day was
the lowest?
As shown in the graph, May 14 has the highest number of unique individuals tested with a total
of 10 841, while April 5 was the lowest with 344 individuals tested during that day.

Another example, using multiple bar graph to compare different data which are not opposite in
nature.

Popol and Kupa's Savings in a Week (in


PHP)
35 33
30 26
25
25
20 20
20 18
16 15 Popol
15 12
10 Kupa
10
5
0
Monday Tuesday Wednesday Thursday Friday

Figure 1.1

Using figure 1.1, Which has more savings on Monday? Wednesday? Friday?
2. PIE GRAPH/CHART – is a circle divided
into sectors proportional to the
frequencies. It shows how a part of
something relates to the whole. It is
important to define what the whole
represents. It is used when you are
showing the relative proportion or
percentage of numbers that add up to a
sum.

For example, the Department of Foreign


Affairs reported the status of Filipinos
abroad regarding the record of Corona
Virus Disease (COVID-19) cases. They
presented four pie charts for Asia Pacific
Region, Middle East/Africa, Europe, and
America.
Figure 2.0

Another example,

6% 10%

20% 12%

6%

16%
30%

Office Rental Maintenance Taxes


Operating Expenses Wages Profit
Others

Figure 2.1

Using figure 2.1, answer the following:


a. Give title for the graph above.
b. If the monthly budget amounts PHP 500 000, how much is the monthly profit of the
business? Office rental? Maintenance?
c. If the monthly maintenance amounts to an average of PHP 25 000, find the amount of
the monthly budget.
d. What fraction of the whole budget is allotted to the operating expenses?
3. LINE GRAPH – shows trends in data clearly. A graph that shows information that is connected
in some way. It is used when you would like to show how one value changes with respect to
another over a certain period of time. The movement of the line indicates the variation of
these changes. It also provides a picture of a possible pattern allowing to predict possibilities.
Similar to bar graph it has a horizontal and vertical axis where the scale which is of equal
interval.

Figure 3.0 Figure 3.1

For instance, the graph at the left (Figure 3.0) shows the trend of number of daily
cases of COVID-19 in the Philippines. This graph can identify if the country is flattening its
curve in dealing with the disease. Another example is the analysis of University of the
Philippines regarding the post-Enhanced Community Quarantine (ECQ) measures relative to
healthcare capacity (see figure 3.1).

Line graphs are used extensively in Sales and Marketing,


Economics, and Business. Another example of it was the
stock market chart. It deals with the volume and price of
the stocks.
4. HISTOGRAM – is a graphical representation showing a visual impression of the distribution
of data. It is a bar graph showing data in a grouped frequency table. The bars are placed next
to each other to show that as one data interval ends, the next data interval begins.

For example, Mr. Jolly is worried about the customer complaint regarding long queues in the
branch. He wants to analyze first what is the frequency of a major customer’s waiting time.
He has called out the cashier and asked him the details. Below is the waiting time of the
customer at the cash counter of the Jolly Me during peak hours which was observed by the
cashier. Let’s use histogram to show the data graphically.

CUSTOMER CUSTOMER
WAITING TIME WAITING TIME FREQUENCY
(IN MINUTES) (IN MINUTES)
2.30 2.30-2.86 3
5.00 2.86-3.43 1
3.55 3.43-3.99 2
2.50 3.99-4.56 3
5.10 4.56-5.12 4
4.21
3.33
4.10
2.55
5.07
3.45
4.10
5.12

TRY IT YOURSELF

Construct a histogram given the 100 ages of Grade 7 students of General Tiburcio de
Leon National High School.

AGES OF GRADE
7 STUDENTS FREQUENCY
11-12 13
12-13 28
13-14 22
14-15 18
15-16 9
16-17 5
17-18 3
18-19 2
5. PICTURE GRAPH/ PICTOGRAM – it is a visual presentation of statistical quantities by means
of drawing pictures or symbols related to the subject under study. See figure a.

6. MAP GRAPH/CARTOGRAM – it is one of the best ways to present geographical data. This
kind of graph is always accompanied by a legend which tells us the meaning of the lines,
colors or other symbols used and positioned in a map. See figure b.

7. SCATTER PLOT DIAGRAM – it is a graphical device to show the relationship between two
quantitative variables. See figure c.

Figure a. COVID-19 Medical Figure b. Class Suspension in Figure c. Ice Cream Sales vs.
Assistance (OCHA) Metro Manila (Earth Shaker) Noon Temperature

CREATE A GRAPH/CHART USING EXCEL

1 Select the data for which you want to create a chart.


2 Click INSERT > Recommended Charts.
On the Recommended Charts tab, scroll through the list of charts that Excel recommends
3 for your data, and click any chart to see how your data will look.
If you don’t see a chart you like, click All Charts to see all the available chart types.
4 When you find the chart you like, click it > OK.
Use the Chart Elements, Chart Styles, and Chart Filters buttons, next to the upper-right
5 corner of the chart to add chart elements like axis titles or data labels, customize the look
of your chart, or change the data that is shown in the chart.
To access additional design and formatting features, click anywhere in the chart to add
6 the CHART TOOLS to the ribbon, and then click the options you want on the DESIGN and
FORMAT tabs.
Source:https://support.microsoft.com/en-us/office/video-create-a-chart-4d95c6a5-42d2-4cfc-aede-0ebf01d409a8
Examples:

1. Construct a pie chart given the data below.

INCOME STATEMENT (FOR THE YEAR ENDED DECEMBER 31, 2019)


REVENUES
Copying Services PHP 25 000
Internet Services PHP 55 000
Printing Services PHP 60 000

Select the data for which you want to create a chart.

Click INSERT > Recommended Charts.

On the Recommended Charts tab, scroll through the list of charts that Excel recommends for your data, and
click any chart to see how your data will look.
If you don’t see a chart you like, click All Charts to see all the available chart types.

When you find the chart you like, click it > OK.
4

Use the Chart Elements, Chart Styles, and Chart Filters buttons, next to the upper-right corner of the chart
to add chart elements like axis titles or data labels, customize the look of your chart, or change the data that is
shown in the chart.

To access additional design and formatting features, click anywhere in the chart to add the CHART TOOLS to the
ribbon, and then click the options you want on the DESIGN and FORMAT tabs.

NOTE: You may insert a chart title

6
2. The inflation rates of the Philippines from 2008-2014. Use a line graph.

Year 2008 2009 2010 2011 2012 2013 2014


Inflation 3.6 4.5 4.4 5.4 3.6 4.8 5.5
Rate (%)

Select the data for which you want to create a chart.

Click INSERT > Recommended Charts.

On the Recommended Charts tab, scroll through the list of charts that Excel recommends for your data, and
click any chart to see how your data will look.
If you don’t see a chart you like, click All Charts to see all the available chart types.

When you find the chart you like, click it > OK.
4

Use the Chart Elements, Chart Styles, and Chart Filters buttons, next to the upper-right corner of the chart
to add chart elements like axis titles or data labels, customize the look of your chart, or change the data that is
shown in the chart.

To access additional design and formatting features, click anywhere in the chart to add the CHART TOOLS to the
ribbon, and then click the options you want on the DESIGN and FORMAT tabs.

6
RANDOM VARIABLES AND PROBABILITY
UNIT DISTRIBUTION
2

1 Random Variables

Probability distributions are applied in variety of fields


2 Discrete Probability Distribution like economics, business, sports, weather, and insurance.
It helps us to describe, or predict, the probability of an
event.
Mean, Variance, and Standard Deviation of a For instance, in analyzing insurance policies to
3
Discrete Probability Distribution determine which plans are best for you or your family
and what deductible amounts you need.

This unit will discuss the concept of random variable and probability distribution. You will learn how to
construct the probability mass function of a discrete probability distribution and describe its properties
and characteristics by computing its mean and variance.
LessonV 1 Random Variables

WHAT YOU SHOULD LEARN Pre – assessment:


List the sample space of the following experiments.

At the end of this lesson, you are expected Experiment Sample Space
to: 1. Tossing three coins
 illustrate a random variable,
 distinguish between a discrete 2. Rolling a die
random variable and a 3. Getting a defective item when
continuous random variable; and two items are randomly
 find the possible values of a selected from a box of two
random variable. defective and three non-
defective items.

RANDOM VARIABLES

A random variable is a function or rule that assigns a number to each outcome of an experiment.
It is denoted by an uppercase letter while its lowercase counterpart represents the value of the
random variable.

KINDS OF RANDOM VARIABLES:

Discrete Random Variable – a random variable whose set of all possible values are countable.
Example: In tossing a coin, let X be the random variable representing the number of tails that occur.
X = 0, if it is head and X = 1, if it is tail.

Continuous Random Variable – a random variable whose set of all possible values are not countable
or infinite.
Example: An experiment is conducted to determine the distance that a certain type of car will travel
using 10 liters of gasoline over a prescribed test course. Let Y be the random variable representing
the distance, then Y ≥ 0.

TRY THIS!

Random or Not?
For each of the following, indicate whether it is or is not a random variable. Classify each random
variable as either discrete or continuous .

1. determining whether the trains arrive on time


2. number of sixes rolled in two rolls of a die that has six on all of its faces
3. classifying insects by their species
4. time between customers entering a checkout lane at a convenience store
5. number of clerical errors on a medical chart
Random or Not?

6. number of accident – free days in one month at EDSA


7. number of people out of 200 surveyed who says no to a question
8. number of lottery tickets you have to buy before you win the jackpot.
9. the temperature of a cup of coffee served in a restaurant
10. number of customers arriving at Jollibee between 5:00 PM to 6:00 PM

In this lesson, we will be focusing on discrete random variables only.


Example 1:
Suppose two coins are tossed. Let X be the random variable representing the number of tails
that occur. Find the values of the random variable X.
Steps Solution
1. Determine the sample space. Let H HH, HT, TH, and TT
represent head and T represent tail.
2. Count the number of tails in each outcome
in the sample space and assign this number Possible Value of the
to this outcome. Outcomes random
variable X
HH 0
HT 1
TH 1
TT 2

Possible outcomes of the random variable X: 0, 1, and 2

Example 2:
Suppose three cell phones are tested at random. Let D represent the defective cell phone and
let N represent the non-defective cell phone. Let Y be the random variable representing the
number of defective cell phones.
Steps Solution
1. Determine the sample space. Let D
represent the defective cell phone and let N
represent the non-defective cell phone. NNN, NND, NDN,DNN, DDN, DND,NDD, DDD

2. Count the number of defective cell phones


in each outcome in the sample space and Possible Outcomes Value of the
assign this number to this outcome. random variable Y
NNN 0
NND 1
NDN 1
DNN 1
DDN 2
DND 2
NDD 2
DDD 3
Possible outcomes of the random variable Y: 0, 1, 2, and 3
Lesson 2 Probability Distribution

WHAT YOU SHOULD LEARN Pre – assessment:


Find the probability of the following events.
At the end of this lesson, you are expected
Event Probability
to:
 illustrate a probability 1. Getting a sum of 7 when two dice are
distribution for a discrete random rolled.
variable and its properties, 2. Getting two heads in tossing three coins.
 compute probabilities 3. Getting a queen when a card is drawn form
corresponding to a given random a deck.
variable; and 4. Getting a red ball from a box containing 2
 construct a probability mass red balls and 4 black balls.
function of a discrete random
5. Getting doubles when two dice are rolled.
variable.

PROBABILITY DISTRIBUTION

A probability distribution is a function or rule that assigns the value of a random variable to the
probability associated with these values.

As we noted earlier, we use the uppercase letter to represent the random variable and lowercase
letter to represent the value of the random variable. Then, we represent the probability that the
random variable X will equal x as

P(X = x) or more simply as P(x)

Example 1:

Suppose three coins are tossed. Let Z be the random variable representing the number of heads
that occur. Find the probability values P(Z) to each value of the random variable.

Steps Solution
1. Determine the sample space. Let H
represent head and T represent tail. S = { HHH, THH, HTH, HHT, HTT, THT, TTH, TTT}

2. Determine the possible values of the


random variable Z representing the number Possible Outcomes Value of the
of heads. random variable Z
HHH 3
THH 2
HTH 2
HHT 2
HTT 1
THT 1
TTH 1
TTT 0
3. Assign probability values P(Z) to each value of the
random variable. Z P(Z)
𝟏⁄
0 𝟖
𝟑⁄
1 𝟖
𝟑⁄
2 𝟖
𝟏⁄
3 𝟖

Table 1.1. The Probability Distribution or the Probability Mass Function of Discrete Random Variable Z

Z 0 1 2 3

1 3 3 1
P(Z)
8 8 8 8

Example 2:
In a recent census, the number of televisions per household was recorded

Number of televisions 0 1 2 3 4 5
Number of households 1 218 32 379 37 961 19 386 7 714 2 842

a. Construct the probability distribution of X, the number of televisions per household.


b. Determine the following probabilities.
 P(X ≤ 2)
 P(X > 2)
 P(X ≥ 4)

Solution:

a. Construct the probability distribution of X, the number of televisions per household.


i. Determine the sum of the number of households.
1 218 + 32 379 + 37 961 + 19 386 + 7 714 + 2 842 = 101 500

ii. Assign the probability value P(X) to each value of the random variable. Reduce it to its
lowest term, if possible.
X P(X)
𝟑⁄
0 𝟐𝟓𝟎
𝟑𝟐 𝟑𝟕𝟗⁄
1 𝟏𝟎𝟏 𝟓𝟎𝟎
𝟏𝟖𝟕⁄
2 𝟓𝟎𝟎
𝟗 𝟔𝟗𝟑⁄
3 𝟓𝟎 𝟕𝟓𝟎
𝟏𝟗⁄
4 𝟐𝟓𝟎
𝟕⁄
5 𝟐𝟓𝟎

Table 1.2. The Probability Distribution or the Probability Mass Function of Discrete Random Variable X

X 0 1 2 3 4 5

3 32 379 187 9 693 19 7


P(X)
250 101 500 500 50 750 250 250

c. Determine the following probabilities.


 P(X ≤ 2)
 P(X > 2)
 P(X ≥ 4)

Solution:

 P(X ≤ 2), we are looking for the probability that the number of televisions per household is less than or
equal to 2. Those values of X are 0, 1 , and 2. Thus,

𝑃(𝑋 ≤ 2) = 𝑃(0) + 𝑃(1) + 𝑃(2)


3 32 379 187
𝑃(𝑋 ≤ 2) = + +
250 101 500 500
𝟑𝟓 𝟕𝟕𝟗
𝑷(𝑿 ≤ 𝟐) =
𝟓𝟎 𝟕𝟓𝟎

 P(X > 2) , we are looking for the probability that the number of televisions per household is greater than
2. Those values of X are 3, 4, and 5. Thus,

𝑃(𝑋 > 2) = 𝑃(3) + 𝑃(4) + 𝑃(5)


9 693 19 7
𝑃(𝑋 > 2) = + +
50 750 250 250
𝟏𝟒 𝟗𝟕𝟏
𝑷(𝑿 > 𝟐) =
𝟓𝟎 𝟕𝟓𝟎
 P(X ≥ 4), we are looking for the probability that the number of televisions per household is greater than
or equal to 4. Those values of X are 4 and 5. Thus,

𝑃(𝑋 ≥ 4) = 𝑃(4) + 𝑃(5)


19 7
𝑃(𝑋 ≥ 4) = +
250 250
𝟏𝟑
𝑷(𝑿 ≥ 𝟒) =
𝟏𝟐𝟓

Example 3:
An online seller advertises that he will deliver the products that a customer purchases in 3 to 6 days. The seller
wants to be precise in its advertising. Accordingly, she records the number of days it takes her to deliver the
goods to customers. From the data, the following probability distribution is developed.

Number of days 0 1 2 3 4 5 6 7 8
Probability 0 0 0.01 0.04 0.28 0.42 0.21 0.02 0.02

a. What is the probability that a delivery will be made within the advertised 3 to 6 day period?
b. What is the probability that a delivery will be late?
c. What is the probability that a delivery will be early?

Solution:

a. What is the probability that the delivery will be made within the 3 to 6 day period?

𝑃(3 ≤ 𝑋 ≤ 6) = 𝑃(3) + 𝑃(4) + 𝑃(5) + 𝑃(6)


𝑃(3 ≤ 𝑋 ≤ 6) = 0.04 + 0.28 + 0.42 + 0.21
𝑷(𝟑 ≤ 𝑿 ≤ 𝟔) = 𝟎. 𝟗𝟓
b. What is the probability that a delivery will be late?

𝑃(𝑋 > 6) = 𝑃(7) + 𝑃(8)


𝑃(𝑋 > 6) = 0.02 + 0.02
𝑷(𝑿 > 𝟔) = 𝟎. 𝟎𝟒

c. What is the probability that a delivery will be early?

𝑃(𝑋 < 3) = 𝑃(0) + 𝑃(1) + 𝑃(2)


𝑃(𝑋 < 3) = 0 + 0 + 0.01
𝑷(𝑿 < 𝟑) = 𝟎. 𝟎𝟏
In the preceding probability distributions, what do you notice about the probability of each value of the random
variable?
____________________________________________________________________________________________________________________________
____________________________________________________________________________________________________________________________
In each of the preceding probability distributions, get the sum of the probabilities of all values of the random
variable. What sum did you get?
____________________________________________________________________________________________________________________________
____________________________________________________________________________________________________________________________

PROPERTIES OF A PROBABILITY DISTRIBUTION

3. The probability of each value of the random variable must be between or equal to 0 and 1.
In symbol, we write it as 0 ≤ 𝑃(𝑋) ≤ 1.

4. The sum of the probabilities of all values of the random variable must be equal to 1.
In symbol, we write it as ∑ 𝑃(𝑋) = 1.
Lesson 3 Mean, Variance, and Standard Deviation of a Probability Distribution

Pre-assessment:
WHAT YOU SHOULD LEARN Complete the following frequency distribution table:
At the end of this lesson, you are expected X F ̅)
(𝑿 − 𝑿 ̅ )𝟐
(𝑿 − 𝑿 ̅ )𝟐
𝑭(𝑿 − 𝑿
to:
 illustrate the mean, variance, and
standard deviation of a discrete 5 3
random variable,
 calculate the mean or expected
8 5
value of a discrete probability
distribution; and
 compute for the variance and 10 4
standard deviation of a discrete
probability distribution.
12 5

Find: 15 3

a. Mean n=20
b. Variance
c. Standard Deviation

MEAN OF A DISCRETE RANDOM VARIABLE

MEAN OF A DISCRETE PROBABILITY DISTRIBUTION


 The mean μ of the discrete random variable X is called the expected value of X, E(X).
 The expected value of a discrete random variable is equal to the mean of the random variable.

Formula for the Mean of the probability Distribution:


The mean of a random variable with a discrete probability distribution is:

𝑬(𝑿) = 𝝁 = ∑[𝑿 ∙ 𝑷(𝑿)]

Where: X – value of the random variable


P(X) – probability of the random variable
Example 2:
An insurance company sells life insurance of ₱100 000 for a premium of ₱2 000 per
year. Actuarial tables show that the probability of death in the year following the
purchase of this policy is 0.1%. What is the expected gain of this policy? Let the
random variable X be the amount of gain of the insurance company.

Steps Solution
1. Construct the probability distribution
for the random variable. X P(x)
₱2 000 0.999

− ₱98 000 0.001

2. Multiply the value of the random


variable by the corresponding probability. X P(x) XP(x)
₱2 000 0.999 1 998

− ₱98 000 0.001 -98

3. Add the results obtained in Step 2.

X P(x) XP(x)
₱2 000 0.999 1 998

− ₱98 000 0.001 -98

₱1 900

INTERPRETATION: The insurance company’s expected gain from each individual who
avails of the policy is ₱1 900 each year.
VARIANCE AND STANDARD DEVIATION OF A DISCRETE RANDOM VARIABLE

The variance and standard deviation describe the amount of spread, dispersion, or variability of the
items in the distribution.

Formula for the Variance and Standard Deviation of a Discrete Probability Distribution

The variance of a discrete probability distribution is given by the formula:

𝝈𝟐 = ∑(𝑿 − 𝝁)𝟐 ∙ 𝑷(𝑿)

The standard deviation of a discrete probability distribution is given by the formula:

𝝈 = √∑(𝑿 − 𝝁)𝟐 ∙ 𝑷(𝑿)


Where:
X – value of the random variable
P(X) – probability of the random variable X
μ - mean of the probability distribution

Steps in Finding the Variance and Standard Deviation


1. Find the mean of the probability distribution.
2. Subtract the mean from each value of the random variable.
3. Square the results obtained in Step 2.
4. Multiply the results obtained in step 3 by the corresponding probability.
5. Get the sum of the results obtained in Step 4. ( the result is the value of the variance)
6. Get the square root of the variance to get the standard deviation.

Example:
1. Determine the variance and standard deviation of the following probability mass function.

X 1 2 3 4 5 6
P(x) 0.15 0.25 0.30 0.15 0.10 0.05

Finding the variance and standard deviation,


X P(x) X P(x) 𝒙− 𝝁 (𝒙 − 𝝁)𝟐 (𝒙 − 𝝁)𝟐 𝑷(𝒙)

1 0.15 0.15 -1.95 3.8025 0.570375


2 0.25 0.5 -0.95 0.9025 0.225625
3 0.30 0.9 0.05 0.0025 0.00075
4 0.15 0.6 1.05 1.1025 0.165375
5 0.10 0.5 2.05 4.2025 0.42025
6 0.05 0.3 3.05 9.3025 0.465125
2.95 1.8475

𝜎 2 = 1.8475 𝑜𝑟 1.85

𝜎 = √1.8475 = 1.359227722 𝑜𝑟 1.36


Thus, the variance is 1.85 and the standard deviation is 1.36
1
6
Statistics can be defined as a process behind how we
make discoveries, make decisions based on data,

UNIT NORMAL CURVE DISTRIBUTION


and make predictions. The application of Statistics
is very wide for it plays a vital role in every field of
human activity.
3

1 Normal Curve Distribution


Standard Normal Distribution

2 Regions of Areas Under the Normal Curve

3 Applications of Normal Curve


Concepts in Real Life Problems
Lesson 1 The Normal Curve Distribution

At the end of the lesson you are Normal Curve Distribution


expected to:
Also known as Gaussian distribution, is a probability distribution
 illustrates a normal random
that is symmetric about the mean. The shape and position of the
variable and its
normal distribution curve depend on two parameters, the mean
characteristics.
and the standard deviation.
 constructs a normal curve.
 identifies regions under the
normal curve
corresponding to different
standard normal values.

PROPERTIES NORMAL CURVE DISTRIBUTION:

1. The distribution curve is bell-shaped.


2. The curve is symmetrical about its center.
3. The mean median and mode coincide at the center.
4. The width of the curve is determined by the standard deviation of the distribution.
5. The tails of the curve is always approaching the horizontal axis but never touching it. “The curve is
asymptotic to the baseline.
6. The area under the curve is 1.

A normal distribution can have any mean and any positive standard deviation. These two
parameters are completely determine the shape of the normal curve. The mean gives the location of the
line of symmetry, and the standard deviation describes how much the data are spread out.
The total area under the normal distribution
curve is equal to 1.00 or 100%.
Empirical Rule
The area under the normal curve that lies within
 one standard deviation of the mean is
approximately 0.68 (68%).
 two standard deviations of the mean is
approximately 0.95 (95%).
 three standard deviations of the mean is
approximately 0.997 ( 99.7%).

STANDARD NORMAL DISTRIBUTION

The standard normal distribution is a normal distribution with a mean of 0 and a standard
deviation of 1.
Since each normally distributed variable has its own mean and standard deviation, the shape and location
of these curves will vary. In practical applications, one would have to have a table of areas under the curve
for each variable. To simplify this, statisticians use the standard normal distribution.
Standard Normal Cumulative Probability Table
Cumulative probabilities for NEGATIVE z-values are shown in the following table:

Z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
-3.4 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0002
-3.3 0.0005 0.0005 0.0005 0.0004 0.0004 0.0004 0.0004 0.0004 0.0004 0.0003
-3.2 0.0007 0.0007 0.0006 0.0006 0.0006 0.0006 0.0006 0.0005 0.0005 0.0005
-3.1 0.0010 0.0009 0.0009 0.0009 0.0008 0.0008 0.0008 0.0008 0.0007 0.0007
-3.0 0.0013 0.0013 0.0013 0.0012 0.0012 0.0011 0.0011 0.0011 0.0010 0.0010
-2.9 0.0019 0.0018 0.0018 0.0017 0.0016 0.0016 0.0015 0.0015 0.0014 0.0014
-2.8 0.0026 0.0025 0.0024 0.0023 0.0023 0.0022 0.0021 0.0021 0.0020 0.0019
-2.7 0.0035 0.0034 0.0033 0.0032 0.0031 0.0030 0.0029 0.0028 0.0027 0.0026
-2.6 0.0047 0.0045 0.0044 0.0043 0.0041 0.0040 0.0039 0.0038 0.0037 0.0036
-2.5 0.0062 0.0060 0.0059 0.0057 0.0055 0.0054 0.0052 0.0051 0.0049 0.0048
-2.4 0.0082 0.0080 0.0078 0.0075 0.0073 0.0071 0.0069 0.0068 0.0066 0.0064
-2.3 0.0107 0.0104 0.0102 0.0099 0.0096 0.0094 0.0091 0.0089 0.0087 0.0084
-2.2 0.0139 0.0136 0.0132 0.0129 0.0125 0.0122 0.0119 0.0116 0.0113 0.0110
-2.1 0.0179 0.0174 0.0170 0.0166 0.0162 0.0158 0.0154 0.0150 0.0146 0.0143
-2.0 0.0228 0.0222 0.0217 0.0212 0.0207 0.0202 0.0197 0.0192 0.0188 0.0183
-1.9 0.0287 0.0281 0.0274 0.0268 0.0262 0.0256 0.0250 0.0244 0.0239 0.0233
-1.8 0.0359 0.0351 0.0344 0.0336 0.0329 0.0322 0.0314 0.0307 0.0301 0.0294
-1.7 0.0446 0.0436 0.0427 0.0418 0.0409 0.0401 0.0392 0.0384 0.0375 0.0367
-1.6 0.0548 0.0537 0.0526 0.0516 0.0505 0.0495 0.0485 0.0475 0.0465 0.0455
-1.5 0.0668 0.0655 0.0643 0.0630 0.0618 0.0606 0.0594 0.0582 0.0571 0.0559
-1.4 0.0808 0.0793 0.0778 0.0764 0.0749 0.0735 0.0721 0.0708 0.0694 0.0681
-1.3 0.0968 0.0951 0.0934 0.0918 0.0901 0.0885 0.0869 0.0853 0.0838 0.0823
-1.2 0.1151 0.1131 0.1112 0.1093 0.1075 0.1056 0.1038 0.1020 0.1003 0.0985
-1.1 0.1357 0.1335 0.1314 0.1292 0.1271 0.1251 0.1230 0.1210 0.1190 0.1170
-1.0 0.1587 0.1562 0.1539 0.1515 0.1492 0.1469 0.1446 0.1423 0.1401 0.1379
-0.9 0.1841 0.1814 0.1788 0.1762 0.1736 0.1711 0.1685 0.1660 0.1635 0.1611
-0.8 0.2119 0.2090 0.2061 0.2033 0.2005 0.1977 0.1949 0.1922 0.1894 0.1867
-0.7 0.2420 0.2389 0.2358 0.2327 0.2296 0.2266 0.2236 0.2206 0.2177 0.2148
-0.6 0.2743 0.2709 0.2676 0.2643 0.2611 0.2578 0.2546 0.2514 0.2483 0.2451
-0.5 0.3085 0.3050 0.3015 0.2981 0.2946 0.2912 0.2877 0.2843 0.2810 0.2776
-0.4 0.3446 0.3409 0.3372 0.3336 0.3300 0.3264 0.3228 0.3192 0.3156 0.3121
-0.3 0.3821 0.3783 0.3745 0.3707 0.3669 0.3632 0.3594 0.3557 0.3520 0.3483
-0.2 0.4207 0.4168 0.4129 0.4090 0.4052 0.4013 0.3974 0.3936 0.3897 0.3859
-0.1 0.4602 0.4562 0.4522 0.4483 0.4443 0.4404 0.4364 0.4325 0.4286 0.4247
0.0 0.5000 0.4960 0.4920 0.4880 0.4840 0.4801 0.4761 0.4721 0.4681 0.4641

Standard Normal Cumulative Probability Table


Cumulative probabilities for POSITIVE z-values are shown in the following table:
Z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
0.0 0.5000 0.5040 0.5080 0.5120 0.5160 0.5199 0.5239 0.5279 0.5319 0.5359
0.1 0.5398 0.5438 0.5478 0.5517 0.5557 0.5596 0.5636 0.5675 0.5714 0.5753
0.2 0.5793 0.5832 0.5871 0.5910 0.5948 0.5987 0.6026 0.6064 0.6103 0.6141
0.3 0.6179 0.6217 0.6255 0.6293 0.6331 0.6368 0.6406 0.6443 0.6480 0.6517
0.4 0.6554 0.6591 0.6628 0.6664 0.6700 0.6736 0.6772 0.6808 0.6844 0.6879
0.5 0.6915 0.6950 0.6985 0.7019 0.7054 0.7088 0.7123 0.7157 0.7190 0.7224
0.6 0.7257 0.7291 0.7324 0.7357 0.7389 0.7422 0.7454 0.7486 0.7517 0.7549
0.7 0.7580 0.7611 0.7642 0.7673 0.7704 0.7734 0.7764 0.7794 0.7823 0.7852
0.8 0.7881 0.7910 0.7939 0.7967 0.7995 0.8023 0.8051 0.8078 0.8106 0.8133
0.9 0.8159 0.8186 0.8212 0.8238 0.8264 0.8289 0.8315 0.8340 0.8365 0.8389
1.0 0.8413 0.8438 0.8461 0.8485 0.8508 0.8531 0.8554 0.8577 0.8599 0.8621
1.1 0.8643 0.8665 0.8686 0.8708 0.8729 0.8749 0.8770 0.8790 0.8810 0.8830
1.2 0.8849 0.8869 0.8888 0.8907 0.8925 0.8944 0.8962 0.8980 0.8997 0.9015
1.3 0.9032 0.9049 0.9066 0.9082 0.9099 0.9115 0.9131 0.9147 0.9162 0.9177
1.4 0.9192 0.9207 0.9222 0.9236 0.9251 0.9265 0.9279 0.9292 0.9306 0.9319
1.5 0.9332 0.9345 0.9357 0.9370 0.9382 0.9394 0.9406 0.9418 0.9429 0.9441
1.6 0.9452 0.9463 0.9474 0.9484 0.9495 0.9505 0.9515 0.9525 0.9535 0.9545
1.7 0.9554 0.9564 0.9573 0.9582 0.9591 0.9599 0.9608 0.9616 0.9625 0.9633
1.8 0.9641 0.9649 0.9656 0.9664 0.9671 0.9678 0.9686 0.9693 0.9699 0.9706
1.9 0.9713 0.9719 0.9726 0.9732 0.9738 0.9744 0.9750 0.9756 0.9761 0.9767
2.0 0.9772 0.9778 0.9783 0.9788 0.9793 0.9798 0.9803 0.9808 0.9812 0.9817
2.1 0.9821 0.9826 0.9830 0.9834 0.9838 0.9842 0.9846 0.9850 0.9854 0.9857
2.2 0.9861 0.9864 0.9868 0.9871 0.9875 0.9878 0.9881 0.9884 0.9887 0.9890
2.3 0.9893 0.9896 0.9898 0.9901 0.9904 0.9906 0.9909 0.9911 0.9913 0.9916
2.4 0.9918 0.9920 0.9922 0.9925 0.9927 0.9929 0.9931 0.9932 0.9934 0.9936
2.5 0.9938 0.9940 0.9941 0.9943 0.9945 0.9946 0.9948 0.9949 0.9951 0.9952
2.6 0.9953 0.9955 0.9956 0.9957 0.9959 0.9960 0.9961 0.9962 0.9963 0.9964
2.7 0.9965 0.9966 0.9967 0.9968 0.9969 0.9970 0.9971 0.9972 0.9973 0.9974
2.8 0.9974 0.9975 0.9976 0.9977 0.9977 0.9978 0.9979 0.9979 0.9980 0.9981
2.9 0.9981 0.9982 0.9982 0.9983 0.9984 0.9984 0.9985 0.9985 0.9986 0.9986
3.0 0.9987 0.9987 0.9987 0.9988 0.9988 0.9989 0.9989 0.9989 0.9990 0.9990
3.1 0.9990 0.9991 0.9991 0.9991 0.9992 0.9992 0.9992 0.9992 0.9993 0.9993
3.2 0.9993 0.9993 0.9994 0.9994 0.9994 0.9994 0.9994 0.9995 0.9995 0.9995
3.3 0.9995 0.9995 0.9995 0.9996 0.9996 0.9996 0.9996 0.9996 0.9996 0.9997
3.4 0.9997 0.9997 0.9997 0.9997 0.9997 0.9997 0.9997 0.9997 0.9997 0.9998
PROPERTIES OF STANDARD NORMAL DISTRIBUTION

1. The cumulative area is close to 0 for z-scores close to z = -3.49


2. The cumulative area increases as the z-scores increase.
3. The cumulative area for is 0.5000.
4. The cumulative area is close to 1 for z-scores close to z = 3.49.

EXAMPLE

Using the Standard Normal Table:

1. Find the area that corresponds to a z-score of 1.13.


2. Find the area that corresponds to a z-score of -2.57.
3. Find the area that corresponds to a z-score of 0.36.
Example:
1. Find the area that corresponds to z = 1.13 by finding 1.1 in the left column and then moving
across the row to the column under 0.03. The number in that row and column is 0.8749. So, the
area to the left of z = 1.13 is 0.8708.

2. Find the area that corresponds to z = -2.57 by finding -2.5 in the left column and then moving
across the row to the column under 0.07. The number in that row and column is 0.8749. So, the
area to the left of z = 1.13 is 0.0051.

3. Find the area that corresponds to z = 0.36 by finding 0.3 in the left column and then moving
across the row to the column under 0.06. The number in that row and column is 0.8749. So, the
area to the left of z = 1.13 is 0.6406.
Lesson 2 Regions of Area Under the Normal Curve

We already know that the area under the curve is equal to


WHAT YOU SHOULD LEARN 1. So, we can make the correspondence between the area and
LEARNlllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll probability. We also learned how to use the z-table so that we can
At the end of the lesson you are
llllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll identify areas of regions under the normal curve. When we say
llllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll
expected to: region under the curve, we are pertaining to the area of that
llllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll
 Identify the regions of the region.
llllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll
areas under the normal
llllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll
curve. Task:
llll
 Express the areas under the Sketch the graph of a normal curve. Draw a vertical line through
normal curve as the specified z-values and shade the region.
probabilities or percentage. 1. z < 1.13
 Determine the areas under 2. z > 1.13
the normal curve given z- 3. -1.37 < z < 1.13
values. 4. z < -1.47  z > 1.13

1. z < 1.13 2. z > 1.13

1.13
1.13

3. -1.37 < z < 1.13 4. z < -1.47  z > 1.13

1.13 -1.47
- 11.13
2. .
Area Under the Standard Normal Distribution Curve
1. To the left of any z value:
Look up the z value in the table and use the area given.

2. The area to the left


of z = 1.13 is 0.8708.
Multiply it to 100 to
solve for its
percentage.
87.08%
1.13
1. Use the table to find
2. To the right of any z value: the area for the z score.
Look up the z value and subtract the area from 1.

3. Subtract to find the area to the


2. The area to the right of z = 1.13:
left of z = 1.13 is 1- 0.8907= 0.1292
0.8708. Multiply it to 100 to solve for its
percentage.
12.92%
1.13
1. Use the table to find
the area for the z score.

Or simple look for -1.13 on the given table.


So if z = -1.13 the area is 0.1292
3. Between two z values:
Look up both z values and subtract the corresponding areas.

1. Use the table to find


the area for the z score.

78.55%

1.13
2. The area to the
left of z = 1.13 is
2. The area to the 0.8708.
left of z = -1.37
is 0.0853. 4. Subtract to find the area of the
region between the two z-scores:
4. If z < -1.47  z > 1.13 0.8708 – 0.0853 = 0.7855
Multiply it to 100 to solve for its
1. Use the table to find percentage.
the area for the z score.

4. Add the result of number 2 and 3.


0.0068 + 0.1292 = 0.1360
Multiply it to 100 to solve for its
percentage.

.68% 12.92%
-2.47 1.13 3. The area to the left
2. The area to the
of z = 1.13 is 0.8708.
left of z = -2.47
1- 0.8907= 0.1292
is 0.0068.
13.60%
Alternative Solution 1. Use the table to find
the area for the z score.

4. Subtract the result of 5. So in order for us to


number 2 from 3. get the area of shaded
0.8708 – 0.0068 = 0.8640 region. Subtract 0.8640
Take note that this area from 1.
is the unshaded region. 1 – 0.8640 = .1360

13.60
86.40%
%
-2.47 1.13
2. The area to the
left of z = -2.47
is 0.0068. 3. The area to the left
of z = 1.13 is 0.8708.

Find the z value such that the area under the standard normal distribution curve
between 0 and the z value is 0.2123.

3. Add 0.5000 + 0.2123


to get the area of the to
the left of the unknown
0.5000 + 0.2123 = 0.7123

2. The area to the


1. Use the table to find
left of z = 0 is
the area for the z score.
0.5000.

Add .5000 to .2123 to get the cumulative area of .7123. Then look for that value inside Table.

The z value is 0.56


Lesson 3 Application of Normal Curve Concepts in Real-Life Problems

Application of Normal Curve Distribution


WHAT YOU SHOULD LEARN
LEARNlllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll The standard normal distribution curve can be used to solve a
At the end of the lesson you are
llllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll wide variety of practical problems. The only requirement is that
llllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll
expected to: the variable be normally or approximately normally distributed.
llllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll
 Apply the normal curve in
llllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll To solve problems by using the standard normal distribution,
solving word problems transform the original variable to a standard normal distribution
llllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll
 Develops habit of reasoning variable by using the z value formula.
llll
using the normal curve
Z – Value (Standard Value)
concepts
The z value is the number of standard deviations that a particular
X value is away from the mean. The formula for finding the z value
is:
𝑣𝑎𝑙𝑢𝑒 − 𝑚𝑒𝑎𝑛 𝑿−
𝑧= 𝒛=
𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 

EXAMPLE

A survey by the National Retail Federation found that women spend on average $146.21 for the Christmas
holidays. Assume the standard deviation is $29.44. Find the percentage of women who spend less than
$160.00. Assume the variable is normally distributed.

Step 1: Draw the Normal Distribution Step 2: Find the z value corresponding to $160.00
Curve.

𝑋− 160 − 146.21


𝒛= = = 𝟎. 𝟒𝟕
 29.44

Step 3: Find the area to the left of z = 0.47.

68.08%
The table gives us an area of .6808.
0 0.47 68% of women spend less than $160.
EXAMPLE

Each month, an American household generates an average of 28 pounds of newspaper for garbage or
recycling. Assume the standard deviation is 2 pounds. If a household is selected at random, find the
probability1.of its generating between 27 and 31 pounds per month. Assume the variable is approximately
normally distributed.

Step 1: Draw the Normal Distribution Step 2: Find the z value corresponding to $160.00
Curve.
𝑋− 27 − 28
𝒛= = = −𝟎. 𝟓
 2

𝑋− 31 − 28
𝒛= = = 𝟏. 𝟓
 2

Step 3: Find the area between z value of -0.5 and 1.5.

62.47%
1. The area to the left of 2. The area to the left of
z = -0.5 is .3085 -0.5 0 1.5
z = 1.5 is .9332
3. Subtract # 1 from #2
EXAMPLE
.9332 - .3085 = .6247
The probability is 62%

The American Automobile Association reports that the average time it takes to respond to an emergency call
is 25 minutes. Assume the variable is approximately normally distributed and the standard deviation is 4.5
minutes. If 80 calls are randomly selected, approximately how many will be responded to in less than 15
minutes?

Step 1: Draw the Normal Distribution Step 2: Find the z value corresponding to $160.00
Curve.

𝑋− 15 − 25
𝒛= = = −𝟐. 𝟐𝟐
 4.5
Step 3: Find the area to the left of z = -2.22.

To find how many calls will be made


in less than 15 minutes, multiply the
sample size 80 by the area of the
shaded region 0.0132 which is equal
to 1.056.
1.32% Hence, approximately 1 call will be
responded to in under 15 minutes.

The table gives us an


area of .0132. -2.22 0
EXAMPLE

To qualify for a police academy, candidates must score in the top 10% on a general abilities test. The test has a
mean of 200 and a standard deviation of 20. Find the lowest possible score to qualify. Assume the test scores
are normally distributed.

Step 1: Draw the Normal Distribution


Curve.

Step 2: Subtract 1 - 0.1000 to find area to the left, 0.9000.


Look for the closest value to that in Table.

Specific
Value

.9000

The z-value is 1.28 𝑋− 𝑿 = 𝑧 +  Closest


𝒛=
 =200 + 1.28(20) Value
=225.60

The cutoff, the lowest possible score to qualify, is 226.


UNIT SAMPLING DISTRIBUTION AND
CENTRAL LIMIT THEOREM

Did you know that if you increase the number of dice


1 Sampling Distribution
you roll, you will obtain a normal distribution of possible
results? In this unit, you will learn about the importance of the
2 Central Limit Theorem quantity of samples (sample size) to obtained a normally
distributed data. French mathematician Abraham de Moivre
used the normal distribution to approximate the distribution of
the number of heads that will result when a fair coin is tossed a
large number of times. It is called the Central Limit Theorem
(CLT) and later Russian mathematician Aleksander Lyapunov
gave its first rigorous proof.
Lesson 1 SAMPLING DISTRIBUTION

PRE-ASSESSMENT:

At the end of this lesson, you are Ednel is working at TN Department Store in
expected to: Valenzuela City. The number of bags he was
 illustrates random sampling, able to sell for three days are: 20, 30, and
 identifies sampling 50. List all the possible samples of size 2
distribution of statistics that can be drawn from the population with
(sample mean), replacement.
 find the mean and variance
of the sampling distribution A SAMPLING DISTRIBUTION is the probability
of the sample mean. distribution of a sample statistic that is formed when samples of
size n are taken from a population.

A SAMPLING DISTRIBUTION OF SAMPLE MEANS is a frequency distribution using the


means computed from all possible random samples of a specific size taken from a population.

SAMPLING ERROR refers to the difference between the sample mean and the population.

EXAMPLE #1

A population consists of the numbers 2, 4, 9, 10, and 5. Let us list all possible sample size
of 3 from this population and compute the mean of each sample.

STEP
1 Determine the number of
sets of all possible random NCn=
𝑁! 5! 5!
= 3!(5−3)! = 3!2! = 10
𝑛!(𝑁−𝑛)!
samples that can be drawn
from a given population.
Sample ̅
𝒙
𝑁! 2, 4, 9 5
NCn=
𝑛!(𝑁−𝑛)! 2, 4, 10 5.33
2, 4, 5 3.67
List all the possible samples 2, 9, 10 7
and compute the mean of 2, 9, 5 5.33
each sample. 2, 10, 5 5.67
4, 9, 10 7.67
4, 9, 5 6
4, 10, 5 6.33
9, 10, 5 8
2 Construct the sampling
distribution of the means. ̅
𝒙 Frequency 𝑃(𝑥
̅)
3.67 1 1/10
5 1 1/10
5.33 2 2/10
5.67 1 1/10
6 1 1/10
6.33 1 1/10
7 1 1/10
7.67 1 1/10
8 1 1/10

3 Illustrate using histogram

EXAMPLE #2

Going back to the situation given in Pre-Assessment, let’s identify the samples given the number
of bags he was able to sell for three days which are 20, 30, and 50. (with replacement)

STEP
1 List all the possible samples
Observation Sample ̅
𝒙
and compute the mean of
1 20, 30 (20+30)/2 = 25
each sample.
2 20, 50 (20+50)/2 = 35
3 30, 50 (30+50)/2 = 40
4 30, 20 (30+20)/2 = 25
5 50, 20 (50+20)/2 = 35
6 50, 30 (50+30)/2 = 40
7 20, 20 (20+20)/2 = 20
8 30, 30 (30+30)/2 = 30
9 50, 50 (50+50)/2 = 50
2 Construct the sampling
̅
𝒙 Frequency 𝑃(𝑥
̅)
distribution of the means.
20 1 1/9
25 2 2/9
30 1 1/9
35 2 2/9
40 2 2/9
50 1 1/9
3 Illustrate using histogram

EXAMPLE #3

Nanno receives 92 or 93 as her grade on her three major subjects: Basic Calculus (BC), General
Chemistry (GC), and General Biology (GB). Construct the sampling distribution of her mean grade.

STEP
1 List all the possible samples
and compute the mean of BC GC GB ̅
𝒙
each sample. 92 92 92 92
92 92 93 92.33
92 93 92 92.33
92 93 93 93.67
93 92 92 92.33
93 92 93 92.67
93 93 92 92.67
93 93 93 93

2 Construct the sampling


distribution of the means. ̅
𝒙 Frequency 𝑥)
𝑃(̅
92 1 1/8
92.33 3 3/8
92.67 3 3/8
93 1 1/8
3 Illustrate using histogram

2.1. What is the probability that her mean grade is lower than 93?

𝑃(𝑥̅ < 93) = 𝑃(92) + 𝑃(92.33) + 𝑃(92.67)


1 3 3 7
𝑃(𝑥̅ < 93) = + + = 𝑜𝑟 0.875
8 8 8 8
Hence, the probability that her mean grade lower than 93 is 87.5%.

2.2. What is the probability that her mean grade is greater than 92.33?

𝑃(𝑥̅ > 92.33) = 𝑃(92.67) + 𝑃(93)


3 1 4
𝑃(𝑥̅ > 92.33) = 8 + 8 = 8 𝑜𝑟 0.5

Therefore, the probability that her mean grade greater than 92.33 is 50%.
PROPERTIES OF SAMPLING DISTRIBUTION OF SAMPLE MEANS

1. The mean of the sample means 𝑥̅ is equal to the population mean 𝜇.


𝜇𝑥̅ = 𝜇

To solve for mean of the sample means 𝜇𝑥̅ : 𝜇𝑥̅ = ∑[𝑥̅ ∙ 𝑃(𝑥̅ )]

2. The variance of the sampling distribution of the sample means 𝜎 2𝑥̅ is given by
𝜎 2𝑥̅ = ∑[𝑃(𝑥 ̅ − 𝜇)2 ]
̅) ∙ (𝑥

or
̅)2 𝑃(𝑥
𝜎 2𝑥̅ = ∑[(𝑥 ̅)] − 𝜇2

If 𝜎 2 and 𝑛 is given, then


𝜎2
𝜎 2𝑥̅ = 𝑛
for infinite population (with replacement)

𝜎 2 𝑁−𝑛
𝜎 2𝑥̅ = ∙
𝑛 𝑁−1
for finite population (without replacement)

3. The standard deviation of the sampling distribution of the sample mean is given by:

𝜎2
𝜎 𝑥̅ = √ 𝑛 for infinite population (with replacement)

𝜎 2 𝑁−𝑛
𝜎 𝑥̅ = √ ∙ for finite population (without replacement)
𝑛 𝑁−1

𝜎
𝜎𝑥̅ = The standard deviation of the sampling distribution of the sample
√𝑛
. mean is called the STANDARD ERROR of the mean.

EXAMPLE #1.1

Refer to example #1, a population consists of the numbers 2, 4, 9, 10, and 5.

a. Compute the population mean.


∑ 𝑥 2 + 4 + 9 + 10 + 5
𝜇= = =6
𝑛 5
b. Compute the population variance.

2
∑(𝑥 − 𝜇)2 (2 − 6)2 + (4 − 6)2 + (9 − 6)2 + (10 − 5)2 + (5 − 5)2
𝜎 = = = 9.2
𝑛 5
c. Compute the mean of the sample means 𝑥̅ .

𝑥̅ 𝑃(𝑥
̅) 𝑥̅ ∙ 𝑃(𝑥
̅)
3.67 1/10 0.367
5 1/10 0.5
5.33 2/10 1.066
5.67 1/10 0.567
6 1/10 0.6
6.33 1/10 0.633
7 1/10 0.7
7.67 1/10 0.767
8 1/10 0.8

𝜇𝑥̅ = ∑[𝑥̅ ∙ 𝑃(𝑥̅ )] = 𝟔

d. Compute the variance of the sampling distribution of the sample means.


𝑥̅ 𝑃(𝑥
̅) 𝑥̅ ∙ 𝑃(𝑥
̅) (𝑥̅ )2 (𝑥̅ )2 𝑃(𝑥
̅)
3.67 1/10 0.367 13.4689 1.34689
5 1/10 0.5 25 2.5
5.33 2/10 1.066 28.4089 5.68178
5.67 1/10 0.567 32.1489 3.21489
6 1/10 0.6 36 3.6
6.33 1/10 0.633 40.0689 4.00689
7 1/10 0.7 49 4.9
7.67 1/10 0.767 58.8289 5.88289
8 1/10 0.8 64 6.4
𝜇𝑥̅ = 6 ∑[(𝑥̅ )2 𝑃(𝑥̅ )] = 37.53334

̅)2 𝑃(𝑥
𝜎 2𝑥̅ = ∑[(𝑥 ̅)] − 𝜇2 = 37.53334 − (6)2

𝜎 2𝑥̅ = 1.53334 𝑜𝑟 𝟏. 𝟓𝟑

You can use the alternative method. Using the population variance,

𝜎2 𝑁 − 𝑛
𝜎 2𝑥̅ = ∙
𝑛 𝑁−1
9.2 5 − 3
𝜎 2𝑥̅ = ∙ = 𝟏. 𝟓𝟑
3 5−1
EXAMPLE #4

If the 𝝈𝟐 𝒙̅ of the sampling distribution of means is 𝟐. 𝟓, find the population


variance 𝝈𝟐 and given the sample size is n = 4.
Manipulate the formula of 𝜎 2𝑥̅ ,

𝜎2
𝜎 2𝑥̅ =
𝑛
𝜎 2 = (𝜎 2𝑥̅ )(𝑛)

𝜎 2 = (2.5)(4)

𝜎 2 = 10

Hence, the population variance is 10.

EXAMPLE #5

Suppose a random sample of size 200 is taken from a population with a mean of 510
kg and standard deviation of 15kg.
a. Find the mean and the variance of the sample mean.
b. If it is required to reduce the standard error of the mean to less than 0.5 kg,
what is the minimum sample size.

a. The mean is 510 kg.


The variance of the sample mean is 1.13, since 15 is the population standard
deviation then we can use this formula.

𝜎2
𝜎 2𝑥̅ =
𝑛
152
𝜎 2𝑥̅ =
200
𝜎 2𝑥̅ = 1.13

b. Using the formula of 𝜎𝑥̅ ,


𝜎
𝜎𝑥̅ =
√𝑛
15
0.5 >
√𝑛
(0.5)(√𝑛) > 15

15
√𝑛 >
0.5

√𝑛 > 30
𝑛 > 900

Hence, the minimum sample size is 901.


UNIT SAMPLING DISTRIBUTION AND
CENTRAL LIMIT THEOREM

Did you know that if you increase the number of dice


1 Sampling Distribution
you roll, you will obtain a normal distribution of possible
results? In this unit, you will learn about the importance of the
2 Central Limit Theorem quantity of samples (sample size) to obtained a normally
distributed data. French mathematician Abraham de Moivre
used the normal distribution to approximate the distribution of
the number of heads that will result when a fair coin is tossed a
large number of times. It is called the Central Limit Theorem
(CLT) and later Russian mathematician Aleksander Lyapunov
gave its first rigorous proof.
Lesson 2 CENTRAL LIMIT THEOREM

PRE-ASSESSMENT:

At the end of this lesson, you are Try this before you proceed to the next part of the lesson.
expected to:
 illustrate the Central Limit Given a die, it has 6 faces in which each
Theorem, face has either dot/s of 𝑥 = 1, 2, 3, 4, 5, 6.
 defines the sampling Given it as the population, consider the
distribution of mean using following sample size:
the Central Limit Theorem,  𝑛=1
and  𝑛=2
 solve problems involving  𝑛=3
sampling distribution of
mean. Illustrate the probability histogram of the
sampling distribution of the mean.

Observe the probability histogram for 𝑛 = 1 and 𝑛 = 2.

For 𝑛 = 1

̅
𝒙 Frequency 𝑥)
𝑃(̅
1 1 1/6
2 1 1/6
3 1 1/6
4 1 1/6
5 1 1/6
6 1 1/6

For 𝑛 = 2

̅
𝒙 Frequency 𝑃(𝑥
̅)
1 1 1/36
1.5 2 2/36
2 3 3/36
2.5 4 4/36
3 5 5/36
3.5 6 6/36
4 5 5/36
4.5 4 4/36
5 3 3/36
5.5 2 2/36
6 1 1/36

Try to illustrate when 𝑛 = 3.


Given probability histogram of the sampling distribution of the mean, observe that the more
you increase the sample size, the more likely it shows that the distribution of the probabilities is
normal. This illustrates the idea of the Central Limit Theorem.
The first version of the Central Limit Theorem was proved by the French-born mathematician
Abraham de Moivre (1667-1754). He used the normal distribution to approximate the distribution of
the number of heads that will result when a fair coin is tossed a large number of times. The first
rigorous proof of the general Central Limit Theorem was introduced by the Russian mathematician,
mechanician and physicist, Aleksander Lyapunov (1857-1918).

CENTRAL LIMIT THEOREM

This theorem describes the relationship between the sampling distribution of sample means and the
population that the samples are taken from.

1. The samples of size n, where 𝑛 ≥ 30, are drawn from any population with a mean 𝜇 and a standard
deviation 𝜎, then the sampling distribution of sample means approximates a normal distribution.
The greater the sample size, the better the approximation.
2. If the population itself is normally distributed, then the sampling distribution of sample means is
normally distributed for any sample size n.

𝑥−𝜇
𝑧= 𝜎
√𝑛

or

𝑥−𝜇𝑥̅
𝑧=
𝜎𝑥̅
EXAMPLE #1

The population mean monthly salary for Associate Professor is about ₱ 63 500. A
random sample of 35 Associate Professor is drawn from the population. What is the
probability that the mean salary of the sample is less than ₱ 60 000? Assume that
𝝈 = ₱ 𝟔 𝟏𝟎𝟎.

Using the formula of Central Limit Theorem,


𝑥−𝜇
𝑧= 𝜎
√𝑛
If 𝑥 = 60 000,
60 000 − 63 500
𝑧=
6100
√35
𝑧 = −3.40
Finding the probability to the left of 𝑧,

𝑃(𝑧 < −3.40) = 0.0003

𝑃(𝑧 < −3.40) = 0.03%

Thus, the probability that mean monthly salary of an Associate Professor is less
than ₱ 60 000 is 0.03%.

EXAMPLE #2

Out of 150 teenager drivers, you randomly picked 50 drivers. What is the probability
that is mean time they spend driving each day is between 24.7 and 25.5 minutes?
Assume that 𝝈 = 𝟏. 𝟓 𝒎𝒊𝒏𝒖𝒕𝒆𝒔 and 𝝁 = 𝟐𝟓minutes.

Using the formula of Central Limit Theorem,


𝑥−𝜇
𝑧= 𝜎
√𝑛

If 𝑥 = 24.7, If 𝑥 = 25.5,
24.7 − 25 25.5 − 25
𝑧= 𝑧=
1.5 1.5
√50 √50
𝑧 = −1.41 𝑧 = 2.36
Finding the probability to the left of 𝑧,
𝑃(𝑧 < −1.41) = 0.0793,
𝑃(𝑧 < 2.36) = 0.9909

𝑃(−1.41 < 𝑧 < 2.36) = 0.9909 − 0.0793

𝑃(−1.41 < 𝑧 < 2.36) = 0.9116


𝑃(−1.41 < 𝑧 < 2.36) = 91.16%

Thus, the probability that drivers have a mean of driving time between 24.7
minutes and 25.5 minutes is 91.16%.

EXAMPLE #3

The mean NAT scores of Grade 10 students is 65. Sixty students were chosen and found
that the standard deviation of their scores is 5. What is the probability that their mean
score is between 64 and 67?

Using the formula of Central Limit Theorem,


𝑥−𝜇
𝑧= 𝜎
√𝑛

If 𝑥 = 64, If 𝑥 = 67,
64 − 65 67 − 65
𝑧= 𝑧=
5 5
√60 √60
𝑧 = −1.55 𝑧 = 3.10

Finding the probability to the left of 𝑧,


𝑃(𝑧 < −1.55) = 0.0606,
𝑃(𝑧 < 3.10) = 0.9990

𝑃(−1.55 < 𝑧 < 3.10) = 0.9990 − 0.0606

𝑃(−1.55 < 𝑧 < 3.10) = 0.9384


𝑃(−1.55 < 𝑧 < 3.10) = 93.84%

Therefore, the probability that the mean score is between 64 and 67 is 0.9384 or
93.84%.
EXAMPLE #4

Suppose the mean amount of cholesterol in eggs labeled “large” is 186 milligrams, with
standard deviation 7 milligrams. Find the probability that the mean amount of cholesterol
in a sample of 144 eggs will be within 2 milligrams of the population mean.

Using the formula of Central Limit Theorem,


𝑥−𝜇
𝑧= 𝜎
√𝑛

If 𝑥 = 186 − 2 = 184, If 𝑥 = 186 + 2 = 188,


184 − 186 188 − 186
𝑧= 𝑧=
7 7
√144 √144
𝑧 = −3.43 𝑧 = 3.43

Finding the probability to the left of 𝑧,


𝑃(𝑧 < −3.43) = 0.0003,
𝑃(𝑧 < 3.43) = 0.9997

𝑃(−3.43 < 𝑧 < 3.43) = 0.9997 − 0.0003

𝑃(−3.43 < 𝑧 < 3.43) = 0.9994


𝑃(−3.43 < 𝑧 < 3.43) = 99.94%

Therefore, the probability that the mean amount of cholesterol in a sample of 144
eggs will be within 2 milligrams of the population mean is 0.9994 or 99.94%.
UNIT CONFIDENCE
INTERVALS
5

Confidence Intervals for the Mean You wish to find the leading candidate for presidency
1
(Large Samples) in the next election. Since its impossible for you to ask all
the registered voters on who will they vote, you conducted a
2 Confidence Intervals for the Mean survey to 5000 registered voters. You found out that 33% of
(Small Samples), t-distribution
them wanted Rodrigo Duterte to become the next president.
Since the estimated percentage is just a single number, it is
3 Confidence Intervals for
Population Proportion hard to tell that it is the true proportion of results. To estimate
the result, you need to use margin of error to have a range
where the true proportion lie. In this case, you have 1%
margin of error which means statistically 32-34% wanted to
vote for Duterte. In this unit, you will learn how to estimate
the parameter given a situation.
Lesson 1 CONFIDENCE INTERVALS FOR THE MEAN (LARGE SAMPLES)

PRE-ASSESSMENT:

At the end of this lesson, you are Below is the frequency distribution table
expected to: of random sample of the weight (in kg) of
 illustrate point and Grade 11 students in Pamantasan ng
interval estimations, Lungsod ng Valenzuela, find the mean.
 distinguishes between
WEIGHT (in kg) FREQUENCY
point and interval
43-47 6
estimation,
48-52 10
 computes for the point
53-57 7
estimate of the
58-62 4
population mean.
63-67 1
68-72 2

In this lesson, you will learn how to use sample statistics to make an estimate of the population
parameter when the sample size is at least 30 or when the population is normally distributed and the
standard deviation is known. To make such an inference, begin by finding a point estimate.
A point estimate is a single value estimate for a population parameter. The most unbiased point
estimate of the population mean is the sample mean 𝑥̅ .
An interval estimate is an interval, or range of values, used to estimate a population parameter.

The level of confidence c is the probability that


the interval estimate contains the population
parameter.
Critical Value is the value that indicates the point
beyond which lies the rejection region. This
region does not contain the true population
For example, 𝑐 = 90%, then 5% lies to the left of
parameter. −𝑧𝑐 = −1.645 and 5% to the right of 𝑧𝑐 = 1.645
CONFIDENCE LEVEL CRITICAL VALUE
(%) OF Z (𝒛𝒄 )
80 ±1.28
90 ±1.645
95 ±1.96
98 ±2.33
99 ±2.58

Given a level of confidence c, the margin of error E (sometimes also called the maximum error
of estimate or error tolerance) is the greatest possible distance between the point estimate and the
value of the parameter it is estimating.
𝜎
𝐸 = 𝑧𝑐
√𝑛
CONFIDENCE INTERVALS FOR THE POPULATION MEAN

Using a point estimate and a margin of error, you can construct an interval estimate of a population
parameter such as This interval estimate is called a confidence interval.

LEFT ENDPOINT RIGHT ENDPOINT


(LE) 𝑥̅ − 𝐸 < 𝝁 < ̅𝑥 + 𝐸 (RE)

The probability that the confidence interval contains is c.


Finding a Confidence Interval for a Population Mean (𝑛 ≥ 30 or 𝜎 known with a normally
distributed population)

STEPS
Find the sample statistics n and 𝑥̅ .
1
Specify 𝜎 if known. Otherwise, if 𝑛 ≥ 30, find the sample
2 standard deviation s and use it as an estimate for 𝜎.

Find the critical value 𝑧𝑐 that corresponds to the given level of


3 confidence.

Find the margin of error E.


4
Find the left and right endpoints and form the confidence
5 interval.

FIND A MINIMUM SAMPLE SIZE TO ESTIMATE 𝝁

Given a c-confidence level and a margin of error E, the minimum sample size n needed to estimate
the population mean 𝜇 is
𝑧𝑐 𝜎 2
𝑛=( )
𝐸

If is 𝜎 unknown, you can estimate it using s, provided you have a preliminary sample with at least
30 members.
Let’s go back to the situation given in the pre-assessment. Solving for the mean of the given data,
WEIGHT (in kg) FREQUENCY MIDPOINT (𝒙) 𝒇𝒙
43-47 6 45 270
48-52 10 50 500
53-57 7 55 385
58-62 4 60 240
63-67 1 65 65
68-72 2 70 140

∑ 𝑓𝑥 1600
𝑥̅ = = = 53.33 𝑘𝑔
𝑛 30

To identify the interval of the population parameter of the given data, the sample mean of
53.33 𝑘𝑔 will be the point estimate. Now, given 95% confidence level, find the margin of error for
the mean weight of the Grade 11 students of Pamantasan ng Lungsod ng Valenzuela. Assuming
that the standard deviation is about 7kg.

𝜎
𝐸 = 𝑧𝑐
√𝑛
𝑧𝑐 = 1.96, 𝜎 = 7, 𝑛 = 30
7
𝐸 = (1.96) ( ) = 2.50
√30

Thus, given the 95% confidence level, the margin of error for the population mean is 5.37kg.

Finally, let’s construct the confidence interval.

𝑥̅ − 𝐸 < 𝝁 < ̅𝑥 + 𝐸
53.33 − 2.50 < 𝝁 < 53.33 − 2.50
50.83 < 𝝁 < 55.83

In conclusion, with 95% confidence, the population mean weight of Grade 11


students of Pamantasan ng Lungsod ng Valenzuela is between 50.83 kg and 55.83 kg.
EXAMPLE #2

From a random sample of 60 days of the year 2020, Philippine gasoline prices had a mean
of ₱ 60.25 and a standard deviation of ₱21.75. Construct the 90%, 95%, and 99% confidence
interval for the population mean.

With 90% confidence level, 𝑐 = 90%, 𝑧𝑐 = 1.645, 𝜎 = 21.75, 𝑛 = 60


𝜎
𝐸 = 𝑧𝑐
√𝑛
21.75
𝐸 = (1.645) ( ) = 𝟒. 𝟔𝟐
√60

𝑥̅ − 𝐸 < 𝝁 < ̅𝑥 + 𝐸
With 90% confidence, the population
60.25 − 4.62 < 𝝁 < 60.25 − 4.62 mean price of the gasoline in the Philippines year
𝟓𝟓. 𝟔𝟑 < 𝝁 < 𝟔𝟒. 𝟖𝟕 2020 is between ₱55.63 and ₱64.87

With 95% confidence level, 𝑐 = 95%, 𝑧𝑐 = 1.96, 𝜎 = 21.75, 𝑛 = 60


𝜎
𝐸 = 𝑧𝑐
√𝑛
21.75
𝐸 = (1.96) ( ) = 𝟓. 𝟓𝟎
√60

𝑥̅ − 𝐸 < 𝝁 < ̅𝑥 + 𝐸
With 95% confidence, the population
60.25 − 5.50 < 𝝁 < 60.25 + 5.50 mean price of the gasoline in the Philippines year
𝟓𝟒. 𝟕𝟓 < 𝝁 < 𝟔𝟓. 𝟕𝟓 2020 is between ₱54.75 and ₱65.75.

With 99% confidence level, 𝑐 = 99%, 𝑧𝑐 = 2.58, 𝜎 = 21.75, 𝑛 = 60


𝜎
𝐸 = 𝑧𝑐
√𝑛
21.75
𝐸 = (2.58) ( ) = 𝟕. 𝟐𝟒
√60

𝑥̅ − 𝐸 < 𝝁 < ̅𝑥 + 𝐸 With 99% confidence, the population


60.25 − 7.23 < 𝝁 < 60.25 + 7.23 mean price of the gasoline in the Philippines year
2020 is between ₱53.01 and ₱67.49.
𝟓𝟑. 𝟎𝟏 < 𝝁 < 𝟔𝟕. 𝟒𝟗
MARGIN OF ERROR
𝑅𝑖𝑔ℎ𝑡 𝐸𝑛𝑑𝑝𝑜𝑖𝑛𝑡 − 𝐿𝑒𝑓𝑡 𝐸𝑛𝑑𝑝𝑜𝑖𝑛𝑡 𝑅𝐸 − 𝐿𝐸
𝐸= =
2 2

LENGTH OF CONFIDENCE INTERVAL


𝐿 = 𝑅𝑖𝑔ℎ𝑡 𝐸𝑛𝑑𝑝𝑜𝑖𝑛𝑡 − 𝐿𝑒𝑓𝑡 𝐸𝑛𝑑𝑝𝑜𝑖𝑛𝑡 = 𝑅𝐸 − 𝐿𝐸
𝜎
𝐿 = 2𝐸 = 2𝑧𝑐
√𝑛

EXAMPLE #3

Find the margin of error and length of confidence interval,


a. if the confidence interval is 𝟑𝟓. 𝟎𝟖 < 𝝁 < 𝟑𝟔. 𝟗𝟐?
b. Confidence level: 95%; 𝝈 = 𝟎. 𝟔𝟎 and n = 44

𝑅𝐸−𝐿𝐸
a. 𝐸= 2

36.92 − 35.08
𝐸=
2
𝐸 = 𝟎. 𝟗𝟐

𝐿 = 𝑅𝐸 − 𝐿𝐸 = 36.92 − 35.08 = 𝟏. 𝟖𝟒

Hence, the margin of error is 0.92 and the length of confidence interval is 1.84.

𝜎
b. 𝐸 = 𝑧𝑐
√𝑛

0.60
𝐸 = (1.96) ( ) = 𝟎. 𝟏𝟖
44

𝐿 = 2𝐸 = 2(0.18) = 𝟎. 𝟑𝟔

Thus, the margin of error is 0.18 and the length of confidence interval is 0.36.
EXAMPLE #4

Given E = 75 and σ=250, find the minimum sample size if the confidence Level is:
(a) 90%, (b) 95%, and (c) 99%
𝑧𝑐 𝜎 2
With 90% confidence level, 𝑛 = ( )
𝐸

1.645 ∙ 250 2
𝑛=( ) = 30.07
75
The minimum sample size is 31.
𝑧𝑐 𝜎 2
With 95% confidence level, 𝑛 = ( )
𝐸

1.96 ∙ 250 2
𝑛=( ) = 42.68
75
The minimum sample size is 43.
𝑧𝑐 𝜎 2
With 99% confidence level, 𝑛 = ( )
𝐸

2.58 ∙ 250 2
𝑛=( ) = 73.96
75
The minimum sample size is 74.

EXAMPLE #5

A company president wishes to estimate the average number of hours his part-
time employee per week. The standard deviation from a previous study is 9.3 hours.
How large a sample must be selected if he wants to be 99% confidence of finding
whether the true mean differs from the sample mean by 4 hours?

𝑧𝑐 𝜎 2
With 99% confidence level, 𝑛 = ( )
𝐸

2.58 ∙ 9.3 2
𝑛=( ) = 35.98
4

Thus, the president needs a sample of size of at least 36 part-time employees.


EXAMPLE #6

A researcher found that the IQ scores of the ALS students in the Division of
Valenzuela are normally distributed with a mean of 110 and a standard deviation of 10.
How many ALS students are needed to test so that the estimate will not be more than 5
from the population mean with a 99% level of confidence?
𝑧𝑐 𝜎 2
With 99% confidence level, 𝑛 = ( )
𝐸

2.58 ∙ 10 2
𝑛=( ) = 26.63
5
Therefore, 27 ALS students are needed to test so that the estimate will not be more
than 5 from the population mean with a 99% level of confidence.

NOTES
 Increasing the confidence level will also increase the margin of error that gives a
wider interval of the population mean.
 As the level of confidence increases, the confidence interval widens. As confidence interval
widens, the precision of the estimate decreases. To prevent the decrease of precision, the
sample size should also increase.
 For minimum sample size, round UP the result to obtain whole number.
 There are three (3) factors that influence sample size determination: (1) level of
confidence, (2) population standard deviation, and (3) the margin of error.
Researchers can control margin of error and confidence level. The less error you are
willing to accept, the bigger the sample size needs to be. Also, the more confident you
want to be, the bigger the sample size needs to be.
UNIT CONFIDENCE
INTERVALS
5

Confidence Intervals for the Mean You wish to find the leading candidate for presidency
1
(Large Samples) in the next election. Since its impossible for you to ask all
the registered voters on who will they vote, you conducted a
2 Confidence Intervals for the Mean survey to 5000 registered voters. You found out that 33% of
(Small Samples), t-distribution
them wanted Rodrigo Duterte to become the next president.
Since the estimated percentage is just a single number, it is
3 Confidence Intervals for
Population Proportion hard to tell that it is the true proportion of results. To estimate
the result, you need to use margin of error to have a range
where the true proportion lie. In this case, you have 1%
margin of error which means statistically 32-34% wanted to
vote for Duterte. In this unit, you will learn how to estimate
the parameter given a situation.
Lesson 2 CONFIDENCE INTERVALS FOR THE MEAN (SMALL SAMPLES)

PRE-ASSESSMENT:

At the end of this lesson, you are Given that the sample mean is 150.5, 𝜎 = 30.25 and n = 50,
expected to: find the confidence interval if the confidence level is:
 illustrates the t-distribution, (a) 90%,
 identifies regions under the t-
(b) 95%, and
distribution corresponding to
t-values, (c) 99%.
 computes for the confidence
interval estimate based on In many real-life situations, the population standard
the appropriate form of the deviation is unknown. Moreover, because of various
estimator for the population constraints such as time and cost, it is often not practical to
mean, and collect samples of size 30 or more. So, how can you construct
 solve problems involving a confidence interval for a population mean given such
confidence interval circumstances? If the random variable is normally
estimation of the population distributed (or approximately normally distributed), you
mean.
can use a t-distribution.

t-DISTRIBUTION

If the distribution of a random variable x is approximately normal, then


𝑥̅ − 𝜇
𝑡= 𝑠
√𝑛
follows a t-distribution.

ILLUSTRATION OF Critical values of t are denoted by several properties of the t-distribution are as
DEGREES OF follows.
FREEDOM 1. The t-distribution is bell-shaped and symmetric about the mean.
2. The t-distribution is a family of curves, each determined by a parameter called the
Suppose the number of degrees of freedom. The degrees of freedom are the number of free choices left after
chairs in your
a sample statistic such as is calculated. When you use a t-distribution to estimate a
classroom equals to
number of students: 20 population mean, the degrees of freedom are equal to one less than the sample size.
chairs for 20 students. Degrees of freedom
Each of the first 19 d.f. = n-1
students has a choice
to which chair he or she 3. The total area under a t-curve is 1 or 100%.
will sit in. There is no
4. The mean, median, and mode of the t-distribution are equal to 0.
freedom of choice,
however, for the 20th 5. As the degrees of freedom increase, the t-distribution approaches the normal
student who enters the distribution. After 30 d.f. the t-distribution is very close to the standard normal z-
room. distribution.
t-table
EXAMPLE #1

Find the critical value 𝒕𝒄 for a 90% confidence level when the sample size is 14.

𝑛 = 14
𝑑𝑓 = 𝑛 − 1 = 14 − 1 = 13
𝑐 = 90%

𝑡𝑐 = ±1.771

EXAMPLE #2

Find the critical value 𝒕𝒄 for a 95% confidence level when the sample size is 20.

𝑛 = 20
𝑑𝑓 = 𝑛 − 1 = 20 − 1 = 19
𝑐 = 95%

𝑡𝑐 = ±2.093
CONFIDENCE INTERVALS AND t-DISTRIBUTIONS
Constructing a confidence interval using the t-distribution is similar to constructing a confidence
interval using the normal distribution—both use a point estimate and a margin of error E.

Constructing a Confidence Interval for the Mean: t-Distribution


1. Find the sample statistics n, 𝑥̅ and s.
2. Identify the degrees of freedom, the level of confidence c, and the critical value 𝑡𝑐 .
3. Find the margin of error E.

𝑠
𝐸 = 𝑡𝑐
√𝑛

4. Find the left and right endpoints and form the confidence interval.

(𝑥̅ − 𝐸) < 𝜇 < ( 𝑥̅ + 𝐸)


EXAMPLE #3

Find the margin of error if 𝒔 = 𝟓, 𝒏 = 𝟏𝟔 and the confidence interval is: 90%, (b)95%,
(c)99%.
𝑠 = 5, 𝑛 = 16; 𝑑𝑓 = 𝑛 − 1 = 16 − 1 = 15, 𝑐 = 90%, 𝑡𝑐 = 1.753

𝑠
𝐸 = 𝑡𝑐
√𝑛
5
𝐸 = 1.753 ( ) = 𝟐. 𝟏𝟗
√16

𝑠 = 5, 𝑛 = 16; 𝑑𝑓 = 𝑛 − 1 = 16 − 1 = 15, 𝑐 = 95%, 𝑡𝑐 = 2.131

𝑠
𝐸 = 𝑡𝑐
√𝑛
5
𝐸 = 2.131 ( ) = 𝟐. 𝟔𝟔
√16

𝑠 = 5, 𝑛 = 16; 𝑑𝑓 = 𝑛 − 1 = 16 − 1 = 15, 𝑐 = 99%, 𝑡𝑐 = 2.947

𝑠
𝐸 = 𝑡𝑐
√𝑛
5
𝐸 = 2.947 ( ) = 𝟑. 𝟔𝟖
√16
EXAMPLE #4

Using example 3, construct confidence interval if the sample mean is 18.65.

(𝒙
̅ − 𝑬) < 𝝁 < ( 𝒙
̅ + 𝑬)

With 90% confidence level and 𝐸 = 2.19,


(𝑥̅ − 𝐸) < 𝜇 < ( 𝑥̅ + 𝐸)
(18.65 − 2.19) < 𝜇 < ( 18.65 + 2.19)
16.46 < 𝜇 < 20.84
With 90% confidence, the population mean is between 16.46 and 20.84.

With 95% confidence level and 𝐸 = 2.66,


(𝑥̅ − 𝐸) < 𝜇 < ( 𝑥̅ + 𝐸)
(18.65 − 2.66) < 𝜇 < ( 18.65 + 2.66)
15.99 < 𝜇 < 21.31
With 95% confidence, the population mean is between 15.99 and 21.31.

With 99% confidence level and 𝐸 = 3.68,


(𝑥̅ − 𝐸) < 𝜇 < ( 𝑥̅ + 𝐸)
(18.65 − 3.68) < 𝜇 < ( 18.65 + 3.68)
14.97 < 𝜇 < 22.33
With 99% confidence, the population mean is between 14.97 and 22.33.

EXAMPLE #5

You randomly select 16 coffee shops and measure the temperature of the coffee sold
at each. The sample mean temperature is 𝟏𝟔𝟐. 𝟎℉ with a sample standard deviation of
𝟏𝟎. 𝟎℉. Construct a 95% confidence interval for the population mean temperature.
Assume the temperatures are approximately normally distributed.

𝑠 = 10, 𝑛 = 16, 𝑑𝑓 = 16 − 1 = 15, 𝑥̅ = 162, 𝑐 = 95%, 𝑡𝑐 = 2.131

𝑠
𝐸 = 𝑡𝑐
√𝑛

10
𝐸 = 2.131 ( ) = 𝟓. 𝟑𝟑
√16

(𝑥̅ − 𝐸) < 𝜇 < ( 𝑥̅ + 𝐸)


(162 − 5.33) < 𝜇 < ( 162 + 5.33)
156.67 < 𝜇 < 167.33

With 95% confidence, the population mean temperature of coffee sold in coffee shops is
between 𝟏𝟓𝟔. 𝟔𝟕℉ and 𝟏𝟔𝟕. 𝟑𝟑℉.
UNIT CONFIDENCE
INTERVALS
5

Confidence Intervals for the Mean You wish to find the leading candidate for presidency
1
(Large Samples) in the next election. Since its impossible for you to ask all
the registered voters on who will they vote, you conducted a
2 Confidence Intervals for the Mean survey to 5000 registered voters. You found out that 33% of
(Small Samples), t-distribution
them wanted Rodrigo Duterte to become the next president.
Since the estimated percentage is just a single number, it is
3 Confidence Intervals for
Population Proportion hard to tell that it is the true proportion of results. To estimate
the result, you need to use margin of error to have a range
where the true proportion lie. In this case, you have 1%
margin of error which means statistically 32-34% wanted to
vote for Duterte. In this unit, you will learn how to estimate
the parameter given a situation.
Lesson 3 CONFIDENCE INTERVALS FOR POPULATION PROPORTIONS

At the end of this lesson, you are


expected to:
 computes point estimate for
population proportion,
 determine the minimum sample
size required when estimating a The previous lessons of this chapter estimate the
population proportion, population mean and focuses on quantitative data. Meanwhile,
 computes for the confidence this chapter estimates for qualitative data. Recall that the
interval estimate of the probability of success in a single trial of a binomial experiment
population proportion, and is p. This probability is a population proportion. In this lesson,
 solve problems involving
you will learn how to estimate a population proportion p using
confidence interval estimation of
a confidence interval. As with confidence intervals for you will
the population proportion.
start with a point estimate.

POINT ESTIMATE FOR A POPULATION PROPORTION


The point estimate for p, the population proportion of successes, is given by the proportion of
successes in a sample and is denoted by
𝑥
𝑝̂ = 𝑛 Sample proportion

where x is the number of successes in the sample and n is the sample size. The point estimate for the
population proportion of failures is 𝑞̂ = 1 − 𝑝̂ . The symbols 𝑝̂ and 𝑞̂ are read as “p hat” and “q hat.”

A c-CONFIDENCE INTERVAL for a population proportion p is


𝑝̂ − 𝐸 < 𝑝 < 𝑝̂ + 𝐸

where: p is the population proportion, E is the margin of error, p ̂-E is lower confidence limit and
p ̂+E is the upper confidence limit.

𝑝̂ (1 − 𝑝̂ )
𝐸 = (𝑧𝑐 )√
𝑛

Constructing a Confidence Interval for a Population Proportion


1. Identify the sample statistics n and x.
2. Find the point estimate 𝑝̂ .
3. Verify that the sampling distribution of 𝑝̂ can be approximated by a normal distribution.
4. Find the critical value 𝑧𝑐 that corresponds to the given level of confidence c.
5. Find the margin of error E.
6. Find the left and right endpoints and form the confidence interval.
EXAMPLE #1

In a survey of 1000 adults, 373 said that it is acceptable to legalized divorce in the country.

a. Find a point estimate for the population proportion of adults who say it is acceptable
to legalized divorce in the country.
b. Construct a 95% confidence interval for the population proportion of adults who say
that it is acceptable to legalized divorce in the country.

𝑥
𝑝̂ =
𝑛
373
𝑝̂ = = 𝟎. 𝟑𝟕𝟑
1000

𝑞̂ = 1 − 𝑝̂
𝑞̂ = 1 − 0.37 = 0.627

𝑝̂(1−𝑝̂)
𝐸 = (𝑧𝑐 )√ 𝑛
; 𝑧𝑐 = 1.96

0.373(0.627)
𝐸 = (1.96)√ = 0.030
1000

𝑝̂ − 𝐸 < 𝑝 < 𝑝̂ + 𝐸
0.373 − 0.03 < 𝑝 < 0.373 + 0.03
𝟎. 𝟑𝟒𝟑 < 𝒑 < 𝟎. 𝟒𝟎𝟑

Hence, with 95% confidence, the population proportion of Filipinos who say that it is
acceptable to legalized divorce in the country is between 34.3% and 40.3%.

EXAMPLE #2

In a survey of 2000 Filipinos (aged 16-25), 1231 said that BlackPink is the best KPOP girl
group in Asia.

a. Find a point estimate for the population proportion of Filipinos who say BlackPink is
the best KPOP girl group in Asia.
b. Construct a 90% confidence interval for the population proportion of Filipinos who
say BlackPink is the best KPOP girl group in Asia.
𝑥
𝑝̂ =
𝑛
1231
𝑝̂ = = 𝟎. 𝟔𝟏𝟓𝟓
2000

𝑞̂ = 1 − 𝑝̂
𝑞̂ = 1 − 0.6155 = 0.3845

𝑝̂(1−𝑝̂)
𝐸 = (𝑧𝑐 )√ ; 𝑧𝑐 = 1.645
𝑛

0.6155(0.3845)
𝐸 = (1.645)√ = 0.0179
2000

𝑝̂ − 𝐸 < 𝑝 < 𝑝̂ + 𝐸
0.6155 − 0.0179 < 𝑝 < 0.6155 + 0.0179
𝟎. 𝟓𝟗𝟕𝟔 < 𝒑 < 𝟎. 𝟔𝟑𝟑𝟒

Hence, with 90% confidence, the population proportion of Filipinos who say BlackPink is
the best KPOP girl group in Asia is between 59.76% and 63.34%.

FINDING A MINIMUM SAMPLE SIZE TO ESTIMATE p


Given a c-confidence level and a margin of error E, the minimum sample size n needed to estimate p
is

𝑧𝑐 2
𝑛 = 𝑝̂ 𝑞̂ ( )
𝐸

This formula assumes that you have preliminary estimates of 𝑝̂ and 𝑞̂. If not, use 𝑝̂ and 𝑞̂ = 0.5.
EXAMPLE #3

Miriam is running for President and wish to estimate, with 95% confidence, the
population proportion of registered voters who will vote her. Her estimate must be accurate
within 3% of the population proportion. Find the minimum sample size needed if (a) no
preliminary estimate is available and (b) a preliminary estimate gives 𝒑
̂ = 𝟎. 𝟑𝟏.

𝑝̂ = 0.5, 𝑞̂ = 0.5, 𝑧𝑐 = 1.96, 𝐸 = 0.03


𝑧𝑐 2
𝑛 = 𝑝̂ 𝑞̂ ( )
𝐸
1.96 2
𝑛 = (0.5)(0.5) ( )
0.03
𝑛 = 1067.11
𝑛 = 1067

The minimum sample size for no preliminary estimate is 1068 registered voters.

𝑝̂ = 0.31, 𝑞̂ = 0.69, 𝑧𝑐 = 1.96, 𝐸 = 0.03


𝑧𝑐 2
𝑛 = 𝑝̂ 𝑞̂ ( )
𝐸
1.96 2
𝑛 = (0.31)(0.69) ( )
0.03
𝑛 = 913.02
𝑛 = 913

The minimum sample size if 𝑝̂ = 0.31 is 913 registered voters.

You might also like