Measures of Variability and Normal Distribution
Measures of Variability and Normal Distribution
VARIABILITY AND
NORMAL
DISTRIBUTION
Module
Table of Contents
❖ INTRODUCTION ………………………………………………………… 3
❖ OBJECTIVES ………………………………………………………… 4
❖ TOPICS
❖ MEASURES OF VARIABILITY
● Importance of Measuring Variability ………………………… 6
● Range and Interquartile Range ………………………… 7
● Variance ………………………………………………… 10
● Standard Deviation ………………………………………… 20
● Consideration of Choosing Measures of Variability ………… 28
❖ NORMAL DISTRIBUTION
● Normal Distribution ………………………………………… 32
● Areas Under the Normal Curve ………………………… 38
● Shaded Region Under the Normal Curve ………………… 50
● Understanding Z-score ………………………………… 53
● Percentile Under the Normal Curve ………………………… 55
❖ PRACTICAL APPLICATIONS ………………………………………… 59
2|Page
INTRODUCTION
Measures of variability and the normal distribution are fundamental concepts in statistics.
Measures of variability, such as the range, variance, and standard deviation, quantify the spread or
dispersion of data points in a dataset. The range is the simplest measure, representing the difference
between the maximum and minimum values. Variance and standard deviation provide more
detailed information by calculating the average of the squared differences from the mean, with the
standard deviation being the square root of the variance. A smaller standard deviation indicates
that data points are closer to the mean, while a larger one suggests greater variability.
The normal distribution, also known as the Gaussian distribution or bell curve, is a widely
curve, where data tends to cluster around the mean, with decreasing frequency as values move
away from the mean. The properties of the normal distribution, such as the 68-95-99.7 rule, which
states that approximately 68% of data falls within one standard deviation of the mean, 95% within
two standard deviations, and 99.7% within three standard deviations, make it a powerful tool in
statistical analysis, hypothesis testing, and modeling in various fields, from natural sciences to
social sciences. These concepts are essential in understanding and analyzing data, making
3|Page
Objectives related to measures of variability and the normal
distribution typically focus on understanding, calculating, and
applying these concepts in various statistical and data analysis
contexts. Here are some common objectives related to these topics:
★ Understanding Measures of Variability:
1. Comprehend the concept of variability in data.
2. Differentiate between variance, standard deviation, range, and interquartile range
(IQR).
3. Recognize the significance of measures of variability in the field of statistics.
★ Calculating Variance and Standard Deviation:
1. Learn the procedures for computing variance and standard deviation for datasets.
2. Grasp the formulas and steps involved in these computations.
3. Gain proficiency in calculating variance and standard deviation for both sample and
population data.
★ Interpreting Variability Measures:
1. Interpret the implications of variance and standard deviation within the context of
data analysis.
2. Articulate the meaning of high and low values of these measures.
3. Explore the link between variability and the spread of data.
★ Exploring the Range and IQR:
1. Calculate and elucidate the significance of the range and IQR when assessing data
distribution.
2. Comprehend how the range and IQR aid in discerning data spread.
3. Analyze and contrast the range and IQR as measures of data spread.
★ Grasping Normal Distribution Fundamentals:
1. Define and elucidate the characteristics of the normal distribution.
2. Understand the key attributes of the normal distribution, including its bell-shaped
curve, mean, and standard deviation.
3. Identify the empirical rule (68-95-99.7) associated with the normal distribution.
★ Understanding Characteristics of Normal Distribution:
1. Justify the importance of the normal distribution in statistical and data analysis
contexts.
2. Describe the symmetry and skewness patterns observed in normal distributions.
3. Delve into how the normal distribution is employed to model real-world data.
★ Developing Problem-Solving Proficiency:
1. Cultivate problem-solving abilities in interpreting and analyzing data using
measures of variability and the normal distribution.
2. Solve a variety of statistical problems and exercises pertaining to these concepts.
4|Page
Measures of Variability
This module has been specifically created with your learning needs in mind. Its primary
purpose is to help you understand the concept of variability measures. This module is designed to
be comprehensive and self-contained for your current learning situation. The language used in this
module is tailored to your vocabulary level. The lessons are organized to align with the standard
curriculum sequence, but you have the flexibility to read them in a different order if it better
2. Explain the concept of measures of variability (range, average deviation, variance, standard
5|Page
Importance of Measuring
Variability
The term "variability" refers to the distance between data points within a distribution and
their distance from its center. Measures of variability give you descriptive statistics that summarize
your data in addition to measures of central tendency.
Variability summarizes the distance between your points, whereas central tendency, or
average, indicates where the majority of your points are located. This is significant because the
degree of variability affects the degree to which results from the sample may be applied to the
entire population.
Low variability is desirable because it makes it easier to extrapolate population information
from sample data. It is more difficult to
6|Page
Range and Interquartile
Range
The range and interquartile range are two measures of the spread or dispersion of data in a
dataset.
Range is a statistical measure that represents the spread or dispersion of data in a dataset.
The range gives you an idea of how much the data values vary from the smallest to the largest in
the dataset. It’s a simple way to understand the extent of data dispersion.
Suppose you have the following set of exam scores for a class of students:
3. Finally, subtract the minimum value from maximum value to calculate the range.
7|Page
Range = 97 - 63
Range = 34
So, the range of exam scores in this dataset is 34. This means that the scores vary from a minimum
Interquartile Range (IQR) is a statistical measure of the spread or dispersion of data that is less
2. Calculate the first quartile (Q1), which represents the 25th percentile of the data. It’s the
IQR = Q3 - Q1
The interquartile range gives you a measure of the spread of the middle 50% of the data. It’s useful
for identifying the variability of the central portion of the dataset while minimizing the influence
Let’s use the following dataset of exam scores: 68, 75, 80, 85, 88, 92, 95, 98
1. First, arrange the data in ascending order: 68, 75, 80, 85, 88, 92, 95, 98
2. Calculate the first quartile (Q1) and the third quartile (Q3):
8|Page
● Q1 (25th percentile): The median of the lower half of the data, which is the average
Q1 = ( 75 + 80 ) / 2 = 77.5
● Q3 (75th percentile): The median of the upper half of the data, which is the average
Q3 - ( 88 + 92 ) / 2 = 90
IQR - Q3 - Q1
IQR = 90 - 77.5
IQR = 12.5
So, the interquartile range (IQR) for this dataset is 12.5 . It represents the spread of the middle
50% of the data, indicating that the middle 50% of exam scores varies by 12.5 points.
9|Page
Variance
Variance is a measure of how data points differ from the mean. According to Layman, a
variance is a measure of how far a set of data (numbers) are spread out from their mean (average)
value.
Variance means to find the expected difference of deviation from actual value. Therefore,
variance depends on the standard deviation of the given data set. The more the value of variance,
the more the data is scattered from its mean and if the value of variance is low or minimum, then
it is less scattered from the mean. Therefore, it is called a measure of spread of data from mean.
Variance is the expected value of the squared variation of a random variable from its mean
value, in probability and statistics. Informally, variance estimates how far a set of numbers
(random) are spread out from their mean value.
The value of variance is equal to the square of standard deviation, which is another central tool.
Where:
X (or x) = Value of Observations
μ = Population mean of all Values
x̄ = Sample mean
10 | P a g e
N = Total number of values in the population
EXAMPLE
Given,
3, 8, 6, 10, 12, 9, 11, 10, 12, 7
SOLUTIONS:
STEP 1
Compute the mean of the 10 values given.
STEP 2
Make a table with three columns, one for the X values, the second for the deviations and the third
for squared deviations. As the data is not given as sample data so we use the formula for population
variance. Thus, the mean is denoted by μ.
11 | P a g e
12 | P a g e
STEP 3
= 73.6 / 10
= 7.36
VARIANCE FORMULAS
Variance can be of either grouped or ungrouped data. To recall, a variance can of two types which
are:
1. Variance of a population
Population Variance - All the members of a group are known as the population. When we want to
find how each data point in a given population varies or is spread out then we use the population
variance. It is used to give the squared distance of each data point from the population mean.
2. Variance of a sample
Sample Variance - If the size of the population is too large then it is difficult to take each data
point into consideration. In such a case, a select number of data points are picked up from the
population to form the sample that can describe the entire group. Thus, the sample variance can be
defined as the average of the squared distances from the mean. The variance is always calculated
with respect to the sample mean.
There are separate variance formulas for the ungrouped data and the grouped data. The variance
formulas are mentioned below.
13 | P a g e
σ² = ∑ (x − x̅)² / n
Where,
σ² = Population Variance
∑ = denotes the sum
xi = ith observation of given data
x̄ = is the mean
n = Total number of observations (Population size)
EXAMPLE:
Calculate Population Variance (σ²) from the following data
10, 50, 30, 20,10, 20, 70, 30
Solution:
Mean x̅ = ∑x/n
= (10, 50, 30, 20,10, 20, 70, 30)/8
= 240/8
= 30
Population Variance
σ² = ∑ (x − x̅)² / n
= 3000/8
14 | P a g e
σ² = 375
EXAMPLE:
Calculate Sample Variance (s²) from the following data
10,50,30,20,10,20,70,30
Mean x̅ = ∑x/n
= (10+50+30+20+10+20+70+30)/8
= 240/8
= 30
Sample Variance
s²= ∑ (x − x̅)²/ n − 1
= 3000/7
s² = 428.5714
15 | P a g e
VARIANCE FORMULAS FOR GROUPED DATA
EXAMPLE:
Calculate Population Variance (σ²) from the following grouped data
16 | P a g e
Solution:
Mean x̅ = ∑fx/n
= 55/25
= 2.2
Population Variance
= [147-(55)²/25]/25
= (147-121)/25
= 26/25
σ² = 1.04
17 | P a g e
EXAMPLE:
Calculate Sample Variance (s²) from the following grouped data
Solution:
Mean x̅ = ∑fx/n
= 55/25
= 2.2
Sample Variance
= [147-(55)²/25]/24
= (147-121)/24
= 26/24
s² = 1.0833
18 | P a g e
PRACTICAL APPLICATION
1. Find the variance for the heights of the top 12 buildings in London, England. The heights
(in feet) are: 800, 720, 655, 655, 625, 600, 590, 529, 513, 502, 502, 502.
4. Calculate Population Variance (σ²) and Sample Variance (σ²) from the following grouped
data
References:
https://www.cuemath.com/data/variance/
https://www.cuemath.com/variance-formula/
https://byjus.com/variance-formula/
https://atozmath.com/default.aspx
19 | P a g e
Standard
Deviation
In lesson 3, you understand how to compute the variance as it is defined as the average of
the squared differences from the Mean. There are different ways to compute variance. It can either
be the variance of a population and the variance for sample population. It also includes grouped
and ungrouped data which was already discussed in the previous lesson.
In the image, the curve on top is more spread out and therefore has a
higher standard deviation, while the curve below is more clustered
around the mean and therefore has a lower standard deviation.
Where:
σ = standard deviation
∑ = denotes the sum
xi = individual data point in the set
µ = is the mean
N = Total number of observations (Population size)
20 | P a g e
OK. Let us explain it step by step.
Say we have a bunch of numbers like 9, 2, 5, 4, 12, 7, 8, 11.
To calculate the standard deviation of those numbers:
1. Work out the mean (the simple average of the numbers)
2. Then for each number: subtract the Mean and square the result.
3. Then work out the mean of those squared differences.
4. Take the square root of that and we are done!
The formula actually says all of that, and I will show you how.
Example:
1. Sam has 20 Rose Bushes. The number of flowers on each bush is
9, 2, 5, 4, 12, 7, 8, 11, 9, 3, 7, 4, 12, 5, 4, 10, 9, 6, 9, 4
Work out the Standard Deviation.
Solution:
Step 1. Work out the mean
The mean is:
= 9+2+5+4+12+7+8+11+9+3+7+4+12+5+4+10+9+6+9+4
20
= 140
20
=7
And so μ = 7
Step 2. Then for each number: subtract the Mean and square the result.
This is the part of the formula that says:
So what is xi ? They are the individual x values 9, 2, 5, 4, 12, 7, etc…
In other words x1 = 9, x2 = 2, x3 = 5, etc.
So it says "for each value, subtract the mean and square the result", like this
(9 - 7)2 = (2)2 = 4
(2 - 7)2 = (-5)2 = 25
(5 - 7)2 = (-2)2 = 4
(4 - 7)2 = (-3)2 = 9
(12 - 7)2 = (5)2 = 25
(7 - 7)2 = (0)2 = 0
(8 - 7)2 = (1)2 = 1
... etc …
21 | P a g e
And we get these results:
4, 25, 4, 9, 25, 0, 1, 16, 4, 16, 0, 9, 25, 4, 9, 9, 4, 1, 4, 9
Note:
The handy sigma notation says to sum up as many terms as we want:
Sigma Notation
We want to add up all the values from 1 to N, where N=20 in our case because there
are
20 values:
We already calculated (x1-7)2=4 etc. in the previous step, so just sum them up:
= 4+25+4+9+25+0+1+16+4+16+0+9+25+4+9+9+4+1+4+9 = 178
But that isn't the mean yet, we need to divide by how many, which is done by
multiplying by 1/N (the same as dividing by N):
σ= √(8.9) = 2.983…
22 | P a g e
SAMPLE STANDARD DEVIATION
Example:
2. Sam has 20 rose bushes, but only counted the flowers on 6 of them!
The "population" is all 20 rose bushes, and the "sample" is the 6 bushes that Sam
counted the flowers of.
Let us say Sam's flower counts are:
9, 2, 5, 4, 12, 7
Note:
We can still estimate the Standard Deviation.
But when we use the sample as an estimate of the whole population, the Standard
Deviation formula changes to this:
where:
s = sample standard deviation
∑ = denotes the sum
X = the value of the data distribution
𝑥 = is the mean
N = Total number of observations (sample size)
Note:
The important change is "N-1" instead of "N" (which is called "Bessel's correction").
The symbols also change to reflect that we are working on a sample instead of the whole
population:
● The mean is now x (called "x-bar") for sample mean, instead of μ for the population
mean,
● And the answer is s (for sample standard deviation) instead of σ.
But they do not affect the calculations. Only N-1 instead of N changes the calculations.
Solution:
Step 1. Work out the mean
Using sampled values 9, 2, 5, 4, 12, 7
The mean is (9+2+5+4+12+7) / 6 = 39/6 = 6.5
So: x = 6.5
23 | P a g e
Step 2. Then for each number: subtract the Mean and square the result.
(9 - 6.5)2 = (2.5)2 = 6.25
(2 - 6.5)2 = (-4.5)2 = 20.25
(5 - 6.5)2 = (-1.5)2 = 2.25
(4 - 6.5)2 = (-2.5)2 = 6.25
(12 - 6.5)2 = (5.5)2 = 30.25
(7 - 6.5)2 = (0.5)2 = 0.25
To work out the mean, add up all the values then divide by how many.
But hang on ... we are calculating the Sample Standard Deviation, so instead of
dividing by how many (N), we will divide by N-1
𝑠 = √(13.1) = 3.619
Note: For you to better compute the standard deviation in either the whole population or
sample population, you can use the Distribution Table in computing the variance. This
will help you compute it easily since standard deviation is the square root of variance.
SAMPLE POPULATION
2 2
√𝑠2 = √∑𝑓(𝑥−𝑥) √𝜎 2 =√
∑𝑓(𝑥−µ)
𝑛−1 𝑁
24 | P a g e
UNGROUPED DATA
2 2
√𝑠2 = √∑𝑓(𝑥−𝑥) √𝜎 2 =√
∑𝑓(𝑥−µ)
𝑛−1 𝑁
Where:
f = frequency
x = classmark for sample
X = classmark for population
𝑥 = sample mean
µ = population mean
N = count of values in Population
n = count of individual values in sample
Question: Calculate the mean, variance and standard deviation for the following data:
n= ∑f = 55
∑𝑓𝑥
Mean =
∑𝑓
25 | P a g e
= 925/55
= 16.818
(∑𝑓𝑥)2
∑𝑓𝑥 2 −
2 𝑛
Variance = s = 𝑛−1
(925)2
(27575) −
55
= 55−1
27575 – 15556.8182
= 54
= 222.559
2
∑𝑓𝑥 2 −(∑𝑓𝑥)
Standard Deviation = s = √ 𝑛−1
𝑛
= √222.559
= 14.918
Solution #2:
n= ∑f = 55
∑𝑓𝑥
Mean =
∑𝑓
= 925/55
26 | P a g e
= 16.818
∑𝑓(𝑥−𝑥)2
Variance = s2 =
𝑛−1
12008.05
= 54
= 222.37
2
∑𝑓(𝑥−𝑥)
Standard Deviation = s = √ 𝑛−1
= √222.37
= 14.91
Note: You can use solution number 1 and 2 when you are calculating standard deviation of
grouped data. You can refer to the formula that has been discussed in lesson 3 (Variance) and
then compute the standard deviation by getting the square root of the value of the variance.
x 60 61 62 63 64 65 66 67 68
f 2 1 12 29 25 12 10 4 5
PRACTICAL APPLICATION
Let’s calculate the standard deviation for the number of gold coins on a ship run by pirates.
There are a total of 100 pirates on the ship. Statistically, it means that the population is 100. We
use the standard deviation equation for the entire population if we know a number of gold coins
every pirate has.
Statistically, let’s consider a sample of 5 and here you can use the standard deviation equation
for this sample population. This means we have a sample size of 5 and in this case, we use the
standard deviation equation for the sample of a population.
Consider the number of gold coins 5 pirates have; 4, 2, 5, 8, 6.
References:
https://www.mathsisfun.com/data/standard-deviation-formulas.html
https://www.nlm.nih.gov/oet/ed/stats/02-900.html
https://byjus.com/maths/standard-deviation/
27 | P a g e
Consideration of
Choosing a Measure of
Variability
Earlier in this module, we already discussed the five measures of variations such as the
IQV, range, interquartile range, and standard deviation which can be used to indicate a
distribution's level of variability. Which one should we use, however? There is no definite answer
to this question as we typically use one measure of variation, and choosing the appropriate one
involves several considerations. The variable's measurement level is one of the most fundamental
factors to consider when selecting a measure of variability, just like when selecting a measure of
central tendency. The data must be measured at the level required for that measure or higher to be
used correctly.
Figure 1
28 | P a g e
A. Nominal level. The options for a measure of variability with nominal variables are limited
to the IQV.
B. Ordinal level. For ordinal variables, it is more challenging to choose the appropriate
measure of variation. Although the IQV can be used to reflect variation in ordinal variable
distributions, it is less informative since it is not sensitive to the rank ordering of values implied
by ordinal variables. The interquartile range is also another option. The interquartile range,
however, is dependent on the difference between two scores to express variance, information
that is derived from measured ordinal scores. The interquartile range is the acceptable
compromise (showing Q1 and Q3 together with the median, taking the interquartile range to
be the range) of rank-ordered values, where the middle 50% of the observations are included.
C. Interval-ratio level. The three options for interval-ratio variables: are variance (also
known as standard deviation), range, or interquartile range. The variance and/or standard
deviation are typically favored because the range and, to a lesser extent, the interquartile range
are based on just two scores in the distribution (and, as a result, tend to be sensitive if either of
the two points is excessive). However, the range and the interquartile range might be utilized
if a distribution is so highly skewed that the mean is no longer indicative of the distribution's
central tendency. When reading tables or quickly scanning data to acquire a general
understanding of the degree of distributional dispersion, the range, and the interquartile range
will also be helpful.
Practical Application
You decide to investigate how young Americans feel about alcohol and (ATDRINK) cigarette use
(ATSMOKE). You obtain the following selected output shown below. You should note that
ATDRINK measures how respondents feel about trying alcohol, while ATSMOKE measures how
respondents feel about smoking one pack of cigarettes per day. These are substantially different
questions, and you should consider that in your answer.
29 | P a g e
c. In 2006, was there more variability in attitudes toward trying alcohol or smoking one pack of
cigarettes per day? Offer an explanation for your findings.
30 | P a g e
Normal Distribution
Previously, you have learned about continuous random variables - variables that have a
value anywhere in a given interval. In this module, you will learn about the most important of all
● identify the regions under the normal curve that correspond to different standard normal
values;
31 | P a g e
Normal
Distribution
The normal distribution is the most important distribution in statistics. Many researchers
from different fields use its idea in order to test their research hypotheses that will generate new
knowledge and transform this knowledge into new applications that improve the quality of
You are expected to learn normal distribution and its characteristics and how to construct
a normal curve.
32 | P a g e
The Normal Random Variable
A continuous random variable is considered normal when its values are distributed
normally, that is, when the majority of the values are close to the expected value with only very
few values that are extremely smaller and extremely larger. For Example, in a grade 11
class,observed that the students normally have a height of 170 cm or very close to that, with only
a number of students who are extremely tall and some who are extremely short.This Illustrates a
normal random variable. Other Examples of normal random variables include blood pressure,
scores in a test, and the weights of students belonging to the same group.
Figure 6.1 shows the graph of a normal distribution. The graph of a normal distribution is
a bell-shaped curve, which is also called the normal curve, and the majority of the values are
clustered around the value of 5 with only very few values which are too small and too large.
33 | P a g e
1 (𝑥−𝜇)2
−
𝑓(𝑥) = 𝑒 2𝜎2
𝜎√2𝜋
where 𝜇 is the expected value (mean), 𝜎 is the standard deviation, 𝜋 ≈ 3.14, and 𝑒 ≈
2.178.
34 | P a g e
This means that the mean, median and mode of
the given distribution are located at exactly one
point since their values are equal, and they are
located at the center of the graph which
indicates the highest peak of the curve.
4. The width of the curve is determined by the standard deviation of the distribution.
7. The standard deviation precisely describes the spread of the normal curve. In fact,
approximately 68.3% of the values in the distribution are within one standard deviation
from mean (from each side), 95.4% is within two standard deviations from mean, and
99.7% is within three standard deviations from the mean.
35 | P a g e
These properties will be very important as you explore further the study of the normal
distribution and its applications. Moreover, knowing the properties of the distribution will also
facilitate the solutions to some problems involving the identification of the mean and standard
deviation, as well as the construction of the normal curve.
Example 1: Find the standard deviation of the normal distribution where 99.7% of the values
fall between 52 and 82.
Solution: The mean is the midpoint halfway between 52 and 82. Thus,
52 + 82
𝜇=
2
134
𝜇=
4
Since 99.7% of the values fall between 52 and 82, then by property 7, there are
three deviations from the mean, i.e., 𝜇 + 3𝜎. Solving for 𝜎,
𝜇 + 3𝜎 = 82 or 𝜇 − 3𝜎 = 52
67 + 3𝜎 = 82 67 − 3𝜎 = 52
3𝜎 = 15 −3𝜎 = −15
𝜎=5 𝜎=5 The standard deviation is 5.
Example 2: Assume that 68.3% of grade 11 students have heights between 1.5 and 1.7 m and
the data are normally distributed.
a. Find the mean.
36 | P a g e
b. Compute the standard deviation.
Solution: a. To find the mean, compute the value that is halfway between 1.5 and 1.7.
1.5 + 1.7
𝜇= 2
1. Complete the statement by filling in the appropriate word or term on the blank.
a. The graph of a normal distribution is asymptotic to the ________.
b. The total area under the normal curve is ________.
c. The graph of a normal distribution is symmetric along the vertical line that contains
the ________ of the distribution.
d. The mean, median and mode of normal distribution are ________.
e. The graph of the normal distribution depends on the ________ and the ________.
2. The IQ scores of 95.4% of the grade 11 students are between 90 and 110.
a. Compute the mean.
b. Find the standard deviation.
37 | P a g e
Areas Under the
Normal Curve
Areas under all normal curves are related. For example, the area
percentage to the right of 1.5 standard deviations above the mean is identical
for all normal curves. (The term "area" will refer to "area percentage".)
The fact stated above is the reason we can find an area over an interval for any normal curve by
finding the corresponding area under a standard normal curve (with a mean of 0 and a standard
deviation of 1).
We have seen that the Empirical Rule (68% - 95% - 99.7%) subdivides the area under a normal
distribution into sections with widths of one standard deviation. These subdivisions are fine for
determining percentages as long as we are dealing with values that fall at these exact subdivision
locations.
What do we do when the value does not fall at an Empirical Rule subdivision? By using z-scores,
we have the ability to locate a percentage (or area) under a standard normal distribution at any
location. Z-scores allow for the calculation of area percentages (also called proportions or
probabilities) anywhere along a standard normal distribution curve (and, consequently along the
corresponding normal distribution).
The area percentage (proportion, probability) calculated using a z-score will be a decimal value
between 0 and 1, and will appear in a Z-Score Table. The total area under any normal curve is 1
(or 100%). Since the normal curve is symmetric about the mean, the area on either sides of the
mean is 0.5 (or 50%).
To find a specific area under a normal curve, find the z-score of the data value and use a
Z-Score Table to find the area. A Z-Score Table, is a table that shows the percentage of values (or
area percentage) to the left of a given z-score on a standard normal distribution.
Positive Z-Score Table Negative Z-Score Table
38 | P a g e
• These tables are designed only for the standard normal distribution, which has a mean of 0 and a
standard deviation of 1.
• The left most column is how many standard deviations above (or below) the mean to one decimal
place. (The label in the row contains the integer part and the first decimal of the z-score.)
• The part of the z-score denoting hundredths is found across the top row of the table. (The label
for columns contains the second decimal of the z-score.)
• The intersection of the rows and columns gives the probability or area under the normal curve.
Each value in the body of the table is a cumulative area.
Example 1: .
40 | P a g e
41 | P a g e
Example 2:
42 | P a g e
43 | P a g e
Example 3:
44 | P a g e
45 | P a g e
Example 4:
46 | P a g e
47 | P a g e
Example 5:
48 | P a g e
49 | P a g e
Shaded Region Under
The Normal Curve
Mathematicians are not fond of Lengthy expressions. They use denotations, notations or
symbols instead.
Probability notations are commonly used to express a lengthy idea into symbols concerning the
normaI curve.
The following are the most common probability notations used in studying concepts on the
normaI curve.
P(a < z < b) this notation represents the idea stating the probability that the z-value is between
a and b
P(z> a) this notation represents the idea stating the probability that the z-value is above a
P(z< a) this notation represents the idea stating the probability that the z-value is below a where a
and b are z-score values.
P(z = a) = 0 this notation represents the idea stating the probability that the z-value is equal to a is
0. This notation indicates that a z-value is equal to exactly one point on the curve. With that singIe
point, a line can be drawn signifying the probability can be below or above it. That is why, for a
z-value to be exactly equalI to a value its probability is equal to 0.
50 | P a g e
Illustration.
1. Find the proportion of the area between z = 2 and z = 3.
Steps Solution
With the graph, decide on what operation With the given graph, the operation to be
wiII be used to identify the proportion of used is subtraction.
the area of the region. Use probability
notation to avoid lengthy expressions. P(2 < z <3) = 0.4987 ‑ 0.4772 = 0.0215
Steps Solution
51 | P a g e
With the graph, decide on what With the given graph, the operation to be
operation wiII be used to identify the used is addition. P(z < 1) = 0.5000+ 0.3413
proportion of the area of the region. Use = 0.8413 This is so because the area of the
probability notation to avoid lengthy region from z = 0 to its Ieft is 0.5 since
expressions.
it represents haIf of the normaI curve. With
the property that the curve has area equal
to 1, therefore haIf of its area signifies
0.5000 or 0.5.
Steps Solution
Locate from the z-TabIe the With the given graph, there is no
corresponding areas of the given z- need to decide on what operation to be
values. used since as defined, if a z- value is
equal to exactly one number then its
probability or the proportion of the area of
the region is automatically 0.
Make a concluding statement. The required area for the z-value exactly
equal to 1 is 0.
52 | P a g e
Understanding
Z-score
Z-score: Definition, Formula, and Uses
Introduction:
What is a Z-score?
A z-score measures the distance between a data point and the mean using standard
deviations. Z-scores can be positive or negative. The sign tells you whether the observation is
above or below the mean.
For example, a z-score of +2 indicates that the data point falls two standard deviations
above the mean, while a -2 signifies it is two standard deviations below the mean.
A z-score of zero equals the mean. Statisticians also refer to z-scores as standard scores,
and I’ll use those terms interchangeably.
Standardizing the raw data by transforming them into z-scores provides the following benefits:
o Identify outliers
To calculate z-scores, take the raw measurements, subtract the mean, and divide by the standard
deviation.
53 | P a g e
X represents the data point of interest.
Mu and sigma represent the mean and standard deviation for the population from which you drew
your sample.
Z-scores help you understand where a specific observation falls within a distribution. Sometimes
the raw test scores are not informative.
When your data are normally distributed, you can graph z-scores on the standard normal
distribution, which is a particular form of the normal distribution. The mean occurs at the peak
with a z-score of zero. Above average z-scores are on the right half of the distribution and below
average values are on the left. The graph below shows where the baby’s z-score of 0.74 fits in the
population.
54 | P a g e
Percentile Under the
Normal Curve
Introduction:
Which of the following expressions are familiar to you?
● “First honor”
● Top five”
● “A score of 98%”
These are expressions of order. They indicate relative standing. In real life many people want to a
high level in terms of relative standing
PERCENTILE:
For any set of measurements (arranged in ascending and descending order), a percentile (or a
centile) is a point in the distribution such that a given number of casesis below it.
A percentile is a comparison score between a particular score and the scores of the rest of a group.
It shows the percentage of scores that a particular score surpassed. For example, if you score 75
points on a test, and are ranked in the 85 th percentile, it means that the score 75 is higher than 85%
of the scores.
55 | P a g e
AREAS UNDER THE NORMAL CURVE
A percentile is the value in a normal distribution that has a specified percentage of observations
below it. Percentiles are often used in standardized tests like the GRE and in comparing height and
weight of children to gauge their development relative to their peers.
56 | P a g e
57 | P a g e
What is the 95th percentile of a normal curve?
58 | P a g e
A. Directions: Solve the following problems.
Scenario: Danny is one of the students who took final examinations in three subjects. The results
of the examination are as follows:
Math 78 5 85
English 82 9 87
Science 85 15 92
1. In which subject did Danny get a higher score than the rest?
2. In which subject did Danny get a lower score than the rest?
3. If the top 7% of the examinees in Math will be given incentives, what must be their score
to be part of the list?
59 | P a g e
4. What is the probability that a randomly selected value lies between 115 and 125 if the mean
is 100, with a standard deviation of 15?
C. Directions: Find the area under a normal curve in percent given the following conditions.
1. from 𝑧 = 0 𝑡𝑜 𝑧 = 2.07
2. from 𝑧 = 0 𝑡𝑜 𝑧 = −1.03
3. from 𝑧 = −2.33 𝑡𝑜 𝑧 = 3.03
4. from 𝑧 = 0.22 𝑡𝑜 𝑧 = 2.22
The average net sales per year of the products in 60 branches of DG Company is ₱85
million, with a standard deviation of ₱15 million. Determine how many branches have net sales
of:
60 | P a g e
Learning Materials
● Module
● Ppt Presentation
❖ https://www.canva.com/design/DAFxyx
PhHVw/74xjGdwEAk7imp4tFNb81Q/edit?
utm_content=DAFxyxPhHVw&utm_campa
ign=designshare&utm_medium=link2&utm
_source=sharebutton
61 | P a g e