Lesson 5 - Measures of Variability
Lesson 5 - Measures of Variability
Measures of Variability
INTRODUCTION:
In the previous lessons, you learned how to compute mean, median and
mode. These measures of centrality focus only on giving information of what
score could best represent the entire set of data. However, if you want to
determine the spread of scores, the measure of variability can address that
query. Thus, in this lesson you will be learning the different kinds of measures of
variability or sometimes called as measures of dispersion.
OBJECTIVES:
:At the end of this lesson, you should be able to:
1. Compute and interpret the various measures of variability such as the
range, variance, standard deviation and coefficient of variation for a set of
ungrouped data.
2. Compute and interpret the various measures of variability such as the
range, variance, standard deviation and coefficient of variation for a set of
grouped data.
Range
Range is the crudest measure of dispersion. It is the difference between
the highest and the lowest scores in the data set. This means that range
considers only two scores, thus making it the most unstable measure of
dispersion. For ungrouped data range is
R HL
where:
R – range
H – highest score
L – lowest score
For example, find the range of the two sets of data below.
1. 3, 4, 5, 5, 6, 8, 10, 15
Solution:
R = 15 – 3 = 12
2. 89, 70, 37, 40, 30, 95, 77, 25, 53, 36
Solution:
R = 95 – 25 = 70
After pressing the “Enter” key, the result will be 12, which is the same result as
in manual computation. MS Excel will be most helpful if you have hundreds of
entries.
R HU L L
where:
R – range
HU – highest upper limit
LL – lowest lower limit
Solution:
The highest upper limit among the class intervals is 94 and the lowest
lower limit is 60. Therefore, the range will be
R 94 60 34
IQR Q 3 Q 1
Q 3 Q1
QD
2
Solution:
n 1 10 1 11
Find Q1: The position of Q1 is at 2.75
4 4 4
Therefore: Q1 3 .75(3 3) 3
3(n 1) 3(10 1) 33
Find Q3: The position of Q3 is at 8.25
4 4 4
Therefore: Q 3 6 .25(7 - 6) 6.25
Since we computed already the values of Q1 and Q3, we can now plug in the
values in our formulas.
Class Intervals f
36 - 40 4
31 - 35 6
26 - 30 10
21 - 25 15
16 - 20 20
11 - 15 16
6 - 10 4
Solution:
The very first thing we have to do is to construct a cumulative frequency
column, identify the i and N.
Class Intervals f Cf
36 - 40 4 75
31 - 35 6 71
26 - 30 10 65
21 - 25 15 55
16 - 20 20 40
11 - 15 16 20
6 - 10 4 4
i=5 N=75
N 74
For Q1: Solve 18.75 and locate in the Cf column.
4 4
Class Intervals f Cf
36 - 40 4 75
31 - 35 6 71
26 - 30 10 65
21 - 25 15 55
16 - 20 20 40
LB=10.5 11 - 15 16 20
6 - 10 4 4
i=5 N=75
N
4 Cfb
Q1 LB i
f Q1
75
4 4 18.75 - 4
Q1 10.5 (5) 10.5 (5) 15.11
16 16
3(N) 3(74)
For Q3: Solve 56.25 and locate in the Cf column.
4 4
Class Intervals f Cf
36 - 40 4 75
31 - 35 6 71
LB=25.5 26 - 30 10 65
21 - 25 15 55
16 - 20 20 40
11 - 15 16 20
6 - 10 4 4
i=5 N=75
3N
4 Cfb
Q 3 LB i
f Q 3
3(75)
4 55 156.25 - 55
Q 3 25.5 (5) 25.5 (5) 26.125
10 10
Therefore:
IQR Q 3 Q1 26.125 - 15.11 11.015
Q 3 Q1 26.125 15.11
QD 5.51
2 2
xx
MAD i1
for sample and
n
x m
MAD i1
for population.
N
For example, we are going to find the MAD of the following scores: 3, 4, 5, 5, 6,
8, 10, 15.
Solution:
It is easier to compute MAD if we are going to construct a table.
Step 1. Construct the table for easy computation.
x
3
4
5
5
6
8
10
15
As you can see, the result is 3. This is equal to the result of the manual
computation.
f i xi x
MAD i1
n
where:
- summation symbol
fi – frequency of ith class
xi – midpoint of ith class
x - mean of distribution
n – number of cases
Let us take the frequency distribution table given above and compute the MAD:
Class Intervals f
90 - 94 5
85 - 89 7
80 - 84 10
75 - 79 15
70 - 74 10
65 - 69 5
60 - 64 3
Solution:
Step 1. Compute the mean. In the previous lesson, the mean of the given data is
77.91.
MAD
f x m x
350.01
6.36
n 55
Population variance, σ 2
(x μ) 2
N n -1
where: σ 2 - is the population variance
s2 – sample variance
x – individual score
- population mean
x - sample mean
N – number of scores (population)
n - number of scores (sample).
For standard deviation, since it the square root of variance, the formula for
population and sample standard deviation will be:
s
(x x) 2
.
n -1
Since the variance and standard deviation are the measures of variability or
spread, they are interpreted as the lower the value the more clustered the scores are
and the higher the value the more spread the scores are.
Suppose a sample of 10 children has the following ages: 4, 5, 5, 6, 6, 7, 8, 9, 9,
11. Let us try to compute the variance and the standard deviation of the ages of these
10 children.
4 5 5 6 6 7 8 9 9 11 70
x 7
10 10
x xx (x x) 2
4 -3 9
5 -2 4
5 -2 4
6 -1 1
6 -1 1
7 0 0
8 1 1
9 2 4
9 2 4
11 4 16
(x x) 2
44
n -1
s2
(x x) 2
44
44
4.89
n -1 10 1 9
For standard deviation:
s
(x x) 2
44
44
4.89 2.21
n -1 10 1 9
Alternate Formula for Variance and Standard Deviation of Ungrouped
Data
This is the alternative formula for the variance and standard deviation of
ungrouped data:
n x 2 x
2
For variance, s 2
and
n(n 1)
n x 2 x
2
Solution:
x x2
4 16
5 25
5 25
6 36
6 36
7 49
8 64
9 81
9 81
11 121
Step 2. Get the sum of x column and the sum of x2 column.
x x2
4 16
5 25
5 25
6 36
6 36
7 49
8 64
9 81
9 81
11 121
x 70 x 2
534
Step 3 Substitute in the formula.
For variance:
n x x
2
2 10(534) (70) 2 5340 4900 440
s 4.89
n(n 1) 10(10 1) 10(9) 90
For standard deviation:
s 4.89 2.21
These values were the same as the result of the previous computation using
different formula.
After pressing the “Enter” button, MS Excel will give you 2.21 as an answer.
Solution:
Step 1. Compute the mean. The mean is equal to 77.91 as computed previously.
Class Intervals f xm xm x
90 - 94 5 92 14.09
85 - 89 7 87 9.09
80 - 84 10 82 4.09
75 - 79 15 77 -0.91
70 - 74 10 72 -5.91
65 - 69 5 67 -10.91
60 - 64 3 62 -15.91
2
Step 3. Get the square of x m x and put it on (x m x) column.
Class Intervals f xm x m x (x m x) 2
90 - 94 5 92 14.09 198.5281
85 - 89 7 87 9.09 82.6281
80 - 84 10 82 4.09 16.7281
75 - 79 15 77 -0.91 0.8281
70 - 74 10 72 -5.91 34.9281
65 - 69 5 67 -10.91 119.0281
60 - 64 3 62 -15.91 253.1281
n fx m fx m
2 2
For variance, s 2
and
n(n 1)
n fx m fx m
2 2
Let us take the same example as above and compare the results.
2
Class Intervals f xm fx m fx m
90 - 94 5 92 460 42320
85 - 89 7 87 609 52983
80 - 84 10 82 820 67240
75 - 79 15 77 1155 88935
70 - 74 10 72 720 51840
65 - 69 5 67 335 22445
60 - 64 3 62 186 11532
fx fx
2
N 55 m 4285 m 337295
Solution:
n fx m fx m
2 2
2 55(337295) 4285 2 18551225 18361225
s
n(n 1) 55(55 1) 55(54)
190000
63.97
2970
s 63.97 7.998
Coefficient of Variation
The standard deviation is not reliable measure to compare two data sets in
terms of spread when the two data sets are of different units or have the same
units but widely dissimilar mean in the field. In this case, the coefficient of
variation is developed to answer this kind of problem. The formula for coefficient
of variation is given below:
s
CV
x
where:
CV – coefficient of variation
S – standard deviation
x - mean