0% found this document useful (0 votes)
7 views

5. Lecture Note 05_ Measures of Dispersion (2)

Uploaded by

Sabbir Ahmed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

5. Lecture Note 05_ Measures of Dispersion (2)

Uploaded by

Sabbir Ahmed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Measures of Dispersion

Measures of Dispersion
The essential purpose of statistical averages is to summarize a large mass of data. These averages
serve to locate the ‘center’ of a distribution, but they do not reveal how the items or the
observations are spread out or scattered on each side of the center. This characteristic or property
of a distribution is commonly referred to as the ‘dispersion’, ‘scatter’ or ‘variation’.
It is just as important to measure this property of a distribution as to locate the central values. If
the dispersion is small, it indicates high uniformity of the observations in the distribution. Absence
of dispersion in the data indicates perfect uniformity. This situation arises when all observations in
the distribution are identical. If this were the case, description of any single observation would
suffice.
A measure of dispersion appears to serve two purposes: First, it is one of the most important
quantities used to characterize a frequency distribution. Second, it affords a basis of comparison
between two or more frequency distribution. The study of dispersion bears its importance from
the fact that various distributions may exactly have the same averages, but substantial differences
in variability.
The frequently used measure of dispersion are:
1. The range
2. The quartile deviation
3. The mean (or average) deviation
4. The variance
5. The standard deviation
The above measures sometimes classified as absolute measures. The measures are absolute in the
sense that they are expressed in the same statistical units in which the original data are presented
such as dollar, taka, meter, kilogram etc. when two sets of data are expressed in different units,
however, the absolute measures are not comparable. In that case the measures are referred as
relative measure. The relative measures are usually expressed in the form of coefficients and are
pure numbers, independent of the unit of measurements. The measures are:
1. Coefficient of range
2. Coefficient of quartile deviation
3. Coefficient of mean deviation
4. Coefficient of variation.
The Range
The simplest and the crudest measure of dispersion is the range. This is defined as the difference
between the smallest and largest values in the distribution. The symbol R is used for the range.
𝑅 = ℎ𝑖𝑔ℎ𝑒𝑠𝑡 𝑣𝑎𝑙𝑢𝑒 − 𝑙𝑜𝑤𝑒𝑠𝑡 𝑣𝑎𝑙𝑢𝑒
For grouped data, the difference between the lower class-limit of the lowest class and the higher
class-limit of the highest class is considered to be the range.
Although the range is meaningful, it is of little use because of its marked instability, particularly
when the range is based on a small sample. Imagine, if there is one extreme value in a distribution,

Prepared by: Suman Biswas, Assistant Professor, Department of Statistics, Islamic University, Kushtia-7003 1
Measures of Dispersion
the range of the distribution will appear to be large, when in fact, removal of this value may reveal
an otherwise compact distribution with extremely low dispersion.
Example: A testing lab wishes to test two experimental brands of outdoor paint to see how long
each will last before fading. The testing lab makes 6 gallons of each paint to test. Since different
chemical agents are added to each group and only six cans are involved, these two groups
constitute two small populations. The results (in months) are shown.
Group A Group B
10 35
60 45
50 30
30 35
40 40
20 25

Find the range of each group.


Solution: For brand A, the range is
𝑅 = ℎ𝑖𝑔ℎ𝑒𝑠𝑡 𝑣𝑎𝑙𝑢𝑒 − 𝑙𝑜𝑤𝑒𝑠𝑡 𝑣𝑎𝑙𝑢𝑒 = 60 − 10 = 50 months
For brand B, the range is
𝑅 = ℎ𝑖𝑔ℎ𝑒𝑠𝑡 𝑣𝑎𝑙𝑢𝑒 − 𝑙𝑜𝑤𝑒𝑠𝑡 𝑣𝑎𝑙𝑢𝑒 = 45 − 25 = 20 months
The range for brand A shows that 50 months separate the largest data value from the smallest data
value. For brand B, 20 months separate the largest data value from the smallest data value, which
is less than one-half of brand A’s range.
The Quartile Deviation
A measure similar to the range is the inter-quartile range. Usually denoted as Q. The quartile
deviation is the difference between the third quartile (𝑄3 ) and the first quartile (𝑄1 ). Thus
𝑄 = 𝑄3 − 𝑄1
The inter-quartile range is frequently reduced to the measure of semi inter-quartile range, also
known as the quartile deviation (QD), by dividing it by 2. Thus
𝑄3 − 𝑄1
𝑄𝐷 =
2
The measure is more meaningful than the range because it is not based on two extreme values.
Example: The number of cloudy days for the top 11 cloudiest cities is shown.
209, 223, 211, 227, 213, 240, 240, 211, 229, 212, 214
Find the interquartile range and quartile deviation.
Solution: Arrange the data in order.
209, 211, 211, 212, 213, 214, 223, 227, 229, 240, 240
The third quartile 𝑄3 = 229 and the first quartile 𝑄1 = 211.
𝑄3 −𝑄1 18
Hence 𝑄 = 𝑄3 − 𝑄1 = 229 − 211 = 18 and 𝑄𝐷 = = = 9.
2 2

Prepared by: Suman Biswas, Lecturer, Department of Statistics, Islamic University, Kushtia-7003 2
Measures of Dispersion
The Mean Deviation
The mean deviation is an average of absolute deviations of individual observations from the
central value of a series.
If 𝑥1 , 𝑥2 , … … , 𝑥𝑛 from a sample of observations, the formula for computing the average or mean
deviation from arithmetic mean is
∑𝑛𝑖=1|𝑥𝑖 − 𝑥̅ | ∑𝑛𝑖=1|𝑑𝑖 |
𝑀𝐷(𝑥̅ ) = =
𝑛 𝑛
Where 𝑑𝑖 = 𝑥𝑖 − 𝑥̅ , which stands for the deviations of the individual observation from the mean;
and | | means that the signs of the deviations whether positive or negative, are ignored.
Example: Compute mean deviation from the mean using the data in the following table.
Physician No. of visit (𝒙𝒊 )
1 5
2 0
3 1
4 4
5 7
6 0
7 12
8 2
9 0
10 20
11 3
12 5
13 6
Total 65
Solution: To compute the average deviation from the arithmetic mean, follow the following steps:
(i) Compute the arithmetic mean, which in this case is
𝑇𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑣𝑖𝑠𝑖𝑡𝑠 65
𝑥̅ = = =5
𝑇𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑝ℎ𝑦𝑠𝑖𝑐𝑖𝑎𝑛𝑠 13
(ii) Obtain the absolute deviation 𝑑𝑖 of the 𝑥𝑖 value from the arithmetic mean 𝑥̅ .
Physician No. of visit (𝒙𝒊 ) |𝑑𝑖 | = |𝑥𝑖 − 𝑥̅ |
1 5 0
2 0 5
3 1 4
4 4 1
5 7 2
6 0 5
7 12 7
8 2 3
9 0 5
10 20 15
11 3 2
12 5 0
13 6 1
Total 65 50
(iii) Sum the deviation to compute ∑𝑛𝑖=1|𝑑𝑖 |.
(iv) Divide the quantity ∑𝑛𝑖=1|𝑑𝑖 | by 𝑛.
(v) Thus, the mean deviation from the arithmetic mean is:
∑𝑛𝑖=1|𝑑𝑖 | 50
𝑀𝐷(𝑥̅ ) = = = 3.85
𝑛 13
Prepared by: Suman Biswas, Lecturer, Department of Statistics, Islamic University, Kushtia-7003 3
Measures of Dispersion
If a grouped frequency distribution is constructed, as is usually done with large samples, the
mean deviation is
∑𝑘𝑖=1 𝑓𝑖 |𝑥𝑖 − 𝑥̅ |
𝑀𝐷(𝑥̅ ) =
𝑛
Where 𝑀𝐷(𝑥̅ ) = 𝑀𝑒𝑎𝑛 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 𝑎𝑏𝑜𝑢𝑡 𝑚𝑒𝑎𝑛
𝑘 = number of classes
𝑥𝑖 =midpoint of the 𝑖 𝑡ℎ class
𝑓𝑖 =frequency of the 𝑖 𝑡ℎ class
𝑘

𝑛 = ∑ 𝑓𝑖
𝑖=1

Example: Compute mean deviation from the following frequency distribution:


Class Interval Mid-Point (𝒙𝒊 ) Frequency (𝒇𝒊 )
48.5-53.5 51 2
53.5-58.5 56 2
58.5-63.5 61 3
63.5-68.5 66 5
68.5-73.5 71 5
73.5-78.5 76 5
78.5-83.5 81 5
83.5-88.5 86 7
88.5-93.5 91 10
93.5-98.5 96 6
Total - 50
Solution: To compute the average deviation from the arithmetic mean, follow the following steps:
(i) Compute the arithmetic mean, which in this case is
∑ 𝑓𝑖 𝑥𝑖 3955
𝑥̅ = = = 79.1
∑ 𝑓𝑖 50
(ii) Obtain the absolute deviation 𝑑𝑖 of the 𝑥𝑖 value from the arithmetic mean 𝑥̅ and multiply the
deviations by the corresponding frequencies to get 𝑓𝑖 |𝑑𝑖 | as below
Class Interval Mid-Point (𝒙𝒊 ) Frequency (𝒇𝒊 ) |𝒅𝒊 | = |𝒙𝒊 − 𝒙
̅| 𝒇𝒊 |𝒅𝒊 |
48.5-53.5 51 2 28.1 56.2
53.5-58.5 56 2 23.1 46.2
58.5-63.5 61 3 18.1 54.3
63.5-68.5 66 5 13.1 65.5
68.5-73.5 71 5 8.1 40.5
73.5-78.5 76 5 3.1 15.5
78.5-83.5 81 5 1.9 9.5
83.5-88.5 86 7 6.9 48.3
88.5-93.5 91 10 11.9 119.0
93.5-98.5 96 6 16.9 101.4
Total 3955 50 - 556.4

Prepared by: Suman Biswas, Lecturer, Department of Statistics, Islamic University, Kushtia-7003 4
Measures of Dispersion
(iii) Sum the deviation to compute ∑𝑛𝑖=1 𝑓𝑖 |𝑑𝑖 |.
(iv) Divide the quantity ∑𝑛𝑖=1 𝑓𝑖 |𝑑𝑖 | by 𝑛 = ∑ 𝑓𝑖 .
(v) Thus, the mean deviation from the arithmetic mean is:
∑𝑛𝑖=1 𝑓𝑖 |𝑑𝑖 | 556.4
𝑀𝐷(𝑥̅ ) = = = 11.1
∑ 𝑓𝑖 50
Usually, the mean deviation is computed as the arithmetic mean of the absolute values of the
deviations from the typical value of a distribution. The typical value may be the arithmetic mean,
median, mode or even an arbitrary value. The median is sometimes preferred as a typical value in
computing the average deviation, because the sum of the absolute values of the deviations from
the median is smaller than any other value. In practice, however, the arithmetic mean is generally
used. If the distribution is symmetrical, the mean is identical with the median and the same average
deviation is obtained.
N.B. To calculate mean deviation from median/mode, compute median/mode of the distribution
in usual manner and replace the mean by the median/mode value. The other steps remain the
same as in the case of calculating mean deviation from mean.
The Variance and Standard Deviation (Ungrouped data)
Instead of ignoring the signs of deviations from the mean as in the computation of an average
deviation, they may each be squared and then the results are added. The sum of squares can be
regarded as a measure of the total dispersion of the distribution. By dividing the sum of squares by
n, we obtain the average of the squares of deviations, a measure, called variance, of the
distribution. If the observations are all from a population, the resulting variance is referred as the
population variance. As a formula, the variance of population observations 𝑥1 , 𝑥2 , … … , 𝑥𝑁
commonly designed as 𝜎 2 is
∑(𝑥𝑖 − 𝜇)2
𝜎2 =
𝑁
Where 𝜇 is the mean of all the observations and 𝑁 is the total number of observations in the
population.
Because of the operation of squaring, the variance is expressed in square units (e.g., 𝑘𝑚2 , 𝑘𝑔2 , 𝑡𝑎𝑘𝑎2
etc.) and not the original unit (e.g., 𝑘𝑚, 𝑘𝑔, 𝑡𝑎𝑘𝑎 𝑒𝑡𝑐. ). It is therefore necessary to extract the
positive square root to restore the original unit. The measure of dispersion thus obtained is called
the population standard deviation and is usually denoted by 𝜎. Thus

∑(𝑥𝑖 − 𝜇)2
𝜎=√ = √𝑉𝑎𝑟𝑖𝑎𝑛𝑐𝑒(𝑥)
𝑁

Thus, by definition, the standard deviation is the positive square root of the mean-deviation of the
observation from their arithmetic mean.
In many statistical applications, we deal with a sample rather that a population. Thus, while a set
of population observation yields a population variance, a set of sample observations will yield a
sample variance. This if 𝑥1 , 𝑥2 , … … , 𝑥𝑛 represent a set of sample observations of size n, then the
sample variance, denote by 𝑠 2 , is expressed as

2
∑(𝑥𝑖 − 𝑥̅ )2
𝑠 =
𝑛−1
Prepared by: Suman Biswas, Lecturer, Department of Statistics, Islamic University, Kushtia-7003 5
Measures of Dispersion
Where 𝑥̅ is the sample mean of all the sample observations and 𝑛 is the total number of
observations in the sample.
Which also can be computed from the formula as:
𝑛 ∑ 𝑥𝑖2 − (∑ 𝑥𝑖 )2
𝑠2 =
𝑛(𝑛 − 1)
The square root of the sample variance 𝑠 2 is the sample standard deviation, usually denoted by 𝑠.
Thus
∑(𝑥𝑖 − 𝑥̅ )2
𝑠=√
𝑛−1

Or

𝑛 ∑ 𝑥𝑖2 − (∑ 𝑥𝑖 )2
𝑠 = √𝑠 2 = √
𝑛(𝑛 − 1)

Example: The number of a 10 household members are given in the following table:
Family # 1 2 3 4 5 6 7 8 9 10
Size (𝒙𝒊 ) 3 3 4 4 5 5 6 6 7 7
Compute the sample variance and standard deviation.
Solution: The quantities to be calculated for computing the variance and standard are shown in
the following table:
Family # Size (𝒙𝒊 ) (𝒙𝒊 − 𝒙̅) ̅) 𝟐
(𝒙𝒊 − 𝒙 𝒙𝟐𝒊
1 3 -2 4 9
2 3 -2 4 9
3 4 -1 1 16
4 4 -1 1 16
5 5 0 0 25
6 5 0 0 25
7 6 1 1 36
8 6 1 1 36
9 7 2 4 49
10 7 2 4 49
Total 50 - 20 270
∑ 𝑥𝑖 50
Here, 𝑥̅ = = 10 = 5
𝑛

∑(𝑥𝑖 −𝑥̅ )2 20 𝑛 ∑ 𝑥𝑖2 −(∑ 𝑥𝑖 )2 10×270−(50)2


So, the variance 𝑠 2 = = 50 = 2.2 Or 𝑠2 = = = 2.2
𝑛−1 𝑛(𝑛−1) 10×9

∑(𝑥𝑖 −𝑥̅ )2 𝑛 ∑ 𝑥𝑖2 −(∑ 𝑥𝑖 )2


And the standard deviation 𝑠 = √ = √2.2 = 1.48 or 𝑠 = √ = √2.2 = 1.48
𝑛−1 𝑛(𝑛−1)

Example 3.21, 3.22, 3.23 [From Bluman’s Book: Chapter 03]


Variance and Standard deviation for Grouped data
When the observations 𝑥1 , 𝑥2 , … … , 𝑥𝑛 are paired with their corresponding frequencies 𝑓1 , 𝑓2 , … … , 𝑓𝑛
respectively in a fashion {𝑓𝑖 , 𝑥𝑖 } to form a frequency distribution, the formula for computing
variance and standard deviation of the distribution should be modified since they are based on

Prepared by: Suman Biswas, Lecturer, Department of Statistics, Islamic University, Kushtia-7003 6
Measures of Dispersion
ungrouped data. With divisors 𝑁 and 𝑛 − 1, the formula for computing the variance for
population and sample are respectively,
∑ 𝑓𝑖 (𝑥𝑖 −𝜇)2 ∑ 𝑓𝑖 (𝑥𝑖 −𝑥̅ )2
𝜎2 = and 𝑠 2 =
𝑁 𝑛−1

Where, 𝑓𝑖 =frequency of the 𝑖 𝑡ℎ observation


𝑥𝑖 = value of the 𝑖 𝑡ℎ observation, i.e., the mid value of the 𝑖 𝑡ℎ class if the frequency distribution is
presented by class intervals.
N or n=∑ 𝑓𝑖
the formula for the computation of the variance presented above can be written in a more
convenient form as follows:
𝑛 ∑ 𝑓𝑖 𝑥𝑖2 −(∑ 𝑓𝑖 𝑥𝑖 )2
𝑠2 = where 𝑛 = ∑𝑘𝑖=1 𝑓𝑖
𝑛(𝑛−1)

Example: Compute the variance and standard deviation for the following grouped data:
𝒙𝒊 𝒇𝒊
3 2
5 3
7 2
8 2
9 1
Total 10
Solution: The quantities to be calculated for computing the variance and standard deviation of the
data are given below:
𝒙𝒊 𝒇𝒊 𝒇 𝒊 𝒙𝒊 (𝒙𝒊 − 𝒙̅) ̅) 𝟐
(𝒙𝒊 − 𝒙 ̅ )𝟐
𝒇𝒊 ( 𝒙𝒊 − 𝒙
3 2 6 -3 9 18
5 3 15 -1 1 3
7 2 14 1 1 2
8 2 16 2 4 8
9 1 9 3 9 9
Total 10 60 - - 40
∑ 𝑓 𝑖 𝑥𝑖 60
Here, 𝑥̅ = ∑ 𝑓𝑖
= 10 = 6

∑ 𝑓𝑖 (𝑥𝑖 −𝑥̅ )2 40
So the variance is 𝑠 2 = = = 4.44.
𝑛−1 9

And the standard deviation is 𝑠 = √𝑠 2 = √4.44 = 2.11


Example: The lengths of 32 leaves were measured correct to the nearest mm. Find the mean length
and the standard deviation.
Length 20-22 23-25 26-28 29-31 32-34
Frequency 3 6 12 9 2
Solution: The quantities to computed are given below:
Length Frequency (𝒇𝒊 ) 𝒙𝒊 𝒙𝟐𝒊 𝒇𝒊 𝒙𝒊 𝒇𝒊 𝒙𝟐𝒊
20-22 3 21 441 63 1323
23-25 6 24 576 144 3456
26-28 12 27 729 324 8748
Prepared by: Suman Biswas, Lecturer, Department of Statistics, Islamic University, Kushtia-7003 7
Measures of Dispersion
29-31 9 30 900 270 8100
32-34 2 33 1083 66 2178
Total 32 - - 867 23805
Here,
∑ 𝑓𝑖 𝑥𝑖 867
𝑥̅ = = = 27.1
∑ 𝑓𝑖 32

2
𝑛 ∑ 𝑓𝑖 𝑥𝑖2 − (∑ 𝑓𝑖 𝑥𝑖 )2 32(23805) − (867)2
𝑠 = = = 10.1
𝑛(𝑛 − 1) 32 × 31

∴ 𝑠 = √𝑠 2 = √10.1 = 3.17
Hence, mean=27.1 mm, 𝑠 2 = 10.1 𝑚𝑚2 and s=3.17 mm.
Example 3.24 [From Bluman’s Book: Chapter 03]
Properties of variance
(i) Changes in origin does not have any effect on their variance (i.e., 𝒔𝟐𝒙 = 𝒔𝟐𝒚 )

Suppose 𝑥1 , 𝑥2 , … … , 𝑥𝑛 be the observations whose variance is denoted by 𝑠𝑥2 which can be


∑(𝑥𝑖 −𝑥̅ )2
computed as 𝑠𝑥2 = .
𝑛−1

Let 𝑦 be the transformed variable defined as follows:


𝑦𝑖 = 𝑥𝑖 + 𝑐; 𝑖 = 1, 2, 3, … , 𝑛 where 𝑐 is a constant.
Summing and dividing throughout by 𝑛, we get 𝑦̅ = 𝑥̅ + 𝑐
When 𝑦1 = 𝑥1 + 𝑐, 𝑦2 = 𝑥2 + 𝑐 … … 𝑦𝑛 = 𝑥𝑛 + 𝑐,
∑(𝑦𝑖 − 𝑦̅)2 ∑( 𝑥𝑖 + 𝑐 − 𝑥̅ − 𝑐)2 ∑(𝑥𝑖 − 𝑥̅ )2
𝑠𝑦2 = = = = 𝑠𝑥2
𝑛−1 𝑛−1 𝑛−1
This completes the proof.
(ii) Changes in scale of measurement have effect on variance
Suppose 𝑥1 , 𝑥2 , … … , 𝑥𝑛 be the observations whose variance is denoted by 𝑠𝑥2 which can be
∑(𝑥𝑖 −𝑥̅ )2
computed as 𝑠𝑥2 = .
𝑛−1

Let 𝑦 be the transformed variable defined as follows:


𝑦𝑖 = 𝑐𝑥𝑖 ; 𝑖 = 1, 2, 3, … , 𝑛 where 𝑐 is a constant.
Summing and dividing throughout by 𝑛, we get 𝑦̅ = 𝑐𝑥̅
Thus,
∑(𝑦𝑖 − 𝑦̅)2 ∑(𝑐𝑥𝑖 − 𝑐𝑥̅ )2 𝑐 2 ∑(𝑥𝑖 − 𝑥̅ )2
𝑠𝑦2 = = = = 𝑐 2 𝑠𝑥2
𝑛−1 𝑛−1 𝑛−1
This shows that if each number is multiplied by a constant term 𝑐, the variance will be multiplied
by the square of the constant term. That is, changes in scale of measurement have effect on
variance.

Prepared by: Suman Biswas, Lecturer, Department of Statistics, Islamic University, Kushtia-7003 8
Measures of Dispersion
Uses of the Variance and Standard Deviation
1. Variances and standard deviations can be used to determine the spread of the data. If the
variance or standard deviation is large, the data are more dispersed. This information is useful
in comparing two (or more) data sets to determine which is more (most) variable.
2. The measures of variance and standard deviation are used to determine the consistency of a
variable. For example, in the manufacture of fittings, such as nuts and bolts, the variation in the
diameters must be small, or the parts will not fit together.
3. The variance and standard deviation are used to determine the number of data values that fall
within a specified interval in a distribution. For example, Chebyshev’s theorem (explained later)
shows that, for any distribution, at least 75% of the data values will fall within 2 standard
deviations of the mean.
4. Finally, the variance and standard deviation are used quite often in inferential statistics.

Relative Measures of Dispersion


Coefficient of Variation
The standard deviation discussed above is an absolute measure of dispersion. The corresponding
relative measure proposed by Karl Pearson is the coefficient of variation (CV), that attempts to
measure the relative variability in the data set.
A coefficient of variation is computed as a ratio of the standard deviation of the distribution to the
mean of the same distribution.
Symbolically,
𝑠𝑥
𝐶𝑉 =
𝑥̅
𝑠𝑥
The CV is usually expressed in percentage, in which case 𝐶𝑉 = × 100.
𝑥̅

Thus, a value 33 per cent (say) for CV implies that the standard deviation of the sample value is
33 percent of the mean of the same distribution.
As an illustration of the use of the CV as descriptive statistics, let us suppose that we wish to obtain
some insight into whether height is more variable than the weight in the same population. For this
purpose, for instance, we have the following data obtained from 150 children in a community.
Height Weight
Mean 40 inch 10 kg
SD 5 inch 2 kg
CV 0.125 0.200
Examination of the respective standard deviations does not tell us in any meaningful way which
characteristic has more variability than the other, because they are in different units. If we now
compute the coefficient of variation, the results become comparable, because coefficient of
variation is a dimensionless.
Thus, since the coefficient of variation for weight is greater than that of the height, we would tend
to conclude that weight has more variability that height in the population.
Again, if two variables in the same population are measured in the same unit, the standard deviation
may fail to provide a correct picture of their relative variability. Consider that the blood pressure
Prepared by: Suman Biswas, Lecturer, Department of Statistics, Islamic University, Kushtia-7003 9
Measures of Dispersion
of a group of patients were measured at two levels: systolic and diastolic, both being measured in
the same unit. The results were as follows:
Systolic Diastolic
Mean 130 mm Hg 60 mm Hg
SD 15 mm Hg 8 mm Hg
CV 0.115 0.133
As the data show, the systolic pressure is more variable (sd=15 mm Hg) than the diastolic pressure
(sd=8 mm Hg). However, in relative terms, as measured by the CV, the diastolic pressure has the
greater variability.
This shows that the relative variability is of more concern than absolute variation—hence the
importance of the coefficient of variation.
The coefficient of variation may be helpful in comparing the relative variation in several data sets
that have different means and different standard deviations.
Example 3.25, 3.26 [From Bluman’s Book: Chapter 03]
Coefficient of Range
The coefficient of range is a relative measure corresponding to range and is obtained by the
following formula:
𝐿 − 𝑆
𝐶𝑜𝑒𝑓𝑓𝑖𝑐𝑖𝑒𝑛𝑡 𝑜𝑓 𝑟𝑎𝑛𝑔𝑒 = × 100
𝐿+𝑆
Where 𝐿 𝑎𝑛𝑑 𝑆 are respectively the largest and the smallest observations in the data set. The
coefficient of range is rarely used as a measure of dispersion because of its inherent difficulties in
interpretation.
Coefficient of Mean Deviation
The third relative measure is the coefficient of mean deviation. As the mean deviation can be
computed from mean, median, mode or from any arbitrary value, a general formula for computing
coefficient of mean deviation may be put as follows:
𝑀𝑒𝑎𝑛 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 𝑓𝑟𝑜𝑚 𝐴 𝑀𝐷(𝐴)
𝐶𝑜𝑒𝑓𝑓𝑖𝑐𝑖𝑒𝑛𝑡 𝑜𝑓 𝑚𝑒𝑎𝑛 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 = × 100 = × 100
𝐴 𝐴
Where 𝐴 is the mean, median, mode or any other arbitrary value.
Coefficient of Quartile Deviation
The coefficient of quartile deviation is computed from the first and the third quartiles using the
following formula:
𝑄3 − 𝑄1
𝐶𝑜𝑒𝑓𝑓𝑖𝑐𝑖𝑒𝑛𝑡 𝑜𝑓 𝑞𝑢𝑎𝑟𝑡𝑖𝑙𝑒 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 = × 100
𝑄3 + 𝑄1
Comparing the Measures of Dispersion
Like the measures of averages, a measure of dispersion should also satisfy certain criteria in order
to be reckoned as an ideal measure. From this point of view, a measure of dispersion should be
• Rigidly defined
• Easy to comprehend
• Based on all the observations
Prepared by: Suman Biswas, Lecturer, Department of Statistics, Islamic University, Kushtia-7003 10
Measures of Dispersion
• Affected less due to sampling fluctuation
• Less affected by extreme values and
• Amenable to algebraic treatment
A brief overview of the advantages, disadvantages and limitations of these measures are discussed
below:
The range is easy to compute and is a common way to describe dispersion. It is especially useful in
situations where the purpose of investigation is only to find out the extent of extreme variations.
The range, however, has certain disadvantages and drawbacks that tend to limit its usefulness as a
measure of variability. Since it depends solely on the highest and lowest values, it is highly sensitive
to the presence of unusual and extreme values in the series. Furthermore, the range does not
provide measurement of dispersion of the items relative to the central value. It tends to increase
as the size of the sample increases. Moreover, the range can not be used meaningfully with nominal
or ordinal data. It is restricted to only to interval data where it is meaningful to talk about the
largest and the smallest values. As it is based on only two terminal observations, it is not suitable
for algebraic treatment.

The average or mean deviation possesses many of the desirable properties of an ideal measure
indicated at the outset. It takes into account each and every item in the distribution and shows the
scatter of the items around the measure of central tendency. It is found that more than halves of
the observations are concentrated within one unit of average deviation around the mean 𝑥̅ . The
chief advantage is that its knowledge helps us to understand the standard deviation, which is one
of the most important measures of dispersion.
One of the drawbacks of the average deviation is the ambiguity about the measure of central
tendency to be used for its computation. In order to avoid confusion, it is necessary to state clearly
whether the mean or the median is used in computing the average deviation.

Because of its high degree of accuracy and precision, standard deviation is the most prominently
used measure of dispersion. It is based on all the observations, highly amenable to further algebraic
treatment and is considerably less affected due to sampling fluctuations.

The quartile deviation has a special utility in measuring variation in the case of open-end
distribution. It has an advantage that it is less affected by extreme values in the data set. The chief
disadvantage is that it ignores 50 percent of its observations in the computation, 25 percent from
the upper tail and 25 percent from the lower tail. Furthermore, no algebraic manipulation is
possible with quartile deviation. It is also less affected by sampling variability.
The coefficient of variation is a dimensionless measure and because of this, it is regarded as the
most commonly used measure of relative variation.

Prepared by: Suman Biswas, Lecturer, Department of Statistics, Islamic University, Kushtia-7003 11

You might also like