0% found this document useful (0 votes)
2 views

Lecture V Probability and Statistics

The document discusses measures of relative dispersion, including the Coefficient of Quartile Deviation, Coefficient of Mean Deviation, and Coefficient of Variation, which help compare the variability of different datasets regardless of their units. It also covers concepts of skewness and kurtosis, explaining how to assess the symmetry and peakedness of distributions using various coefficients. Examples and exercises are provided to illustrate the application of these statistical measures.

Uploaded by

adamsedwin06
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Lecture V Probability and Statistics

The document discusses measures of relative dispersion, including the Coefficient of Quartile Deviation, Coefficient of Mean Deviation, and Coefficient of Variation, which help compare the variability of different datasets regardless of their units. It also covers concepts of skewness and kurtosis, explaining how to assess the symmetry and peakedness of distributions using various coefficients. Examples and exercises are provided to illustrate the application of these statistical measures.

Uploaded by

adamsedwin06
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

5.

1 Measures of Relative Dispersion

These measures are used in comparing spreads of two or more sets of observations. These
measures are independent of the units of measurement. These are a sort of ratio and are called
coefficients. The smaller the coefficient the lower the spread and vice versa.

Suppose that the two distributions to be compared are expressed in the same units and their
means are equal or nearly equal. Then their variability can be compared directly by using their
standard deviations. However, if their means are widely different or if they are expressed in
different units of measurement, we cannot use the standard deviations as such for comparing
their variability. We have to use the relative measures of dispersion in such situations.
Examples of these Measures of relative dispersion includes; Coefficient of quartile deviation,
Coefficient of mean deviation and the Coefficient of variation

5.1.1 Coefficient of Quartile Deviation and Coefficient of Mean Deviation


Q3 − Q1
The Coefficient of Quartile Deviation of x CQD(x) is given by CQD(x) =  100%
Q3 + Q1
The Coefficient of Mean Deviation CMD(x) is given by CMD(x) = MAD  100 %
Mean

5.1.2 Coefficient of Variation:


Coefficient of variation is the percentage ratio of standard deviation and the arithmetic mean.
It is usually expressed in percentage. The coefficient of variation of x denoted C.V(x) is given
by the formula
C.V(x) = Sx 100%
where x is the mean and S is the standard deviation of x.
Note: Standard deviation is absolute measure of dispersion while. Coefficient of variation is
relative measure of dispersion.

Example 1 Consider the distribution of the yields (per plot) of two ground nut varieties. For
the first variety, the mean and standard deviation are 82 kg and 16 kg respectively. For the
second variety, the mean and standard deviation are 55 kg and 8 kg respectively. Then we have,
for the first variety C.V(x) = 16
82
 100  19.5%
For the second variety C.V(x) = 558  100  14.5%
It is apparent that the variability in second variety is less as compared to that in the first variety.
But in terms of standard deviation the interpretation could be reverse.

Example 2 Below are the scores of two cricketers in 10 innings. Find who is more „consistent
scorer‟ by indirect method.
Cricketer A 204 68 150 30 70 95 60 76 24 19
Cricketer B 99 190 130 94 80 89 69 85 65 40
Solution:
From a calculator, x A = 79.6 , S A = 58.2 xB = 94.1 and S B = 41.1
Coefficient of variation for player A is C.V(x) = 58.2
79.6
 100  73.153 %
Coefficient of variation for player B is
41.1
C.V(x) = 94.1  100  43.7028 %
Coefficient of variation of A is greater than coefficient of variation of B and hence we conclude
that player B is more consistent

Exercise
1) Find the coefficient of quartile deviation, the coefficient of mean deviation and the
Coefficient of variation n of x for the following data: a) 9, 3, 4, 2, 9, 5, 8, 4, 7, 4 b)
1, 2, 2, 3, 4, 4, 5, 5, 6, 6, 7, 8, 8 and 9 c) 3, 6, 9, 10, 7, 12, 13, 15, 6, 5, 13 d)
data on marks given by the table below
Marks Obtained 0-10 10-20 20-30 30-40 40-50 50-60 60-70
No. of Students 6 12 22 24 16 12 8
2) If the weights of 7 ear-heads of sorghum are 89, 94, 102, 107, 108, 115 and 126 g. Find the
arithmetic mean and standard deviation using a calculator hence determine the coefficient
of variation of the ear-heads of sorghum
3) The following are the 381soybean plant heights in cm’s collected from a particular plot.
Using coding formula, Find the mean and Standard deviation of the plants hence determine
the coefficient of variation of the 1soybean plant heights:
Plant 6.8- 7.3- 7.8- 8.3- 8.8- 9.3- 9.8- 10.3- 10.8- 11.3- 11.8- 12.3-
heights 7.2 7.7 8.2 8.7 9.2 9.7 10.2 10.7 11.2 11.7 12.2 12.7
(Cms)
No. of 9 10 11 32 42 58 65 55 37 31 24 7
Plants

5.2 Measures of Skewness and Kurtosis


5.2.1 Skewness
Before discussing the concept of skewness, an understanding of the concept of symmetry is
essential. A plot of frequency against class mark joined with a smooth curve can help us to
visually assess the symmetry of a distribution. Usually, symmetry is about the central value.
Symmetry is said to exist if the smoothed frequency polygon of the distribution can be divided
into two identical halves which are mirror images of each other Skewness on the other hand
means lack of symmetry and it can be positive or negative.
Basically, if the distribution has a tail on the right, (See figure below), then the distribution is
positively skewed E.g. Most students having very low marks in an examination. However, if
the distribution has a tail on the left, then the distribution is negatively skewed. (see figure
below). E.g. Most students having very high marks in an examination
Measures of Skewness
Generally, for any set of values x1 , x2 , x3 , ..... xn , the moment coefficient of skewness  3 is

given by  3 =
 f(x - x) 3

where S is the standard deviation. It’s worth noting that if  3  0,


nS 3
the distribution is negatively skewed, if  3  0, the distribution is positively skewed and if
3 = 0 the distribution is normal
Other measures of Skewness includes the Karl Pearson coefficient of Skewness SK p , Bowley’s

coefficient of Skewness SK B and Kelley’s coefficient of Skewness SK k .

The Karl Pearson’s coefficient of Skewness is based upon the divergence of mean from mob
in a skewed distribution. Recall the empirical relation between mean, median and mode which
states that, for a moderately symmetrical distribution, we have
Mean - Mode = 3 (Mean - Median)
Hence Karl Pearson's coefficient of skewness is defined by;
Mean − Mode 3(Mean − Median )
SK p = = ,
Standard Deviation Standard Deviation
The Bowley’s coefficient of Skewness is based on quartiles. For a symmetrical distribution, it
is seen that Q1 and Q3 are equidistant ftom median.
Q3 - 2Q 2 + Q1
SK B = where Qk is the Kth quartile.
Q3 - Q1
The Kelly’s coefficient of Skewness is based on P90 and, P10 so that only 10% of the
observations on each extreme are ignored. This is an improvement over the Bowley’s
coefficient which leaves 25% of the observatories on each extreme of the distribution.
P90 - 2P50 + P10
SK k = where Pk is the Kth percentile.
P90 - P10

Interpreting Skewness
If the coefficient of skewness is positive, the distribution is positively skewed or skewed right,
meaning that the right tail of the distribution is longer than the left. If the coefficient of skewness
is negative, the data are negatively skewed or skewed left, meaning that the left tail is longer.
If the coefficient of skewness = 0, the data are perfectly symmetrical. But a skewness of exactly
zero is quite unlikely for real-world data, so how can you interpret the skewness number?
Bulmer, M. G., Principles of Statistics (Dover,1979) — a classic — suggests this rule of thumb:
If the coefficient of skewness is: -
• less than −1 or greater than +1, the distribution is highly skewed.
• between −1 and − 12 . or between + 12 . and +1, the distribution is moderately skewed.
• between − 12 and + 12 .., the distribution is approximately symmetric.

Example The following figures relate to the size of capital of 285 companies:
Capital (in Ks lacs.) 1-5 6-10 11-15 16-20 21-25 26-30 31-35
No. of companies 20 27 29 38 48 53 70
Compute the Bowley's coefficients of skewness and interpret the results.
Solution
Boundaries 0.5-5.5 5.5-10.5 10.5-15.5 15.5-20.5 20.5-25.5 25.5-30.5 30.5-35.5
CF 20 47 76 114 162 215 285
 71.5 - 47 
Q1 = 14 (286) th value = 71.5th value = 10.5 +    5  14.7241
 29 
 143 - 114 
Q2 = 12 (286) th value = 143rd value = 20.5 +    5  23.5208
 48 
 214.5 - 162 
Q3 = 34 (286) th value = 214.5th value = 25.5 +    5  30.4528
 53 
Q - 2Q 2 + Q1 30.4528 - 2  23.5208 + 14.7241
SK p = 3 =  -0.11855.
Q3 - Q1 30.4528 - 14.72411
This value lies between − 12 and + 12 , therefore the distribution is approximately symmetric.
Question: Compute the Karl Pearson's and the Kelly’s coefficient of skewness for the above
data and interpret the results.

5.2.2 Kurtosis
It measures the peakedness of a distribution. If the values of x are very close to the mean, the
peak is very high and the distribution is said to be Leptokurtic. On the other hand, if the values
of x are very far away from the mean, the peak is very low and the distribution is said to be
Pletykurtic. Finally, if x values are at a moderate distance from the mean then the peak is
moderate and the distribution is said to be mesokurtic. (see figure on pg 42)
Measures of Kurtosis
Generally for a set of values x1 , x2 , x3 , ..... xn , the moment coefficient of kurtosis  4 is given

by  4 =
 f(x - x) 4

where x and S are the arithmetic mean and standard deviation.


nS 4
Example Find the coefficient of Skewness  3 and kurtosis  for the data 5, 6, 7, 6, 9, 4, 5
4

Solution
x= 1
n x = 42
7
= 6 and Standard deviation s = 1
n  (x - x) 2
= 4
7

x 5 6 7 6 9 4 5 Total
(x - x ) 2 1 0 1 0 9 4 1
16
(x - x ) 3 -1 0 1 0 27 -8 -1
18
(x - x ) 4 1 0 1 0 81 16 1 100

Coefficient of Skewness  3 =
 (x - x)3
=
18
 ( )  0.744118
7
3
3 4
nS 7
Notice that this distribution is moderately skewed to the right

Coefficient of kurtosis  4 =
 f(x - x) 4

=
100
 ( )  2.73438
7
4
4
nS 4 7
Exercise
1. Find the moment coefficient of Skewness and kurtosis for the dat below. a) 9, 3, 4, 2, 9, 5,
8, 4, 7, 4 b) 1, 2, 2, 3, 4, 4, 5, 5, 6, 6, 7, 8, 8 and 9 c) 3, 6, 9, 10, 7, 12, 13, 15, 6, 5,
13
data on marks given by the table below
Marks Obtained 0-10 10-20 20-30 30-40 40-50 50-60 60-70
No. of Students 6 12 22 24 16 12 8
Data given by the table below
Marks Obtained 0-10 10-20 20-30 30-40
No. of Students 1 3 4 2
2. Compute the Bowley's coefficient of skewness, the Kelly’s coefficient of skewness and the
Percentile coefficient of kurtosis for the following data and interpret the results. a) 9, 3, 4,
2, 9, 5, 8, 4, 7, 4 b) 1, 2, 2, 3, 4, 4, 5, 5, 6, 6, 7, 8, 8 and 9 c)
3, 6, 9, 10, 7, 12, 13, 15, 6, 5, 13 d) data on heights given by the table below
Heightl (in 58 59 60 61 62 63 64 65
inches.)
No. of persons 10 18 30 42 35 28 16 8
e) data on daily expenditure of families given by the table below
Daily Expenditure 0-20 20-40 40-60 60-80 80-100
(Rs)
No. of persons 13 25 27 19 16
f) Data on marks given by the table below
Marks Obtained 0-20 20-40 40-60 60-80 80-100
No. of Students 8 28 35 17 12
3. The following measures were computed for a frequency distribution :
Mean = 50, coefficient of Variation = 35% and Karl Pearson's Coefficient of Skewness
SKp = - 0.25 . Compute Standard Deviation, Mode and Median of the distribution.

You might also like