Unit 13 - Basic Statistics
Unit 13 - Basic Statistics
13.1 Introduction
In this unit we discuss some of the concepts of Basic Statistics. The single
value, which is representative of a set of values, may be used to give an
indication of the general size of the members in a set, the word ‘average’
often being used to indicate the single value. The Statistical term used for
‘average’ is the arithmetic mean or just the mean. Other measures of central
tendency may be used and these include the median and the modal values.
The standard deviation of a set of data gives an indication of the amount of
dispersion, or the scatter, of members of the set from the measure of central
tendency.
Objectives:
At the end of the unit you would be able to understand
the concept of central tendency and its applications
how to calculate standard deviation and variation of a given set of data.
x1 , x 2 , .......... , x n is
x
x1 x 2 ....... x n
x
n n
If the observations x1 , x 2 , .......... , x n have frequencies f1 , f2 , .......... .fn ,
the arithmetic mean is
f1 x1 f2 x 2 ....... fn x n fx
x
f1 f2 ......... fn N
(for discrete frequency distribution)
Where N = f is the total frequency.
Thus, for a raw data, the arithmetic mean is
x
x
n
For a tabulated data (discrete or continuous), it is
x
fx
N
Example 1: Heights of six students are 163, 173, 168, 156, 162 and 165
cms. Find the arithmetic mean.
Solution: The arithmetic mean of the heights is
x
x 163 173 168 156 162 165
n 6
987
164.5 cms.
6
Example 2: In a one-day cricket match, a bowler bowls 8 overs. He gives
away 3, 5, 12, 0, 4, 1, 3 and 7 runs in these overs. Find the mean run rate
per over.
Solution: The mean run rate is
x
Sum of values
x
Number of values n
3 5 12 0 4 1 3 7
8
35
4.375 runs per over.
8
Solution:
Salary (Rs.) Employees
fx
(x) (f)
2430 4 9720
2590 28 72520
2870 31 88970
3390 16 54240
4720 3 14160
5160 2 10320
Total 84 249930
x
fx 249930 Rs. 2975.36
N 84
(ii) Total salary of the employees is
fx = Rs. 249930.
Example 4: A survey of 128 smokers revealed the following frequency
distribution of daily expenditure on smoking of these smokers. Find the
mean daily expenditure
Expenditure (Rs.) 10 – 20 20 – 30 30 – 40 40 – 50 50 – 60 60 – 70 70 – 80
No. of smokers 23 44 35 12 9 3 2
Solution:
Expenditure
Frequency (f) Mid-value (x) fx
(Rs.)
10 – 20 23 15 345
20 – 30 44 25 1100
30 – 40 35 35 1255
40 – 50 12 45 540
50 – 60 9 55 495
60 – 70 3 65 195
70 – 80 2 75 150
Total 128 - 4050
The mean is
x
fx 4050 Rs. 31.64
N 128
The mean daily expenditure is Rs. 31.64.
Change of Origin and Scale
Let x1, x2, ………., xn be n values. Let ‘a’ be a constant. Then x1 – a, x2 – a,
…….., xn – a are the values of x1, x2,……….. xn with origin shifted to ‘a’. If ‘c’
is a positive constant,
x1 a x2 a x a
, ,.........., n
c c c
are the values x1, x2, ………., xn with origin shifted to a and scale changed
x a
by c. Thus, u is the variable x with origin shifted to a and scale
c
changed by c.
x a
Here u therefore, x = a + cu
c
And so, x a cu a
c fu
N
However, if c = 1, x a u a
u
n
Deviations: Let x1, x2, x3, …….., xn be n values. Let ‘a’ be a constant. Then
x1 a, x2 a, x3 a,....., x n a are the deviations of the values from
the constant a. The squares of these deviations, namely,
x1 a2 , x2 a 2 , x3 a2 ,........., xn a 2 are the squared deviations of
the values.
Thus, x1 x , x 2 x , x 3 x , .......... , x n x are the deviations from the
arithmetic mean.
x1 x 2 , x2 x 2 , x3 x 2 , .......... , x n x 2 are the squared deviations
from the arithmetic mean. The deviations may be positive, negative or zero.
But, the squared deviations will never be negative.
Properties of Arithmetic Mean:
Arithmetic mean has the following important properties:
1. Algebraic sum of the deviations of a set of values from their arithmetic
mean is zero
That is, x x 0
2. Sum of the squared deviations of a set of values is minimum when
deviations are taken around the arithmetic mean.
3. Let x1 be the arithmetic mean of a set of n1 values. And let x2 , be the
arithmetic mean of another set of n2 values. Then, the arithmetic
mean of the two sets of values put together is
n x n2 x 2
x 1 1 (combined arithmetic mean)
n1 n2
Median
Median of a set of values is the middle most value when they are arranged
in the ascending order of magnitude. (Such an arrangement is called an
array). It is a value that is greater than half of the values and lesser than the
remaining half. The median is denoted by M.
In the case of a raw data and also a discrete frequency distribution, the
mean is-
n 1
th
M value in the arrayed series
2
In the case of a continuous frequency distribution, the median is –
N
2 m c
M I
f
Where I : lower limit of the median class
c : width of the median class
f : frequency of the mean class.
m : less-than cumulative frequency up to/ (cumulative
frequency corresponding to the class preceding the median
class)
N : Total frequency
Median class is the class which contains the median.
Example 7
The following data relates to the number of children of 25 couples. Find the
median
No. of children per couple: 2, 0, 5, 2, 5 , 1, 0, 0, 3, 4, 2, 1, 1, 2, 3, 0,
1, 2, 7, 2, 2, 1, 3, 4, 1.
Solution:
The arrayed series (ascending series) is:
0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 4, 4, 5,
7
Here, n =25. Therefore, median is
n 1
th
M value in the arrayed series
2
Manipal University Jaipur B0947 Page No.: 331
Mathematics for IT Unit 13
25 1
th
th
value = 13 value
2
= 2 children per couple
Merits of median:
1. The logic behind its computation is easily understood. It can be easily
computed.
2. Even when some of the extreme values are missing, it can be
computed.
3. It is not affected by abnormal extreme values
4. It can be used for the study of qualitative data also.
Demerits of median:
1. It is not based on all the values
2. It cannot be used in deep statistical analysis.
Mode
Mode is the value which has the highest frequency. It is the most
frequently occurring value. It is denoted by Z.
In the case of raw data, and also in the case of a discrete frequency
distribution, mode is the value with highest frequency.
In the case of a continuous frequency distribution, mode is
f f1 c
Z I
2 f f1 f2
Where l : lower limit of the modal class
f : frequency of the modal class
c : width of the modal class
f1 : frequency of the class preceding the modal class
f2 : frequency of the class succeeding the modal class
Modal class is the class width containing the mode.
Generally, modal class will be the class with highest frequency. But
sometimes, it may be a class other than the class with highest
frequency. In such a situation, mode is obtained by using the formula –
cf2
Z l
f1 f2
Most of frequency distributions have only one value with highest frequency.
Such frequency distributions are unimodal – they have only one mode. On
the other hand, if in a frequency distribution, there is more than one value
with highest frequency, such a distribution is multimodal – it will have more
than one mode. If there are two modes, the distribution is bimodal.
However, for a distribution which has more than one mode, it is said to be ill
– defined.
Example 8:
The following are the number of children for 20 couples. Find the mode.
No. of children per couple: 2, 3, 6 , 3, 4, 0, 5, 2 , 2, 4, 3, 2, 1, 0, 4, 2, 2, 1, 1, 3
Solution:
The data should be tabulated first
No. of couples
No. of children
Tally marks Frequency
0 II 2
1 III 3
2 IIII I 6
3 IIII 4
4 III 3
5 I 1
6 I 1
Total 20
No. of students 8 19 29 36 25 13 4
Solution:
Here, the class intervals are of inclusive type. Firstly, they should be
converted into the exclusive type.
Modal class
Since 36 is the highest frequency and it is far higher than the other
frequencies, the class interval 39.5 – 49.5 is the modal class. Thus l = 39.5,
f = 36, f1 = 29, f2 = 25 and c = 10.
The model is –
f f1 c 36 29 10
Z l 39.5
2f f1 f2 2 36 29 25
70
39.5 43.4%
18
Merits and demerits of mode
The merits and demerits of mode are the same as merits and demerits of
median. In addition, one demerit of mode can be listed. It is –
For some frequency distribution, mode is ill – defined.
x x 2 f x x f m x 2
2
Actual
1.
Mean N N N
2 2 2
Direct x 2 x fx 2 fx fm 2 fm
2.
method N N N N N N
2 2 2
Assumed d 2 d fd 2 fd fd 2 fd
3.
mean N N N N N N
2 2 2
d' 2 d' fd' 2 fd' fd' 2 fd'
Step C C C
4.
Deviation N N N N N N
Individual Observations
Method 1: Deviation taken from actual mean
X 2
Standard Deviation, where x X X . This method is simple
N
when X X values are integers
Steps:
1. Form a table with the given values, x in the first column
x
2. Find out the arithmetic mean, X
N
3. Find out the deviation of each values from the actual mean and call it x
i.e, find x X X . Enter those values in the next column
4. Find out the squares of the deviation of the values from the actual mean,
i.e, find x2. Enter those values in the next column.
5. Find out the mean of the squared deviation of the values from their
..
N
74 0 0 = 4.4
76 2 4 = 2.10
x 740 X X 0 x 2 44
2
X 2 X
Standard Deviation
N N
This method can be used for all kinds of data. This formula is used later for
correcting the mistakes in the calculations.
Steps:
1. Form a table with the given values, x, in the first column
2. Find the square of each X and write in the next column under the title X2.
3. Find the totals X and X 2 and N, the number of values
4. Substitute in the above formula and simplify
Example: 2
10 students of B.Com class of a college have obtained the following marks
in statistics out of 100. Calculate the standard deviation
S. No : 1 2 3 4 5 6 7 8 9 10
Marks : 5 10 20 25 40 42 45 48 70 80
Solutions:
S. No Marks X X2 Standard Deviation
2
1 5 25 X 2 X
2 10 100 N N
3 20 400 2
20143 385
4 25 625 10 10
5 40 1600
20143 38.5 2
6 42 1764
7 45 2025 2014 .30 1482 .25
8 48 2304
532.05
9 70 4900
23.07
10 80 6400
Total X 385 X
2
20143
Steps:
1. Form a table with the given values, X, in the first column.
2. Assume any value as ‘A’ if it is root specified in a problem. It is
preferable to assume a value in between the minimum value and the
maximum value of X as A.
3. Find out the observation of each value from the assumed mean A and
call it d. i.e., find d = X – A and write them in the next column
4. Write the squares of the deviations, d2, in the next column
5. Find d and d2 and identify N, the number of values substitute them in
the above formula and simplify.
Example 3:
For the data below, calculate standard deviation:
40, 50, 60, 70, 80, 90, 100
Solution:
d=X–A
X X Standard Deviation
A = 70
2
40 – 30 900 d 2 d
50 – 20 400
N N
60 – 10 100 2
2800 0
70 0 0 7 7
80 10 100
400 0 2
90 20 400
100 30 900 = 20
Total d 0 d 2 2800
d 2 d
2
Standard Deviation, C
N N
X A
d
C
This method is preferred when X X are fractions and there is common
difference between X.
Steps:
1. Form a table with the given values X, in the first column
2. Choose A and C as mentioned under Arithmetic mean
3. Find out the step deviation corresponding to each X. i.e., find
X A
d and write those values in the next column.
C
4. Write the squares of d . i .e, d 2 , in the next column.
5. Find d’ and d’2 and identify N, the number of values. Substitute them
in the above formula and simplify.
Example 4: Given below are the marks obtained by 5 B.Sc. students
Roll No : 101 102 103 104 105
Marks : 10 30 20 25 15
Calculate Standard Deviation Standard Deviation
Solution:
d ' d '
2 2
X A C
d' d'2 N N
Marks C
Roll No.
X A = 20 2
10 0
C=5 5
101 10 –2 4 5 5
102 30 2 4
5 2 02
103 20 0 0
104 25 1 1 5 2
105 15 –1 1 = 5 1.4142
6 7 42 –6 36 252
9 12 108 –3 9 108
12 13 156 0 0 0
15 10 150 3 9 90
18 8 144 6 36 288
Total N = 50 fx = 600 – – fx2 = 738
fx
Arithmetic mean = X
N
600
50
= 12.00
Manipal University Jaipur B0947 Page No.: 340
Mathematics for IT Unit 13
fx 2
Standard Deviation
N
738
50
14.36
= 3.84
Method 2: Direct Method
Under this method, the formula becomes the following
2
fx 2 fx
Standard Deviation,
N N
Steps:
1. From a table with the given values, x and the frequencies, f in the first
two columns
2. Multiply each x by the corresponding f to find fx. Write all such fx
values in the next column.
3. Multiply each fx by the corresponding x to find fx2 (It is not (fx)2. That
is, fx should not be squared) such fx2 value in the next column.
4. Find N(=f), fx and fx2.
5. Substitute in the above formula and simplify.
Example 6: Calculate the standard deviation
No. of goals scored in a match : (x) 0 1 2 3 4 5
No. of Matches : (f) 1 2 4 3 0 2
Solution:
X f fx fx2
Standard Deviation
0 1 0 0 fx 2 fx
2
1 2 2 2 N N
2 4 8 16 95 20
2
3 3 9 27 12 12
4 0 0 0 7.9167 2.4167 2
5 2 10 50 7.9167 5.8404
Total N = 12 fx = 29 fx = 95
2
2.0763
= 1.44
Manipal University Jaipur B0947 Page No.: 341
Mathematics for IT Unit 13
Example 7:
Calculate standard deviation from the following data:
x : 6 9 12 15 18
f : 7 12 19 10 2
Solution:
Let X: 6, 9, 12, 15 and 18.
d=X–A
X f fd fd2
A = 12
6 7 –6 – 42 252
9 12 –3 – 36 108
12 19 0 0 0
15 10 3 30 90
18 2 6 12 72
2
fd 2 fd
Standard Deviation
N N
2
522 36
50 50
10.44 0.72 2
10.4400 0.5184
9.9216
= 3.15
13.5.1 Method 4: Step Deviation Method
This following formula is used.
2
fd' 2 fd'
Standard Deviation, C
N N
X A
d' ; N f
C
Steps:
1. Form a table with the given values, x and the frequencies, f in the first
two columns
2. Choose the value for A.
X A
3. Find out d' corresponding to each X and enter them in the next
C
column
4. Multiply each d by the corresponding f to get fd . Enter them in the
next column.
5. Multiply each fd by the corresponding d and get fd 2 . Enter them in
the next column.
6. Find N f , fd and fd 2
7. Substitute in the above formula and simplify.
Example 8:
The weekly salaries of a group of employees are given in the following table.
Find the mean and standard deviation of the salaries.
Salaries (in Rs.) : 75 80 85 90 95 100
No. of persons : 3 7 18 12 6 4
Solution:
Salary (in No. of X A
d
Rs.) persons C fd’ fd2
x f A = 85; C = 5
75 3 –2 –6 12
80 7 –1 –7 7
85 18 0 0 0
90 12 1 12 12
95 6 2 12 24
100 4 3 12 36
Total N = 50 – fd = 23 d2 = 91
fd
Arithmetic mean X A C
N
23
85 5
50
= 85 + 2.3
= 87.30 (Rs.)
Standard Deviation
2
fd 2 fd
C
N N
2
91 23
5
50 50
5 1.82 0.46 2
5 1.8200 0.2116
5 1.6084
= 5 1.27
= 6.35 (Rs.)
f m X
2
Standard deviation,
N
Steps:
1. Form a table with class intervals and class frequencies in the first two
column.
2. Find the mid values (m) and write them the next column
3. Find the products of f and m and write them in the next column
4. Find X
fm where N f , X may be found by other formula also.
N
5. Subtract X from each m. Enter the resulting m X values in the next
column
6. Write m X 2 in the next column
Find f m X and write them in the next column
2
7.
Find f m X
2
8.
10 – 20 5 15 75 –8 64 320
20 – 30 9 25 225 2 4 36
40 – 50 1 45 45 22 484 484
f m X 1920
2
N = 20 – fm = 460 – –
fm 460
Arithmetic Mean X 23
N 20
Standard Deviation
f m X 2
N
1920
20
96
= 9.80
Method 2: Direct method
Under this method, the form of the formula is as follows:
2
fm 2 fm
Standard Deviation
N N
Steps:
1. Form a table with class intervals and frequencies in the first two
columns.
2. Find the mid values (m) and write them in the next column.
3. Find the products of f and m and write those fm values in the next
column
4. Find the products of m and fm and write those fm2 values in the next
column.
5. Find N(=f), fm and fm2
6. Substitute in the formula and simplify.
Example 10:
The following data were obtained while observing the life span of a few neon
lights of a company calculate S.D.
Life span (years) : 4–6 6–8 8 – 10 10 – 12 12 – 15
No. of neon lights : 10 17 32 21 20
Solution:
Life span No. of Neon Mid value
fm fm2
(Years) lights (f) (m)
4–6 10 5 50 250
6–8 17 7 119 833
8 – 10 32 9 288 2592
10 – 12 21 11 231 2541
12 – 14 20 13 260 3380
2
fm 2 fm
Standard Deviation,
N N
2
9596 948
100 100
95.96 9.48 2
95.9600 89.8704
6.0896
= 2.47
N1 X1 N2 X 2
Combined mean X12
N1 N2
63 27.6 26 19.2
63 26
1738 .8 499.2
89
2238 .0
89
= 25.15
Combined standard deviation:
N1 12 N 2 22 N1 d12 N 2 d 22
12
N1 N 2
N1 X1 N 2 X 2 N 3 X 3
123
N1 N 2 N 3
Uses
Standard deviation is the best absolute measure of dispersion. It is a part of
many statistical concepts such as skewness, kurtosis, correlation,
regression, estimation sampling, tests of significance and statistical quality
control. Not only in statistics but also in biology, education, psychology and
other disciplines standard deviation is of immense use.
Merits
1. Standard deviation is rigidly defined
2. It is calculated on the basis of the magnitudes of all the items
3. It could be manipulated further. The combined standard deviation can
be calculated
4. Mistakes in its calculation can be corrected. The entire calculation
need not be redone.
5. Coefficient of variation is based on standard deviation. It is the best
and most widely used relative measure of dispersion
6. It is free from sampling fluctuations. This property of sampling stability
has brought it in dispensable place in tests of significance
7. It reduces the complexity in the approach of normal distribution by
providing standard normal variable
8. It is the most important absolute measure of dispersion. It is used in all
the areas of statistics. It is widely used in other disciplines such as
psychology, education and biology as well.
9. Scientific calculators show the standard deviation of any series.
10. Different forms of the formula are available.
Demerits
1. Compared with other absolute measures of dispersion, it is difficult to
calculate
2. It is not simple to understand
3. It gives more weightage to the items away from the mean than those
near the mean as the deviations are squared.
Solution:
XX
X
X 51
X X 2 X
Mean X
40 – 11 121 N
510
41 –10 100 51.00
10
X X
45 –6 36 2
49 –2 4 S.D.
N
50 –1 1 504
50.4
51 0 0 10
= 7.10
55 4 16
C.V . 100
59 8 64 X
7.10
60 9 81 100
51.00
60 9 81 = 13.92
x 510 – X X 504
13.9 Variance
Definition
Variance is the mean square deviation of the values from their arithmetic
mean 2 (read, sigma square) is the symbol. Standard deviation is the
positive square root of variance and is denoted by . The term of variance
was introduced by R.A. Fisher in the year 1913. It is used much in sampling,
analysis of variance, etc. In analysis of variance, total variation is split into a
few components. Each component is due to one factor of variation. The
significance of the variation is then tested.
Formulae
These formulae can be compared with those under standard deviation
Method Individual Discrete Continuous
Observations Series Series
X X f X X f m X
2 2 2
1. Actual mean
N N N
fm2 fm
2
fX 2 fX
2
X 2 x
2
2. Direct Method
N N N N N N
3. Assumed mean d 2 d
2
fd 2 fd
2
fd 2
fd 2
N N N N N N
2 2
4. Step deviation C 2 d d C 2 fd fd
2 2
2
C2
fd 2 fd
N N N N N N
Individual Observations
Example 13: Number of goals scored by a team in different matches.
Calculate variance.
XX
Goals X
X : 1 .7
X X 2
2 0.3 0.09
Mean X
X
N
0 – 1.7 2.89
17
1.7
1 – 0.7 0.49 10
X X
2
3 1.3 1.69
Variance, 2
0 – 1.7 2.89 N
16.10
4 2.3 5.29 10
= 1.61
3 1.3 1.69
1 – 0.7 0.49
1 – 0.7 0.49
2 0.3 0.09
Self Assessment Questions
1. Why is that standard deviation considered to be the most popular
measure of dispersion?
2. Calculate the standard deviation from the following data:
14, 22, 9, 15, 20, 17, 12, 11
3. The table below gives the marks obtained by 10 B.Com. students in
statistics examination. Calculate standard deviation.
Numbers: 1 2 3 4 5 6 7 8 9 10
Narks: 43 48 65 57 31 60 37 48 78 59
13.10 Summary
In this unit we discussed the concept of standard deviation, the different
types of formulas are discussed with good examples. The concept of
variance is discussed next with examples.
13.12 Answers
Self Assessment Questions
1. Karl Person introduced the concept of standard deviation in 1893. It is
the most important measure of dispersion and is widely used in many
statistical formulae. Standard deviation is also called Root-mean square
deviation or Mean Error or Mean Square Error
Manipal University Jaipur B0947 Page No.: 354
Mathematics for IT Unit 13
XX
Value (X)
X 15 X X 2
14 –1 1
22 7 49
9 –6 36
15 0 0
20 5 25
17 2 4
12 –3 9
11 –4 16
X 10 X X 140
120
X 15
8
X 2 or X X
2
N N
140
17.5 4.18
8
Alternatively
We can find out standard deviation by using variable directly, i.e. no
deviation is fount out.
Values = x X2
14 196
22 484
9 81
15 225
20 400
17 289
12 144
11 121
X = 120 X = 1940
2
X2
2
X
N N
2
1940 120
8 8
242.5 225
17.5
= 4.18
Deviation taken from assumed mean method
d 2
2
d
The formula
N N
d 2
2
d
N N
2
1826 26
182 .6 2.6 2
10 10
182.6 6.76
175.84
= 13.26
dxx
4. Marks (x) f fx d2 fd2
x 30.8
10 8 80 – 20.8 432.64 3461.12
20 12 240 – 10.8 116.64 1399.68
30 20 600 – 0.8 0.64 12.80
40 10 400 9.2 84.64 846.40
50 7 350 19.2 368.64 2580.48
60 3 180 29.2 852.64 2557.92
X = 210 N = 60 fx = 1850 fd2 = 10,858.40
Mean X
fx 1850 30.8
N 60
Standard deviation
fd 2
N
10858 .64
60
= 13.45
Another method
Marks f d = x – 30 fd fd2
10 8 – 20 – 160 3200
20 12 – 10 – 120 1200
30 20 0 0 0
40 10 10 100 1000
50 7 20 140 2800
60 3 30 90 2700
X = 210 N = 60 fx = 1850 fd = 50 fd = 10,900
2
fd 2
2
fd
Standard deviation
N N
fd fd
2
C
2 N
fd 2 109 ; fd 5 ; N = 60 C = 10
2
109 5
10
60 60
10817 0.0069 10
1.81 10
= 1.345 10
= 13.45
5.
XA
d
C
Class (x) Mid value f fd fd2
X 35
10
0 – 10 5 –3 8 – 24 72
10 – 20 15 –2 12 – 24 48
20 – 30 25 –1 17 – 17 17
30 – 40 35 0 14 0 0
40 – 50 45 1 9 9 9
50 – 60 55 2 7 14 28
60 – 70 65 3 4 12 36
N = 71 fd = – 30 fd = 210
2
X A
fd C
N
A = 35 fd 30 N = 71 C = 10
35
30 10
71
= 35 – 4.225 = 30.775
fd 2
2
fd
Standard deviation C
N N
2
210 30
10
71 71
2.957 0.4225 2 10
2.7785 10
= 1.667 10 = 16.67
Terminal Questions
1. The standard deviation is an absolute measure of dispersion. Coefficient
being considered as the “percentage variation in mean, standard
deviation being considered as the total variation in the mean. That is it
shows the relationship between the standard deviation and the
arithmetic mean expressed in terms of percentage.
standard deviation
Coefficient of variance 100
mean
(or) Covariance = 100
X
2. Calculation of coefficient of variation
Price deviation from dx2 y deviation from dy2
x x 20 dx y 15 dy
20 0 0 10 –5 25
22 2 4 20 5 25
19 –1 1 18 3 9
23 3 9 12 –3 9
16 –4 16 15 0 0
x = 100 dx = 0 dx2 = 30 dy = 0 dy = 68
2
City A City B
x
x 100 20 y
y 75 15
N 5 N 5
x 20 y 15
x
dx 2 y
dy 2
N N
30 68
5 5
6 13.6
x 2.45 x 3.69
x y
C.V . 100 C.V . 100
x y
2.45 3.69
100 100
20 15
C.V. = 12.25 C.V. = 24.6
Variance = 2.45
2
Variance 2 = 3.69
City A had more stable prices than in city B because the coefficient of
variations is lower in city A
3. “Skewness is the degree of asymmetry, or departure from symmetry, of
a distributuion”
1. Karl Pearson’s coefficient of Skewness
mean mode
SK P or X Z
standard deviation S.D. σ
(or)
SK P
3 mean mode
3 XZ
standard deviation σ
It can be used when mode is ill defined.
2. Bowley’s coefficient of Skewness
Q Q1 2M
SK B 3
Q 3 Q1
M.A.
4. Mid P frequency d fd fd 2 class cf
C
value f A = 50 C = 10 interval
20 1 –3 –3 9 15 – 25 1
30 12 –2 – 24 48 25 – 35 13
40 55 –1 – 55 55 35 – 45 68
50 91 0 0 0 45 – 55 159
60 55 1 55 55 55 – 65 214
70 12 2 24 48 65 – 75 226
80 1 3 3 9 75 – 85 227
Total N = 227 – 0 fd2 224 – –
i) A.M. X A
fd C 50 0
10 50 0 50
N 227
fd fd
2
2
224 0
S.D. C 10 9.93
N N 227 227
35
10
50.75 13
55
35
10
43.75 42.95
55
Q2 (or) median
N 227
113.5 Median class interval: 45 – 55
2 2
L = 45 f = 91 h = 10 c = 68
h N
M L C
f 2
10
45 113.5 68
91
= 57. 05
3N
Q3: 170.25 Q3 class interval: 55 – 65
4
L = 55 h = 10 f = 55 Q = 159
h 3N
Q3 L C
f 4
55
10
170.25 159
55
= 57.05
Q3 Q1 2M
Bowley’s Skewness SKB
Q3 Q1
57.05 42.95 2 50
57.05 42 95
0
0
14.10
5.
Class frequency True class cumulative
Intervals frequency
30 – 49 25 29.5 – 49.5 25
50 – 69 40 49.5 – 69.5 65
70 – 89 50 69.5 – 89.5 115
90 – 109 100 89.5 – 109.5 215
110 – 129 80 109.5 – 129.5 295
130 – 149 50 129.5 – 149.5 345
150 – 169 25 149.5 – 169.5 370
N = 370
10 N 370
10 37. 37th cumulative frequency is included in the class
100 100
interval
49.5 – 69.5. It is P10 class interval
L10 = 49.5 h10 = 20 f10 = 40 C10 = 25
h 10 N
P10 L10 10 C10
f10 100
49.5
20
37.25
40
= 49.5 + 6 = 55.5
N 370
185 89.5 – 109.5 is the class interval
2 2
L = 89.5 h = 20 f = 100 C = 115
h N
M L C
f 2
20
89.5 185 115
100
= 89.5 + 14 = 103.5
90 N 370
90 333
100 100
129.5 – 149.5 is P90 class interval
h 90 N
L90 L90 90 C90
f90 100
129.5
20
333 295
50
129.5 15.2 144.7
Kelly’s coefficient of Skewness
P P10 2M
SKK 90
P90 P10
144.7 55.5 2 103.5
144.7 55.5
6.8
0.0762
89.2
References:
1. Algebra and Trigonometry by Richard Brown
2. Integral calculus by Shanthi Narayan
Publication – S. Chand & Co.
3. Differential calculus by Shanthi Narayan
Publication – S. Chand & Co.
4. Problems in Calculus of one variable by I. A. Maron
Publication – CBS Publishers
5. Trigonometry by S.L. Loney
Publication – S. Chand & Co.
6. Applied & Computational Complex Analysis by Peter Henrici
7. Mathematical Analysis by K.G. Binmore.
________________