0% found this document useful (0 votes)
15 views

Final Unit III Mathematic IV (Stat. Tech. I)

Uploaded by

Piyush lavania
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views

Final Unit III Mathematic IV (Stat. Tech. I)

Uploaded by

Piyush lavania
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 39

Page1

Engineering Mathematics-IV (KAS-302)


Unit-III
Statistical Techniques-I

Statistics is as old as the human society itself.


Definition:- Statistics in the science which deals with methods of collecting, classifying,
presenting, comparing, and interpreting numerical data collected and drawing valid
conclusions and thereafter making reasonable decisions on the basis of such analysis.
Variable (Variate): A quantity which can vary from one individual to another called a
variable or variate.
Such as heights, weights, ages, wages of persons, rainfall records of cities etc.
Continuous variables: Quantities which can take any numerical value within certain range
are called continuous variables.
For ex. As child grows his/her height takes all possible values from 50 cm. to 100 cm.
Discrete variables: Quantities which are incapable of taking all possible values are called
discrete or discontinuous variable.
For ex. The number of children a man can have all positive integers 1, 2, 3 etc. (no value
between any the consecutive integers)
Comparison of frequency distributions:
When two or more different series of the same type are compared, tabulation of observations
is not sufficient. It is often desirable to define quantitatively.
The characteristics of frequency distribution:There are two fundamental characteristics of
frequency distribution. There are two fundamental characteristics in which similar frequency
distributions may differ.
(i) They may differ in measures of location or central tendancy i. e. in the values of the
variate ‘x’ around which they centre.
(ii) They may differ in the extent to which observations are scattered about the centre value.
(iii)Measures of this kind are called measures of dispersion
Measures of central tendency:
Measures of central tendancy helps their understanding and comparison .
Measures of central tendancy or measures of location (also properly called averages) serve
this purpose.
There are five type of averages in common use.
Page2

Arithmetic Average or Mean


1. Median
2. Mode
3. Geometric Mean
4. Harmonic mean
5.
#Arithmetic Mean:
In case of individual observation (frequency is not given)
(i) Direct Method :-
If x1 , x2 , x3 ,.........., xn , then
x1  x2  x3  ..........  xn  x
A. M .  x  
n n
(ii) Short cut Method (shifting of origin):

x
x
Shifting the origin to an arbitrary point ‘a’, the formula n becomes

x a
 ( x  a)  x a
 dx
n n where dx  x  a
a = arbitrary number, called Assumed Mean.
 dx   ( x  a)  ( x1  a)  ( x2  a)      ( xn  a)
n = number of observations
In Case of discrete series:
(i) Direct Method:
If the frequency distribution is
x : x1 , x2 , x3 ,........., xn
f : f1 , f 2 , f3 ,........., f n
f1 x1  f 2 x2       f n xn  fx
x 
then
f1 ,  f 2       f n f
(i) Shortcut Method (Shift of origin): Shifting the origin to an arbitrary point ‘a’, the formula

x
x
n becomes

xa 
 f ( x  a)  x  a   fdx  x  a   fdx
f f N

where 
f dx   f (x  a)  f ( x  a)  f ( x
1 1 2 2  a)       f n ( xn  a)
Page3

a = arbitrary number, called Assumed Mean and


N  f1  f 2       f n

# Weighted Mean: If the variate values are not of equal importance, we may attached to
them ‘weights’ w1 , w2 , w3 ,......wn as measures of their importance.
The weighted mean x is defined as
w1 x1  w 2 x2       w n xn  wx
xw  
w1  w2       wn w
Ex. Find the means

Marks No. of Marks No. of students


students
Below 10 5 Below 60 60
Below 20 9 Below 70 70
Below 30 17 Below 80 78
Below 40 29 Below 90 83
Below 50 45 Below 100 85
Sol.

f x  55 x  55 fu
Marks Mid u
10
value
0- 10 5 5 -50 -5 -25
10-20 15 4 -40 -4 -16
20-30 25 8 -30 -3 -24
30-40 35 12 -20 -2 -24
40-50 45 16 -10 -1 -16
50-60 55 15 0 0 0
60-70 65 10 10 1 10
70-80 75 8 20 2 16
80- 90 85 5 30 3 15
90-100 95 2 40 4 8
N   f  85  fu  56
Here h  10
Page4

x ah
 fu  55  10   56   48.41
 
N  85 
Ex. The mean of 200 items was 50. Later on it was discovered that two items were misread
as 92 and 8 instead of 192 and 88. Find out the correct mean.
Sol.
Incorrect value of x  50, n  200

x
x
n
  x  nx  50  200  10000
Corrected value =10000-(92+8)+(192+88)=10180
corrected  x 10180
  50.9
Correct Mean= n 200
Note: If the frequency are given in terms of class intervals. The mid values of class intervals
are considered as ‘x’ and then above formula are applied. In case of continuous series having
equal class intervals say width ‘h’, we use a different formula.
Shift of origin and change of scale (Step Deviation Method):
 xa
u  
Let  h  then x  a  hu
 fx   f (a  hu)  a f  h fu    (1)
N  f
dividing both sides by we get from eq. (1)
 fx  a  h fu or x  a  h  fu 
, u 
xa

N N N  h 
a = arbitrary number, called Assumed Mean.

Problem Set On Arithmetic Mean

Q.1 Calculate Arithmetic Mean.


Roll 1 2 3 4 5 6
No.
Marks 5 15 25 35 45 55
Sol.
Page5

A. M .  x 
 x  180  30
n 6
Alternate solution (Short Cut) Let assumed mean a=40
Roll. Marks (x-a)
No. (x)
1 5 -35
2 15 -25
3 25 -15
4 35 -5
5 45 5
6 55 15
n=6  ( x  a)  60
A. M .  x  a 
 ( x  a)  40  (60)  30
n 6
Q.2 Calculate Mean in discrete series:
Marks 5 15 25 35 45 55
No. of 10 20 30 50 40 50
students
Sol.
Marks (x) No. of students (f) fx
5 10 50
15 20 300
25 30 750
35 50 1750
45 40 1800
55 50 2750
N=180  fx  7400

A. M .  x 
 fx  7400  37
N 200
Alternate Solution (Shortcut Method)
Page6

Let assumed mean a=40


Marks (x) dx=(x-a) No. of students fdx
(f)
5 -35 10 -350
15 -25 20 -500
25 -15 30 -450
35 -5 50 -250
45 5 40 200
55 15 30 450
N=180  fdx  900

x a
 fdx  40  (900)  35
N 180
Q.3 Calculate Arithmetic Mean of
Marks 0- 10-20 20-30 30-40 40-50 50-60
10
No. of 10 20 30 50 40 30
students

Sol.
Marks Mid value (x) No. of students fx
(f)
0-10 5 10 50
10-20 15 20 300
20-30 25 30 750
30-40 35 50 1750
40-50 45 40 1800
50-60 55 30 1650
N=∑f=180  fx  6300
Mean  x 
 fx  6300  35
N 180
Alternate Solution (Shortcut Method)
Page7

Let assumed mean a=40


Marks Mid value (x) No. of students d  xa fd
(f)
0-10 5 10 -35 -350
10-20 15 20 -25 -500
20-30 25 30 -15 -450
30-40 35 50 -5 -250
40-50 45 40 5 200
50-60 55 30 15 450
N=∑f=180  fd  900

Mean x  a 
 fd  40  (900)  35
N 180
Q.4 Calculate Mean by step deviation method
Marks 0-10 10-20 20-30 30-40 40-50 50-60
No. of 10 20 30 50 40 30
students
Sol. Let assumed mean a=45
Marks Mid value (x) No. of students x  45 fu
u
10
(f)
0-10 5 10 -4 -40
10-20 15 20 --3 -60
20-30 25 30 -2 --60
30-40 35 50 --1 -50
40-50 45 40 0 0
50-60 55 30 1 30
N=∑f=180  fu  180

Here class interval h=10

Mean x  a  h
 fu  45  10 (180)  35
N 180
# Median:
Page8

Median is the central value of the variable that divides the series into two equal parts. The
series is arranged in ascending or descending order. Two types of data are given: ungrouped
data (individual series) and grouped data (or discrete or continuous series.
Methods:
(i) Individual series :-
Step(1):Arrange the size of items in ascending or descending order.
N 1
(2) Find 2 , items.
(3) calculate
N 1
th

(a) In case 2 , items works out to be whole number.


N 1
th

Median= size of 2 item .


N 1
th

(b) In case 2 , items works out to be fractions.


Median= size of full item+50% of difference between size of immediate next item and size of
full item.

Q. Obtain the median for the following :


Roll 1 2 3 4 5 6
No.
Marks 25 15 5 35 45 55
Sol. (1) Arranging the size of item in ascending order: 5, 15, 25, 35, 45, 55
N 1
th

(2) 2 item = 6+1/2=7/2=3.5th item


(3) Median =3rditem+50% (4th -3rd )
= 25+(35-25)/2=30 Ans.
Q. Obtain the median for the following :
Roll 1 2 3 4 5 6 7
No.
Marks 25 15 5 45 15 35 60
Sol. (1) Arranging the size of item in ascending order:
5, 15, 25, 35, 45, 55, 60
Page9

N 1
th

(2) 2 item = 8/2= 4th item


(3) Median =size of 4th item =35 Ans.In case of Discrete series :-
Step(1): Arrange the size of items in ascending series..
(2) Calculate cumulative frequency (C.F).
N 1
th

(3) Ascertain 2 , item.


N 1
th

(4) Find the cumulative frequency which includes 2 item


N 1
th

(5) CalculateMedian= size of the item corresponding to the c.f. which include 2 item .
Q. Obtain the median for the following frequency distribution:
Marks 45 55 25 35 5 15
No. of 40 30 30 50 10 20
student
Sol. (1)
Marks in ascending No. of students (f) c.f.
5 10 10
15 20 30
25 30 60
35 50 110
45 40 150
55 30 180
N=180

N  1 181
  90.5th
(2) 2 2 , item.
(3)c. f. which includes 90.5th =110
(4) median= size of item corresponding to 110= 35.Ans.
Q. Obtain the median for the following frequency distribution:
x 1 2 3 4 5 6 7 8 9
f 8 10 11 16 20 25 15 9 6
Page10

Sol.
x F c.f.
1 8 8
2 10 18
3 11 29
4 16 45
5 20 65
6 25 90
7 15 105
8 9 114
9 6 120
N=120
N 1 N 1
Here N  120;  60.5 
2 c. f. just 2 is 65 and the value of x corresponding to c.f.
65 is 5.
(ii) In case of Continuous series :-
Step: (1) Calculate cumulative frequency (C.F).
th
N
(2) Ascertain 2 , item.
th
N
(3) Ascertain the c. f. which includes 2 item, the corresponding class of frequency (f) and
lower limit (L) of that class. The interval between the upper limit and lower limit of class and
c. f. of the preceeding class (c.f.).
N 
  c. f . 
Median  l    i
2
(4) Calculate f

Where L=lower limit


c = c.f.= cumulative frequency of the preceding class
f = frequency of the class
i = interval
# Mode:
Mode is the value which occurs most frequently in a set of observations and around which
the other items of the set of observations and around which the other items of the set cluster
Page11

densely. It is the point of maximum frequency or the point of greatest density. It is the point
of maximum frequency or the point of greatest density.
Other definition: Mode of the distribution is that value of the variable for which frequency
is maximum.
Calculation of Mode:
(a)In case of discrete frequency distribution, the mode is the value of x corresponding to
maximum frequency.
But in any one cases:
(1) If the maximum frequency is repeated.
(2) If the maximum frequency occurs in a very beginning or at the end of the
distribution.
(3) If these are irregularities in the distribution, the value of mode is determined by the
method of grouping.
(a) In case of continuous frequency distribution, the mode is given by:

M0  l 
 f m  f1   i
(2 f m  f1  f 2 ) .

Where l = lower limit


i = width
fm = frequency of the modal class
f1 = frequency of preceding class of the modal class
f2 = frequency of succeeding class of the modal class
Note: The above formula for the c.f. is same size.
If they are unequal, they should first made equal. On the assumptions that frequencies are
equal distribution throughout the class.

In case
 fm  f1   0, 2 fm  f1  f2  0
1
Mode  l  i
(1   2 ) , 1  fm  f1 , 2  f m  f2
Use the formula
(b) For a symmetrical distribution Mean, Median and Mode coincide.
(c) Where mode is ill-defined i.e. the method of grouping also, fails, its value can be ascertained
by the formula: Mode = 3 median-2mean
This measure is called empirical mode.
Ex. Calculate mode
Page12

Size (x) 4 5 6 7 8 9 10 11 12 13
frequency 2 5 8 9 12 14 14 15 11 13
(f)

Sol.

frequencies
x
I II III IV V VI
4 2
 7
5 5   15

6 8 13
 17 22 
7 9 
29
8 12 21  35
 26
9 14 40
28
10 14 43
29
11 15  40

12 11  26
39
13 13 24

Procedure:
In column-I: Original frequency are written.
II: Frequency of column-I are combined two by two.
III: Leave the first frequency of column I and combine others two by two.
IV: Frequency of column-I are combined three by three.
V: Leave the first frequency of column I and combine the others three by three.
VI: Leave the first two frequencies in column I and combine the others three by three.
In all these columns, the maximum frequency.
By Interpolation formula Calculate mode:
(a) Where the modal class is one having the maximum frequency.
f1  f 0
Mode  L  i
(2 f1  f 0  f 2 )
L = lower limit of the modal class
Page13

f 1 = frequency of modal class


f0 = frequency of preceding class of the modal class
f2 = frequency of succeeding class of the modal class
(b) Where the modal class is other than the one having the maximum frequency.
f2
Mode  L  i
f0  f2
(c) Where there are two or more values having the same maximum frequency:
Mode  3Median  2Mode
Q. Calculate of mode in case of continuous exclusive series:
Marks 0-10 10-20 20-30 30-40 40-50 50-60
No. of 10 20 30 50 40 30
students
Sol. Maximum frequency=50 of modal class 30-40
L  30, f1  50, f0  30, f 2  40, i  10
f1  f 0 50  30
Mode  L   i  30  10  36.667
(2 f1  f 0  f 2 ) (2  50  30  40)

Q. Calculate of mode in case of continuous less than series:


Marks less than/upto 10 20 30 40 50 60
No. of students 10 30 60 110 150 180
Sol. Let us first convert less than series into continuous series.
L  30, f1  50, f0  30, f 2  40, i  10
f1  f 0 50  30
Mode  L   i  30  10  36.667
(2 f1  f 0  f 2 ) (2  50  30  40)
Q.
Marks 0-10 10-20 20-30 30-40 40-50 50-60
No. of 10 20 30 50 40 30
students
Mode=36.667 (proceed the same as previous question)
# Geometric Mean:
x1 , x2 , x3 ,.........., xn ( xi  0)
(a) Geometric mean (G.M.) of n individual observations is the
nth root of their product.
Page14

1
G   x1.x2 . x3      .xn n
Taking log on both the sides
1
log G   log x1  log x2  log x3       log xn 
n
1
  log xi
n
1 
G  anti log   log xi 
n 
n
N   fi
(b) If x1 , x2 , x3 ,.........., xn occurs f1 , f 2 , f3 ,.........., f n times respectively and i 1 then
1


G  x1 f1 .x2 f2 . x3 f3      .xn fn  N

Taking log on both the sides


1
log G   f1 log x1  f 2 log x2       f n log xn 
N
1 n 
   fi log xi 
N  i 1 
1 n

G  anti log 
N
 f log x 
i 1
i i

Q. Obtain the Geometric mean of:


Marks 0-10 10-20 20-30 30-40 40-50
No. of student 10 5 8 7 20
Sol.
Marks Mid Value (x) No. of logx f.logx
students (f)
0-10 5 10 0.6990 6.990
10-20 15 5 1.1761 5.8805
20-30 25 8 1.3979 11.1832
30-40 35 7 1.5441 10.8087
40-50 45 20 1.6532 33.0640
f  50  f log x  67.9264

1 67.9264
log G 
N
 f .log x  50
 1.3585
Page15

G  anti log1.3585  22.83


# Harmonic Mean:
Harmonic mean (H) of a number of observations is the reciprocal of the Arithmetic Mean
A.M. of the reciprocal of given values.
1 n
H n 
1 1 1 1 1
  
n i 1 xi x1 x2 xn
(a)
1 N
H n

1 fi f1 f 2 f

N i 1 xi
  n
x1 x2 xn
(b)
Q. Obtain the Harmonic mean of:
Marks(out of 150) 10 20 40 60 120
No. of student 2 3 6 5 4
Sol.
Marks f 1 f
x x x
10 2 0.1000 .200
20 3 0.05 0.150
40 6 0.025 0.150
60 5 0.017 0.085
120 4 0.008 0.032
20  f  50 f
 x  0.617

N 20
H .M .    32.4
f
x 0.617

Q. An aeroplane flies along the four sides of a square at speeds of 100, 200, 300 and 400
respectively. What is the average speed of the aeroplane in its flight around the square.
Sol. When equal distance covered using unequal speeds , the harmonic mean is the proper
average.
4
Avg.Speed   192 km / h
1 1 1 1
  
100 200 300 400 Ans.
Page16

Partition Values: There are the values of the variate which divide the total frequency into a
number of equal parts. Median being that value of the variate which divided the total
frequency into two equal parts.
(a) Quartiles:- Quartiles are those values of the variate which divides the total frequency
into four equal parts.
When the lower half before the median is divided into two equal parts, the value of the
dividing variate is called lower quartile and denoted by Q1.The value of the variate dividing
the upper half into two equal parts is called upper quartile and denoted as Q 3.
(Q2 being the median). The formula is given by
N   3N 
 F  F
Q1  l    i , Q  l   4  i
4
3
f f
(b) Deciles: Deciles are those values of the variate which divides the total frequency into 10
equal parts D1, D2, D3…., D10.
N   4N 
 F  F
D1  l     i , D  l   10  i
10
4
f f (D5 is median)
(c) Percentiles: Percentiles are those values of the variate which divides the total frequency
into 100 equal parts P1,P2,P3….,P100.
 N   9N   72 N 
 F  F  F
P1  l     i , P  l   100   i ,.... P  l   100  i
100
9 72
f f f
Ex. Find the median, lower and upper quartile for the following

Marks No. of Marks No. of students


students
Below 10 15 Below 50 94
Below 20 35 Below 60 127
Below 30 60 Below 70 198
Below 40 84 Below 80 249
Sol.

Marks No. of c. f .
Page17

students f
0- 10 15 15
10-20 20 35
20-30 25 60
30-40 24 84
40-50 10 94
50-60 33 127
60-70 71 198
70-80 51 249

Here
N  f 249
   124.5
2 2 2
 Median class is 50-60 .
l  50, i  10, f  33, F  94
N 
 F
Median  l  
2   i  50  124.5  94  10  59.24
(i) f 33
N 249
Q1    62.25
(ii) For 4 4
Median class is 30-40.
 l  30, i  10, f  24, F  60
N 
 F
Q1  l  
4   i  30   62.25  60  10  30.94
f 24
 3N 
 F
Q3  l    i
4
Similarly f

# Moments:
Meaning of moments: In mechanics moments refer to the turning effect of a force. In
Statistics it is used to describe the peculiarities of a frequency distribution. According to A.
Page18

E. Waugh the arithmetic means of the various power of the deviation are called the moments
of the distribution.
Central Moments: Central moments refer to the moments about the actual arithmetic mean.
Moment about mean is denoted by Greek letter µ. Moments can be extended to any power
but generally in practice the first four moments suffics. Moments about actual arithmetic
mean are as follows:

1   (X  X )  x
or  0 3   ( X  X )3  x 3
or
n n n n

2   (X  X ) 2

or
 x 2

4   (X  X ) 4

or
 x4
n n n n where x  X  X
Since the sum of deviations of items from actual Arithmetic Mean is always zero .
For a frequency distribution:

1    fx 3    fx
f (X  X ) f ( X  X )3 3
or or
N N N N

2    fx 4    fx 4
f ( X  X )2 2
f ( X  X )4
or or
N N N N
Raw moments ( Non-Central) Moments or moments about an arbitrary origin: Raw
moments refer to the moments about the assumed mean. These are denoted by Greek letter
µ. The first four moments about assumed arithmetic mean are as follows:

1   
( X  A) ( X  A) 3

3
n n

2   ( X  A)2 4   ( X  A)4


n n
For A Frequency distribution:

1    fd  i,
f ( X  A) X A
or d
N N i
2    fd
( X  A) 2 2

or  i2
N N
Page19

  fd
f ( X  A) 3 3

3 or  i3
N N

4   ( X  A)4 or
 fd 4
 i4
N N
Conversion Raw moments into Central Moments:
First calculate raw moments about assumed mean and then convert them in to central
moments on the basis of relation.
1  1  1  0, 2  2  ( 1 ) 2
3  3  312  2( 1 )3 ,
4  4  413  62 ( 1 )2  3( 1 ) 4
X  1  A,  2  2  ( 1 ) 2

Moments about zero:


The moments about zero is denoted by . The first four moments about zero are

1 
f X  A  1  X ,  2 
f X 2

 2  12
N N

3   f X3  3  3 1 2  2 13
N

4   f X4  4  4 1 3  6 12 2  3 14
N
Moments as measure of skewness:
β1 measures skewness and calculate with the half of µ3 and µ2.
3 2
1 
 23
Moment coefficient of skewness
3 2
( 1 )  1 
23

(i) If
1  0 (Symmetric distribution)
(ii) If
1  0 (positive skewed distribution)
(iii) If
1  0 (negative skewed distribution)
Page20

β1 gives good result only in case of distribution having moderate distribution skewness.
#Skewness:
Skewness denotes the opposite of symmetry. It is lack of symmetry.
Skew Symmetrical distribution:
Distribution which is not symmetrical, It is said to be skew-symmetrical distribution, the left
tail and the right tail are not of equal length, one tail will be longer than other.
(i) Negative-Skew distribution: The left tail is longer than right tail.
(ii) Positive-Skew distribution: The right tail is longer than left tail.

Test of Skewness:
(1) There is no skewness distribution if A. M. = mode= median.
(2) There is no skewness distribution if Q3-median = median-Q1.
(3) There is no skewness distribution if
Sum of frequency which are less than mode= sum of frequency which are greater than
mode.
(4) There is no skewness distribution if Quartiles are equidistant from the median.
(5) The distribution is negatively skewed if A. M. < mode.
(6) The curve is not symmetrical about the median if A. M. ≠ mode≠ median.
Use of Skewness:
(1) It gives nature of the curve.
(2) It gives nature and concentration of observation about mean.
Types of distribution:
(1) Fairly symmetrical.
(2) Positives skewed
Page21

(3) Negative skewed

Measures of skewness:
(1) Absolute measure = Mean-mode.
(2) Relative measure, there are four types of skewness.
(a) Karl Pearson’s coefficient of skewness
(b) Bowley’s coefficient of skewness
(c) Kelly’s coefficient of skewness.
(d) Based on moments (mode = 3 median – 2 mean)
(a) Karl Pearson’s coefficient of skewness
mean  mod e mean  (3 median  2 mean) 3(mean  median)
 
= S .D. S .D. S .D.

It generally lies between -1 to +1.


If it is zero that means no skewness.
(a) If mean = mode Fairly.
(b) If mean > mode +ve skewness.
(c) If mean < mode –ve skewness..
(d) If mean = mode Fairly.
Q3  Q1  2median
(b) Bowley’s coefficient of skewness = Q3  Q1

(c) Kelly’s skewness =


P10  P90  2 P50
, ( P50  median)
P90  P10
D1  D9  2 Median
 , ( D5  median)
D9  D1
 N 
10  C
P10  l   100  i
f

#Kurtosis:
It measures the degree of peakedness of a distribution and is given by measure of kurtosis.
4
2 
2 2

where 2 
 f ( x  x) 2

and 4 
 f ( x  x) 4

N N
Page22

(i) If  2  3 , the curve is normal or mesokurtic Symmetric distribution.


(ii) If  2  3 , the curve is peaked or leptokurtic.
(iii) If  2  3 , the curve is flat topped or platykurtic.

# Standard Deviation:
(a) Standard deviation from actual mean:

X
X
(i) N
(ii) x  ( X  X )
(iii) x 2

S .D.( ) 
x 2

(iv) N
Page23


 100
Coefficient of S. D.= X
(b) Standard deviation from assumed mean:

d  d 
2 2

S .D.( )     , d  X  A, A  assumed mean


N  N 

 100
Coefficient of S. D.= X

S.D.( ) 
 fx 2

, X  actual mean 
 f X ,x  X  X
(c) N N

(d) S.D.    var iance  


2

(e) Mode=3Median-2Mean
Importance of moments:
(1) First central moment is always zero i.e.
1  0 .

(2) Second central moment indicates variance i.e.


2   2 .
3 2
1 
(3) Third central moment is used to measure skewness ( i.e. lack of symmetry)  23 .
4
2 
(4) Fourth central moment is used to measure kurtosis ( i.e. degree of peakedness) 2 2 .

Q. The first three moments of a distribution about the value 2 of the variable are 1, 16 and -
40. Show that mean=3, the variance=15 and µ3= - 86.
Sol. We know that
1  X  a  d , 2   2  d 2 , 3  3  3d 2  2d 3
here a  2, 1  1, 2  16, 3  40
Substituting the values we get
X  1  a  1  2  3,  d  X  a  X  2  3 2 1
2   2  2  d 2  16  1  15
3  3  3d 2  2d 3  40  3 16  2 1  86
Page24

Q. The first three central moments of a distribution are 0, 2.5, 0.7. Find the value of the
moment coefficient of skewness.
  0, 2  2.5, 3  0.7
Sol. 1
3 2  0.7
( 1 )  1   3   0.1771
2 3
23 (2.5)3
Moment coefficient of skewness
Q. In a central distribution, the first four moments about a point are -1.5, 17, -30 and 108.
Calculate the moments about mean, 1 and  2 , and state whether the distibution is
leptokurtic or platykurtic.
Sol. We know that
1  1.5, 2  17, 3  30, 4  108
2  2  ( 1 )2  17  (1.5) 2  14.75
3  3  32 1  2( 1 )3  30  3(17)(1.5)  2(1.5)3  39.75
4  4  413  621  3( 1 ) 4  142.3125
32 (39.75) 2
1    .4924
23 (14.75)3

 2  42  .6541  3 ( Platykurtic)
2
Q. Compute skewness and kurtosis, if the first four moments of a frequency distribution
f ( x) about the values x  4 are 1, 4, 10 and 45 respectively.

Sol. We have
a  4, 1  1, 2  4, 3  10, 4  45

about mean,
1  0 (always)

2  2  ( 1 )2  4  (1) 2  3


3  3  32 1  2( 1 )3  10  3(4)(1)  2(1)  0
4  4  413  621  3( 1 ) 4  26
Skewness
Moment coefficient of skewness
3 2  0
( 1 )  1   3  0
2 3
2 3
27
Distribution is symmetrical.
Page25

Kurtosis
 26
 2  42  2  2.86  3 ( Platykurtic)
2 (3)
Q. Calculate the first four moments about the assumed mean of the given distribution. Also
find 1 and  2 .
x 2.0 2.5 3.0 3.5 4.0 4.5 5.0
f 4 36 60 90 70 40 10

Sol.
x  3.5 x  3.5
u 
Taking a  3.5 and h 0.5
x  3.5
u
0.5 fu fu 2 fu 3 fu 4
x f
2.0 4 -3 -12 36 -108 342
2.5 36 -2 -72 144 -288 576
3.0 60 -1 -60 60 -60 60
3.5 90 0 0 0 0 0
4.0 70 1 70 70 70 70
4.5 40 2 80 160 320 640
5.0 10 3 30 90 270 810
 f  310 36 560 204 2480

1  
fu 36
  0.166
f 310


2
fu 560
  1.806
f
2
310
1  0.000125, 2  2.44
#Calculate the Bowley’s coefficient of skewness
Step(1) Calculate Q1, Q3 and median
Q  Q1  2median
Coefficient of Sk  3
Q3  Q1
Sk
(a) Zero value of = symmetrical
Page26

(b) Positive value of S k = Positive skewed distribution


(c) Negative value of S k = negative skewed distribution

Uses of Quartile skewness:


(i) When the class interval are unequal.
(ii) Modal class is the last class in case of continuous series.

Ex. Calculate the Bowley’s coefficient of skewness


Marks 5 15 25 35 45 55
No. of 10 20 30 50 40 30
students
Sol.
x F c.f.
5 10 10
15 20 30
25 30 60
35 50 110
45 40 150
55 30 180
N=180

 N 1
th
181
   45.25, Q1  25
(i) 1 size of the  4 
Q item = 4 .
 3( N  1) 
th
3 181
   135th , Q3  45
(ii) 2 size of the  
Q 4 4
item = .
 N 1
th
181
   90.5, Q2  35
(iii)Median, size of the  2  item = 2 .
Q  Q1  2median
Bowley ' s Coefficient of Sk  3 0
Q3  Q1

# Moment Generating Function:


Page27

For certain theoretical developments, an indirect method for computing moments is used, the
method depend on the finding of the moment generating function.
#In case of continuous variable x: it is defined as
b
M (t )   etx f ( x) dx ....(1)
a

Where integral is a function of parameter ‘t’ only. F(x) is a distribution function for which
the integral given by (1) exists.
tx
Then e may be expanded in power series
b
 t2 
M (t )   1  tx  x 2      f ( x) dx
a 
2!
b b b
t2
 f ( x) dx t  x. f ( x) dx   x 2 . f ( x) dx    
a a
2! a
b
t2 tr
M (t )   0  1t  2          x r . f ( x) dx ...(2)
2! r! a

tr
In case of in (2) is the rth moment about the origin.
r!

 dr   r! 
Also  r M (t )    r  r 1t         r    (3)
 dt   r!  t 0

Thus  r about origin = rth derivative of M(t) with t=0.

Although the moment generating function (m. g. f.) has been defined for the variable x only.

The definition can be generalized so that it hold for variable z where z is a function of x i.e.
z = x-m, m=mean.

rth moment about z will be rth moment of x about the mean m.


b
M z (t )   etz f ( x) dx ....(1)
a
b b
M ( x  m ) (t )   et ( x  m ) f ( x) dx  e  mt  etx f ( x) dx  e  mt M x (t )
a a

#In case of discrete distribution of the variable x:


We know that for variable x
 r   xr .P .......(1)
Page28

Where P is the probability that the variable takes on the value x.


For Z is any function of x, we get rth moment for z by the relation

 r   z r .P .......(2)

And Moment Generating Function is given by

M z (t )   etz .P .......(3)

 t2 
M z (t )   1  tz  z 2      P
 2! 
2
t
  P  t  zP   z 2 P      
2!
2
t tr
  0  1t  2          r     
2! r!

 dr 
 r   r M z (t ) 
 dt  t 0

M(t) is clearly expected value of etx hence can be written as E (etx ) which gives m.g.f
M x  a (t )  E  et ( x a )    et ( x  a ) Pi  e at  etxi Pi
i i
 at
M x  a (t )  e M 0 (t )
Properties:
(i) M x  y (t )  M x (t )  M y (t ) (provided a and y are independent)
t  xa
at

(ii) M u (t )  e h
Mx   (effect of change of origin and scale on m.g.f. u   
h  h 
(iii) M cx (t )  E  etcx   M x (ct ) (c being constant)

Ex. Find Moment generating function of exponential distribution. Also find mean and S. D.
1 x
f ( x)  e c , 0  x  , c  0
c
Sol.
Page29

 
1 1
M x (t )   e . .e x / c dx   e(t 1/ c ) x dx
tx

0
c c0

  
1  e(t 1/ c ) x  1  e (1t / c ) x  0 1
      
c  (t  1 )  c  (ct  1) / c  0 (ct  1)
 c 0
 (1  ct ) 1  1  ct  c 2t 2  c3t 3      ( Binomial theorem)
 dr 
 r   r M x (t ) 
 dt  t 0
d  d 
 1   M x (t )    1  ct  c 2t 2  c 3t 3       
 dt  t 0  dt  t 0
 c  2c 2t  3c 3t 2       c
t 0

 d2 
and  2   2 M x (t )   2c 2
 dt  t 0
Now
Mean  x  1  c
2
Variance  2   2  x   2  12  2c 2  c 2  c 2
Ex. Find Moment generating function of the random variable x, have probability distribution
x for 0  x  1

f ( x)  2  x for 1  x  2
0
 elsewhere
Also find  1 ,  2 and variance  2 .

Sol.
Page30

1 2 
M x (t )  E  etx    x.etx dx   (2  x).etx dx   0.etx dx
0 1 2
1 2
 xetx etx   2etx xetx etx 
  2    2  2
 t t 0  t t t 1
2
e 2t  2et  1  et  1 
  
t2  t 
2
 t2 t3 
t      
   1 t  t2     
2! 3!
2
t
d 
Mean   1   M x (t )   1
 dt 
2
 2  2, 2   2  x   2  12  1  var iance
Method of Least Square:-
Let
a11 x1  a12 x2        a1n xn  b1
a21 x1  a22 x2        a2 n xn  b2


am1 x1  am 2 x2        amn xn  bm

Rewrite the equations


Ei  ai1 x1  ai 2 x2        ain xn  bi    (1)
i  1, 2,3,....., m
Let S denote the sum of the squares of these errors then we have
m
S    ai1 x1  ai 2 x2        ain xn  bi 
2

i 1
m
S   Ei 2 ......(2)
i 1

The principle of least square asserts that most plaussible values of the unknowns are those which
makes S given by (2) a minimum.
In differential calculus (Maxima and Minima)
F  F ( X1 , X 2 , X 3 ,......, X n ) are given by
F F F F
0      
x1 x2 x3 xn
Page31

Provided the partial derivative exists.


If ‘S’ will be the maximum or minimum which satisfy the following equations.
S S S S
0      
x1 x2 x3 xn
m m m

a
i 1
i1 Ei  0, a
i 1
i2 Ei  0,.......,  ain Ei  0
i 1
.....(3)

Those n equations are given by (3) are known as the normal equations, can be solved and the
value of x1  ai 2 x2        ain xn  1 ,  2 are most plausible values or best values.

Ex. Find the normal equations and hence find the most plausible values of x, y, z in the least
square sense from the following equations.
x  2 y  z  1, 2 x  y  z  4,  x  y  2 z  4, 4 x  2 y  5 z  7
Sol.. Given equation can be rewritten as

x  2 y  z  1  0, 2 x  y  z  4  0,  x  y  2 z  4  0, 4 x  2 y  5 z  7  0
Now to obtain normal equations for x, wemultiply these equations by the coefficient of x in that
equation and then add.
Thus we get normal equation of x as
1.( x  2 y  z  1)  2.( 2 x  y  z  4)  (1)(  x  y  2 z  4)  4( 4 x  2 y  5 z  7)  0
 22 x  11y  19 z  23  0    (1)
Similarly the normal equation for y is
 11x  10 y  5 z  4  0    (2)
And the normal equation for z is
 19 x  5 y  31z  48  0    (3)
Solving eqs. (1), (2) and (3), we get
x  0.910, y  0.378, z  2.045
Ex. Find the values of x and y which will satisfy the following equation most satisfactorily with
the help of normal equations of x, y.
x  2.5 y  21, 4 x  1.2 y  42.04, 3.2 x  y  28, 1.5 x  6.3 y  40
Ans. x  9.620, y  4.064
Ex. Find the most plausible values of x and y from the following equations:
x  y  3.00, 2 x  y  0.5, x  3 y  7.25, 3x  y  4.95
Ans. x  9.620, y  4.064
Sol. Given equation can be rewritten as
x  y  3.00  0, 2 x  y  0.5  0, x  3 y  7.25  0, 3x  y  4.95  0
The sum of squares is given by:
Page32

S  ( x  y  3.00)2  (2 x  y  0.5)2  ( x  3 y  7.25) 2  (3x  y  4.95) 2


For extreme values of S we have
S S
 0, 0
x y
S
 0 gives 2( x  y  3.00)  4(2 x  y  0.5)  2( x  3 y  7.25)  6(3 x  y  4.95)  0
x
 15 x  5 y  28.10  0 .....(1)
and
S
 0 gives 2( x  y  3.00)  2(2 x  y  0.5)  6( x  3 y  7.25)  2( 3 x  y  4.95)  0
y
 5 x  12 y  29.20  0 .....(2)
From (1) and (2) we have
x  1.234, y  1.919
#Curve Fitting:
We find that the fitting of curves helps us to get a close functional relation between the variables
x and y and this relation is generally expressed as a polynomial but other types of algebraic,
exponential or logarithmic relationship can also be fitted with the help of the principle of least
squares.
Suppose we have to fit a rth degree curve given by
y  a  bx  cx 2        kx r .....(1)
to the given values ( x1 , y1 ), ( x2 , y2 ),..............,( xr , yr ) .
This curve given by (1) has (r+1) unknown constants a, b, c, d,…..,k and since there are (r+1)
constants we get (r+1) equations. Then unique solution of the values a, b, c,…,k impossible.
Now let yi  a  bxi  cxi 2        kxi r .....(2)
and let yi be the observed value of y for xi .
then if ui be the residual for this point, we have
ui  yi  yi  yi  (a  bxi  cxi 2        kxi r ) from (2)
ui  yi  a  bxi  cxi 2      kxi r .....(3)
In order to make the sum of squares minimum, We have to minimize
r r
S   ui 2   ( yi  a  bxi  cxi 2       kxi r ) 2 .....(4)
i 1 i 1

S will have its extreme values when


S S S
 0,  0,..........., 0
a b k
Which gives (r+1) equations:
Page33

S r
  ( yi  a  bxi  cxi 2       kxi r )  0
a i 1
S r
  xi ( yi  a  bxi  cxi 2       kxi r )  0
b i 1
S r
  xi 2 ( yi  a  bxi  cxi 2       kxi r )  0
c i 1


S r
  xi r ( yi  a  bxi  cxi 2       kxi r )  0
k i1
Which reduce to
 y ma  b x  c x        k  x
2 r
i i i i

 x y a x  b x  c x        k  x
2 3 r 1
i i i i i i

 x y a x  b x  c x        k  x
2 2 3 4 r 1
i i i i i i




x yi a  xi r  b xi  c  xi        k  xi
r r 1 r 2 r r
i

Solve all (r+1) equations and find value of constants a, b, c,….,k.


Best fitted curve is
y  a  bx  cx 2  dx3        kx r
(a) Straight line y  a  bx ......(1)
(b) Parabolic y  a  bx  cx 2 ......(2)

(c) Exponential
y  aebx ......(3)
log y  log a  bx log10 e
Y  A  BX
where Y  log10 y, A  log10 a, B  b log10 e
(d) Fitting of the Curve
y  ab x
log 10 y  log10 a  x log10 b
Y  A  xB
where Y  log 10 y, A  log10 a, B  log10 b
Page34

(e) Fitting of the Curve


y  axb
log 10 y  log10 a  b log10 x
Y  A  bX
where Y  log 10 y, A  log10 a, X  log10 X

(f) Fitting of the Curve


PV   k  V   kP 1
Taking log
 log10 V  log10 k  log10 P
1 1
log10 V  log10 k  log10 P
 
 Y  A  bX
1 1
where Y  log 10V , A  log10 k , B   , X  log10 P
 
(g) Fitting of the Curve
xy  b  ax
b
y  a
x
1
 Y  a  bX where X 
x
(h) Fitting of the Curve
b
y  ax 2 
x
2
n
n
b
S   Ei    yi  axi 2   is min imum.
2

i 1 i 1  xi 
S n n n
 0   xi 2 yi  a  xi 4  b xi
a i 1 i 1 i 1

S n
y n n
1
 0   i  a  xi  b 2
b i 1 xi i 1 i 1 xi

Normal equations are


 x 2 y  a  x 4  b x
y 1
 x  a  x  b x 2

(i) Fitting of the curve


Page35

y  ax  bx 2

S   Ei 2    yi  axi  bxi 2 
n n 2
is min imum.
i 1 i 1

S n n n
 0   xi yi  a  xi 2  b xi 3
a i 1 i 1 i 1

S n n n
 0   xi 2 yi  a  xi 3  b xi 4
b i 1 i 1 i 1

Normal equations are


 xy  a x  b x 2 3

 x y  a  x  b x
2 3 4

(j) Fitting of the Curve


b
y  ax 
x
2
n
 n
b
S   Ei    yi  axi   is min imum.
2

i 1 i 1  xi 
S n n n
x n
 0   xi yi  a  xi 2  b i  a  xi 2  nb
a i 1 i 1 i 1 xi i 1

S n
y n
1
 0   i  na  b 2
b i 1 xi i 1 xi

Normal equations are


 xy  a x 2  nb
y 1
 x  na  b x 2

(k) Fitting of the Curve


b c
y a  2
x x
Normal equations are
1 1
 y  na  b x  c x 2
y 1 1 1
 x  a  x  b x 2  c  x 3
y 1 1 1
 x 2  a  x 2  b x 3  c  x 4
(l) Fitting of the Curve
c
y  0  c1 x
x
Normal equations are
Page36

y 1 1
x c x 0 2
 c1 
x
1
y x  c0 
x
 c1  x

(m) Fitting of the Curve


2 x  ax 2  bx  c
Normal equations are
2 x
x 2  a  x 4  b x 3  c  x 2
2 x
x  a  x 3  b x 2  c  x
2 x
 a x 2  b x  nc
(n) Fitting of the Curve
y  ae3 x  be2 x
Normal equations are
 ye 3 x
 a  e6 x  b e5 x
 ye 23 x
 a  e5 x  b e4 x

Q. Fit a straight line of the following data:


x 1 2 3 4 5
y 5 7 9 10 11

Sol.
x y xy x2
1 5 5 1
2 7 14 4
3 9 27 9
4 10 40 16
5 11 55 25
15 42 141 55

Let the equation of the line is y  a  bx ....(1)


Then its normal equations are
 y  na  b x
 xy  a x  b x 2

It reduces to
Page37

42  5a  15b .....(1)
141  15a  55b .....(2)
On solving we get
3
b  , a  3.9
2
so y  3.9  1.5 x

Q. Fit a straight line of the following data regarding x as the independent variable
x 0 1 2 3 4
y 1.0 1.8 3.3 4.5 6.3
Hence find the difference between the actual value of y and value of y obtained from the
fitted curve when x=3.
Sol.
The values of the variables=5 i. e. odd
So middle value=2
Let u=x-2, so that we have y as v i. e. v=y-0
x u=x-2 y v=y-0 u2 uv
0 -2 1.0 1.0 4 -2.0
1 -1 1.8 1.8 1 -1.8
2 0 3.3 3.3 0 0
3 1 4.5 4.5 1 4.5
4 2 6.3 6.3 4 12.6
10 0 16.9 16.9 10 13.3

Let the equation of the line is v  a  bu ....(1)


Then its normal equations are
 v  na  b u  16.9  5a  b  0  a  3.38
 uv  a u  b u 2
 13.3  a  0  b  10  b  1.33
 v  3.38  1.33u
 y  3.38  1.33( x  2)
 y  1.33 x  0.72
also when x  3  y  4.71
Page38

Q. Fit a second degree curve to the following data taking u=x-5 as the independent variable
and v=y-7 as dependent variable:
x 1 2 3 4 5 6 7 8 9
y 2 6 7 8 10 11 11 10 9
Sol.
Let the second degree curve to be fitted be
y  a  bu  cu 2 ......(1)
Normal equations are
 v  na  b u  c u 2
......(2)
 uv  a u  b u  c u 2
.......(3) 3

 u v  a u  b u  c u ........(4)
2 2 3 4

 u  0,  v  11,  uv  51,  u  60 2

 u v  9,  u  0,  u  708
2 3 4

Solving normal equations we get

a  3.004, b  085, c  0.257


 y  0.921  3.5 x  8.27 x 2

Q. Fit the curve y=aebx to the following data where e=2.71028:

x 0 2 4
y 5.012 10 31.62
Sol. curve to be fitted is
y  aebx ......(1)
Taking log
log10 y  log10 a  (b log10 e) x
Y  A  Bx
Normal equations are
 Y  nA  B x ......(2)
 xY  A x  B x 2
.......(3)

x y Y=log10y xY x2
0 5.012 0.7 0 0
Page39

2 10 1.0 2.0 4
4 31.62 1.5 6.0 16
6 3.2 16 20

we have
3.2  3 A  6 B ......(4)
8  6 A  20 B .......(5)
 A  0.6666, B  0.2

A  0.6666  log10 a  0.6666


 a  anti log(0.6666)  4.640

B  0.2  b(log10 2.718)  0.2


 b(0.4338)  0.2  b  0.46
 y  4.64 e0.46 x

You might also like