0% found this document useful (0 votes)
80 views

Statistics

Statistics jee

Uploaded by

defnotavi08
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
80 views

Statistics

Statistics jee

Uploaded by

defnotavi08
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Statistics

Introduction
Statistics

Measures of dispersion
Measures of central tendency
1. Range
1. mean, median, mode
2. Mean deviation
2. quartile, decile
3. σ, σ2
• Eg. Suppose Virat Kohli’s and Rohit Sharma’s average runs is 50.
The flowchart for Virat’s runs looks like this :

0 50 100
While Rohit’s runs are distributed like this :

0 50 100
Although both have same average runs, Virat is more consistent. Virat’s runs about the
mean are less scattered. This statistical figure is measured by measures of dispersion.

Types of data
(a) Ungrouped data : Eg. runs scored are 10, 20, 30, 40, 50. These values are denoted by 𝑥𝑖 .
(b) Grouped data :
i. Discrete frequency distribution : For every data there is a corresponding frequency
(denoted by 𝑓𝑖 ).
𝑥𝑖 10 20 30 40 50
𝑓𝑖 5 2 5 1 4
ii. Continuous frequency distribution :
𝑥𝑖 0 – 10 10 – 20 20 – 30 30 – 40 40 – 50
𝑓𝑖 5 1 3 4 2
Now we have represented runs in terms of intervals i.e., there were 5 values where runs
lie in between 0 – 10 interval. Similarly, there were 3 values where runs were in the 20 –
30 interval. These intervals are called class intervals. The assumption is that the data is
centred at the middle of the class interval i.e., in 40 – 50 category, 2 ‘45 runs’ are scored.

Measures of central tendency


Mean
• For ungrouped data
∑ 𝑥𝑖
𝑥̅ = where 𝑛 = Σ𝑓𝑖
𝑛
• For grouped data
∑ 𝑥𝑖 𝑓𝑖
𝑥̅ = where 𝑛 = Σ𝑓𝑖
𝑛
Q1. Mean of 100 items is 49. It was discovered that 3 items which should have been 60, 70, 80
were wrongly read as 40, 20, 50 respectively. Correct mean is?
A1. Ans. 50
Q2. If a variable takes values 0, 1, 2, …, n with frequencies 1, 𝑛𝑒𝐶1 , 𝑛𝑒𝐶2 , … , 𝑛𝑒𝐶𝑛 then the A.M. is
(A) n (B) 2n/n (C) n + 1 (D) n/2
A2. Ans. D

• Eg. Discrete frequency


𝑥𝑖 𝑓𝑖 𝑥𝑖 𝑓𝑖
2 2 4
5 8 40
6 10 60
8 7 56
10 8 80
12 5 60
𝑁 = 40 ∑ 𝑥𝑖 𝑓𝑖 = 300
∑ 𝑥𝑖 𝑓𝑖
Mean 𝑥̅ =
𝑁
• Eg. Continuous distribution
Marks obtained 𝑥𝑖 𝑓𝑖
10 – 20 15 2
20 – 30 25 3
30 – 40 35 8
40 – 50 45 14
50 – 60 55 8
60 – 70 65 3
70 – 80 75 2
Method – 1 : Make column for 𝑥𝑖 𝑓𝑖
Method – 2 : Assumed mean method
Assumed mean method
• Eg. 10, 20, 30, 40, 50, 60
We can observe values and guess that mean must be 35. Indeed on using the formula mean
comes out to be 35 only. So this method works by making educated guesses on what the
mean could be i.e. assuming a mean.
• Suppose assumed mean is 40 (𝐴 = 40).
∑(𝑥𝑖 − 𝐴)
Then mean will be given by 𝑥̅ = 𝐴 +
𝑛
−30 − 20 − 10 + 0 + 10 + 20
= 40 + = 40 − 5 = 35
6
• We can extend this for grouped data.
∑ 𝑓𝑖 (𝑥𝑖 − 𝐴)
𝑥̅ = 𝐴 +
𝑛
• In above example, seeing that 𝑥𝑖 is a multiple of 5, we will take 𝐴 as 45.
Marks obtained 𝑥𝑖 𝑓𝑖 𝑥𝑖 − 𝐴 𝑓𝑖 (𝑥𝑖 − 𝐴)
10 – 20 15 2 -30 -60
20 – 30 25 3 -20 -60
30 – 40 35 8 -10 -80
40 – 50 45 14 0 0
50 – 60 55 8 10 80
60 – 70 65 3 20 60
70 – 80 75 2 30 60
Σ𝑓𝑖 (𝑥𝑖 − 𝐴) = 0
∴ 𝑥̅ = 45 + 0 = 45

Q3. The mean of the following frequency table is 50.


Class Frequency
0 – 20 17
20 – 40 𝑓1
40 – 60 32
60 – 80 𝑓2
80 – 100 19
Total 120
The missing frequencies are :
(A) 28, 24 (B) 24, 36 (C) 36, 28 (D) none of these
A3. Ans. (A)
Median
• Eg. 10, 30, 50, 20, 60, 70, 80
Median is not 20. First we have to arrange data in increasing/decreasing order, only then
we can comment on median.
• Median is the central value/observation when data is arranged in increasing/decreasing
order.
𝑁 + 1 𝑡ℎ
( ) term ; 𝑛 ∈ 𝑜𝑑𝑑
2
Median = 𝑁 𝑡ℎ 𝑁 𝑡ℎ
( ) + ( + 1)
2 2 term ; 𝑛 ∈ 𝑒𝑣𝑒𝑛
{ 2
• For grouped data :
Eg.
𝑥𝑖 𝑓𝑖 𝑐𝑓𝑖
2 2 2
5 8 10
6 10 20
8 7 27
10 8 35
12 5 40
Step 1 : Arrange data in increasing/decreasing order
Step 2 : Make a cumulative frequency column
Step 3 : Use the formula of median
20𝑡ℎ term + 21𝑠𝑡 term 6 + 8
In this example, Median = = =7
2 2
• Eg.
Marks 𝑥𝑖 𝑓𝑖 𝑐𝑓𝑖
10 – 20 15 2 2
20 – 30 25 3 5
30 – 40 35 8 13
40 – 50 45 14 27
50 – 60 55 8 35
60 – 70 65 3 38
70 – 80 75 2 40
Median will be the 20th term in the data. So it will have come from highlighted row. Hence
median should be 45. But finding median in this way leaves a scope for error. Because we
have assumed that the data is centred around this value 45. Suppose we change 14 (of 𝑓𝑖 )
to 24. Then some effect should come on median. Similarly, the frequencies of classes before
and after median class will also affect the value of median.
• Hence the formula of median comes out as
𝑁
( − 𝐶)
Median = 𝑙 + 2 ×ℎ
𝑓
where 𝑙 is the lower limit of median class
𝐶 is the cumulative frequency of class preceding to median class
𝑁 is the number of total data/observations
𝑓 is the frequency of median class
ℎ is the width of median class
• In above example….
Step 1 : Identify the median class.
𝑁
= 20 → so median class is 40 − 50
2
Step 2 : Apply the formula
20 − 13
𝑀 = 40 + ( ) × 10 = 45
14
Mode
• The mode or model value of a distribution is that value of the variable for which the
frequency is maximum. The class having the highest frequency is called model class.
• For continuous series, the mode is calculated as,
𝑓1 − 𝑓0
Mode = 𝑙 + [ ]×𝑖
2𝑓1 − 𝑓0 − 𝑓2
where, 𝑙 = the lower limit of model class
𝑓1 = the frequency of model class
𝑓0 = the frequency of the class preceding the model class
𝑓2 = the frequency of the class succeeding the model class
𝑖 = the size of the model class
• Eg. Find mode of data 2,4,6,8,8,12,17,6,8,9.
Soln. 8 occurs maximum number of times so mode = 8.
• Eg. Find the mode of the following frequency distribution
Class 0 – 10 10 – 20 20 – 30 30 – 40 40 – 50 50 – 60 60 – 70 70 – 80
𝑓𝑖 2 18 30 45 35 20 6 3
Soln. Here the class 30 – 40 has maximum frequency so this is the model class.
𝑙1 = 30, 𝑓1 = 45, 𝑓0 = 30, 𝑓2 = 35, ℎ = 10
𝑓1 − 𝑓0 45 − 30
∴ Mode = 𝑙1 + × ℎ = 30 + × 10 = 36
2𝑓1 − 𝑓0 − 𝑓2 2 × 45 − 30 − 35

Types of distribution
• Symmetric distribution : A distribution is a symmetric distribution if the values of the
mean, mode and median coincide. In a symmetric distribution, frequencies are
symmetrically distributed on both sides of the centre point of the frequency curve.
• Asymmetric distribution : A distribution which is not symmetric is called a skewed –
distribution. In a moderately asymmetric distribution, the interval between the mean and
median is approximately one – third of the interval between the mean and the mode i.e. we
have the following empirical relation between them

Mean – Mode = 3(Mean – Median) ⇒ Mode = 3 Median – 2 Mean

Partition values

Partition
values

Median Quartile Decile Percentile

Quartile

0 Q1 Q2 Q3 N
Quartile divides the distribution into four equal parts. Q1 stands for lower quartile, Q3
stands for upper quartile and Q2 is the middle quartile which is same as median.
• Lower quartile :
i. 𝑁 + 1 𝑡ℎ
Discrete series : 𝑄1 = size of ( ) item
4
ii. 𝑁
( − 𝐶)
Continuous series : 𝑄1 = 𝑙 + 4 ×ℎ
𝑓
• Upper quartile :
i. 𝑡ℎ
3(𝑁 + 1)
Discrete series ∶ 𝑄3 = size of [ ] item
4
ii. 3𝑁
( − 𝐶)
Continuous series ∶ 𝑄3 = 𝑙 + 4 ×ℎ
𝑓
• Eg. 10, 20, 30, 40, 50, 60, 70, 80
N=8
𝑁+1 9
∴ 𝑄1 = ( ) th term = th term = 2.25th term
4 4
20 + 30
2.25𝑡ℎ term ≡ = 25
2
𝑁
• For continuous distribution first step is to identify 𝑄1 class (i.e. c.f. > ) or 𝑄3 class (i.e. c.f.
4
3𝑁
> ) as asked in question.
4
Q4. Find Q3
𝑥𝑖 5 4 9 12 15 6 10
𝑓𝑖 8 6 12 8 6 9 10
A4. Ans. 10 (hint : arrange data in ascending order)

Decile
• Decile divides total frequencies N into ten equal parts.
𝑁×𝑗
−𝐶
𝐷𝑗 = 𝑙 + 10 × ℎ [𝑗 = 1,2,3,4,5,6,7,8,9]
𝑓
𝑁
−𝐶
• Eg. If 𝑗 = 5, then 𝐷5 = 𝑙 + 2
× ℎ. Hence D5 is also known as median.
𝑓

Q5. Find D5
𝑥𝑖 5 4 9 12 15 6 10
𝑓𝑖 8 6 12 8 6 9 10
A5. Ans. 9
𝑁 + 1 𝑡ℎ
Hint : Use formula 𝐷5 = 5 ( ) term
10

Percentile
• Percentile divides total frequencies N into hundred equal parts.
𝑁×𝑘
−𝐶
𝑃𝑘 = 𝑙 + 100 × ℎ where 𝑘 = 1,2,3,4,5, … ,99
𝑓
• Eg. For ungrouped data
𝑁 + 1 𝑡ℎ
𝑃10 = value of 10 ( ) term
100
For continuous data
𝑁
(10 × − 𝐶)
𝑃10 = 𝑙 + 100 ×ℎ
𝑓

Q6. Marks obtained by 50 students. If 70% students pass the test, find min. marks needed by
students to pass the exam.
Marks 0 – 10 10 – 20 20 – 30 30 – 40 40 – 50 50 – 60
No. of
3 5 9 12 18 3
students
A6. Ans. 28 [or 27.77] (Hint : value of P30 is required)

Measures of dispersion
• The degree to which numerical data tends to spread about an average value is called the
dispersion of the data.
• Four measures of dispersion are : (1) Range, (2) Mean Deviation, (3) Standard deviation,
(4) Square deviation
Range
• It is the difference between the values of extreme items in a series.
Range = 𝑋𝑚𝑎𝑥 − 𝑋𝑚𝑖𝑛
𝑋𝑚𝑎𝑥−𝑋𝑚𝑖𝑛
• The coefficient of range (scatter) =
𝑋𝑚𝑎𝑥+𝑋𝑚𝑖𝑛
• Range is not the measure of central tendency. Range is widely used in statistical series
relating to quality control in production.
Inter quartile range
• The inter – quartile range is found by taking the difference between third and first quartiles
and is given by the formula :
Inter – quartile range = 𝑄3 − 𝑄1
where 𝑄1 = first quartile or lower quartile and 𝑄3 = third quartile or upper quartile
Percentile range
• This is measured by the following formula
Percentile range = 𝑃90 − 𝑃10
where 𝑃90 = 90th percentile and 𝑃10 = 10th percentile
• Percentile range is considered better than range as well as inter quartile range
Quartile deviation or semi – quartile range
• It is one – half of the difference between the third quartile and first quartile i.e.
𝑄3 − 𝑄1
𝑄. 𝐷. =
2
𝑄 −𝑄
• Coefficient of quartile deviation = 3 1
𝑄3 +𝑄1
where 𝑄3 is the third or upper quartile and 𝑄1 is the first or lower quartile

Q7. Find quartile deviation.


𝑥𝑖 5 10 15 20 25 30
𝑓𝑖 2 3 8 7 6 4
A7. Ans. 5

Mean deviation
• The arithmetic average of the deviations (all taken positive) from the mean, median or
mode is known as mean deviation.
• Formulae :
∑|𝑥𝑖 − 𝑥̅ |
Mean deviation about mean ∶ 𝑀. 𝐷. (𝑥̅ ) =
𝑛
∑|𝑥𝑖 − 𝑀 |
Mean deviation about median ∶ 𝑀. 𝐷. (𝑚𝑒𝑑𝑖𝑎𝑛) =
𝑛
• In general mean deviation (M.D.) always stands for mean deviation about median.

Q8. Find mean deviation about mean : 6,7,10,12,13,4,8,12


A8. Ans. 2.75

• Mean deviation of discrete frequencies :


∑ 𝑓|𝑥 − 𝑀 | ∑ 𝑓𝑑𝑀
Mean deviation = =
𝑛 𝑛
where M is the mean, 𝑑𝑀 = |𝑥 − 𝑀 | and 𝑛 = Σ𝑓
• Some more formulae :
i. Mean deviation from the mean
Mean coefficient of dispersion =
Mean
ii. Mean deviation from the median
Median coefficient of dispersion =
Median
iii. Mean deviation from the mode
Mode coefficient of dispersion =
Mode

Q9. Find M.D. (𝑥̅ ).


𝑥𝑖 2 5 6 8 10 12
𝑓𝑖 2 8 10 7 8 5
A9. Ans. 2.3
∑ 𝑥𝑖 𝑓𝑖 300
𝑥̅ = = = 7.5
𝑁 40
𝑥𝑖 2 5 6 8 10 12
𝑓𝑖 2 8 10 7 8 5
|𝑥𝑖 − 𝑥̅ | 5.5 2.5 1.5 0.5 2.5 4.5
𝑓𝑖 |𝑥𝑖 − 𝑥̅ | 11 20 15 3.5 20 22.5
∑ 𝑓𝑖 |𝑥𝑖 − 𝑥̅ | = 92
∑ 𝑓𝑖 |𝑥𝑖 − 𝑥̅ | 92
𝑀. 𝐷. (𝑥̅ ) = = = 2.3
𝑁 40

Q10. Find mean deviation from mean


𝑥𝑖 5 7 9 10 12 15
𝑓𝑖 8 6 2 2 2 6
A10. Ans. 3.38
𝑥 𝑓 𝑓𝑥 𝑥 − 𝑥̅ |𝑥 − 𝑥̅ | 𝑓 |𝑥 − 𝑥̅ |
5 8 40 -4 4 32
7 6 42 -2 2 12
9 2 18 0 0 0
10 2 20 1 1 2
12 2 24 3 3 6
15 6 90 6 6 36
∑ 𝑓𝑥 234
𝑥̅ = = =9
∑𝑓 26
∑ 𝑓 |𝑥 − 𝑥̅ | = 88
∑ 𝑓|𝑥 − 𝑥̅ | 88
Now, 𝑀. 𝐷. (𝑥̅ ) = = = 3.38
∑𝑓 26

Q11. Find mean deviation about median


𝑥𝑖 3 6 9 12 13 15 21 22
𝑓𝑖 3 4 5 2 4 5 4 3
A11. Ans. 4.97
𝑥𝑖 3 6 9 12 13 15 21 22
𝑓𝑖 3 4 5 2 4 5 4 3
𝑐𝑓𝑖 3 7 12 14 18 23 27 30
15𝑡ℎ + 16𝑡ℎ 13 + 13
𝑀= = = 13
2 2
|𝑥𝑖 − 𝑀 | 10 7 4 1 0 2 8 9
𝑓𝑖 |𝑥𝑖
30 28 20 2 0 10 32 27
− 𝑀|
∑|𝑥𝑖 − 𝑀 |𝑓𝑖 149
𝑀. 𝐷. (𝑀 ) = = = 4.97
𝑁 30
Standard deviation (σ) and variance (σ2)
• 𝑆. 𝐷. = 𝜎 = √𝜎 2
2
∑(𝑥𝑖 − 𝑥̅ )2
where 𝜎 =
𝑛
• Also,
2
2
∑ 𝑥𝑖2 ∑ 𝑥𝑖
𝜎 = −( )
𝑛 𝑛
• Above formulae were for ungrouped data. For grouped data, standard deviation can be
calculated by below formulae :
∑ 𝑓𝑖 (𝑥𝑖 − 𝑥̅ )2
𝜎2 =
𝑛
2
2
∑ 𝑓𝑖 𝑥𝑖2 ∑ 𝑓𝑖 𝑥𝑖
And, 𝜎 = −( )
𝑛 𝑛

Q12. Prove that :


𝑛2 − 1
S.D. of first n natural numbers is √
12

Q13. In an experiment with 15 observations on 𝑥, the following results were available


∑ 𝑥 2 = 2830 and ∑ 𝑥 = 170
One observation that was 20 was found to be wrong and was replaced by the correct value 30.
Then corrected variance?
A13. Ans. 78

Q14. Find the variance and standard deviation of the following frequency distribution :
𝑥𝑖 2 4 6 8 10 12 14 16
𝑓𝑖 4 4 5 15 8 5 4 5
A14. Calculation of variance and standard deviation :
𝑥𝑖 − 𝑋̅
𝑥𝑖 𝑓𝑖 𝑓𝑖 𝑥𝑖 (𝑥𝑖 − 𝑋̅)2 𝑓𝑖 (𝑥𝑖 − 𝑋̅)2
= 𝑥𝑖 − 9
2 4 8 -7 49 196
4 4 16 -5 25 100
6 5 30 -3 9 45
8 15 120 -1 1 15
10 8 80 1 1 8
12 5 60 3 9 45
14 4 56 5 25 100
16 5 80 7 49 245
𝑁 = Σ𝑓𝑖 Σ𝑓𝑖 𝑥𝑖 Σ𝑓𝑖 (𝑥𝑖 − 𝑋̅)2
= 50 = 450 = 754
Here 𝑁 = 50, Σ𝑓𝑖 𝑥𝑖 = 450
Σ𝑓𝑖 𝑥𝑖 450
𝑋̅ = = =9
𝑁 50
We have Σ𝑓𝑖 (𝑥𝑖 − 𝑋̅)2 = 754
1 754
∴ 𝑉𝑎𝑟(𝑋) = [∑ 𝑓𝑖 (𝑥𝑖 − 𝑋̅)2 ] = = 15.08
𝑁 50
𝑆. 𝐷. = √𝑉𝑎𝑟(𝑋) = √15.08 = 3.88
Alternate method : Use the formula
2
2
∑ 𝑓𝑖 𝑥𝑖2 ∑ 𝑓𝑖 𝑥𝑖
𝜎 = −( )
𝑛 𝑛

• Short cut method : We calculate variance about 𝑥̅ . But we can calculate it about any random
number A as well.
i. ∑(𝑥𝑖 − 𝑥̅ )2
Just like 𝜎 2 =
𝑁
∑ 2
2
𝑑
Similarly, 𝜎 =
𝑁
ii. 2 2
2
∑ 𝑥𝑖 ∑ 𝑥𝑖
Just like 𝜎 = −( )
𝑁 𝑁
2
2
∑ 𝑑2 ∑𝑑
Similarly, 𝜎 = −( )
𝑁 𝑁
where, 𝑑 = 𝑥𝑖 − 𝐴 = deviation from the assumed mean A
𝑓 = frequency of the item
𝑁 = ∑ 𝑓 = sum of frequencies
• Standard deviation for continuous series :
𝑥 −𝐴 2 𝑥 −𝐴 2
∑ 𝑓𝑖 ( 𝑖 ) ∑ 𝑓𝑖 ( 𝑖 ) ∑ 𝑓𝑖 𝑢𝑖2 ∑ 𝑓𝑖 𝑢𝑖
2
2
𝜎 =ℎ 2[ ℎ −( ℎ 2
) ]=ℎ [ −( ) ]
𝑁 𝑁 𝑁 𝑁

where ℎ = class width


𝑥𝑖 − 𝐴
𝑢𝑖 =

• Below formula can also be used for continuous series :
2
2
∑ 𝑓𝑖 (𝑥𝑖 − 𝐴)2 ∑ 𝑓𝑖 (𝑥𝑖 − 𝐴)
𝜎 = −( )
𝑛 𝑛
(notice we removed the unnecessary h2 term from above formula)

18 18

Q15. If ∑(𝑥𝑖 − 8) = 9 and ∑(𝑥𝑖 − 8)2 = 45, then find the standard deviation of 𝑥1 , 𝑥2 , … 𝑥18
𝑖=1 𝑖=1
3
A15. Ans.
2

Q16. Calculate the mean and standard deviation for the following data :
Wages upto
15 30 45 60 75 90 105 120
(in Rs.)
No. of
12 30 65 107 157 202 222 230
workers
A16.
Class Cumulative Mid – 𝑥𝑖 − 67.5
Frequency 𝑢 𝑖 = 𝑓𝑖 𝑢𝑖 𝑓𝑖 𝑢𝑖2
interval frequency values 15
0 – 15 12 7.5 12 -4 -48 192
15 – 30 30 22.5 18 -3 -54 162
30 – 45 65 37.5 35 -2 -70 140
45 – 60 107 52.5 42 -1 -42 42
60 – 75 157 67.5 50 0 0 0
75 – 90 202 82.5 45 1 45 45
90 – 105 222 97.5 20 2 40 80
105 – 120 230 112.5 8 3 24 72
∑ 𝑓𝑖 𝑢𝑖 ∑ 𝑓𝑖 𝑢𝑖2
∑ 𝑓𝑖 = 230
= −105 = 733
Here 𝐴 = 67.5, ℎ = 15, 𝑁 = 230, ∑ 𝑓𝑖 𝑢𝑖 = −105 𝑎𝑛𝑑 ∑ 𝑓𝑖 𝑢𝑖2 = 733
1 −105
∴ 𝑀𝑒𝑎𝑛 = 𝐴 + ℎ ( ∑ 𝑓𝑖 𝑢𝑖 ) = 67.5 + 15 ( )
𝑁 230
= 67.5 − 6.85 = 60.65
2
2
1 2 1
and 𝑉𝑎𝑟(𝑥) = ℎ [ ∑ 𝑓𝑖 𝑢𝑖 − ( ∑ 𝑓𝑖 𝑢𝑖 ) ]
𝑁 𝑁
733 −105 2
⇒ 𝑉𝑎𝑟(𝑥) = 225 [ −( ) ]
230 230
= 225[3.18 − 0.2025] = 669.9375
∴ 𝑆. 𝐷. = √𝑉𝑎𝑟(𝑋) = √669.9375 = 25.883

Q17. Determine the variance of the following frequency dist.


class 0–2 2–4 4–6 6–8 8 – 10 10 – 12
𝑓𝑖 2 7 12 19 9 1
A17. Let 𝑎 = 7, ℎ = 2
𝑥𝑖 − 𝑎
class 𝑥𝑖 𝑓𝑖 𝑢𝑖 = 𝑓𝑖 𝑢𝑖 𝑓𝑖 𝑢𝑖2

0–2 1 2 -3 -6 18
2–4 3 7 -2 -14 28
4–6 5 12 -1 -12 12
6–8 7 19 0 0 0
8 – 10 9 9 1 9 9
10 – 12 11 1 2 2 4
𝑁 = 50 ∑ 𝑓𝑖 𝑢𝑖 = 50 ∑ 𝑓𝑖 𝑢𝑖2 = 71

2
∑ 𝑓𝑖 𝑢𝑖2
2[
Σ𝑓𝑖 𝑢𝑖 2 71 −21 2
𝜎 =ℎ −( ) ] = 4[ − ( ) ]
𝑁 𝑁 50 50
= 4[1.42 − 0.1764] = 4.97
Coefficient of standard deviation
• To compare the dispersion of two frequency distributions the relative measure of standard
deviation is computed which is known as coefficient of standard deviation and is given by
𝜎
Coefficient of S.D. = , where 𝑥̅ is the 𝐴. 𝑀.
𝑥̅
𝜎
• Also, Coefficient of variance = coefficient of S.D. × 100 = × 100
𝑥̅
Square deviation
• Root mean square deviation :
𝑛
1
𝑆 = √ ∑ 𝑓𝑖 (𝑥𝑖 − 𝐴)2
𝑁
𝑖=1

where A is any arbitrary number and S2 is called mean square deviation.


• Relation between S.D. and root mean square deviation :
If σ be the standard deviation and S be the root mean square deviation then 𝑆 2 = 𝜎 2 + 𝑑2 .
Obviously, 𝑆 2 will be least when 𝑑 = 0 i.e. 𝑥̅ = 𝐴. Hence, mean square deviation and
consequently root mean square deviation is least if the deviations are taken from the mean.

Q18. The mean square deviation of a set of n observations 𝑥1 , 𝑥2 , … 𝑥𝑛 about a point c is defined as
1 𝑛
∑𝑖=1(𝑥𝑖 − 𝑐 )2 . The mean square deviation about -2 and 2 are 18 and 10 respectively, then
𝑛
standard deviation of this set of observations is
(A) 3 (B) 2 (C) 1 (D) None of these
A18. Ans. (A)
1 1
∑(𝑥𝑖 + 2)2 = 18 and ∑(𝑥𝑖 − 2)2 = 10
𝑛 𝑛
⇒ ∑(𝑥𝑖 + 2) = 18𝑛 and ∑(𝑥𝑖 − 2)2 = 10𝑛
2

⇒ ∑(𝑥𝑖 + 2)2 + ∑(𝑥𝑖 − 2)2 = 28𝑛 and ∑(𝑥𝑖 + 2)2 − ∑(𝑥𝑖 − 2)2 = 8𝑛

⇒ 2 ∑ 𝑥𝑖2 + 8𝑛 = 28𝑛 and 8 ∑ 𝑥𝑖 = 8𝑛

⇒ ∑ 𝑥𝑖2 = 10𝑛 and ∑ 𝑥𝑖 = 𝑛


∑ 𝑥𝑖2 ∑ 𝑥𝑖
⇒ = 10 and =1
𝑛 𝑛
2
∑ 𝑥2 ∑ 𝑥𝑖
𝜎 =√ 𝑖 −( ) = √10 − (1)2 = 3
𝑛 𝑛

Variance of the combined series


• If 𝑛1 , 𝑛2 are the sizes; ̅̅̅,
𝑥1 ̅̅̅
𝑥2 the means and 𝜎1 , 𝜎2 the standard deviation of two series, then
1
𝜎2 = [𝑛 (𝜎 2 + 𝑑12 ) + 𝑛2 (𝜎22 + 𝑑22 )]
𝑛1 + 𝑛2 1 1
𝑛1 ̅̅̅
𝑥1 + 𝑛2 ̅̅̅
𝑥2
where 𝑑1 = ̅̅̅ 𝑥1 − 𝑥̅ , 𝑑2 = ̅̅̅
𝑥2 − 𝑥̅ and 𝑥̅ =
𝑛1 + 𝑛2

Q19. For 2 data sets each of size 5 the variances are given to be 4 and 5 & the corresponding
means are given to be 2 and 4 respectively. Variance of combined data set is?
A19. 𝑛1 = 5 , 𝑛2 = 5 , 𝜎12 = 4, 𝜎22 = 5 , ̅̅̅
𝑥1 = 2, ̅̅̅
𝑥2 = 4
𝑛1 ̅̅̅
𝑥1 + 𝑛2 ̅̅̅
𝑥2 10 + 20
𝑥̅ = = =3
𝑛1 + 𝑛2 10
𝑑1 = ̅̅̅
𝑥1 − 𝑥̅ = −1
𝑑2 = ̅̅̅
𝑥2 − 𝑥̅ = 1
2
𝑛1 (𝜎12 + 𝑑12 ) + 𝑛2 (𝜎22 + 𝑑22 )
𝜎 =
𝑛1 + 𝑛2
5 4 + 1) + 5(5 + 1) 11
(
⇒ 𝜎2 = =
10 2

You might also like