Module I - Statistical Measures questions
Module I - Statistical Measures questions
SYLLABUS:
Measures of central tendency: Arithmetic Mean, Median and Mode –
Measures of variation: Range, Mean deviation, Standard deviation
and Coefficient of variation – Correlation (Discrete Data): Karl
Pearson’s Correlation coefficient, Spearman’s Rank Correlation –
Regression lines (Discrete Data).
Arithmetic Mean
Problems:
1. Find the arithmetic mean of the values 7, 5, 3, 4, 6, 4, 5
2. Andy has grades of 84, 65, and 76 on three tests. What grade
must he obtain on the next test to have an average of exactly
80 for the four tests?
3. If the mean of five observations x, x + 4, x + 6, x + 8 and x +
12 is 16, find the value of x.
10. The pass result of students who passed in a class test is given
below. If the average marks for all the fifty students were 5.16, find
the average marks of the students who failed.
Marks(𝑥 ) No. of
students( 𝑓)
4 8
5 10
6 9
7 6
8 4
9 3
11. The average marks secured by 50 students was 44. Later on,
it was discovered that a score 36 was misread as 56. Find the correct
average marks secured by the students.
Weighted Arithmetic Mean:
If 𝑥1 , 𝑥2 , … … , 𝑥𝑛 are the observations and 𝑤1, 𝑤2 , … … , 𝑤𝑛 are the
assigned weights, the weighted arithmetic mean 𝑥̅ is given by
∑𝑛𝑖=1 𝑤𝑖 𝑥𝑖
𝑥̅ =
∑𝑛𝑖=1 𝑤𝑖
If ̅̅̅
𝑥1 is the A.M. of 𝑛1 observations and ̅̅̅
𝑥2 is the A.M. of another 𝑛2
observations, then the A.M. of the combined set is
𝑛1 ̅𝑥̅̅1̅+𝑛2 ̅̅̅̅
𝑥2
𝑥̅ =
𝑛1 +𝑛2
14. There are two branches of a company employing 280 and 320
persons respectively. If the A.M. of the salaries of the two branches
are Rs. 750 and Rs.937.5 respectively, find the A.M. of the salaries
of the employees of the company as a whole.
15. The average salary of male employees in a firm was Rs.5200
and that of female employees was Rs.4200. If the mean salary of
all the employees was Rs.5000, find the percentage of male and
female employees.
Median
Median is the value of the middle item when the items are arranged
in ascending or descending order of magnitude. It is a positional
average.
Mode
Mode is the value which occurs most frequently in a set of
observations.
For a discrete frequency distribution, mode is value of x
corresponding to the maximum frequency.
For a continuous frequency distribution, modal class is the class
having maximum frequency
(𝑓1 −𝑓0 )×𝑐
Mode = 𝑙 +
2𝑓1 −𝑓0 −𝑓2
where
𝑙 = lower limit of the modal class
𝑓1= frequency of the modal class
𝑓0= frequency of the class just above the modal class
𝑓2= frequency of the class just below the modal class
𝑐 = class width
1. For the six values 140, 220, 90, 180, 140, 200, find: (a)
the mean (b) the median (c) the mode
6. The following table gives the length of life of 150 electric lamps.
Calculate the mode.
Life (hours) Frequency
0-400 4
400-800 12
800-1200 40
1200-1600 41
1600-2000 27
2000-2400 13
2400-2800 9
2800-3200 4
1
Mean – Median = (Mean – Mode)
3
i.e., Mode = 3 Median – 2 Mean (For asymmetrical distribution)
Measures of Dispersion
The degree to which numerical data tend to spread about the
average value, is called variation or dispersion of data. The
measures of dispersion are:
(i) Range
(ii) Mean deviation
(iii) Standard deviation
(iv) Quartile deviation
Range
Range is the difference between the greatest and the least of the
given values.
Range = L – S
For continuous frequency distribution, take L = upper limit of the
highest class and S = lower limit of the lowest class.
Coefficient of range
(𝐿−𝑆)
Coefficient of range =
(𝐿+𝑆)
1. If the marks of 5 students are 45, 92, 26, 81 and 72 , find the
range.
2. The profits (in ‘000 Rs.)of a company for the last 8 years are
given below. Calculate the range and coefficient of range.
Year: 1975 1976 1977 1978 1979 1980 1981 1982
Profit 40 30 80 100 120 90 200 230
3. Calculate the range of the prices of gold from Monday to Saturday
of a week.
Mon Tue Wed Thu Fri Sat
1160 1158 1170 1142 1175 1187
4. Calculate range and coefficient of range:
Daily wages (Rs.): 50-60 60-70 70-80 80-90
No. of labourers: 60 45 45 40
90-100 100-110 110-120
35 30 30
Mean Deviation (from the mean)
5. Find the M.D. from the mean of the numbers: 4800, 4600, 4400,
4200, 4000
6. Calculate the mean deviation from the mean: 100.500, 100.250,
100.375, 100.625, 100.750, 100.125, 100.375, 100.625, 100.500,
100.125
7. Calculate the mean deviation from the mean, for the following
data
Marks No. of
(x) students(f)
5 5
15 8
25 15
35 16
45 6
8. Calculate the mean deviation from the mean, for the following
data:
Marks: 0-10 10-20 20-30 30-40 40-50
No. of students: 6 5 8 15 7
50-60 60-70
6 3
1. Calculate the mean and standard deviation of the heights (in cms)
of 10 students given below:
160, 160, 161, 162, 163, 163, 163, 164, 164, 170
Coefficient of variation
𝑆𝐷
Coefficient of variation = × 100
𝑀𝑒𝑎𝑛
It is a pure number independent of the units of measurement.
To compare the variability of two series: The series having
greater C.V. is more variable than the other. The series having
less C.V. is said to be more consistent or more homogeneous
than the other.
𝑆𝐷
Note: Coefficient of S.D. =
𝑀𝑒𝑎𝑛
1. The prices of two commodities over 10 weeks are given below.
Find out which price shows less variation.
A: 54 55 53 56 52 52 58 49 50 51
B: 108 107 105 106 105 103 102 104 104 101
Correlation
The given data are plotted on a graph in the form of dots. ie, for
each pair of X and Y, we put dots and looking at the scatter of the
various points, we form an idea as to whether the two variables are
related or not. The more the plotted points scatter over a chart, the
lesser is the degree of relationship between the two variables. The
nearer the points come to a line, the higher the relationship. If the
points lie in a haphazard manner, it shows the absence of any
relationship between the variables
Note:
1
∑(𝑥 − 𝑥̅ )(𝑦 − 𝑦̅) is called the covariance between X and Y (Cov
𝑛
(X,Y).
𝑛∑𝑥𝑦 − ∑𝑥∑𝑦
𝑟𝑥𝑦 =
√𝑛∑𝑥 2 − (∑𝑥 )2 √𝑛∑𝑦 2 − (∑𝑦)2
Regression
Regression is the measure of the average relationship between
two or more variables in terms of the original units of the data.
It provides a mechanism for predicting or forecasting.
Note:
𝑛∑𝑥𝑦−∑𝑥∑𝑦 𝑛∑𝑥𝑦−∑𝑥∑𝑦
1. 𝑏𝑦𝑥 = 𝑏𝑥𝑦 =
𝑛∑𝑥 2 −(∑𝑥)2 𝑛∑𝑦 2 −(∑𝑦)2
2. Both the regression lines pass through the point (𝑥̅ , 𝑦̅) .
Hence, by solving the two regression equations, we can find
the means of X and Y.
3. Both the regression coefficients will have the same sign;
either both will be positive or both will be negative.
4. Correlation coefficient is the geometric mean between the
regression coefficients.
i.e., 𝑟𝑥𝑦 = 𝑏𝑥𝑦 . 𝑏𝑦𝑥
If both the regression coefficients are positive, r will be
positive; if both the regression coefficients are negative, r will
be negative.
5. Regression coefficients are independent of the change of
origin, but not of scale.
Angle between regression lines
If 𝜃 is the angle between the two regression lines, then
1 − 𝑟 2 𝜎𝑥 𝜎𝑦
tan 𝜃 = ( ) 2
𝑟 𝜎𝑥 + 𝜎𝑦2
Note:
𝜋
1. When r = 0, 𝑡𝑎𝑛𝜃 = ∞, so 𝜃 =
2
i.e., the two regression lines are perpendicular to each
other. Their equations are 𝑦 = 𝑦̅ and 𝑥 = 𝑥̅
3. For the following data, find the most likely price at Madras
corresponding to the price 70 at Bombay and that at Bombay
corresponding to the price 68 at Madras
Madras Bombay
Average price 65 67
S.D. of price 0.5 3.5
S.D. of the difference between the prices at Madras and
Bombay is 3.1
Exercises:
1. Calculate Karl Pearson’s Coefficient of correlation between price
and supply of a commodity from the following data:
Price (Rs.): 17 18 19 20 21 22 23 24 25 26
Supply (Kg): 38 37 38 33 32 33 34 29 26 23
2. Compute the coefficient of correlation between the corresponding
values of x and y in the following table:
X: 2 4 5 6 8 11
Y: 18 12 10 8 7 5
3. Calculate Karl Pearson’s correlation coefficient from the data:
Roll No. 1 2 3 4 5 6 7 8 9 10
Marks
in Economics:78 36 98 25 75 82 90 62 65 39
Marks
in Maths: 84 51 91 60 68 62 86 58 53 47
4. Calculate the coefficient of correlation from the following data:
X: 9 8 7 6 5 4 3 2 1
Y: 15 16 14 13 11 12 10 8 9
14. You are given the following information about advertising and
sales:
15. The equations of the two lines of regression for a bivariate data
are Y = 10(X – 5) and X = 2.5(Y – 14). Find the arithmetic
means of X and Y as well as the coefficient of correlation between
X and Y.