Lecture IV Measures of relative positioning
Lecture IV Measures of relative positioning
These are values which divide a sorted data set into N equal parts. They are also known as
quantiles or N-tiles. The commonly used quantiles are; Quartiles, Deciles and Percentiles
These 3 divides a sorted data set into four, ten and hundred equal parts respectively. To work
with percentiles, deciles and quartiles - you need to learn to do two different tasks First you
should learn how to find the percentile that corresponds to a particular score and then how to
find the score in a set of data that corresponds to a given percentile.
Quartiles
One can divide a set of data into three quartiles; lower, middle and upper quartiles denoted Q1,
Q2 and Q3 respectively. The lower quartile Q1 separates the bottom 25% from the top 75%, Q2
is the median and Q3 separates the top 25% from the bottom 75% as shown below.
Example 1 Find the lower and upper quartiles, the 7th decile and the 85th percentile of the
following data. 3, 6, 9, 10, 7, 12, 13, 15, 6, 5, 13
Solution
Sorted data: 3, 5, 6, 6, 7, 9, 10, 12, 13, 13, 15 Here n=11
Q1 = 14 (11 + 1)th = 3rd value = 6 Similarly Q3 = 34 (11 + 1)th = 9th value = 13
D7 = 10
7
(11 + 1)th = 7.7th value = 7th value + 0.7(8th value − 7th value) = 10 + 0.7(12 − 10) = 11.4
linear interpolation
Page 1
P85 = 100
85
(11 + 1)th = 10.2th value = 10th value + 0.2(11th value − 10th value) = 13 + 0.2(15 − 13) = 13.4
linear interpolation
Example 2
Estimate the lower quartile, 4th decile and the 72nd percentile for the frequency table below
Class 1-4 5-8 9-12 13-16 17-20 21-24
frequency 10 14 20 16 12 8
Solution
Boundaries 0.5-4.5 4.5-8.5 8.5-12.5 12.5-16.5 16.5-20.5 20.5-24.5
C.F 10 24 44 60 72 80
20.25 − 10
Q1 = 14 (80 + 1)th = 20.25th value = 4.5 + 4 7.428571
14
32.4 − 24
D4 = 104 (80 + 1)th = 32.4 th value = 8.5 + 4 10.18
20
58.32 − 44
P72 = 100
72
(80 + 1) th = 58.32 th value = 12.5 + 4 16.08
16
Exercise
a) Find the lower and upper quartiles, the 7th decile and the 85th percentile of the data.
a) 9, 3, 4, 2, 9, 5, 8, 4, 7, 4 b) 1, 2, 2, 3, 4, 4, 5, 5, 5, 5, 7, 8, 8 and 9
2) The number of goals scored in 15 hockey matches is shown in the table.
No of goals 1 2 3 4 5
No of matches 2 1 5 3 4
Estimate the lower quartile, 4 decile and the 72nd percentile of the number of goals cored
th
4) The table shows the heights of 30 students in a class calculate an estimate of the upper and
lower quartile of the height.
Height (cm) 140<x<14 144<x<14 148<x<15 152<x<15 156<x<16 160<x<16
4 8 2 6 0 4
No of 4 5 8 7 5 1
students
5) The distance each of 150 people travel to work is as shown in the following frequency table.
Distance 0<d<5 5<d<10 10<d<15 15<d<20 20<d<25 25<d<30
(Km)
No of 15 28 40 35 20 12
People
a) Work out what percentage of the 150 people travel more than 20 km to work
b) Calculate an estimate for the median distance travelled to work by the people?
Page 2
(ii) They are affected by change of scale. Multiplying each and every observation in a data set
by a constant value scales up all the measures of location by the same magnitude.. That is
New measure = K (old measure )
Example: Consider the three sets of data A, B and C below
Set A: 65, 53, 42, 52, 53 x A = 53 and Median A = 53
Set B: 15, 3, -8, 2, 3 xB = 3 and Median B = 3
Set C: 45, 9, -24, 6, 9 xC = 9 and Median C = 9
• Notice that set B is obtained by subtracting 50 from each and every observation in set A
and clearly x B = x A − 50 and Median B = Median A − 50 Therefore
New measure = old measure k. This is referred to as change of origin.
• Effectively set C is obtained by multiplying each and every observation in set B by 3 and
clearly x C = 3x B and Median C = 3Median B Thus New measure = K (old measure ) This is
referred to as change of scale.
Spread is the degree of scatter or variation of the variable about the central value. Examples of
these measures includes: the range, Inter-Quartile range, Quartile Deviation also called semi
Inter-Quartile range, Mean Absolute Deviation, Variance and standard deviation.
MAD =
x-x for ungrouped data but for grouped data MAD =
f x - x
n n
Example 1 Find the quartile deviation and the mean absolute deviation for the data.
3, 6, 9, 10, 7, 12, 13, 15, 6, 5, 13
Solution
Sorted data: 3, 5, 6, 6, 7, 9, 10, 12, 13, 13, 15
Recall Q1 = 6 and Q3 = 13 ie from earlier calculations.
Thus SIQR = 12 (Q3 -Q1 ) = 12 (13 − 6) = 3.5
3 + 5 + 6 + 6 + 7 + 9 + 10 + 12 + 13 + 13 + 15
x= =9
11
Page 3
MAD = x - x = 3 - 9 + 5 - 9 + 6 - 9 + ... + 15 - 9 = 6 + 4 + 3 + ... + 6 = 36 3.2727
n 11 11 11
Example 1 Find the variance and standard deviation for the data.
3, 6, 9, 10, 7, 12, 13, 15, 6, 5, 13
Solution
3 + 5 + 6 + 6 + 7 + 9 + 10 + 12 + 13 + 13 + 15
x= =9
11
(x - x ) (3 - 9) + (5 - 9) + (6 - 9)
=
+ ... + (15 - 9)
2
36 + 16 + 9 + ... + 36 143
2 2 2 2
S 2
= = = = 13
n 11 11 11
Standard deviation s = variance = = 13 3.60555 .
Mean x =
x = 2 + 4 + 8 + ... + 5 = 63 = 6.3 and x 2
= 2 2 + 4 2 + 82 + ... + 52 = 455
n 10 10
Standard deviation s = 1
n x 2
− x2 = 45.5 − 6.32 2.4104 .
Example 3 Estimate the mean, and standard deviation for the frequency table below:
Class 5-9 10-14 15-19 20-24 25-29 30-34 35-39
freq 5 12 32 40 16 9 6
Solution
Mid pts (x) 7 12 17 22 27 32 37 Total
Freq (f) 5 12 32 40 16 9 6 120
Xf 35 144 544 880 432 288 222 2545
fx 2 245 1728 9248 19360 11664 9216 8214 59675
Mean x =
fx = 2545 21.2083 and fx 2
= 59675
n 120
59675
Standard deviation s = 1
n fx 2
− x2 =
120
− 21.2083 2 6.8919 .
Page 4
Exercise
1) Find the quartile deviation, the mean absolute deviation and the standard deviation of the
data: a) 9, 3, 4, 2, 9, 5, 8, 4, 7, 4 b) 1, 2, 2, 3, 4, 4, 5, 5, 5, 5, 7, 8, 8 and 9
2) The number of goals scored in 20 hockey matches is shown in the table
No of goals 1 2 3 4 5
No of matches 2 5 6 3 4
Estimate the quartile deviation, the mean absolute deviation and the standard deviation of
the number of goals cored
3) consider the frequency table below and estimate quartile deviation, the mean absolute
deviation and the standard deviation
Class 8-12 13-17 18-22 23-27 28-32 33-37
Freq 3 10 12 9 5 1
4) The table shows the heights of 30 students in a class calculate an estimate of the quartile
deviation, the mean absolute deviation and the standard deviation of the height.
Height (cm) 140<x<14 144<x<14 148<x<15 152<x<15 156<x<16 160<x<16
4 8 2 6 0 4
No of 4 5 8 7 5 1
students
5) The grouped frequency table gives information about the distance each of 150 people
travel to work.
Height (cm) 0<d< 5<d<10 10<d<15 15<d<20 20<d<25 25<d<30
5
No of 15 28 40 35 20 12
students
Calculate an estimate for the quartile deviation and the standard deviation of the distance
travelled to work by the people.
Page 5
• Notice that set B is obtained by subtracting 50 from each and every observation in set A
and clearly MAD B = MAD A and Variance B = Variance A
Therefore there is no effect on the
change of origin ie New measure = old measure . .
• Effectively set C is obtained by multiplying each and every observation in set B by 3 and
clearly MAD C = 3 MAD B and Variance C = 32 Variance B Thus
x 2
x n Typing 1 then = gives the value ofx 2
If the observations are too large such that the natural computation of totals is tedious,
we can take one of the observations as the working/assumed mean. Let A be any
guessed or assumed arithmetic mean and let d i =x i -A be the deviations of x i from A,
then arithmetic mean and variance are respectively given by;
x = A + 1n fd = A + d and S 2 = 1n fd 2 − (1n fd ) = 1n fd 2 − d
2 2
Where A = Assumed mean which is generally taken as midpoint of the middle class or
the class where frequency is large
Remark:, in most cases deviations (d) of x i from A is a multiple of the class interval ie
di x i -A
di = t i i t i = = .
i i
In these cases we can use t rather than d in computation. The above formulae reduces to
x = A + ni ft = A + it and S 2 = i 2 ft
1
n
2
− (1n ft ) = i 2
2
ft
1
n
2
−t
2
respectably
The latter formulae are referred to as coding formulae
Example
Using coding formulae, find the mean and standard deviation of the following data
Class 6340-6349 6350-6359 6360-6369 6370-6379 6380-6389
Freq 2 3 7 5 3
Page 6
Solution
Class Mid pts Freq t ft ft 2
340-349 6344.5 2 -2 -4 8 x - 6364.5
Let A = 6364.5 t =
350-359 6354.5 3 -1 -3 3 i
360-369 6364.5 7 0 0 0
x = A + n ft = 6364 .5 + 20 (4) = 6366 .5
i 10
370-379 6374.5 5 1 5 5
( )
380-389 384.5 3 2 6 12
− ( 204 ) 11.6619
2
S = i 1n ft 2 − 1n ft = 10 20 28 2
Total - 20 - 4 28
Exercise
1) Consider the following frequency distribution.
classes 5410-5414 5415-5419 5420-5424 5425-2549 5430-5434
frequency 7 11 14 13 5
Estimate the mean and standard deviation using coding formula
2) Using coding formula, find the mean and standard deviation for the data below
Class 6710- 6720- 6730- 6740- 6750- 6760- 6770- 6780- 6790- 6800-
6720 6730 6740 6750 6760 6770 6780 6790 6800 6810
Freq 4 5 7 13 16 11 9 6 4 3
3) The table shows the speed distribution of vehicles on Thika Super high way on a typical
day. Using coding formulae, find the mean speed and the standard deviation of the speeds.
Speed 2260- 2270- 2280- 2290- 2300- 2310- 2320- 2330- 2340-
(km/hr) 2269 2279 2289 2299 2309 2319 2329 2339 2349
No of 138 163 325 541 427 214 110 52 30
vehicles
4) The following table shows a frequency distribution of the weekly wages of 65 employees
at P&R Company. Using coding formula find the mean & standard deviation of the wages.
Page 7