0% found this document useful (0 votes)
4 views

Lecture IV Measures of relative positioning

The document discusses measures of relative positioning (quantiles) including quartiles, deciles, and percentiles, explaining how to calculate them for both ungrouped and grouped data. It also covers measures of spread or dispersion such as inter-quartile range, mean absolute deviation, variance, and standard deviation, providing formulas and examples for each. Additionally, it includes exercises for practice on calculating these statistical measures.

Uploaded by

adamsedwin06
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Lecture IV Measures of relative positioning

The document discusses measures of relative positioning (quantiles) including quartiles, deciles, and percentiles, explaining how to calculate them for both ungrouped and grouped data. It also covers measures of spread or dispersion such as inter-quartile range, mean absolute deviation, variance, and standard deviation, providing formulas and examples for each. Additionally, it includes exercises for practice on calculating these statistical measures.

Uploaded by

adamsedwin06
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

4.

1 Measures of Relative Positioning (Quantiles)

These are values which divide a sorted data set into N equal parts. They are also known as
quantiles or N-tiles. The commonly used quantiles are; Quartiles, Deciles and Percentiles
These 3 divides a sorted data set into four, ten and hundred equal parts respectively. To work
with percentiles, deciles and quartiles - you need to learn to do two different tasks First you
should learn how to find the percentile that corresponds to a particular score and then how to
find the score in a set of data that corresponds to a given percentile.

Quartiles
One can divide a set of data into three quartiles; lower, middle and upper quartiles denoted Q1,
Q2 and Q3 respectively. The lower quartile Q1 separates the bottom 25% from the top 75%, Q2
is the median and Q3 separates the top 25% from the bottom 75% as shown below.

The Kth quartile is given by: Qk = k4 (n + 1) value where k=1,2,3


th

Deciles and Percentiles


For a set of data you can divide the data into nine deciles ( D1 , D2 ,...D9 ) and 99 percentiles (
P1 , P2 ,...., P99 ). Tthe Kth Deciles Dk and the Kth Percentiles Pk are respectively given by;
Dk = 10k (n + 1)th Value and Pk = 100
k
(n + 1)th value
NB For ungrouped data we may use linear interpolation for us to get the required Kth quantile.
However for grouped data the Kth Value is given by
 K − Cf a 
K thValue = LCB +    i
 f 
where LCB, i and f are the lower class boundary. class interval and frequency of the class
containing the K th value. Cfa is the cumuilative frequency of the class above this particular
class

Example 1 Find the lower and upper quartiles, the 7th decile and the 85th percentile of the
following data. 3, 6, 9, 10, 7, 12, 13, 15, 6, 5, 13
Solution
Sorted data: 3, 5, 6, 6, 7, 9, 10, 12, 13, 13, 15 Here n=11
Q1 = 14 (11 + 1)th = 3rd value = 6 Similarly Q3 = 34 (11 + 1)th = 9th value = 13
D7 = 10
7
(11 + 1)th = 7.7th value = 7th value + 0.7(8th value − 7th value) = 10 + 0.7(12 − 10) = 11.4
 
linear interpolation

Page 1
P85 = 100
85
(11 + 1)th = 10.2th value = 10th value + 0.2(11th value − 10th value) = 13 + 0.2(15 − 13) = 13.4
  
linear interpolation

Example 2
Estimate the lower quartile, 4th decile and the 72nd percentile for the frequency table below
Class 1-4 5-8 9-12 13-16 17-20 21-24
frequency 10 14 20 16 12 8

Solution
Boundaries 0.5-4.5 4.5-8.5 8.5-12.5 12.5-16.5 16.5-20.5 20.5-24.5
C.F 10 24 44 60 72 80
 20.25 − 10 
Q1 = 14 (80 + 1)th = 20.25th value = 4.5 +    4  7.428571
 14 
 32.4 − 24 
D4 = 104 (80 + 1)th = 32.4 th value = 8.5 +    4  10.18
 20 
 58.32 − 44 
P72 = 100
72
(80 + 1) th = 58.32 th value = 12.5 +    4  16.08
 16 
Exercise
a) Find the lower and upper quartiles, the 7th decile and the 85th percentile of the data.
a) 9, 3, 4, 2, 9, 5, 8, 4, 7, 4 b) 1, 2, 2, 3, 4, 4, 5, 5, 5, 5, 7, 8, 8 and 9
2) The number of goals scored in 15 hockey matches is shown in the table.
No of goals 1 2 3 4 5
No of matches 2 1 5 3 4
Estimate the lower quartile, 4 decile and the 72nd percentile of the number of goals cored
th

4) The table shows the heights of 30 students in a class calculate an estimate of the upper and
lower quartile of the height.
Height (cm) 140<x<14 144<x<14 148<x<15 152<x<15 156<x<16 160<x<16
4 8 2 6 0 4
No of 4 5 8 7 5 1
students
5) The distance each of 150 people travel to work is as shown in the following frequency table.
Distance 0<d<5 5<d<10 10<d<15 15<d<20 20<d<25 25<d<30
(Km)
No of 15 28 40 35 20 12
People

a) Work out what percentage of the 150 people travel more than 20 km to work
b) Calculate an estimate for the median distance travelled to work by the people?

Properties of measures of Location


(i) They are affected by change of origin. Adding or subtracting a constant from each and
every observation in a data set causes all the measures of location to shift by the same
magnitude. That is New measure = old measure  k

Page 2
(ii) They are affected by change of scale. Multiplying each and every observation in a data set
by a constant value scales up all the measures of location by the same magnitude.. That is
New measure = K (old measure )
Example: Consider the three sets of data A, B and C below
Set A: 65, 53, 42, 52, 53 x A = 53 and Median A = 53
Set B: 15, 3, -8, 2, 3 xB = 3 and Median B = 3
Set C: 45, 9, -24, 6, 9 xC = 9 and Median C = 9
• Notice that set B is obtained by subtracting 50 from each and every observation in set A
and clearly x B = x A − 50 and Median B = Median A − 50 Therefore
New measure = old measure  k. This is referred to as change of origin.
• Effectively set C is obtained by multiplying each and every observation in set B by 3 and
clearly x C = 3x B and Median C = 3Median B Thus New measure = K (old measure ) This is
referred to as change of scale.

4.2 Measures of Spread/ Dispersion

Spread is the degree of scatter or variation of the variable about the central value. Examples of
these measures includes: the range, Inter-Quartile range, Quartile Deviation also called semi
Inter-Quartile range, Mean Absolute Deviation, Variance and standard deviation.

Inter-Quartile range and Semi Inter-Quartile Range


Inter-Quartile range (IQR) is the difference between the upper and lower quartiles. Half of this
difference is called Quartile Deviation or the semi Inter-Quartile range (SIQR) Ie
IQR =Q3 -Q1 and SIQR = 12 (Q3 -Q1 )

Mean Absolute Deviation (MAD)


It is the average of the absolute deviations from the mean and it’s given by

MAD =
 x-x for ungrouped data but for grouped data MAD =
f x - x
n n

Example 1 Find the quartile deviation and the mean absolute deviation for the data.
3, 6, 9, 10, 7, 12, 13, 15, 6, 5, 13
Solution
Sorted data: 3, 5, 6, 6, 7, 9, 10, 12, 13, 13, 15
Recall Q1 = 6 and Q3 = 13 ie from earlier calculations.
Thus SIQR = 12 (Q3 -Q1 ) = 12 (13 − 6) = 3.5
3 + 5 + 6 + 6 + 7 + 9 + 10 + 12 + 13 + 13 + 15
x= =9
11

Page 3
MAD =  x - x = 3 - 9 + 5 - 9 + 6 - 9 + ... + 15 - 9 = 6 + 4 + 3 + ... + 6 = 36  3.2727
n 11 11 11

Variance and Standard Deviation


Ignoring the negative sign in order to compute MAD is not the only option we have to deal
with deviations. We can square the deviations and then average. The average of the squared
deviations from the mean is called the variance denoted s 2 and its given by
s2 = n1  (x - x)2 A little algebraic simplification of this formular gives s 2 = 1
n x 2
− x2
2
For grouped data s = 1
n  f (x - x) 2
= n1  fx 2 − x 2 where n is the sum of frequencies.
To reverse the squaring on the units we find the square root of the variance.Standard Deviation
denoted s is the square root of variance.

Example 1 Find the variance and standard deviation for the data.
3, 6, 9, 10, 7, 12, 13, 15, 6, 5, 13
Solution
3 + 5 + 6 + 6 + 7 + 9 + 10 + 12 + 13 + 13 + 15
x= =9
11
(x - x ) (3 - 9) + (5 - 9) + (6 - 9)
=
+ ... + (15 - 9)
2
36 + 16 + 9 + ... + 36 143
2 2 2 2
S 2
= = = = 13
n 11 11 11
Standard deviation s = variance = = 13  3.60555 .

Example 2 Find the standard deviation of the data: 2, 4, 8, 7, 9, 4, 6, 10, 8, and 5.


Solution

Mean x =
 x = 2 + 4 + 8 + ... + 5 = 63 = 6.3 and  x 2
= 2 2 + 4 2 + 82 + ... + 52 = 455
n 10 10
Standard deviation s = 1
n x 2
− x2 = 45.5 − 6.32  2.4104 .

Example 3 Estimate the mean, and standard deviation for the frequency table below:
Class 5-9 10-14 15-19 20-24 25-29 30-34 35-39
freq 5 12 32 40 16 9 6
Solution
Mid pts (x) 7 12 17 22 27 32 37 Total
Freq (f) 5 12 32 40 16 9 6 120
Xf 35 144 544 880 432 288 222 2545
fx 2 245 1728 9248 19360 11664 9216 8214 59675

Mean x =
 fx = 2545  21.2083 and  fx 2
= 59675
n 120
59675
Standard deviation s = 1
n  fx 2
− x2 =
120
− 21.2083 2  6.8919 .

Page 4
Exercise
1) Find the quartile deviation, the mean absolute deviation and the standard deviation of the
data: a) 9, 3, 4, 2, 9, 5, 8, 4, 7, 4 b) 1, 2, 2, 3, 4, 4, 5, 5, 5, 5, 7, 8, 8 and 9
2) The number of goals scored in 20 hockey matches is shown in the table
No of goals 1 2 3 4 5
No of matches 2 5 6 3 4
Estimate the quartile deviation, the mean absolute deviation and the standard deviation of
the number of goals cored
3) consider the frequency table below and estimate quartile deviation, the mean absolute
deviation and the standard deviation
Class 8-12 13-17 18-22 23-27 28-32 33-37
Freq 3 10 12 9 5 1

4) The table shows the heights of 30 students in a class calculate an estimate of the quartile
deviation, the mean absolute deviation and the standard deviation of the height.
Height (cm) 140<x<14 144<x<14 148<x<15 152<x<15 156<x<16 160<x<16
4 8 2 6 0 4
No of 4 5 8 7 5 1
students
5) The grouped frequency table gives information about the distance each of 150 people
travel to work.
Height (cm) 0<d< 5<d<10 10<d<15 15<d<20 20<d<25 25<d<30
5
No of 15 28 40 35 20 12
students
Calculate an estimate for the quartile deviation and the standard deviation of the distance
travelled to work by the people.

Properties of Measures of Spread


a) They are not affected by change of origin. Adding or subtracting a constant from each and
every observation in a data set does not affect any measures of spread. That is
New measure = old measure
b) They are affected by change of scale. Multiplying each and every observation in a data set
by a constant value scales up all the measures of spread by the same value except in the
case of variance which is scaled up by a square of the same constant.
ie New measure = K (old measure ) but New variance = k 2  old variance
Example: Consider the three sets of data A, B and C below
Set A: 65, 53, 42, 52, 53 Range=23, MAD A = 4.8 and Variance A = 66 .5

Set B: 15, 3, -8, 2, 3 Range=23, MAD B = 4.8 and Variance B = 66.5


Set C: 45, 9, -24, 6, 9 Range=69, MAD C = 14.4 and Variance C = 598.5

Page 5
• Notice that set B is obtained by subtracting 50 from each and every observation in set A
and clearly MAD B = MAD A and Variance B = Variance A
Therefore there is no effect on the
change of origin ie New measure = old measure . .
• Effectively set C is obtained by multiplying each and every observation in set B by 3 and
clearly MAD C = 3  MAD B and Variance C = 32  Variance B Thus

New measure = K (old measure ) and New Variance C = k 2  old Variance B

Mean and Standard Deviation Using a Calculator


a) When on, press mode key to get COMP SD REG with numbers 1 2 3 respectively below.
b) Press 2 to select SD for statistical data.
c) Enter data one by one pressing m+ after every value entered. The screen will be showing
the number of observations that are fully entered. Eg 31M+ 52M+ 29M+ 60M+ 58M+
d) Pressing shift then 1 gives

x 2
x n Typing 1 then = gives the value ofx 2

1 2 3 Typing 2 then = gives the value of  x

c) Pressing shift then 2 gives x x n x n −1


1 2 3
These are the mean uncorrected and the corrected standard-deviation. For the entered values
x = 46 and s = 14.91643

4.3 Assumed Mean and the Coding Formular

If the observations are too large such that the natural computation of totals is tedious,
we can take one of the observations as the working/assumed mean. Let A be any
guessed or assumed arithmetic mean and let d i =x i -A be the deviations of x i from A,
then arithmetic mean and variance are respectively given by;
x = A + 1n  fd = A + d and S 2 = 1n  fd 2 − (1n  fd ) = 1n  fd 2 − d
2 2

Where A = Assumed mean which is generally taken as midpoint of the middle class or
the class where frequency is large
Remark:, in most cases deviations (d) of x i from A is a multiple of the class interval ie
di x i -A
di = t i  i  t i = = .
i i
In these cases we can use t rather than d in computation. The above formulae reduces to
x = A + ni  ft = A + it and S 2 = i 2   ft
1
n
2
− (1n  ft ) = i 2
2
   ft
1
n
2
−t
2
 respectably
The latter formulae are referred to as coding formulae

Example
Using coding formulae, find the mean and standard deviation of the following data
Class 6340-6349 6350-6359 6360-6369 6370-6379 6380-6389
Freq 2 3 7 5 3

Page 6
Solution
Class Mid pts Freq t ft ft 2
340-349 6344.5 2 -2 -4 8 x - 6364.5
Let A = 6364.5  t =
350-359 6354.5 3 -1 -3 3 i
360-369 6364.5 7 0 0 0 
x = A + n ft = 6364 .5 + 20 (4) = 6366 .5
i 10

370-379 6374.5 5 1 5 5

( )
380-389 384.5 3 2 6 12
 − ( 204 )  11.6619
2
S = i  1n ft 2 − 1n ft = 10  20 28 2
Total - 20 - 4 28
Exercise
1) Consider the following frequency distribution.
classes 5410-5414 5415-5419 5420-5424 5425-2549 5430-5434
frequency 7 11 14 13 5
Estimate the mean and standard deviation using coding formula
2) Using coding formula, find the mean and standard deviation for the data below

Class 6710- 6720- 6730- 6740- 6750- 6760- 6770- 6780- 6790- 6800-
6720 6730 6740 6750 6760 6770 6780 6790 6800 6810
Freq 4 5 7 13 16 11 9 6 4 3
3) The table shows the speed distribution of vehicles on Thika Super high way on a typical
day. Using coding formulae, find the mean speed and the standard deviation of the speeds.
Speed 2260- 2270- 2280- 2290- 2300- 2310- 2320- 2330- 2340-
(km/hr) 2269 2279 2289 2299 2309 2319 2329 2339 2349
No of 138 163 325 541 427 214 110 52 30
vehicles
4) The following table shows a frequency distribution of the weekly wages of 65 employees
at P&R Company. Using coding formula find the mean & standard deviation of the wages.

Wages $250.00 $260.00- $270.00- $280.00- $290.00- $300.00- $310.00-


-259.99 269.99 279.99 289.99 299.99 309.99 319.99
No. of 8 10 16 14 10 5 2
employe
es

Page 7

You might also like