Chapter THREE
Chapter THREE
THREE
MEASURES
OF CENTRAL
TENDENCY
Measures of Central Tendency
TheDefinition:
3 most common measures of central tendency are the mode, median, and mean.
• Mode:
Central the most
tendency is afrequent value.
descriptive summary of a dataset through a single value that reflects
•
the center of the data distribution. an ordered data set.
Median: the middle number in
• Mean: the sum of all values divided by the total number of values.
Different measures of central tendency can be easily demonstrated by the below chart:
Mean
Mean is the most commonly used measure of central tendency. There are different types of
mean, the arithmetic mean, geometric mean (GM) and harmonic mean (HM).
Arithmetic Mean
Definition:
Arithmetic Mean (AM) or called average is the ratio of all observations to the total number
of observations.
Arithmetic Mean Formula for ungrouped data
The population mean is represented by the Greek letter mu (μ). It is given by the formula
The total number of observations, either N or n depending upon the population or sample
Where:
k= number of Class
• The sum of deviations of the items from their arithmetic mean is always zero, i.e. ∑(x –
X) = 0.
• The sum of the squared deviations of the items from Arithmetic Mean (A.M) is
minimum, which is less than the sum of the squared deviations of the items from any
other values.
• If each item in the arithmetic series is substituted by the mean, then the sum of these
replacements will be equal to the sum of the specific items.
Merits of Arithmetic Mean
• It is changed by extreme items such as very small and very large items.
• It can rarely be identified by inspection.
• In some cases, A.M. does not represent the original item. For example, average patients
admitted to a hospital are 10.7 per day.
• The arithmetic mean is not suitable in extremely asymmetrical distributions.
Example:
The one-way train fare of five selected BS students is recorded as follows (birr) : 10, 5, 15, 8 and
12. Calculate the arithmetic mean of the following data.
Solution:
∑5𝑖=1 𝑥𝑖 = 10 + 5 + 15 + 8 + 12 = 50
∑5𝑖=1 𝑥𝑖 10+5+15+8+12 50
𝑥̅ = = = = 10 𝑏𝑖𝑟𝑟
5 5 5
Example:
Provide the given distribution of the following frequency distribution of first year students of a
particular college:
Age (Years) 13 14 15 16 17
Number of Students 2 5 13 7 3
Solution:
The given distribution is grouped data and the variable involved is ages of first year students,
while the number of students represents frequencies.
Ages (Years) Number of Students Fx
x f
13 2 26
14 5 70
15 13 195
16 7 112
17 3 51
Total ∑ 𝑓 = 30 ∑ 𝑓𝑥 = 454
Example:
The following data shows the distance covered by 100 people to perform their routine jobs.
Distance (Km) 0 - 10 10 - 20 20 – 30 30 - 40
Number of People 10 20 40 30
Solution:
The given distribution is grouped data and the variable involved is distance covered, while the
number of people represents frequencies.
Definition:
{{{}}{{{{
The Harmonic Mean (HM) is defined as the reciprocal of the arithmetic mean of the given
data values.
Since the harmonic mean is the reciprocal of the arithmetic mean, the formula to define the
harmonic mean “HM” is given as follows:
Calculating weighted harmonic mean is similar to the simple harmonic mean. It is a special case
of harmonic mean where all the weights are equal to 1. If the set of weights such as w1, w2, w3,
…, wn connected with the sample space x1, x2, x3,…., xn.
The weighted harmonic mean can be calculated using the following formula:
∑𝑛𝑖=1 𝑤𝑖 ∑𝑛𝑖=1 𝑤𝑖
𝐻𝑀 = 𝑤 = 𝑤1 𝑤2 𝑤3 𝑤𝑛
∑𝑛𝑖=1 𝑖 + + + ⋯ +
𝑥𝑖 𝑖 𝑥1 𝑥2 𝑥3 𝑥𝑛
If the frequencies “f” is supposed to be the weights “w”, then the harmonic mean is calculated as
follows:
If x1, x2, x3,…., xn are n items with corresponding frequencies f1, f2, f3, …., fn, then the
grouped harmonic mean is
∑𝑛𝑖=1 𝑓𝑖 𝑛
𝐻𝑀 = =
𝑓 𝑓1 𝑓2 𝑓3 𝑓𝑛
∑𝑛𝑖=1 𝑖
𝑥𝑖 𝑥1 + 𝑥2 + 𝑥3 + ⋯ + 𝑥𝑛
Note:
1. f values are considered as weights
2. For continuous series, mid-value = (Lower limit + Upper limit)/2 and is taken as x
Harmonic Mean Uses
The main uses of harmonic means are as follows:
• The harmonic mean is applied in the finance to the average multiples like price-earnings
ratio
• It is also used by the market technicians in order to determine the patterns like Fibonacci
Sequences
Merits and Demerits of Harmonic Mean
• It is rigidly confined.
• It is based on all the views of a series, i.e. it cannot be computed by ignoring any item of
a series.
• It is able to advance the algebraic method.
• It provides a more reliable result when the results to be achieved are the same for the
various means adopted.
• It provides the highest weight to the smallest item of a series.
• It can also be measured when a series holds any negative value.
• It produces a skewed distribution of a normal one.
• It produces a curve straighter than that of the A.M and G.M.
• The harmonic mean is greatly affected by the values of the extreme items
• It cannot be able to calculate if any of the items is zero
• The calculation of the harmonic mean is cumbersome, as it involves the calculation using
the reciprocals of the number.
Example:
Solution:
Solution:
6
= 1 1 1 1 1
( + + + + )
1 2 5 7 9
6
= (1.95) = 3.07
Example:
Solution:
𝑁
𝐻𝑀 = 𝑓
∑( 𝑖 )
𝑥𝑖
Here 𝑁 = ∑ 𝑓 = 2 + 3 + 3 + 2 = 10 and 𝑥 = 2, 4, 8, 6
𝑁 10 10
𝐻𝑀 = 𝑓 𝑓 𝑓 𝑓 = 2 3 3 2 = 2.25 = 4.44
( 1+ 2+ 3+ 4) ( + + + )
2 4 8 16
𝑥1 𝑥2 𝑥3 𝑥4
x 1 3 5 7 9 11
f 2 4 6 8 10 12
Solution:
The calculation for the harmonic mean is shown in the below table:
x f 1/x f/x
1 2 1 2
3 4 0.333 1.332
5 6 0.2 1.2
7 8 0.143 1.144
9 10 0.1111 1.111
11 12 0.091 1.092
N =42 Σ f/x = 7.879
Geometric
Definition:Mean
If x1, x2 …. xn are the observation, and then the Geometric Mean is defined as:
1
𝐺. 𝑀 = 𝑛√𝑥1 ∗ 𝑥2 ∗ 𝑥3 ∗ … ∗ 𝑥𝑛 = (𝑥1 ∗ 𝑥2 ∗ 𝑥3 ∗ … ∗ 𝑥𝑛 )𝑛
If we have a series of n positive values with repeated values such as x1,x2,x3,…,xk which are
repeated f1,f2,f3,…,fk times respectively, then the geometric mean will become:
1
𝑛
𝑓 𝑓 𝑓 𝑓 𝑓 𝑓 𝑓 𝑓
𝐺. 𝑀 = √𝑥11 ∗ 𝑥22 ∗ 𝑥33 ∗ … ∗ 𝑥𝑛𝑛 = (𝑥11 ∗ 𝑥22 ∗ 𝑥33 ∗ … ∗ 𝑥𝑛𝑛 )𝑛
Where n = f1+f2+f3+⋯+fk
• It is rigidly defined.
• It is based on all the items.
• It is capable of further algebraic treatment.
• It gives less weight to large items and more to small items.
• It is difficult to compute.
• It is not easy to understand.
• If there are negative values in the series, it can not be computed.
Example of Suitable Average
• Geometric mean is considered as the best average in the construction of index numbers.
• When large weights are to be given to small items and small weights to large items,
geometric mean is very useful.
• It is useful in averaging rates; ratios and percentages.
Example:
Solution:
Example:
Solution:
GM = 6
Example:
Solution:
Examples:
Solution:
We know that,
1
𝐺𝑀 = (𝑥1 × 𝑥2 × 𝑥3 … 𝑥𝑛 )𝑛
GM = 13.92
Example:
GM = Antilog∑logxin
= Antilog 8.925/5
= Antilog 1.785
= 60.95
Example:
Find the geometric mean of the following grouped data for the frequency distribution of weights.
Solution:
Weights of ear heads (g) No of ear heads (f) Mid x Log x f log x
60-80 22 70 1.845 40.59
80-100 38 90 1.954 74.25
100-120 45 110 2.041 91.85
120-140 35 130 2.114 73.99
140-160 20 150 2.716 43.52
Total 160 324.2
GM = Antilog∑flogxin
GM = Antilog ( 2.02625 )
GM = 106.23
Median is a statistical measure that determines the middle value of a dataset listed in
ascending order (i.e., from smallest to largest value).
Step 2: Find the number of observations in the given set of data. It is denoted by n.
𝑛+1 𝑡ℎ
Step 3: If n is odd, the median equals the [( ) 𝑡𝑒𝑟𝑚] observation.
2
𝑛 𝑡ℎ 𝑛+1 𝑡ℎ
Step 4: If n is even, then the median is given by [(2) +( ) ] 𝑡𝑒𝑟𝑚
2
Step 1: Make a table with 4 columns. First column for the class interval, second column
for frequency, f, the third column for cumulative frequency, cf and fourth
column for class boundary if necessary.
Step 2: Write the class intervals and the corresponding frequency in the respective
columns.
Step 3: Obtain N and find N2.
Step 4: Find the class whose cumulative frequency is just greater than the value N2.
This class is known as the median class.
Step 5: To calculate median, use the formula
𝒏
−𝒄𝒇
𝑴𝒆𝒅𝒊𝒂𝒏 = 𝑳 + (𝟐 𝒇 ) × 𝒘
L= lower limit of median class
n = no. of observations
cf =denotes cumulative frequency of the class preceding the median class
f = frequency of median class
w = class size (assuming classes are of equal size)
Example 1:
Solution:
Example 2:
4, 17, 77, 25, 22, 23, 92, 82, 40, 24, 14, 12, 67, 23, 29
Solution:
4, 12, 14, 17, 22, 23, 23, 24, 25, 29, 40, 67, 77, 82, 92,
If the total number of observation is odd (i.e. 15), then the formula to calculate the median is:
(𝑛+1) 𝑡ℎ
𝑀𝑒𝑑𝑖𝑎𝑛 = [ ] 𝑡𝑒𝑟𝑚
2
(15+1) 𝑡ℎ (16) 𝑡ℎ
𝑀𝑒𝑑𝑖𝑎𝑛 = [ ] 𝑡𝑒𝑟𝑚 = [ ] 𝑡𝑒𝑟𝑚 = [8]𝑡ℎ 𝑡𝑒𝑟𝑚
2 2
Rahul’s family drove through 7 states on summer vacation. The prices of Gasoline differ from
state to state. Calculate the median of gasoline cost.
Solution:
If the total number of observation is odd (i.e. 7), then the formula to calculate the median is:
Hence, the median of the gasoline cost is 1.84. There are three states with greater gasoline costs
and 3 with smaller prices.
Example:
Find the median of the values 5, 7, 10, 20, 16, 12
Solution:
𝑛+1 𝑡ℎ
Median = Value of ( ) 𝑖𝑡𝑒𝑚
2
𝑛+1 𝑡ℎ 6+1 𝑡ℎ
Median = Value of ( ) 𝑖𝑡𝑒𝑚 = ( ) 𝑖𝑡𝑒𝑚 = 3.5𝑡ℎ 𝑖𝑡𝑒𝑚
2 2
10+12
Median = = 11
2
Example:
The following are the marks scored by the students in the Summative Assessment exam.
Solution:
= (50/2)th value
= 25th value
Median class = 30 - 40
Substitute.
50
−24 25−24
𝑀𝑒𝑑𝑖𝑎𝑛 = 30 + ( 2 10 ) × 10 = 30 + ( 10
)× 10 = 31
Example:
The following table gives the weekly expenditure of 200 families. Find the median of the weekly
expenditure.
Solution:
= (200/2)th value
= 100th value
Substitute.
200
−74 100−74
2
𝑀𝑒𝑑𝑖𝑎𝑛 = 2000 + ( ) × 1000 = 2000 + ( ) × 1000 = 2481.5
54 54
𝑀𝑒𝑑𝑖𝑎𝑛 = 2481.5
Example:
Group 60 – 64 65 – 69 70 – 74 75 – 79 80 – 84 85 – 89
Frequency 1 5 9 12 7 2
Solution:
Cumulative
Group f Class Boundary
Frequency
60 – 64 1 59.5 – 64.5 1
65 – 69 5 64.5 – 69.5 6
70 – 74 9 69.5 – 74.5 15
75 – 79 12 74.5 – 79.5 27
80 – 84 7 79.5 – 84.5 34
85 – 89 2 84.5 – 89.5 36
𝑛 𝑡ℎ
= value of ( 2) observation
36 𝑡ℎ
= value of ( 2 ) observation
= value of 18th observation
From the column of cumulative frequency cf, we find that the 18th observation lies in the class
75-79.
Now,
∴ n=Total frequency = 36
Example:
Solution:
L = 20 N = 55 + x, cf = 30, w = 10, f = x
𝑵
−𝒄𝒇
𝟐
𝑴𝒆𝒅𝒊𝒂𝒏 = 𝑳 + ( 𝒇
)×𝒘
Substitute.
55+𝑥
−30
2
24 = 20 + ( 𝑥
) × 10
55+𝑥
−30
2
4=( 𝑥
) × 10
55+𝑥−60
2
4=( 𝑥
) × 10
𝑥−5
4 = ( 2𝑥 ) × 10
10𝑥−50
4=( 2𝑥
)
8𝑥 = (10𝑥 − 50)
2𝑥 = 50
𝑥 = 25
Mode
Definition:
The mode is the value that appears most frequently in a data set. A set of data may have one
mode, more than one mode, or no mode at all.
Grouped data
Step 1: Prepare the frequency distribution table in such a way that its first column
consists of the observations and the second column the respective frequency.
Step 2: Determine the class of maximum frequency by inspection. This class is called
the modal class.
𝒇𝟏 − 𝒇𝟎
𝑴𝒐𝒅𝒆 = 𝑳 + ( )×𝒘
𝟐𝒇𝟏 − 𝒇𝒐 − 𝒇𝟐
Where,
Find the mode of the given data set: 3, 3, 6, 9, 15, 15, 15, 27, 27, 37, 48.
Solution:
15 is the mode since it is appearing more number of times in the set compared to other numbers.
Example:
Find the mode of 4, 4, 4, 9, 15, 15, 15, 27, 37, 48 data set.
Solution:
As we know, a data set or set of values can have more than one mode if more than one value
occurs with equal frequency and number of time compared to the other values in the set.
Hence, here both the number 4 and 15 are modes of the set.
Example:
Solution:
If no value or number in a data set appears more than once, then the set has no mode.
Hence, for set 3, 6, 9, 16, 27, 37, 48, there is no mode available.
Example:
Solution:
The maximum class frequency is 12 and the class interval corresponding to this frequency is
20 – 30. Thus, the modal class is 20 – 30.
𝑓 −𝑓
1 0 12−5
𝑀𝑜𝑑𝑒 = 𝐿 + (2𝑓 −𝑓 ) × 𝑤 = 20 + ((2×12)−5−8) × 10 = 26.36
1−𝑓 𝑜 2
MIXED QUESTIONS ON MEAN MEDIAN AND
MODE FOR UNGROUPED DATA
Question 1 :
The monthly salary (in $) of 10 employees in a factory are given below :
5000, 7000, 5000, 7000, 8000, 7000, 7000, 8000, 7000, 5000
Solution :
Mean :
= (5000 + 7000 + 5000 + 7000 + 8000 + 7000 + 7000 + 8000 + 7000 + 5000)/10
= 66000/10
= 6600
Median :
5000, 5000, 5000, 7000, 7000, 7000, 7000, 7000, 8000, 8000
= (7000 + 7000)/2
= 14000/2
= 7000
Mode :
Solution :
3.1 and 3.3 are repeating twice, so mode is 3.1 and 3.3
It is a bimodal data.
Question 3 :
For the data 11, 15, 17, x+1, 19, x–2, 3 if the mean is 14 , find the value of x. Also find the mode
of the data.
Solution :
Mean = (11 + 15 + 17 + x + 1 + 19 + x - 2 + 3)/7
14 = (64 + 2x)/7
14(7) = 64 + 2x
2x = 98 - 64
2x = 34
x = 34/2 = 17
Solution :
Demand for the size 40 is 37.
In case of a grouped frequency distribution, the exact values of the variables are not known and
as such it is very difficult to locate mode accurately
The class interval with maximum frequency is called the modal class.
Solution :
Marks Number of students
0-10 22
10-20 38
20-30 46
30-40 34
40-50 20
modal class is 20 - 30
= 20 + [(46-38)/2(46) - 38 - 34] x 10
= 20 + [8/(92 - 38 - 34)] x 10
= 20 + [8/20] x 10
= 20 + 4
= 24
Solution :
Marks Number of students
24.5 - 34.5 4
34.5 - 44.5 8
44.5 - 54.5 10
54.5 - 64.5 14
64.5 - 74.5 8
74.5 - 84.5 6
= 54.5 + [4/10] x 10
= 54.5 + 4
= 58.5
Please use the following summary table to know what the best measure of central tendency is
with respect to the different types of data.
Normal distribution
In a normal distribution, data is symmetrically distributed with no skew. Most values cluster
around a central region, with values tapering off as they go further away from the center. The
mean, mode and median are exactly the same in a normal distribution.
Example: Normal distribution. you survey a sample in your local community on the number of
books they read in the last year.
A histogram of your data shows the frequency of responses for each possible number of books.
From looking at the chart, you see that there is a normal distribution.
The mean, median, and mode are all equal; the central tendency of this data set is 8.
Skewed distributions
In skewed distributions, more values fall on one side of the center than the other, and the mean,
median and mode all differ from each other. One side has a more spread out and longer tail with
fewer scores at one end than the other. The direction of this tail tells you the side of the skew
In a positively skewed distribution, there’s a cluster of lower scores and a spread out tail on the
right. In a negatively skewed distribution, there’s a cluster of higher scores and a spread out tail
on the left.
Positively skewed distribution
In this histogram, your distribution is skewed to the right, and the central tendency of your data
set is on the lower end of possible scores.
In this histogram, your distribution is skewed to the left, and the central tendency of your data set
is towards the higher end of possible scores.