0% found this document useful (0 votes)
88 views

Chapter 4 - Measures of Central Tendency

Uploaded by

A
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
88 views

Chapter 4 - Measures of Central Tendency

Uploaded by

A
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Chapter 4: MEASURES OF CENTRAL TENDENCY

A measure of central tendency is a single value that attempts to describe a set of data by identifying
the central position within that set of data. The mean (often called the average) is most likely the
measure of central tendency that you are most familiar with, but there are others, such as, the
median and the mode.
The mean, median and mode are all valid measures of central tendency but, under different
conditions, some measures of central tendency become more appropriate to use than others. In the
following sections we will look at the mean, mode and median and learn how to calculate them
and under what conditions they are most appropriate to be used.
The mean, median and mode can also be computed when the data is either an ungrouped or
grouped. Ungrouped is raw data. This means that it has just been collected but not sorted into
any group or classes. On the other hand, grouped data is data that has been organized into groups
from the raw data. Grouped data is data given in intervals.

4.1 The Arithmetic Mean


The arithmetic mean or simply the mean refers to the sum of all the given values or items in a
distribution divided by the number of values or items summed.
41.1 The Computation of the Mean from Ungrouped Data
The arithmetic mean or mean in the ungrouped values X1, X2, X3, ………….., Xm denoted by
X is

∑𝒏
𝒊=𝟏 𝒙𝒊 ∑𝑿
X= or can be simplified by X =
𝒏 𝒏
Wherein X stands for the values or items and n is the number of measures in distribution.

Example: The mean of 86, 80, 75, 78, and 86 is


86+80+75+78+86
X= = 81
5
The mean is computed using the formula below when some values or items appear several times.

∑𝒏
𝒊=𝟏 𝒙𝒊 𝒇𝒊
X=
𝒏
Wherein Xi fi is the product of each score (Xi) and its (fi ) and n is the number of items or cases.
Example 2: The mean of 80, 80, 75, 76, 74, 74, 74 is
2 (80)+75+76+3(74)
X= = 76. 14
7

4.1.2 Computation of the Mean for Grouped Data


The mean for grouped data is computed by the formula:

∑𝑛
𝑖=1 𝑋𝑖 𝑓𝑖
X= where Xi stands for the class mark and fi is the corresponding frequency.
𝑛
To illustrate, please see the table below.
Table 4.1. Statistics Test Score
Class Interval Class Mark Frequency Xi fi
90 – 94 92 2 184
85 – 89 87 6 522
80 – 84 82 3 246
75 – 79 77 8 616
70 – 74 72 5 360
65 – 69 67 2 134
60 – 64 62 10 620
55 – 59 57 3 171
50 – 54 52 4 208
45 – 49 47 3 141
40 - 44 42 4 168

∑ 𝑓𝑖 = 50 ∑ 𝑋𝑖 𝑓𝑖 = 3,370

The following steps are involved when the formula is applied:


Step 1. Find the class mark of each class interval by adding its upper and lower class interval and
dividing the sum by 2.
Step 2. Multiply the class mark by its corresponding frequency to obtain the Xi fi column.
Step 3. Get the sum of Xi fi column to get ∑ 𝑋𝑖 𝑓𝑖 .
Step 4. Substitute the values of ∑ 𝑋𝑖 𝑓𝑖 and n or ∑ 𝑓𝑖 in the formula.

∑𝑛
𝑖=1 𝑋𝑖 𝑓𝑖 3,370
Hence, X= = = 67. 4
𝑛 50
4.2. The Weighted Mean
The weighted mean is a type of mean that is calculated by multiplying the weight (or probability)
associated with a particular event or outcome with its associated quantitative outcome and then
summing all the products together. It is very useful when calculating a theoretically expected
outcome where each outcome has a different probability of occurring, which is the key feature that
distinguishes the weighted mean from the arithmetic mean.

Weighted means are useful in a wide variety of scenarios. For example, a student may use a
weighted mean in order to calculate his/her percentage grade in a course. In such an example, the
student would multiply the weighing of all assessment items in the course (e.g., assignments,
exams, projects, etc.) by the respective grade that was obtained in each of the categories. Consider
a student with the following grades:

The formula in computing weighted mean is:


∑ 𝒏𝑾𝑿
X = 𝒊=𝟏𝒏 𝒊 𝒊

Where Wi = weight of each item or value


Xi = represents each of the items or values
n = total number of weights

Example: Let us determine the weighted mean if 500 bags were sold at P250 each, 350 bags at
P200 each, 200 bags at P150 each, 150 bags at P100 each, and 50 bags at P80.

(500 𝑥 250)+(350 𝑥 200)+(200 𝑥 150)+(150 𝑥 100)+(50 𝑥 80) 244,000


Solution: X = = = P195.20
1,250 1,250
4.3. Median
The median refers to the middle observation in an ordered distribution.

4.3.1. Computation of the Median for Ungrouped Data


To get the median for ungrouped data, we arrange first the data from the highest value to the lowest
or vise-versa.
When there is an odd number of observation, the middle value is the median.
For example: 6, 7, 8, 9, 10, 12 and 16 ; the median is 9.
If the number of observation is even, the average of the two middle score is the median.
6+5
For example: 8, 7, 6, 5, 4, and 3 ; the median is the average of 6 & 5 so = 5.5
2

4.3.2 Computation of the Median for Grouped Data


The formula used in getting the median for grouped data in the form of frequency distribution is:
𝑛
2
− 𝑐𝑓
Md = u + ( )i
𝑓𝑚
Where u = exact lower limit of the class interval containing the median
n/2 = one-half of the total number of cases
cf = cumulative frequency immediately below u
fm = frequency of the class interval containing the median.

Steps in Calculating the Median


Step 1. Determine the cumulative frequency (cf) by adding successively, starting from the bottom
(lowest class interval) of the individual frequencies. Thus, copy the frequency at the bottom which
is 10, then add it to next frequency 15, to obtain the succeeding cf which is 10 + 15 or 25. The next
is 25 + 14 or 30 and so on, up to the top. Notice that the last cf is 176 which is the value of n or f.
Step 2. Find the median class. Compute for n/2 which is 176/2 = 88. The 88 th item is the median
class. It is the class where the median lies. Our median class is 410 – 419 because it is the class
interval which contains the 88 th item. The lowest or smallest cf that contains the 88 th item is 90.
Across 90, is the median class which is 410 – 419.
Step 3. Find the cf of the class immediately below the median class which is 70. This is the sum
of all the frequencies below the median class.
Step 4. Compute fi or the frequency of the median class which is 20.
Step 5. Determine the class size which is 10.
Step 6. Substitute all the needed values in the formula.
𝑛
2
− 𝑐𝑓
Md = u + ( )i
𝑓𝑚

Example:
Table 4.3. Computations of the Median

Income/week Class Boundaries Frequency Cumulative Frequency


460 – 469 459.5 – 469.5 16 176
450 – 459 449.5 – 459.5 22 160
440 – 449 439.5 – 449.5 19 138
430 – 439 429.5 – 239.5 14 119
420 – 429 419.5 – 429.5 d2 15 105
410 – 419 409.5 – 419.5 20 90 88th item is here,
d1
400 – 409 399.5 – 409.5 13 70 therefore, the
390 – 399 389.5 – 399.5 18 57 median class is
380 – 389 379.5 – 389.5 14 39 410 -419
370 – 379 369.5 – 379.5 15 25
i = 370 – 360 = 10
360 – 369 359.5 – 369.5 10 10

∑ 𝑓𝑖 = N = 176

Therefore, n = 176 ; u = 409.5 ; cf = 70 ; fm = 20 ; i = 10


𝑛 176
− 𝑐𝑓 − 70
2 2
Md = u + ( ) i = 409.5 + ( ) 10 = 418.5
𝑓𝑚 20

4.4. The Mode


The mode is the item or value in a distribution with the highest frequency or most number of cases.
4.4.1 Determining the Mode for Ungrouped Data
For ungrouped data, the mode is simply the value which occurs most often.
Example: Lets determine the mode given the distribution below.
Set A: 79, 20, 20, 20, 18, 15, 13, 9
Set B: 8, 8, 9, 6, 10, 8, 10, 2
Set C: 9, 2, 8, 3, 4, 5, 7, 6
In Set A, the mode is 20since it is the score that appears most frequently. Since there is only one
mode, the distribution is said to be unimodal.
In Set B, the modes are 8 and 10, since there are two modes, the distribution is bimodal.
In Set C, the mode does not exist because all the frequencies are equal.

4.4.2 Computation of the Mode for Grouped Data


The mode in grouped data is the class mark or midpoint of the class interval with the highest
frequency. This class interval is known as the modal class. The mode obtained in this manner is
called a crude mode because it is just a rough approximation of the actual mode. We use the
formula below to improve the computation.
𝒅𝟏
Mo = u + (𝒅 )i
𝟏 𝒅𝟐

Wher; u = exact lower limit of the modal class


d1 = difference between the frequency of the modal class and the next class lower in value.
d2 = difference between the frequency of the modal class and the next class higher in value.
i = class size of the modal class.

This formula is used only if the modal class and the two adjoining class intervals have the same
width. The modal class is the class interval with the highest frequency.
Thus from the previous example (Table 4. 3)
u = 409.5 ; d1 = 20-13 = 7 ; d1 = 20 – 15 = 5 ; i = 10
𝒅𝟏 7
Mo = u + (𝒅 ) i = 409.5 + ( 7+5 ) 10 = 415.33
𝟏 𝒅𝟐
4.5 Uses of the Measures of Central Tendency
4.5.1. The Mean is used
 for interval and ratio measurements
 if higher statistical computations are wanted
 if there are no extreme values in a distribution since it is easily affected by the
extremely high or extremely low scores. Thus, the distribution is approximately
normal
 When the greatest reliability of the measure of central tendency is wanted since its
computations include all the given values.
4.5.2 The Median is Used

 for ordinal or ranked measurements


 if there are extreme cases, thus the distribution is markedly skewed
 if we desire to know whether the cases fall within the upper halves or the lower halves
of a distribution
 for an open-end distribution, that is, the lowest or the highest class interval or both are
not defined as 50 and below or 100 and above

4.5.3 The Median is Used

 for nominal or categorical data


 if the most popular or most typical case or value in a distribution is wanted
 if a rough or quick estimate of a central value is wanted

4.6 Limitations of the Measures of Central Tendency


4.6.1 The Limitations of the Mean:

 It is the most widely used average, because it is the most familiar. It is often, however
misused. It cannot be used if the clustering of values or items is not substantial. An
example is when representing the scores or values, 10 and 100 since they are far apart.
 When the given values do not tend to cluster around a central value, the mean is a poor
measure of central location.
 It is easily affected by extremely large or small values. One small value can easily pull
down the mean.
 The mean cannot be utilized to compare distributions since the means of two or more
distributions may be the same but their other characteristics maybe entirely different.
The means of distribution A whose values are 80, 85, and 90 and distribution B whose
values are 86, 85, and 84 are both 85. However, we cannot imply that both distributions
possess the same characteristics since their patterns of dispersion or variations are
markedly different despite having the same mean.
4.6.2 The Limitations of the Median:
 It is easily affected by the number of items in a distribution
 It cannot be determined if the given values are not arranged according to magnitude.
 If several values are contained in distribution, it becomes laborious task to arrange them
according to magnitude.
 Its value is not as accurate as the mean because it is just an ordinal statistic.

4.6.3 The Limitations of the Mode:


 It is rarely or seldom used since it does not always exist
 It is very unstable because its value easily changes depending on the approaches used
in finding it
 Its value is just a rough estimate of the center of concentration of a distribution

4.7 Quantities
The idea of the median can be extended to the discussion of quantities. Quantiles refers to
values which divide the distribution into a given number of equal parts. The median divides the
distribution into two equal parts. There are other types of quantiles which we shall discuss in this
section these are the quantiles, deciles, and the percentiles or centiles.
The quantiles divide the distribution into four equal parts. The quantiles are Q1 (first
quantile), Q2 (second quantile or median), Q3 (third quantile) and Q4 (fourth quantile).
The deciles divide the distribution into ten equal parts. The decile are D1 (first decile), D2
(second decile), D3 (third decile), up to D10 (tenth decile).
The percentiles or centiles divide the distribution into one-hundred equal parts. They are
P1, (first percentile), P2 (second percentile), and so on up to P100. Notice that P50 = Q2 = Median,
P20 = D2; C90 = D9, etc.
Quantiles are determined when the values in the distribution are already arranged according
to magnitude for ungrouped data. The computation for grouped data of the quantiles are similar to
that of the median.
To understand the concept of quantiles, let us take the percentileor or centiles. A percentile
is a point in a distribution below which a given percent of cases lie. For example, the 70th
percentile or P70 is the point or score in a distribution below which 70% of the cases lie. If the 70th
percentile is equal to 80, that is P70 in a distribution scores given to freshmen entrants in a certain
college, a student who got 80 in the admission exam surpassed 70% of the cases with only 30%
examinees higher than his grade.
4.7.1 Computations of the Quantiles for Ungrouped Data

To determine any quantile, change it first to percentile and follow the steps below:
Step 1. Arrange first the scores according to magnitude or size
Step 2. Find the position of the given percentile in the distribution using the formula P(n+1)/100
where P is the given percent and n is the number of cases.
Step 3. Locate the score corresponding to the obtained position in the distributions starting from
the lowest score.
Step 4. Interpolate to get the score if the obtained position from step 2 is not exact.

Example: Find the 20th percentile or P20 of the following scores: 25, 22, 20, 16, 17, 12, 8, 6, 5
Steps:
𝑃 (𝑛+1)
1. Locate the position of the score corresponding to the 20th percentile using 100
20 (9+1)
Solution: =2
100

2. Locate the second score from the lowest. The answer is 6.


3. Hence, the 20th percentile or P20 = 6
This means that 20% of the cases scored below 6.

Example: Find the 60th percentile or P60 of the following scores: 99, 95, 80, 75, 70, 60, 40
Steps:
1. Compute for the position of the 60th percentile.
𝑃 (𝑛+1) 60 (7+1)
= = 4.8
100 100

2. Since 4.8 is between the 4th and 5th score from the bottom, we have to interpolate to find
the answer. Take the 4th score from the bottom which is 75 and the 5th score which is 80.
3. Solve the difference between these two scores, 80 – 75, which is 5.
4. Multiply the difference 5 by the decimal part obtained in step 1 which is 0.8. The product
is 4.
5. Finally add this product to the lower score, 75.
Thus P60 = 75 + 4 = 79

This means that 60% of the cases fall below the score 79.
4.7.2 Computations of the Quantiles for Grouped Data
The computation of any quantile for grouped data is similar to that of the median.
The formula is:
𝒏𝒑−𝒄𝒇
Pp = u + ( )i
𝒇𝒑

Where; Pp = the desired percentile


u = exact lower limit of the class interval containing Pp
n = number of cases
p = proportion corresponding to the desired percentage
cf = cumulative frequency immediately below the class interval containing
Pp
fp = frequency of the class interval containing Pp
i = class size

Table 4.5. Computations of Some Quantiles


Class Interval Class Boundaries f cf
64 - 67 59.5 – 67.5 7 90
60 - 63 59.5 – 59.5 6 83
56 – 59 55.5 – 59.5 7 77
Class Interval Containing P75 52 – 55 51.5 – 55.5 5 70 67.5th item is here
48 – 51 47.5 – 51.5 10 65
44 – 47 43.5 – 47.5 14 55
40 – 43 39.5 – 43.5 10 41
36 – 39 35.5 – 39.5 5 31
32 – 35 31.5 – 35.5 9 26
28 – 31 27.5 – 31.5 7 17
i = 24 – 20 = 4 24 – 27 23.5 – 27.5 6 10
20 – 23 19.5 – 23.5 4 4
∑ 𝑓 = 90

Solution of Some Quantiles:


1. Compute for Q3 or P75

Step 1: Solve for np


np = 90 x 75% = 90 x .70
= 67.5
Step 2: Locate 67.5 under the cf column. The 67.5 score is contained in the cf 70 which
corresponds to the class interval 52 – 55. Therefore, P75 lies within this class interval.

Step 3: Find the exact lower limit of 52 which is 51.5. So, u = 51.5.

Step 4: Find the cf value immediately below the class interval, 52-55. Thus, cf = 65.

Step 5: Find the fp value or frequency of the class interval 52-55. So, fp = 5.

Step 6: Get the class size which is 4. So i = 4.

Step 7: Compute for P75 by substituting all the needed values in the formula.

Solution:
𝑛𝑝−𝑐𝑓 67.5−65
Pp = u + ( )i = 51.5 + ( ) 4 = 53.5
𝑓𝑝 5

This means that 75% of the cases lie below the score 53.5

References:
Nocon, F.P, Torrecampo, J.T, Balacua, M.P, Dagua, W.B. General Statistics Made Simple for
Filipinos
https://centergrove.instructure.com/courses/1823759/pages/module-7-measures-of-central-
tendency
https://corporatefinanceinstitute.com/resources/knowledge/other/weighted-mean/

You might also like