Lesson 3: Measures of Central Tendency: Total Number of Scores Population N Sample N
Lesson 3: Measures of Central Tendency: Total Number of Scores Population N Sample N
For example, let us say the average income in Baguio City is PHP13,000 while in
Manila, it is PHP20,000. This example shows how we can describe a large set of data
with a single, representative number. Central tendency characterizes what is typical for a
large population and in doing so makes large amounts of data more digestible. Instead
of having thousands of incomes per month per individual in each city, you can use the
central tendency measure to compare both cities. Statisticians sometimes use the
expression “number crunching” to illustrate this aspect of data description. That is, we
take a distribution consisting of many scores and “crunch” them down to a single value
that describes them all.
TOTAL NUMBER
OF SCORES
POPULATION N
SAMPLE n
The mean, also known as the arithmetic average, is computed by adding all the
scores in the distribution and dividing by the number of scores. The mean for a population
is identified by the Greek letter mu, u (pronounced “mew”), and the mean for a sample
is identified by M (common in research) or X (common in textbooks; read “x-bar”).
Do take note that as we move on, we will represent X as either the scores or
the midpoint depending on what we are trying to solve or what frequency distribution
we have. Also, always round off two 2 decimal places.
1. RAW DATA
Series of scores (e.g. 1,2,3,4,5)
FORMULA:
EXAMPLE:
Get the mean of the scores of students on a quiz. The scores are
as follows:
5,7,9,10,3,4,5
= (5+7+9+10+3+4+5) = 43 = 6.14
7 7
2. ORGANIZED DATA
Grouped or ungrouped frequency table
MIDPOINT METHOD
FORMULA:
o
EXAMPLE (Ungrouped Frequency table):
Scores on fX
a Quiz f (frequency
(X) * Score)
9 1 9
8 2 16
6 3 18
5 4 20
3 5 15
3 7 21
2 3 6
∑f = 25 ∑fX =
(n) 105
= 105 = 4.20
25
= 346 = 8.65
40
EXAMPLE:
Class Intervals f d fd
15-17 4 3 12
12-14 6 2 12
9-11 8 1 8
*6-8 12 0 0
3-5 10 -1 -10
∑f = 40 (N) ∑fd = 22
= 7 + 3 22 = 8.65
40
Characteristics of the Mean
1. The mean represents dividing the total score and distributing it equally among
the individuals. If the total money is PHP20 and there are 4 individuals, then it
can be said that everyone has PHP5.00 each. In this case, the mean is also
PHP5.00.
2. The mean serves as the balance point. It is the center of the distribution, so it
ALWAYS occurs between the highest and lowest score.
3. Changing any score in the distribution can affect the mean.
4. Adding or removing a score from the distribution will change the mean UNLESS,
the new or removed score is equal to the mean.
Example: If you have a set of scores, 1,2,3,4,5, the mean would be 3. If
you remove the score 3, you will only have 1,2,4,5 but the mean would
still be 3. If you add a score, 3, you will have 1,2,3,3,4,5 but the mean
would still be 3.
5. Adding, subtracting, multiplying, and dividing the scores by a constant will make
the mean change the same way.
Example: If you multiple all your scores, 1,2,3,4,5 by 3 then you will have
3,6,9,12,15. In the original set, the mean is 3 and if you check in the new
the set, the mean is 9 (the original mean multiplied by 3).
Weighted Mean
The weighted mean is used when there are response scales with different weights.
For example, in an item in an attitude scale which states that “I am concerned when
others seem sad”, the scaled responses and their weights are: Strongly agree (4), agree
(3), disagree (2), and strongly disagree (1). The formula is computing the weighted mean
is:
𝑓 𝑋 + 𝑓 𝑋 +𝑓 𝑋 …𝑓 𝑋 ∑𝑓𝑋
𝑊𝑒𝑖𝑔ℎ𝑡𝑒𝑑 𝑀𝑒𝑎𝑛 = =
𝑓 + 𝑓 + 𝑓 …𝑓 𝑛
Let’s try to look at another example. The following items are from a kindness
scale. The scale uses a 4-point Likert scale as follows: Always (4), Often (3), Sometimes
(2), and Never (1). The scale was answered by 10 respondents and the table below
represents the data.
Items Frequency (# of respondents who
chose option)
Always Often Sometimes Never
1. I help others as much as I can. 5 3 1 1
Using the formula above, let’s try to compute for the weighted mean of each
item.
For item 1, 5 people chose always which has a scale value of 4 so it would be
(5)(4), 3 people chose often which has a value of 3 so it would be (3)(3), 1 person
chose sometimes with a value of 2 so it would be (1)(2), and 1 person chose never with
a value of 1 so it would be (1)(1). For the f in the denominator, we just added how
many people chose each option so it would be 5+3+1+1 which is equal to 10. That is
correct since in our example, the scale was given to 10 respondents. Using the formula,
the weighted mean of item #1 is 3.2.
Now, how are we going to interpret that? First, you have to make your ranges. In
doing so get the highest possible value in your options so in our example, we have
always which has a value of 4. Subtract 1 from the highest value so that would be 4-1
is equal to 3. Then, divide the answer by the total number of options so that would be
¾ = 0.75. Now starting from the lowest value which is 1 (Never), add 0.75 till you reach
the highest value. So, we start with 1 + 0.75 which is 1.75 therefore our first range is
1.00 – 1.75. Next, add 0.75 to 1.75 and you get 2.50. Our next range should start with
1.76 since our last ended with 1.75 so that would be 1.76-2.50. Look at the table below
for the complete ranges. Now that you have your ranges, you can assign verbal
interpretations to each. You can use your original options such that 1.00 – 1.75 would
be never, 1.76-2.50 I sometimes, 2.51-3.25 is often, and 3.26-4.00 is always. Or you can
also assign other interpretations such as 1.00-1.75 very low, 1.76-2.50 is low, 2.51-3.25
is high, and 3.26-4.00 is very high. In our example, we’ll just use the original options.
3.26 – 4.00 Always/Very High
2.51 – 3.25 Often/High
1.76 – 2.50 Sometimes/Low
1.00 – 1.75 Never/Very Low
Now that you have your ranges and their interpretations, you can now interpret
your weighted mean. For item #1, our weighted mean is 3.20, which according to our
table above, is often. This means that in general, the 10 respondents often help others
as much as they can or if you use the levels, that would be high; so, this could also
mean that your 10 respondents are highly likely to help others as much as they can.
You can also get the overall mean to get the overall interpretation. In doing so,
just add the weighted means for each item and divide by the number of items. So, in
our example that would be (3.20+3.00+3.30 / 3 = 3.17). Again, you can use the table
to interpret this. 3.17 is equal to often or high so you can say that your respondents
are often kind (remember: we are dealing with overall, and our questionnaire is kindness
scale) or have high level of kindness.
Median
If the scores in a distribution are listed in order from smallest to largest, the
median is the midpoint of the list. It divides the distribution of scores into two equal
parts. More specifically, the median is the point on the measurement scale below which
50% of the scores in the distribution are located. There are no specific symbols depicting
the median but for easier identification, we will use Md.
1. RAW DATA
FORMULA: (N + 1) / 2
ODD number of scores
EXAMPLE: 4,3,5,8,7
Arrange the scores from lowest to highest – 3, 4, 5, 7, 8
Compute for the middle score – (5 + 1) / 2 = 3 = 3rd
Count and identify – Median = 5
EVEN number of scores
EXAMPLE: 4,3,5,8,7,7
Arrange scores from lowest to highest – 3, 4, 5, 7, 7, 8
Compute for middle score – (6 + 1) / 2 = 3.5 = 3rd – 4th
Count and identify – Median = 5 & 7
Add both scores and divide by 2 > (5 + 7) / 2 = 12 / 2 = 6
Median = 6
DUPLICATION of scores near the median
EXAMPLE: 3,4,5,7,8,7,7,8
Arrange scores from lowest to highest – 3, 4, 5, 7, 7, 7, 8, 8
Compute for middle score – (8 + 1) / 2 = 9 / 2 = 4.5 = 4th – 5th
Count and identify – duplicated score = 7
Divide set into 2 equal haves and put a line.
o 3, 4, 5, 7, 7, 7, 8, 8
Get Upper Limit (UL) (+ 0.5 ) or Lower limit (LL) ( - 0.5 ) of duplicated
score: UL = 7.5, LL = 6.5
2. ORGANIZED DATA
FORMULA: L = REAL LOWER LIMIT of the median class
Md = L + i N/2 – fc
fm
MEDIAN CLASS = Compute for N/2 and find the cf
of CI that is equal to greater than N/2
EXAMPLE:
Class Intervals f cf
15-17 4 40
12-14 6 36
9-11 8 30
*6-8 12 22
3-5 10 10
∑f = 40 (N)
* - median class
The mode is the score or category that has the greatest frequency. The word
mode means “the customary fashion” or “a popular style.” There are no symbols or
special notation used to identify the mode or to differentiate between a sample mode
and a population mode. For easier understanding, we will use Mo to represent mode.
The mode is a useful measure of central tendency because it can be used to determine
the typical or most frequent value for any scale of measurement, including a nominal
scale.
The mode also can be useful because it is the only measure of central tendency
that corresponds to an actual score in the data; by definition, the mode is the most
frequently occurring score. The mean and the median, on the other hand, are both
calculated values and often produce an answer that does not equal any score in the
distribution.
In a frequency distribution graph, the greatest frequency will appear as the tallest
part of the figure. Although a distribution will have only one mean and only one median,
it is possible to have more than one mode. Specifically, it is possible to have two or
more scores that have the same highest frequency. In a frequency distribution graph, the
different modes will correspond to distinct, equally high peaks. A distribution with two
modes is said to be bimodal, and a distribution with more than two modes is called
multimodal. Occasionally, a distribution with several equally high points is said to have
no mode.
1. RAW DATA
Look at the set of scores and identify what the most frequent score is
Example: 5,6,6,6,7,3,2
Mode = 6
2. ORGANIZED DATA
FORMULA: L = REAL LOWER LIMIT of the modal class
Mo = L + i D1
D1 + D2
EXAMPLE:
Class Intervals f
15-17 4
12-14 6
9-11 8
*6-8 12
3-5 10
∑f = 40 (N)
* - Modal class
D1 = 12 – 10 = 2
D2 = 12 – 8 = 4
Mo = 5.5 + 3 2 = 6.50
2 + 4
You usually can compute two or even three measures of central tendency for the
same set of data. Although the three measures often produce similar results, there are
situations in which they are predictably different. Deciding which measure of central
tendency is best to use depends on several factors.
Whenever the scores are numerical values (interval or ratio scale) the mean is
usually the preferred measure of central tendency. Because the mean uses every score
in the distribution, it typically produces a good representative value. Remember that the
goal of central tendency is to find the single value that best represents the entire
distribution. Also, the mean has the added advantage of being closely related to variance
and standard deviation, the most common measures of variability. This relationship makes
the mean a valuable measure for purposes of inferential statistics. For these reasons,
and others, the mean generally is the best of the three measures of central tendency.
However, there are specific situations in which it is impossible to compute a mean or in
which the mean is not particularly representative. It is in these situations that the mode
and the median are used.
o Ordinal Scale: When scores are measured on an ordinal scale, the median
is always appropriate and is usually the preferred measure of central
tendency. The median is compatible with this type of measurement because
it is defined by direction: half of the scores are above the median and half
are below the median.
The treatment group showed fewer errors (M = 2.56) on the task than the control
group (M = 11.76).
When there are many means to report, tables with headings provide an organized
and more easily understood presentation. Refer to the table below for an example.
The median can be reported using the abbreviation Mdn, as in “Mdn = 8.5 errors,”
or it can simply be reported in narrative text, as follows:
The median number of errors for the treatment group was 8.5, compared to a median
of 13 for the control group.