0% found this document useful (0 votes)
74 views

Lesson 3: Measures of Central Tendency: Total Number of Scores Population N Sample N

The document discusses measures of central tendency, which are statistical methods used to describe the center or typical value of a data set. The most common measure is the mean, which is calculated by adding all values and dividing by the total number of data points. The mean represents an equal distribution of the total among all data points. Other measures discussed include the weighted mean, which accounts for different weights or scales in the data. Measures of central tendency allow large data sets to be described concisely using a single representative value.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
74 views

Lesson 3: Measures of Central Tendency: Total Number of Scores Population N Sample N

The document discusses measures of central tendency, which are statistical methods used to describe the center or typical value of a data set. The most common measure is the mean, which is calculated by adding all values and dividing by the total number of data points. The mean represents an equal distribution of the total among all data points. Other measures discussed include the weighted mean, which accounts for different weights or scales in the data. Measures of central tendency allow large data sets to be described concisely using a single representative value.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Lesson 3: Measures of Central Tendency

The general purpose of descriptive statistical methods is to organize and


summarize a set of scores. Perhaps the most common method for summarizing and
describing a distribution is to find a single value that defines the average score and can
serve as a typical example to represent the entire distribution. In statistics, the concept
of an average or representative score is called central tendency. Central tendency is a
statistical measure to determine a single score that defines the center of a distribution.
The goal in measuring central tendency is to describe a distribution of scores by
determining a single value that identifies the center of the distribution. Ideally, this central
value will be the score that is the best representative value for all the individuals in the
distribution.

In everyday language, central tendency attempts to identify the “average” or


“typical” individual. This average value can then be used to provide a simple description
of an entire population or a sample. In addition to describing an entire distribution,
measures of central tendency are also useful for making comparisons between groups
of individuals or between sets of data.

For example, let us say the average income in Baguio City is PHP13,000 while in
Manila, it is PHP20,000. This example shows how we can describe a large set of data
with a single, representative number. Central tendency characterizes what is typical for a
large population and in doing so makes large amounts of data more digestible. Instead
of having thousands of incomes per month per individual in each city, you can use the
central tendency measure to compare both cities. Statisticians sometimes use the
expression “number crunching” to illustrate this aspect of data description. That is, we
take a distribution consisting of many scores and “crunch” them down to a single value
that describes them all.

Before we describe the measures of central tendency, let us familiarize ourselves


with some symbols used to describe values. Refer to the table below.

TOTAL NUMBER
OF SCORES
POPULATION N
SAMPLE n

When describing the total number of individuals in the population, N is used. As


for the total number of individuals in the sample, n is appropriate.
Mean

The mean, also known as the arithmetic average, is computed by adding all the
scores in the distribution and dividing by the number of scores. The mean for a population
is identified by the Greek letter mu, u (pronounced “mew”), and the mean for a sample
is identified by M (common in research) or X (common in textbooks; read “x-bar”).

Do take note that as we move on, we will represent X as either the scores or
the midpoint depending on what we are trying to solve or what frequency distribution
we have. Also, always round off two 2 decimal places.

1. RAW DATA
 Series of scores (e.g. 1,2,3,4,5)
 FORMULA:


 EXAMPLE:
Get the mean of the scores of students on a quiz. The scores are
as follows:
5,7,9,10,3,4,5
 = (5+7+9+10+3+4+5) = 43 = 6.14
7 7
2. ORGANIZED DATA
 Grouped or ungrouped frequency table
 MIDPOINT METHOD
FORMULA:

o
EXAMPLE (Ungrouped Frequency table):
Scores on fX
a Quiz f (frequency
(X) * Score)
9 1 9
8 2 16
6 3 18
5 4 20
3 5 15
3 7 21
2 3 6
∑f = 25 ∑fX =
(n) 105
 = 105 = 4.20
25

EXAMPLE (Grouped Frequency table):


X (Midpoint of
Class Intervals f each class fX
interval)
15-17 4 16 64
12-14 6 13 78
9-11 8 10 80
6-8 12 7 84
3-5 10 4 40
∑f = 40 (N) ∑fX = 346

 = 346 = 8.65
40

 DEVIATION METHOD (ASSUMED MEAN)


Grouped frequency table
 FORMULA:
o = X + i ∑fd
N

EXAMPLE:
Class Intervals f d fd
15-17 4 3 12
12-14 6 2 12
9-11 8 1 8
*6-8 12 0 0
3-5 10 -1 -10
∑f = 40 (N) ∑fd = 22

 * - class interval with highest f


 d – put 0 on class interval with highest frequency, continue up or
down the count

 = 7 + 3 22 = 8.65
40
Characteristics of the Mean

1. The mean represents dividing the total score and distributing it equally among
the individuals. If the total money is PHP20 and there are 4 individuals, then it
can be said that everyone has PHP5.00 each. In this case, the mean is also
PHP5.00.
2. The mean serves as the balance point. It is the center of the distribution, so it
ALWAYS occurs between the highest and lowest score.
3. Changing any score in the distribution can affect the mean.
4. Adding or removing a score from the distribution will change the mean UNLESS,
the new or removed score is equal to the mean.
 Example: If you have a set of scores, 1,2,3,4,5, the mean would be 3. If
you remove the score 3, you will only have 1,2,4,5 but the mean would
still be 3. If you add a score, 3, you will have 1,2,3,3,4,5 but the mean
would still be 3.
5. Adding, subtracting, multiplying, and dividing the scores by a constant will make
the mean change the same way.
 Example: If you multiple all your scores, 1,2,3,4,5 by 3 then you will have
3,6,9,12,15. In the original set, the mean is 3 and if you check in the new
the set, the mean is 9 (the original mean multiplied by 3).

Weighted Mean

The weighted mean is used when there are response scales with different weights.
For example, in an item in an attitude scale which states that “I am concerned when
others seem sad”, the scaled responses and their weights are: Strongly agree (4), agree
(3), disagree (2), and strongly disagree (1). The formula is computing the weighted mean
is:

𝑓 𝑋 + 𝑓 𝑋 +𝑓 𝑋 …𝑓 𝑋 ∑𝑓𝑋
𝑊𝑒𝑖𝑔ℎ𝑡𝑒𝑑 𝑀𝑒𝑎𝑛 = =
𝑓 + 𝑓 + 𝑓 …𝑓 𝑛

In the equation above, f would be the frequency representing how many


respondents chose that option. X would be the value of that option. In our example
above, for strongly agree, it is 4, for agree is 3 and so on. So, for example, let’s say
for item #1, “I am concerned when others seem sad.”, 5 respondents chose “strongly
agree”, f1X1 would be (5)(4) = 20. The formula includes fkXk since it would depend on
how many options you have. In our example, we have 4 but there could be up to 10.
This formula is used per item in a questionnaire or survey.

Let’s try to look at another example. The following items are from a kindness
scale. The scale uses a 4-point Likert scale as follows: Always (4), Often (3), Sometimes
(2), and Never (1). The scale was answered by 10 respondents and the table below
represents the data.
Items Frequency (# of respondents who
chose option)
Always Often Sometimes Never
1. I help others as much as I can. 5 3 1 1

2. I do not get angry at others for hurting 3 2 5 1


me.

3. I encourage others who are down. 6 2 1 1

Using the formula above, let’s try to compute for the weighted mean of each
item.

Item 1 (5)(4) + (3)(3) + (1)(2) + (1)(1) 20 + 9 + 2 + 1 32


𝑊𝑒𝑖𝑔ℎ𝑡𝑒𝑑 𝑀𝑒𝑎𝑛 = = = = 3.2
5+3+1+1 10 10
Item 2 (3)(4) + (2)(3) + (5)(2) + (1)(1) 12 + 6 + 10 + 2 30
𝑊𝑒𝑖𝑔ℎ𝑡𝑒𝑑 𝑀𝑒𝑎𝑛 = = = =3
3+2+5+1 10 10
Item 3 (6)(4) + (2)(3) + (1)(2) + (1)(1) 24 + 6 + 2 + 1 33
𝑊𝑒𝑖𝑔ℎ𝑡𝑒𝑑 𝑀𝑒𝑎𝑛 = = = = 3.3
6+2+1+1 10 10

For item 1, 5 people chose always which has a scale value of 4 so it would be
(5)(4), 3 people chose often which has a value of 3 so it would be (3)(3), 1 person
chose sometimes with a value of 2 so it would be (1)(2), and 1 person chose never with
a value of 1 so it would be (1)(1). For the f in the denominator, we just added how
many people chose each option so it would be 5+3+1+1 which is equal to 10. That is
correct since in our example, the scale was given to 10 respondents. Using the formula,
the weighted mean of item #1 is 3.2.

Now, how are we going to interpret that? First, you have to make your ranges. In
doing so get the highest possible value in your options so in our example, we have
always which has a value of 4. Subtract 1 from the highest value so that would be 4-1
is equal to 3. Then, divide the answer by the total number of options so that would be
¾ = 0.75. Now starting from the lowest value which is 1 (Never), add 0.75 till you reach
the highest value. So, we start with 1 + 0.75 which is 1.75 therefore our first range is
1.00 – 1.75. Next, add 0.75 to 1.75 and you get 2.50. Our next range should start with
1.76 since our last ended with 1.75 so that would be 1.76-2.50. Look at the table below
for the complete ranges. Now that you have your ranges, you can assign verbal
interpretations to each. You can use your original options such that 1.00 – 1.75 would
be never, 1.76-2.50 I sometimes, 2.51-3.25 is often, and 3.26-4.00 is always. Or you can
also assign other interpretations such as 1.00-1.75 very low, 1.76-2.50 is low, 2.51-3.25
is high, and 3.26-4.00 is very high. In our example, we’ll just use the original options.
3.26 – 4.00 Always/Very High
2.51 – 3.25 Often/High
1.76 – 2.50 Sometimes/Low
1.00 – 1.75 Never/Very Low

Now that you have your ranges and their interpretations, you can now interpret
your weighted mean. For item #1, our weighted mean is 3.20, which according to our
table above, is often. This means that in general, the 10 respondents often help others
as much as they can or if you use the levels, that would be high; so, this could also
mean that your 10 respondents are highly likely to help others as much as they can.

You can also get the overall mean to get the overall interpretation. In doing so,
just add the weighted means for each item and divide by the number of items. So, in
our example that would be (3.20+3.00+3.30 / 3 = 3.17). Again, you can use the table
to interpret this. 3.17 is equal to often or high so you can say that your respondents
are often kind (remember: we are dealing with overall, and our questionnaire is kindness
scale) or have high level of kindness.

Median

If the scores in a distribution are listed in order from smallest to largest, the
median is the midpoint of the list. It divides the distribution of scores into two equal
parts. More specifically, the median is the point on the measurement scale below which
50% of the scores in the distribution are located. There are no specific symbols depicting
the median but for easier identification, we will use Md.

1. RAW DATA
 FORMULA: (N + 1) / 2
 ODD number of scores
EXAMPLE: 4,3,5,8,7
Arrange the scores from lowest to highest – 3, 4, 5, 7, 8
Compute for the middle score – (5 + 1) / 2 = 3 = 3rd
Count and identify – Median = 5
 EVEN number of scores
EXAMPLE: 4,3,5,8,7,7
Arrange scores from lowest to highest – 3, 4, 5, 7, 7, 8
Compute for middle score – (6 + 1) / 2 = 3.5 = 3rd – 4th
Count and identify – Median = 5 & 7
Add both scores and divide by 2 > (5 + 7) / 2 = 12 / 2 = 6
Median = 6
 DUPLICATION of scores near the median
EXAMPLE: 3,4,5,7,8,7,7,8
Arrange scores from lowest to highest – 3, 4, 5, 7, 7, 7, 8, 8
Compute for middle score – (8 + 1) / 2 = 9 / 2 = 4.5 = 4th – 5th
Count and identify – duplicated score = 7
Divide set into 2 equal haves and put a line.
o 3, 4, 5, 7, 7, 7, 8, 8
Get Upper Limit (UL) (+ 0.5 ) or Lower limit (LL) ( - 0.5 ) of duplicated
score: UL = 7.5, LL = 6.5

Compute for median:


o Count how many duplicated scores are below and above the
line, and the total number of duplicated scores
a. Below – 1
b. Above - 2
c. Total - 3
o If using the UL, formula is: UL – (No. of duplicated scores above the line)
(total number of duplicated scores)
o Example: 7.5 – (2 / 3) = 6.83
o If using the LL, formula is: LL + (No. of duplicated scores below the line)
(total number of duplicated scores)
o Example: 6.5 + (1 / 3) = 6.83
o Median = 6.83

2. ORGANIZED DATA
 FORMULA: L = REAL LOWER LIMIT of the median class
 Md = L + i N/2 – fc
fm
MEDIAN CLASS = Compute for N/2 and find the cf
of CI that is equal to greater than N/2
EXAMPLE:
Class Intervals f cf
15-17 4 40
12-14 6 36
9-11 8 30
*6-8 12 22
3-5 10 10
∑f = 40 (N)

* - median class

Md = 5.5 + 3 – 10 = 8.00


12
Mode

The mode is the score or category that has the greatest frequency. The word
mode means “the customary fashion” or “a popular style.” There are no symbols or
special notation used to identify the mode or to differentiate between a sample mode
and a population mode. For easier understanding, we will use Mo to represent mode.
The mode is a useful measure of central tendency because it can be used to determine
the typical or most frequent value for any scale of measurement, including a nominal
scale.

The mode also can be useful because it is the only measure of central tendency
that corresponds to an actual score in the data; by definition, the mode is the most
frequently occurring score. The mean and the median, on the other hand, are both
calculated values and often produce an answer that does not equal any score in the
distribution.

In a frequency distribution graph, the greatest frequency will appear as the tallest
part of the figure. Although a distribution will have only one mean and only one median,
it is possible to have more than one mode. Specifically, it is possible to have two or
more scores that have the same highest frequency. In a frequency distribution graph, the
different modes will correspond to distinct, equally high peaks. A distribution with two
modes is said to be bimodal, and a distribution with more than two modes is called
multimodal. Occasionally, a distribution with several equally high points is said to have
no mode.

1. RAW DATA
 Look at the set of scores and identify what the most frequent score is
 Example: 5,6,6,6,7,3,2
 Mode = 6

2. ORGANIZED DATA
 FORMULA: L = REAL LOWER LIMIT of the modal class
Mo = L + i D1
D1 + D2

EXAMPLE:
Class Intervals f
15-17 4
12-14 6
9-11 8
*6-8 12
3-5 10
∑f = 40 (N)
 * - Modal class
 D1 = 12 – 10 = 2
 D2 = 12 – 8 = 4
 Mo = 5.5 + 3 2 = 6.50
2 + 4

Selecting a Measure of Central Tendency

You usually can compute two or even three measures of central tendency for the
same set of data. Although the three measures often produce similar results, there are
situations in which they are predictably different. Deciding which measure of central
tendency is best to use depends on several factors.

Whenever the scores are numerical values (interval or ratio scale) the mean is
usually the preferred measure of central tendency. Because the mean uses every score
in the distribution, it typically produces a good representative value. Remember that the
goal of central tendency is to find the single value that best represents the entire
distribution. Also, the mean has the added advantage of being closely related to variance
and standard deviation, the most common measures of variability. This relationship makes
the mean a valuable measure for purposes of inferential statistics. For these reasons,
and others, the mean generally is the best of the three measures of central tendency.
However, there are specific situations in which it is impossible to compute a mean or in
which the mean is not particularly representative. It is in these situations that the mode
and the median are used.

 When to Use the Median


o Extreme scores: When a distribution has a few extreme scores, scores that
are very different in value from most of the others, then the mean may
not be a good representative of most of the distribution. The problem
comes from the fact that one or two extreme values can have a large
influence and cause the mean to be displaced. In this situation, the fact
that the mean uses all the scores equally can be a disadvantage. The
median, on the other hand, is not easily affected by extreme scores.
o Undetermined Values: It is impossible to compute for the mean because
you have unknown values. The median can still be acquired as it would
divide the scores into halves including the unknown value. The median for
the sample below is 12.5 including the undetermined or unknown value.
o Open-ended distributions: A distribution is said to be open-ended when
there is no upper limit (or lower limit) for one of the categories. The table
below provides an example of an open-ended distribution, showing the
number of pizzas eaten during a 1-month period for a sample of n = 20
high school students. The top category in this distribution shows that three
of the students consumed “5 or more” pizzas. This is an open-ended
category. Notice that it is impossible to compute a mean for these data
because you cannot find ∑X (the total number of pizzas for all 20 students).
However, you can find the median. Listing the 20 scores in order produces
X = 1 and X= 2 as the middle two scores. For these data, the median is
1.5.

o Ordinal Scale: When scores are measured on an ordinal scale, the median
is always appropriate and is usually the preferred measure of central
tendency. The median is compatible with this type of measurement because
it is defined by direction: half of the scores are above the median and half
are below the median.

 When to Use the Mode


o Nominal Scales: The primary advantage of the mode is that it can be used
to measure and describe central tendency for data that are measured on
a nominal scale. Recall that the categories that make up a nominal scale
are differentiated only by name, such as classifying people by occupation
or college major. Because nominal scales do not measure quantity (distance
or direction), it is impossible to compute a mean or a median for data
from a nominal scale. Therefore, the mode is the only option for describing
central tendency for nominal data.
o Discrete Variables: Recall that discrete variables are those that exist only
in whole, indivisible categories. Often, discrete variables are numerical
values, such as the number of children in a family or the number of rooms
in a house. When these variables produce numerical scores, it is possible
to calculate means. However, the calculated means are usually fractional
values that cannot exist. For example, computing means will generate results
such as “the average family has 2.4 children and a house with 5.33 rooms.”
The mode, on the other hand, always identifies an actual score (the most
typical case) and, therefore, it produces more sensible measures of central
tendency. Using the mode, our conclusion would be “the typical, or modal,
family has 2 children and a house with 5 rooms.” In many situations,
especially with discrete variables, people are more comfortable using the
realistic, whole-number values produced by the mode.

Reporting Measures of Central Tendency

Measures of central tendency are commonly used in the behavioral sciences to


summarize and describe the results of a research study. For example, a researcher may
report the sample means from two different treatments or the median score for a large
sample. These values may be reported in text describing the results or presented in
tables or in graphs.

In reporting results, many behavioral science journals use guidelines adopted by


the American Psychological Association (APA), as outlined in the Publication Manual of
the American Psychological Association (2010). We will refer to the APA manual from
time to time in describing how data and research results are reported in the scientific
literature. The APA style uses the letter M as the symbol for the sample mean. Thus, a
study might state:

The treatment group showed fewer errors (M = 2.56) on the task than the control
group (M = 11.76).

When there are many means to report, tables with headings provide an organized
and more easily understood presentation. Refer to the table below for an example.

The median can be reported using the abbreviation Mdn, as in “Mdn = 8.5 errors,”
or it can simply be reported in narrative text, as follows:

The median number of errors for the treatment group was 8.5, compared to a median
of 13 for the control group.

There is no special symbol or convention for reporting the mode. If mentioned at


all, the mode is usually just reported in narrative text.
Using SPSS to Find the Mean, Median, and Mode

You might also like