Module 06 Skewness and The Mean, Median and Mode
Module 06 Skewness and The Mean, Median and Mode
Figure 2.11
The histogram displays a symmetrical distribution of data.
A distribution is symmetrical if a vertical line can be drawn at
some point in the histogram such that the shape to the left and
the right of the vertical line are mirror images of each other.
The mean, the median, and the mode are each seven for these
data. In a perfectly symmetrical distribution, the mean and
the median are the same. This example has one mode
(unimodal), and the mode is the same as the mean and median.
In a symmetrical distribution that has two modes (bimodal), the
two modes would be different from the mean and median.
The histogram for the data: 4; 5; 6; 6; 6; 7; 7; 7; 7; 8 shown
in Figure 2.11 is not symmetrical. The right-hand side seems
"chopped off" compared to the left side. A distribution of this
type is called skewed to the left because it is pulled out to the
left. We can formally measure the skewness of a distribution
just as we can mathematically measure the center weight of
the data or its general "spreadness". The mathematical formula
for skewness is: a3=∑(xi−x¯)3
ns3 The greater the deviation from
zero indicates a greater degree of skewness. If the skewness is
negative then the distribution is skewed left as in Figure 2.12. A
positive measure of skewness indicates right skewness such
as Figure 2.13.
Figure 2.12
The mean is 6.3, the median is 6.5, and the mode is
seven. Notice that the mean is less than the median, and they
are both less than the mode. The mean and the median both
reflect the skewing, but the mean reflects it more so.
The histogram for the data: 6; 7; 7; 7; 7; 8; 8; 8; 9; 10 shown
in Figure 2.13, is also not symmetrical. It is skewed to the right.
Figure 2.13
The mean is 7.7, the median is 7.5, and the mode is seven. Of
the three statistics, the mean is the largest, while the mode is
the smallest. Again, the mean reflects the skewing the most.
The mean is affected by outliers that do not influence the
mean. Therefore, when the distribution of data is skewed to
the left, the mean is often less than the median. When the
distribution is skewed to the right, the mean is often greater
than the median. In symmetric distributions, we expect the
mean and median to be approximately equal in value. This is an
important connection between the shape of the distribution
and the relationship of the mean and median. It is not,
however, true for every data set. The most common exceptions
occur in sets of discrete data.
As with the mean, median and mode, and as we will see
shortly, the variance, there are mathematical formulas that give
us precise measures of these characteristics of the distribution
of the data. Again looking at the formula for skewness we see
that this is a relationship between the mean of the data and the
individual observations cubed.
=∑(xi−x¯)3
ns3
1. Use the following information to answer the next three exercises: State whether
the data are symmetrical, skewed to the left, or skewed to the right.
1; 1; 1; 2; 2; 2; 2; 3; 3; 3; 3; 3; 3; 3; 3; 4; 4; 4; 5; 5
Answer:
The data are symmetrical. The median is 3 and the mean is 2.85. They are close,
and the mode lies close to the middle of the data, so the data are symmetrical.
2..
When the data are skewed left, what is the typical relationship between the mean
and median?
Answer:
The data are skewed right. The median is 87.5 and the mean is 88.2. Even
though they are close, the mode lies to the left of the middle of the data, and
there are many more instances of 87 than any other number, so the data are
skewed right.
3.
When the data are symmetrical, what is the typical relationship between the
mean and median?
Answer
When the data are symmetrical, the mean and median are close or the same.
5. Describe the relationship between the mean and the median of this distribution.
Figure 5
The mean is 4.1 and is slightly greater than the median, which is four.
6. Describe the relationship between the mode and the median of this distribution.
Figure 6
Answer
The mode and the median are the same. In this case, they are both five.
8. Describe the relationship between the mean and the median of this distribution.
Figure 8
Answer
The mean and the median are both six.
9. Which is the greatest, the mean, the mode, or the median of the data set?
11; 11; 12; 12; 12; 12; 13; 15; 17; 22; 22; 22
Answer:
The mode is 12, the median is 12.5, and the mean is 15.1. The mean is the largest.
10. Which is the least, the mean, the mode, and the median of the data set?
56; 56; 56; 58; 59; 60; 62; 64; 64; 65; 67
Of the three measures, which tends to reflect skewing the most, the mean, the mode, or
the median? Why?
Answer:
The mean tends to reflect skewing the most because it is affected the most by outliers.