0% found this document useful (0 votes)
284 views8 pages

Unit 4 Descriptive Statistics

This document provides an introduction and overview of measures of dispersion used in descriptive statistics. It defines key terms like range, quartiles, mean deviation, variance, and standard deviation. It explains that measures of dispersion describe how spread out or varied the data is around the central value. The document also briefly discusses the normal curve and numerical measures of shape like skewness and kurtosis. The objectives are to explain the basic concepts and formulas for calculating various measures of dispersion.

Uploaded by

HafizAhmad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
284 views8 pages

Unit 4 Descriptive Statistics

This document provides an introduction and overview of measures of dispersion used in descriptive statistics. It defines key terms like range, quartiles, mean deviation, variance, and standard deviation. It explains that measures of dispersion describe how spread out or varied the data is around the central value. The document also briefly discusses the normal curve and numerical measures of shape like skewness and kurtosis. The objectives are to explain the basic concepts and formulas for calculating various measures of dispersion.

Uploaded by

HafizAhmad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

UNIT-4

DESCRIPTIVE STATISTICS: MEASURES


OF DISPERSION

Written By:
Miss Sumbal Asghar

Reviewed By:
Dr. Rizwan Akram Rana
Introduction
Measures of central tendency estimate normal or central value of a dataset, while
measures of dispersion are important for describing the spread of the data, or its variation
around a central value. Two distinct samples may have same mean or median, but
completely different level of variability and vice versa. A proper description of a set of
data should include both of these characteristics. There are various methods that can be
used to measure the dispersion of a dataset. In this unit you will study range, quartiles,
quartile deviation, mean or average deviation, standard deviation and variance. Two
measures of shape and discussion about co-efficient of variation are also included in this
unit.

Objectives
After reading this unit, you will be able to:
1. tell the basic purpose of measure of central tendency.
2. define Range.
3. determine range of a given data.
4. write down the formulas for determining quartiles.
5. define mean or average deviation.
6. determine variance and standard deviation.
7. define normal curve.
8. explain skewness and kurtosis.

4.1 Introduction to Measures of Dispersion


Measures of central tendency focus on what is an average or in the middle of the
distribution of scores. Often the information provided by these measures does not give us
clear picture of the data and we need something more. It means that knowing the mean,
median, and mode of a distribution does allow us to differentiate between two or more
than two distributions; and we need additional information about the distribution. This
additional information is provided by a series of measures which are commonly known as
measures of dispersion.

There is dispersion when there is dissimilarity among the data values. The greater the
dissimilarity, the greater the degree of dispersion will be.

Measures of dispersion are needed for four basic purposes.


i) To determine the reliability of an average.
ii) To serve as a basis for the control of the variability.
iii) To compare two or more series with regard to their variability.
iv) To facilitate the use if other statistical measures.

48
Measure of dispersion enables us to compare two or more series with regards to their
variability. It is also looked as a means of determining uniformity or consistency. A high
degree would mean little consistency or uniformity whereas low degree of variation
would mean greater uniformity or consistency among the data set. Commonly used
measures of dispersion are range, quartile deviation, mean deviation, variance, and
standard deviation.

4.1.1 Range
The range is the simplest measure of spread and is the difference between the highest and
lowest scores in a data set. In other words we can say that range is the distance between
largest score and the smallest score in the distribution. We can calculate range as:
Range = Highest value of the data – Lowest value of the data

For example, if lowest and highest marks scored in a test are 22 and 95 respectively, then
Range = 95 – 22 = 73

The range is the easiest measure of dispersion, and is useful when you wish to evaluate
whole of a dataset. But it is not considered a good measure of dispersion as it does not
utilize the other information related to the spread. The outliers, either extreme low or
extreme high value, can considerably affect the range.

4.1.2 Quartiles
The values that divide the given set of data into four equal parts is called quartiles, and is
denoted by Q1, Q2, and Q3. Q1 is called the lower quartile and Q3 is called the upper
quartile. 25% of scores are less than Q1and 75% scores are less than Q3. Q2 is the median.
The formulas for the quartiles are:
Q1 = (N + )th Score
Q2 = 2 (N + )th = (N + ) th Score
Q3 = 3(N + 1) / 4th Score

4.1.3 Quartile Deviation (QD)


Quartile deviation or semi inter-quartile range is one half the differences between first
and the third quartile, i.e.
Q D = Q3 – Q1
Where Q1 = the first quartile (lower quartile)
Q3 = third quartile (upper quartile)

Calculating quartile deviation from ungrouped date:

In order to calculate quartile deviation from ungrouped data, following steps are used.
i) Arrange the test scores from highest to lowest
ii) Assign serial number to each score. The first serial number is assigned to the
lowest score.

49
iii) Determine first quartile (Q1) by using formula Q1 = . Use obtained value to
locate the serial number of the score that falls under Q1.
iv) Determine the third (Q3), by using formula Q3 = . Locate the serial number
corresponding to the obtained answer. Opposite to this number is the test score
corresponding to Q3.
v) Subtract the Q1 from Q3, and divide the difference by 2.

4.1.4 Mean Deviation or Average Deviation


The mean or the average deviation is defined as the arithmetic mean of the deviations of the
scores from the mean or the median. The deviations are taken as positive. Mathematically
For ungrouped data
M. D = Ʃ X - X / N

For grouped data


M. D = Ʃf X - X / Ʃf

4.1.5 Standard Deviation


Standard deviation is the most commonly used and the most important measure of
variation. It determines whether the scores are generally near or far from the mean, i.e.
are the scores clustered together or scattered. In simple words, standard deviation tells
how tightly all the scores are clustered around the mean in a data set. When the scores are
close to the mean, standard deviation is small. And large standard deviation tells that the
scores are spread apart. Standard deviation is simply square root of variance, i.e.
Standard deviation Ϭ = √ Variance

Or
Ϭ = √ Ʃ (X – X)2 / n
Ϭ is a Greek letter “Sigma”

4.1.6 Variance
The variance of a set of scores is denoted by σ2and is defined as
Ϭ2= Ʃ (X – X)2 / n

Where X is the mean, n is the number of data values and X stand for each of the scores,
and Ʃ means add up all the values.

And alternate equivalent formula for variance is


Ϭ2 = (Ʃ X2 / n) – X2

4.2 Normal Curve


One way of presenting out how data are distributed is to plot them in a graph. If the data
is evenly distributed, our graph will come across a curve. In statistics this curve is called
a normal curve and in social sciences, it is called the bell curve. Normal or bell curved is

50
distribution of data may naturally occur in several possible ways, with a number of
possibilities for standard deviation (which could be from 1 to infinity). A standard
normal curve has a mean of 0 and standard of 1. The larger the standard deviation, the
flatter the curve will be and vice versa. A standard normal distribution is given below.

Source: Google Images

A normal curve has following properties.


i) The mean, median or mode are equal.
ii) The curve is symmetric at the center (i.e. around the mean).
iii) Exactly half of the values are to the left of the center and half to the right.
iv) The total area under the curve is 1.

4.2.1 Numerical Measures of Shape


One of the fundamental tasks in any statistical analysis is to characterize the location and
variability of a data set. Two important measures of shape, skewness and kurtosis, give us
a more precise evaluation of the data. Measures of dispersion tell us about the variation of
the data set, while skewness tells us about the direction of variation and kurtosis tells us
the shape variation. Let us have a brief review of these measures of shape.

a) Skewness
Skewness tells us about the amount and direction of the variation of the data set. It is a
measure of symmetry. A distribution or data set is symmetric if it looks the same to the left
and right of the central point. If bulk of data is at the left i.e. the peak is towards left and the
right tail is longer, we say that the distribution is skewed right or positively skewed.

On the other hand if the bulk of data is towards right or, in other words, the peak is
towards right and the left tail is longer, we say that the distribution is skewed left or
negatively skewed.If the skewness is equal to zero, the data are perfectly symmetrical.
But it is quiet unlikely in real world.

51
Source: Google Images

Here are some rules of thumb:


i) If the skewness is less than – 1or greater than + 1, the distribution is highly skewed.
ii) If the skewness is between -1 and - or between + and + 1, the distribution is
moderately skewed.
iii) If the skewness is between - and + , the distribution is approximately skewed.

b) Kurtosis
Kurtosis is a parameter that describes the shape of variation. It is a measurement that tells
us how the graph of the set of data is peaked and how high the graph is around the mean.
In other words we can say that kurtosis measures the shape of the distribution, .i.e. the
fatness of the tails, it focuses on how returns are arranged around the mean. A positive
value means that too little data is in the tail and positive value means that too much data
is in the tail. This heaviness or the lightness in the tail means that data looks more peaked
of less peaked. Kurtosis is measured against the standard normal distribution. A standard
normal distribution has a kurtosis of 3.

Kurtosis has three types, mesokurtic, platykurtic, and leptokurtic. If the distribution has
kurtosis of zero, then the graph is nearly normal. This nearly normal distribution is called
mesokurtic. If the distribution has negative kurtosis, it is called platykurtic. An example
of platykurtic distribution is a uniform distribution, which has as much data in each tail as
it does in the peak. If the distribution has positive kurtosis, it is called leptokurtic. Such
distribution has bulk of data in the peak.

Source: Google Images

52
4.3 Co-Efficient of Variation
The coefficient of variation is another useful statistics for measuring dispersion of a data
set. The coefficient of variation is
C.V = (s / x ) × 100

The coefficient of variation is invariant with respect to the scale of the data. On the other
hand, standard deviation is not scale variant.

4.4 Self Assessment Questions


Q. 1 Write down the basic purpose of measure of central tendency.
Q. 2 Define range.
Q. 3 Write down the range of the following data.
12, 15, 35, 18, 21, 33, 18, 24, 48, 55, 36, 32, 17
Q. 4 What do you understand by mean deviation.
Q. 5 Define normal curve.
Q. 6 Write down the properties of normal curve.
Q. 7 Write down types of kurtosis

4.5 Activities
Take a cardboard. Cut it into 4x4 pieces, and:
i) Cut one piece into standard normal distribution shape and mention its name on it.
ii) Cut one piece into negatively skewed shape and mention its name on it.
iii) Cut one piece into positively skewed shape and mention its name on it.
iv) Cut one piece into no skewed shape and mention its name on it.
v) Cut one piece into mesokurtic shape and mention its name on it.
vi) Cut one piece into platykurtic shape and mention its name on it.
vii) Cut one piece into leptokurtic shape and mention its name on it.

53
4.6 Bibliography
Bartz, A. E. (1981). Basic Statistical Concepts (2nd Ed.). Minnesota: Burgess Publishing
Company

Deitz, T., & Kalof, L. (2009). Introduction to Social Statistics. UK: Wiley_-Blackwell

Gravetter, F. J., & Wallnau, L. B. (2002). Essentials of Statistics for the Behavioral
Sciences (4th Ed.). Wadsworth, California, USA.

54

You might also like