0% found this document useful (0 votes)
13 views

MEASURES OF VARIATION

The document outlines the objectives and key concepts of a course on Epidemiology and Biostatistics, focusing on measures of variation in numerical data. It explains various measures of dispersion, including range, variance, and standard deviation, along with their advantages and disadvantages. Additionally, it discusses quartiles and the interquartile range as tools for understanding data distribution.

Uploaded by

chandaevans33
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views

MEASURES OF VARIATION

The document outlines the objectives and key concepts of a course on Epidemiology and Biostatistics, focusing on measures of variation in numerical data. It explains various measures of dispersion, including range, variance, and standard deviation, along with their advantages and disadvantages. Additionally, it discusses quartiles and the interquartile range as tools for understanding data distribution.

Uploaded by

chandaevans33
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 32

EPIDEMIOLOGY & BIOSTATISTICS

COURSE CODE: DPH 2135


Bwalya Chanda
OBJECTIVES
• To describe the properties of measures of
variation in numerical data
• To calculate descriptive summary measures of
variation for a sample and population:
• Range, inter-quartile range, variance and standard
deviation
• To construct and interpret a box-and-whisker plot
Quartiles
MEASURES OF DISPERSION
• The measure of dispersion shows how the data is
spread or scattered around the mean.

• The measure of location or central tendency is a


central value that the data values group around. It
gives an average value.

• The measure of skewness is how symmetrical (or


not) the distribution of data values is.
MEASURES OF DISPERSION
• A measure of central tendency does not
completely provide an adequate summary of the
characteristics of a data set. We will usually
require, in addition, measures of dispersion which
measure the degree of variability among the
observations.

• The most important are the range, the variance


and the standard deviation.
MEASURES OF DISPERSION
MEASURES OF DISPERSION
RANGE
• Simplest measure of dispersion
• The data is ranked in order
• Difference between the largest and the smallest
• If xmin and xmax are the smallest and the largest
observations respectively, then the range,
denoted by r is
• r = xmax −xmin
MEASURES OF DISPERSION

Example: Numbers below are test scores for a class.


44 56 58 62 64 64 70 72
Range = 72 – 44 = 28
28 (44,72)
• Communicates very little information
MEASURES OF DISPERSION
• The range is a very poor measure of the variability
since it involves only two numbers, and it provides
no indication about the spread of the other
values.
• It is also sensitive to outliers
1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,5
Range = 5 - 1 = 4

1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,120
Range = 120 - 1 = 119
MEASURES OF DISPERSION
Advantages and Disadvantages of the Range
Advantage
• it is easy to compute

Disadvantage
• It communicates very little information about the
data set
– It only takes into account the largest and smallest
value
– This makes it a poor measure of dispersion
MEASURES OF DISPERSION
Variance

• The variance is a measure of variability which takes into


account the differences between each observation and the
sample mean

• It measures the scatter of the values in a set of data about


the mean

• The dispersion of the value when they are close to the


mean is less and vice versa
– Hence the logic to measure the variation of values from the mean
MEASURES OF DISPERSION
Calculation of the Variance

• Sample variance = The sum of the squared deviations, divided


by (n – 1).

Mathematical notation: s² = Σ(x – x¯)²


n -1
• The quantity s² is called the sample estimate of the variance

• Population variance:
Mathematical notation: σ² = Σ(x – μ)²
N
MEASURES OF DISPERSION
Advantages and disadvantages of the variance

Advantage
• It takes into consideration all the values in the set
of data.
Disadvantage
• The units of measure are squared which may be
difficult to communicate
– e.g. variance of weight will be in kg squared.
MEASURES OF DISPERSION

Standard deviation
• Most commonly used measure of variation
• Shows variation about the mean and is the square root of the variance
• The quantity denoted by s, is called the sample standard deviation

• Thus, if s² = Σ(x – x¯)²


n–1
Then
s = √(Σ (x – x¯)² / n – 1)

The population standard deviation will therefore be denoted as: σ = √σ²


Where
σ² = Σ(x – μ)²
MEASURES OF DISPERSION
Quartiles
• Quartiles are the three values (Q1, Q2, Q3) that
divide the data set into four (approximately) equal
parts.

25% 25% 25% 25%

Q1 Q2 Q3

• Q1 is the lower quartile/ minimum


• Q2 is the median
• Q3 is the upper quartile/maximum
MEASURES OF DISPERSION
• The first quartile, Q1, is the value for which 25% of
the observations are smaller and 75% are larger

• Q2 is the same as the median and it implies 50% of


the observations are smaller and 50% are larger

• Only 25% of the observations are greater than the


third quartile
MEASURES OF DISPERSION
• Lower quartile Q1: is defined as

Q1 = (25/100)n, where n is the sample size

• Median quartile Q2: is defined as


Q2 = (50/100)n, where n is the sample size

• Upper quartile Q1: is defined as


Q3 =(75/100)n, where n is the sample size
MEASURES OF DISPERSION
• For any set of data (ranked in order from least to
greatest):
• The second quartile, Q2, is the median.
• The first quartile, Q1, is the median of all items
below Q2.
• The third quartile, Q3, is the median of all items
above Q2.
MEASURES OF DISPERSION
• The interquartile range shows the spread of the
middle 50% of the data.

• Interquartile IQR: is thus, calculated as


IQR = Q3 −Q1
MEASURES OF DISPERSION
Example
• The following are test scores (out of 100) for a
particular math class.
44 56 58 62 64 64 70 72
72 72 74 74 75 78 78 79
80 82 82 84 86 87 88 90
92 95 96 96 98 100
• Find the three quartiles.
• And find the interquartile range
MEASURES OF DISPERSION
MEASURES OF DISPERSION
MEASURES OF DISPERSION
MEASURES OF DISPERSION
MEASURES OF DISPERSION
MEASURES OF DISPERSION
MEASURES OF DISPERSION
MEASURES OF DISPERSION
MEASURES OF DISPERSION
MEASURES OF DISPERSION
MEASURES OF DISPERSION
MEASURES OF DISPERSION
MEASURES OF DISPERSION

You might also like