0% found this document useful (0 votes)

19 views123 pages

2_Final_Introduction to Data_Measure_Central_Tendency_DPPM_II_PG

Uploaded by

kunturandal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views123 pages

2_Final_Introduction to Data_Measure_Central_Tendency_DPPM_II_PG

Uploaded by

kunturandal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 123

2024 QUANTITATIVE METHODS

IN DECISION MAKING

POST-GRADUATE DIPLOMA IN PROJECT PLANING & MANAGEMENT,

DPPM II- KLA WKD
MODULE: Quantitative Methods in Decision Making
Quantitative Methods in Decision Making

Key concepts, application

&
Types of Data

Dr. PG, Mphil_Epidemiology, PhD

Dr Philip Govule 2
Some useful definitions:
• Variable: a quantity or a quality which varies between one individual
and another.
• Frequency: the number of individuals with a specific value of a
variable.
• Probability: the proportion of times an event would occur in
repetition of given circumstances.
• Population: a collection of individuals of interest.
• Sample: a group of individuals taken from a larger population used to
find out about the population.
DESCRIPTIVE STATISTICS
INTRODUCTION TO DATA MANAGEMENT

The bigger picture Study Design

(observational and
experimental)
Data collection

B C Statistical
D Analysis
Data Management
Ensuring data quality
A Measures of [Location,
Dispersion plus Inferential
(data entering, editing stat (univariate, Bivariate &
and reconciliation) Multivariate; Correlation, Chi,
t‐test, ANOVA , MANCOVA)
DATA
 Data are numbers obtained by measuring or counting prope
rties of objects
 Data are obtained from
Analysis of existing records
Cross‐sectional Surveys‐Primary data (eg. UDHS, Student research)
Census (last Population and Housing census in Uganda: 2024)
Experiments (Clinical trials, field trials)
Reports
etc.
MEASUREMENT
Measurement is the assignment of numbers to objects or e
vents in a systematic fashion.
 A variable is any measured characteristic or attribute th
at varies from subject to subject.
Weight
Age
Height
 etc.

 A random variable is one that cannot be predicted in

advance because it arises by chance.
Qualities of Variables
• Exhaustive - Should include all possible answerable
responses.
• Mutually exclusive -No respondent should be able
to have two attributes simultaneously: For example,
– Male Vs Female,
– Employed vs. unemployed
– It is possible to be both if looking for a second job
while employed).
Definitions and examples
Definitions Example
Variable Gender

Attribute Attribute Female Male

VARIABLES

Observations or measurements are used to obtain the

value of a random variable.

There are two types of variable:

Quantitative (numerical) variables

Qualitative/Categorical (non‐numerical) variables, attribute

Levels of Measurement
VARIABLE & MEASUREMENT SCALE

Data

ATTRIBUTE OR NUMERIC OR
QUALITATIVE QUANTITATIVE

DISCRETE
NOMINAL ORDINAL CONTINUOUS
(COUNT)

RATIO INTERVAL
The Levels of Measurement

 Nominal

 Ordinal

 Interval

 Ratio
Why Is Level of Measurement Important?

Helps you decide what statistical analysis is appro

priate on the values that were assigned

Helps you decide how to interpret the data from th

at variable
NOMINAL SCALE (sounds like “names” or labels.

 Nominal variables allow for only classification or categorization based o

n some distinctively different characteristic, but we cannot rank order th
ose categories.
 A categorical variable, also called a nominal variable, is for mutual exclu
sive (no overlap) and none of them have any numerical significance……
…but not ordered, categories.
 “Nominal” scales could simply be called “labels.”
• They are mere codes assigned to objects as labels, they are not me
asurements.
• Not a measure of quantity.
Nominal Measurement

sex, religion
blood group,
symptoms of disease, cause of death
Measurement?
The relationship of the values that are assigned to the attributes for a variable

Variable Party Affiliation

Attributes Movement Independent FDC

Values 1 2 3

Relationship
ORDINAL SCALE
Ordinal Measurement
When attributes can be rank‐ordered…

With ordinal scales, the order of the values is what’s importa

nt and significant, but the differences between each one is
not really known.
Ordinal Measurement
o Attributes can be rank‐ordered…
o Distances between attributes do not have any meaning.
All we know: #4 > #3 or #2,
o But we don’t know–and cannot quantify–how much better it is.
o E.g. Is the difference between “OK” and “Unhappy” the same as th
e difference between “Very Happy” and “Happy?” We can’t say.
o Typically measures of non‐numeric concepts like satisfaction, happi
ness, discomfort, etc.
o “Ordinal” = sounds like “order” (order matters, but that’s all you
really get from these).
Ordinal Measurement
Educational Attainment
0= No education
O 1= Primary
R
2= Secondary
D
3= Tertiary
E
4= University (Graduate)
R
5= Post graduate eg. PG‐DJCM, DHSMA
Ordinal Measurement
Ordinal variables
• Also known as ordered categorical variables
• Consist of ordered categories, where the differences between
categories cannot be considered to be equal
• Example student evaluation rating made up of:
– Excellent,
– Satisfactory,
– Unsatisfactory
Continuous variable
• A continuous variable is a number on a continuous scale
and so it can take on an unlimited number of values. Eg:
weight, height, income
– 150.4, 150.8
Discrete variable
• A discrete variable is a numerical variable that can take on
only a limited number of values.
• These values are usually whole numbers

• An example of this is age in years at last birthday

• Another example is number of episodes of diarrhea

experienced by a child
Interval Measurement
Interval scale:
values have identity, magnitude, and equal intervals. Eg. Temperature
:Every degree Fehrenheit/Celsius is the same interval.
Hence distance between attributes has meaning, for example,
temperature (in Fahrenheit)
‐‐ distance from 30‐40 is same as distance from 70‐80
• Absolute Zero has no meaning (arbitrary) eg. 0 degrees does not
mean there is no temperature
• IQ scores or performance scores
Ratio Measurement
• Has an absolute zero that is meaningful
• Can construct a meaningful ratio (fraction), for
example, number of clients in past six months

• It is meaningful to say that “...we had twice as many

clients in this period as we did in the previous six
months. Examples are weight and height
Important
 Quantitative variables are often converted to categorical ones using
“Cut‐points”.
 Instead of presenting the mean fasting glucose level of male and
female subjects, one may prefer to present the proportion of
diabetics in male and female population using a fasting glucose level
of 110 mg/dL as the cut‐point to categorize the subjects as diabetic/
non‐diabetic.
 Glucose level:85 and 110 mg/dL‐non‐diabetic.
 Glucose level 111 and 150 mg/dL ‐ Diabetic.
 However, categorizing a continuous variable lead to loss of informati
on
The Hierarchy of Levels
The Hierarchy of Levels
Types of Data
Types of Data
DESCRIPTIVE STATISTICS

ORGANIZATIONOFTHEDATA
ORGANIZATION OF THE DATA
 Data is usually presented in a matrix form
 The column of the matrix represents variable
 The row of the variable represents individual or units.
The “Normal” distribution of biological continuous variables

• Most biological continuous variables (such as Blood

pressure, Height, Weight, etc) present different
values among individuals (some have higher values
and others lower values).

• The measurements of biological continuous variables

present a characteristic frequency distribution that is
called Normal Distribution (or Gaussian Distribution).
The Normal frequency distribution of
biological variables
Frequency distribution of the body height of a hypothetical population
80

60
Frequency (No. of observation)

0
0

0
15

20
Hight (cm)
Parameters of Frequency Distribution
• Frequency distribution of continuous data are defined by two
types of measures or parameters:
• Measures of Central Tendency
– They allow to summarise in a single value the whole set of
observations.
– We calculate a measure of central location when we need
a single value to summarize a set of epidemiological data.
• Measures of Dispersion
– They suggest how widely the observations are spread out.
Measures of Central Tendency
• There are three fundamental measures of central tendency:
• The Mode
• The Median
• The Mean (Arithmetic mean)
• Others: Midrange, geometric mean

• Which measure is best for our use in a particular instance depends

on the characteristics of the distribution, such as its shape, and on
how we intend to use the measure.
Measures of Central Tendency
DESCRIPTIVE METHODS FOR CONTINUOUS DATA

NUMERICALMETHODS

Measures of Central Tendency

Measures of Location
MEASURES OF LOCATION
• The most common measures of location are the:
– Mode
– Median
– Mean
• The most common measures of dispersion are the:
– Range
– Variance
– Standard Deviation
– Interquartile Range
MEASURES OF LOCATION - MODE

 The mode is the most frequently occurring value in a set of

measurements (set of data).
2, 3, 4, 4, 5, 5, 5, 6, 6, 7
 For grouped data modal interval is deﬁned as the class
interval with the highest frequency.
The modal value is the midpoint of the modal interval

 The mode is not used much in statistical analysis because

of the ambiguity in its deﬁnition.
The Mode

• The following parity data has a mode of 1, because it

occurs 4 times, which is more than any other value:
– 0, 0, 1, 1, 1, 1, 2, 2, 2, 3, 4, 6
• We find the mode by creating a frequency distribution in
which we tally how often each value occurs. If every value
occurs only once, the distribution has no mode.
• If two or more values are tied as the most common, the
distribution has more than one mode. Eg. bimodal
The Mode
Value Tally No
Find the mode of the given data set:
0 2
0, 0, 1, 1, 1, 1, 2, 2, 2, 3, 4, 6
1 4
2 3
What is the Mode? 3 1
ANS: Mode= 1 is the mode since it is ap 4 1
6 1
pearing more number of times (=4 time
12
s) in the set compared to other number:
The Mode
Value Tally No
Find the mode of the given data set: 3, 3
3 2
, 6, 9, 15, 15, 15, 27, 27, 37, 48.
6 1
9 1
What is the Mode? 15 3
ANS: Mode= 15 is the mode since it is a 27 2
ppearing more number of times in the s 37 1
48 1
et compared to other numbers. :
11
The Mode

The following set of numbers:

Set of numbers 1 2 3 4 6 9

What is the Mode?

Every value occurs only once, the distribution has no mode.
Note
If two or more values are tied as the most common, the distributi
on has more than one mode. Eg. bimodal
MEASURES OF LOCATION ‐ MODE
 Some distributions may be bimodal bimodal, trimodal, etc.
Use and limits of the Mode
• The mode may have some communication interest but
generally is rarely of statistical value.
• The mode is probably most useful when describing qualitative
data.
• It is not uncommon to have a frequency distribution with more
than a mode, simply as the consequence of chance.
• Sometimes, however, a bimodal frequency distribution is
extremely meaningful because the population contains two
sub-groups, each of which has a different distribution that
peaks at a different point.
MEASURES OF LOCATION - MEDIAN

 The median is the middle observation

 It divides the data set into equal halves.
 If n (# of obs.) is odd, the median will be unique, and
deﬁned as: (n + 1)/2 or ½(n+1)th observation
MEASURES OF LOCATION - MEDIAN

 If n is even, the median is obtained by averaging the

two middle observations
Method I: Even numbers
Using formulae: (n + 1)/2th observation
then get a number with a decimal point denoting distance from the whole number eg 4.
5 shows the median is between position 4 and the next value
Method 2: Even numbers
The median can be obtained by the simple average of the
n/2 ‐th and (n/2 + 1) ‐th terms.
= ((n/2)th +(n+1)th)/2 observation
The Median: How do we get the Median

• Median: The median is the middle of a set of data that has

been arranged into rank order.
• Divides a set of data into two halves [One half of the
observations being larger than the median value], and [one half
smaller than the median].
• Example: suppose we had the following set of height measures
(in cm): 110, 120, 120, 130, 140, 165, 180
• 3 observations are larger than 130 and 3 observations are
smaller; thus the median is 130 cm.
Median 1 for Even Numbers
• Find the median of the following set of data with n = 10:
15, 7, 13, 9, 10, 11, 16, 12, 5, 11.

1. Arrange the observations in increasing or decreasing order.

5, 7, 9, 10, 11, 11, 12, 13, 15, 16

2. Find the middle rank.

– Middle rank = (n + 1)/2
– (10+1)/2 = 5.5
– Therefore, the median lies halfway between the values of the 5th and 6th observations.

3. Identify the value of the median. Since the median is equal to

the average of the values of the 5th and 6th observations, the
median is 11. Median = (11+11)/2 = 11
Median 2
• Find the median of the following set of data with n = 10:
15, 7, 13, 9, 10, 11, 16, 12, 5, 11.

1. Arrange the observations in increasing or decreasing order.

5, 7, 9, 10, 11, 11, 12, 13, 15, 16

2. Find the middle rank.

 term ]

  term ]   term ] term

– 𝑀𝑒𝑑𝑖𝑎𝑛 = =
– Therefore, the median lies halfway between the values of the 5th and 6th observations.

3. Median = (11+11)/2 = 11
Method 2 for simplifying Even number Median
• Find the median of the following set of data with n = 10:
15, 7, 13, 9, 10, 11, 16, 12, 5, 11.

1. Arrange the observations in increasing or decreasing order.

5, 7, 9, 10, 11, 11, 12, 13, 15, 16

2. Find the middle rank.

– Get the average observation = [(n/2)th + {(n/2) + 1}th] /2
– ((10/2)th observation + (10+1/2)th)/2
– [(5)th observation + {(5th) + 1}th] /2
– [(5)th observation + {(6th] /2
– [(5)th+ 6th] /2 = (11+11)/2 = 11
Since the median is equal to the average of the values of the 5th and 6th
observations, the median is 11. Median = (11+11)/2 = 11
Use and limitations of the Median

• The main advantage of using a median is that it is “robust” to outliers.

• Example:
– Data set: 24 + 25 + 29 + 29 + 30 + 31 Median = 29
– Data set: 24 + 25 + 29 + 29 + 30 + 131 Median = 29

• Therefore in a series of data that have some outliers that may shift the
mean too much, the use of the median may be more meaningful.
• The median is also used in defining the LD50 in experimental animals
(lethal dose that kills 50% of the animals).
• It does not allow complex inferences from medical data as it can not be
used for advanced statistics.
MEASURES OF LOCATION - MEDIAN
 The median is less aﬀected by extreme values.
 However, it has some notable disadvantages compared to the mean:
It ignores the precise magnitude of most of the observations.
This makes it less eﬃcient than the mean
In large data sets, the median requires more work to calculate than th
e mean
No easy way to combine the median of two groups of measurements.

It is not of much use in the elaborate statistical analysis.

The Median
• Is the point above which 50% of the
distribution lies and below which lies 50%
of the distribution

• The median should be used when the

distribution is skewed.
Egessa Simon
Determining the median
• The median should be used when the
distribution is skewed.
• This is the middle figure of the distribution.
Given 3,4,6,7,10 the median is 6

Egessa Simon
Determining the median of grouped data

Median = Lm + c(N/2 – CFb) /fm

Where:
Lm is the lower boundary of the median class
CFb = is the cumulative frequency before the median
class
fm = frequency of the median class
c is the width of the median class
N is the total number of observations
MEDIAN FOR GROUPED DATA – Using class limits
Class Tally (F) Cum Relative Frequency Relative Frequency
Frequency Frequency F (𝐅/𝐧) (𝐅/𝐧)
10‐14 IIII III 8 8 0.08 0.08
15‐19 IIII III 8 16 0.08 0.16
20‐24 IIII IIII IIII IIII IIII IIII 29 45 0.29 0.45
25‐29 IIII IIII II 12 57 0.12 0.57
30‐34 IIII IIII 10 67 0.1 0.67
35‐39 IIII IIII I 11 78 0.11 0.78
40‐44 IIII III 8 86 0.08 0.86
45‐49 IIII III 8 94 0.08 0.94
50‐54 IIII I 6 100 0.06 1
∑f=100 1.0
MCT - MEDIAN FOR GROUPED DATA

  To simplify, measurements are assumed to be sprea

d evenly over the interval.
 The ﬁrst interval for which the cumulative relative frequen
cy exceeds 0.50 contains the median.
 The computation of the median value for grouped data is
carried out as follows:
MCT ‐ MEDIAN FOR GROUPED DATA

𝑁 𝑛
Cf Cf
2 Median 𝑳 2 𝑤
Median 𝐿 𝑐
𝐹 𝐹
MCT ‐ MEDIAN FOR GROUPED DATA
Class width: is the difference between the upper or lower limits of two consecutive classes:
•Formula: Class width = Upper class limit ‐ Lower class limit
•Example: For the class interval 163–175, the class width is 12 because 175 – 163 =

C = Class Interval width = Class interval width is the difference between the lower endpoint o
f an interval and the lower endpoint of the next interval. For 2 groups: 20‐24; 25‐29.
Class interval width is 25‐20 =5
Median = 24.5 +
𝑁
Cf
Median 𝐿 2 𝑐 = 24.5 + (
𝐹
= 24.5 + 0.42 x 5
24.5 + 2.1 = 26.6
MCT ‐ MEDIAN FOR GROUPED DATA
Median group = 25‐29
Lm= Lower class boundary of the Median class = 25‐0.5 = 24.5
Cfb = Cumulative frequency of the class which is before the Median class= 45
Fm = frequency of the median class = 12
C = Class Interval width = Class interval width is the difference between the lower endpoint of
an interval and the lower endpoint of the next interval. For 2 groups: 20‐24; 25‐29.
Class interval width is 25‐20 =5
Median = 24.5 +
𝑁
Cf = 24.5 + (
Median 𝐿 2 𝑐
𝐹 = 24.5 + 0.42 x 5
24.5 + 2.1
26.6
GROUPED MEAN FOR GROUPS CLASSIFIED BY CLASS
BOUNDARIES

Dr Philip Govule 65
What are class boundaries?
• A class boundary refers to the dividing line between two adjacent
classes or categories in a dataset.
• It helps in determining the range of values that fall within each
class and allows for better analysis and interpretation of data.
• Understanding class boundaries is crucial for effective data
segmentation and can provide valuable insights for business
planning and strategy development.
• Note: Some data are often classified by class boundaries hence
may still be useful for estimation of Mean, Median etc
Dr Philip Govule 66
MCT - MEDIAN FOR GROUPED DATA

 Cumbersome to compute because the actual measure

ment values are unknown
 To simplify, measurements are assumed to be spread
evenly over the interval.
 The ﬁrst interval for which the cumulative relative frequen
cy exceeds .50 contains the median.
 The computation of the median value for grouped data is
carried out as follows:
The Mean
The arithmetic Mean

• The arithmetic mean is the most familiar

measure of central tendency.
• It is the arithmetic average and is commonly
called simply “mean” or “average”.
• In formulas, the arithmetic mean is usually
represented as x, read as “x-bar”.
Mean
• The population mean
µ = sum of all data value in the population
population size
• The sample mean
X = sum of all data values in the sample
sample size
Egessa Simon
• The formula for calculating
Mean
the mean from individual
data is:
Mean = x = ∑ x(i) / n
• This formula is read as “x-
bar equals the sum of the
x’s divided by n”.

• Where n is the number of

observations.
MEASURES OF LOCATION - MEAN

 Given a data set of n xi values, i=1, 2, 3,….,n,

the mean of the x’s (denoted by x̅ ) is given by:

 F o r example, if our data are:

Determining the mean of grouped data
• Mean = (∑fx)/∑f
• Where:
• ∑f is the cumulative frequency of the
distribution.
• ∑fx is the summation of the product of
frequency and class mark of each class
interval.
MEAN FOR GROUPED DATA
 The formula for the mean for grouped data is given by:
Mean: calculation example – grouped data
• The frequency of weight of patients to a health facility :
A B C D
Weight (Kg) Frequency (f) Midpoint (xm) f. xm
54-57 5 55.5 277.5
58-61 7 59.5 416.5
62-65 10 63.5 635
66-69 12 57.5 810
70-73 6 71.5 429
74-77 5 75.5 377.5
59-81 4 79.5 318
82-85 1 83.5 83.5
∑ f =n= 50 ∑ f x = 3347
Mean: calculation example – grouped data
• Solution:
 To calculate the mean:
Step1: Find the midpoints of each class and enter them in column C
Xm = 54+57/2 = 55.4; 58+61/2 =59.5 etc
Step 2: For each class, multiply the frequency by the midpoint, as shown below,
and enter the product in column D (f. xm )
5*55.5 = 277.5; etc.
Step 3: Find the sum of column D, as shown in the table= ∑ f. xm

Step 4: Divide the sum by n, to get the mean

x = f. xm / n = 3347/50= 66.9 kilograms
CALCULATE THE MEAN DEVIATION, VARIANCE AND
STANDARD DEVIATION
• Solution:
CALCULATE THE MEAN DEVIATION, VARIANCE AND
STANDARD DEVIATION
• Solution
• :
CALCULATE THE MEAN DEVIATION, VARIANCE AND
STANDARD DEVIATION
• Solution

• The variance of a population for grouped data

is: σ2 = ∑ f (x − x̅)2 / n.
• :
A
Weight (Kg)
Mean:
B calculation
Freq (f)
C
Midpoint (x )
example
D
f. x
– grouped data
Mean deviation
m m

• The frequency of weight of patients to a health facility

(|X-X| ): f.(|X-X| ) f.(|X-X| )2

54-57 5 55.5 277.5 11.44 57.20 654,368

58-61 7 59.5 416.5 7.44 52.08 387.4752
62-65 10 63.5 635 3.44 34.40 118.336
66-69 12 57.5 810 0.56 6.72 3.7632
70-73 6 71.5 429 4.56 27.36 124.7616
74-77 5 75.5 377.5 8.56 42.80 366.368
59-81 4 79.5 318 12.56 50.24 631.0144
82-85 1 83.5 83.5 16.56 16.56 274.2336
∑ f =n= 50 ∑ f. xm= ∑ f.(x- x)= 287.36 2560.32
3347 65.12
Mean: calculation example – grouped data

• Solution:

 To calculate the mean:

Step1: Find the midpoints of each class and enter them in column C

Step 2: For each class, multiply the frequency by the midpoint, as shown below,
and enter the product in column D (f. xm )

Step 3: Find the sum of column D, as shown in the table= ∑ f. xm

Step 4: Divide the sum by n, to get the mean

CALCULATE THE MEAN DEVIATION, VARIANCE AND
STANDARD DEVIATION
• Solution

• The variance of a population for grouped data

is: σ2 = ∑ f (x − x̅)2 / n.
• 2560.32/50
• Variance (S2)=51.2064
• Standard Deviation: √ Variance
• √:51.2064 = 7.559
Mean: calculation example – grouped data
• The frequency of distances in kilometres travelled by patients to a health facility :
A B C D

Class Frequency (f) Midpoint (xm) f. xm

5-9 1 7.0 7
10-14 2 12 24
15-19 3 17 51
20-24 5 22 110
25-29 4 27 108
30-34 3 32 96
35-39 2 37 74
∑ f =n= 20 ∑ f. xm= 470
MEASURES OF LOCATION - MEAN
A d v a n t a g e s of the group formula:
The process saves some computational labor
The diﬀerence between the x̅ ’s from the two approaches
is very small if:
the data set is large and
the interval width is small
Applications and characteristics
• The arithmetic mean is useful when performing analytic manipulation.
With the exception of a situation where extreme scores occur in the
distribution, the mean is generally the best measure of central tendency.
 The values of mean tend to fluctuate least from sample to sample.
 It is amenable to algebraic treatment and it possesses known
mathematical relationships with other statistics.
 Hence, it is used in further statistical calculations. Thus, in most
situations the mean is more likely to be used than either the mode or the
median.
Advantages of the Mean
Simple to calculate.
Has mathematical properties that enable the
development of advanced statistics standard
distribution
Most descriptive analyses of continuous variables and
advanced statistical analyses use the mean as the
measure of central tendency.
Mean advantages
• Advantages
– It summarizes the entire distribution
– It is unbiased/meaning it always gives us the
population mean μ

Egessa Simon
Determining the mean of grouped data
Mean = =(∑fx)/∑f
Where:
∑f is the cumulative frequency of the
distribution.
∑fx is the summation of the product of
frequency and class mark of each class
interval.
Mean Disadvantages
• It is affected by extreme values

• Sometimes the figure obtained is not

anywhere in the distribution;
Limitations of the Mean - 1
• The mean is quite sensitive
to extreme values that skew
a distribution.
• Because the mean is so
sensitive to extreme values
(OUTLIERS), it is a poor
summary measure for data
that are severely skewed in
either direction.
Limitations of the Mean - 2
• In summary mean is so sensitive to extreme values
.
• Example:
– Data set: 29 + 31 + 24 + 29 + 30 + 25. =[Mean is 28.0]

– Data set: 29 + 31 + 124 + 29 + 30 + 25. =[Mean is 44.7]

Jokes in statistics
• The Italian poet Trilussa said:
if you eat two chicken a day and I eat none, on the average
you and I eat a chicken a day.
• The statistician who was drowned in a lake that had an
average depth (the mean of the depth) of 40 cm
Conclusion
The mean may be the same but the observation may have a
very different dispersion
Limitations of the Mean - 3
• Moreover the same mean may be found in two series of
observations that are very different:
• For instance let us consider the average height of 2 groups of
people (in cm):
• Group 1: 170, 168, 167, 165, 165, 165, 164, 163, 163, 163,
162, 161, 160, 160.
Mean : 164 cm
• Group 2: 205, 195, 189, 186, 170, 173, 160, 157, 150, 145,
143, 142, 141, 140.
Mean : 164 cm
Mean is the centre of gravity of the distribution

Mean is the centre of gravity

of the distribution
Measures of central tendency & Normal Distribution
• In reality very often the distribution does not look perfectly
symmetrical but may be particularly extended in one
direction (Skewness).
• When the curve is stretched on the left side it is said to be
skewed on the left.
• When a distribution is skewed on one side the Mean will
move in the same direction. The median tends to move
also but in a lesser extent.
Measures of central tendency & Normal Distribution

Median
This frequency distribution is skew
ed to the right side (+ve skewness)
Mode Mean
MEASURES OF LOCATION - SKEWNESS

 Skewness is a measure of the lack of symmetry.

 The skewness value can be positive or negative, or even undeﬁned.

 In a symmetrical distribution, the mode, median, and mean will all be

the same.
MEASURES OF LOCATION - SKEWNESS
 In a skewed distribution:
the mean is pulled in the direction of the tail
the median falls between the mode and the mean
MEASURES OF LOCATION - GEOMETRIC MEAN
MEASURES OF LOCATION - GEOMETRIC MEAN
 Geometric means offer useful summaries for highly skewed data.
Survival data
Income distribution
Ratios
 The effect of this process is to minimize the influences of extre
me observations (very large numbers in the data set).
 Don't use a geometric mean, though, if you have any negative o
r zero values in your data.
MEASURES OF LOCATION - GEOMETRIC MEAN
MEASURES OF LOCATION - GEOMETRIC MEAN

 For example, consider the data set

The observation 28 is an unusually large measurement, causing a right skewing a
nd inﬂuencing the mean

Compare the means for the ra

w and transformed data:
MEASURES OF LOCATION - GEOMETRIC MEAN

 A clinical trial assessed the eﬀect of the drug 6‐mercaptopurine (Puri

nethol) (6‐MP) in maintaining remission.
 Patients were randomized to receive 6‐MP or placebo. Below are the
times to relapse (weeks) for the 21 placebo group were:
GROUPED MODE

Dr Philip Govule 104

Determining the mode of grouped data

Mode = Lm +(D1/D1+D2)c
Where:
◦ D1 is the frequency in the modal class minus the frequency
in class before it
◦ D2 is the frequency in the modal class minus the frequency
after it
◦ Lm is the lower class limit of the modal class
◦ C is the width of the modal class
Egessa Simon
Mode
Advantages:
• It is simple
• Unique
• Useful for qualitative data say the most
handsome man;

Egessa Simon
Mode
• Disadvantages:
– Cannot be called unbiased
– Cannot be used to reconstruct the distribution
– Can not be further processed
– Some distributions are bimodal

Egessa Simon
APPLICATION OF MEASURES
Measure Formula/Example Used for
Arithmetic Mean [Average] 𝑆𝑢𝑚 𝑎 𝑏 𝑐 Most situations
𝑆𝑖𝑧𝑒 3 (“Average Item”)
Median Middle of sorted list Widely varying samples
[Middle value] (2 middles? (houses, incomes)
Mode Most popular value No compromises (Winner
[Most popular] takes all)
Geometric Mean [average 𝑎𝑏𝑐 Investments, growth, area,
factor] volume
Harmonic Mean [Average rate] 3 Speed, production, cost
1 1 1
𝑎 𝑏 𝑐
What is the best measure of central tendency?

 The "best" measure of central tendency depen

ds on the data you are analyzing,
Whether to use the median, mean or mode will
depend on the type of data you have (categorical
or continuous data);
Whether your data has outliers and/or is skewed
;
What you are trying to show from your data.
In a strongly skewed distribution, what is the best
indicator of central tendency?

It is usually inappropriate to use the mean in such situ

ations where your data is skewed.

You would normally choose the median or mode, with

the median usually preferred.
When is the mean the best measure of central
tendency?

The mean is usually the best measure of central tendency

to use when:‐

Data distribution is continuous and symmetrical (data is n

ormally distributed).

Depends on what you are trying to show from your data.

When is the mode the best measure of central
tendency?

 The mode is the least used of the measures of central tendency

 Used when dealing with nominal data.
 For this reason, the mode will be the best measure of central
tendency (as it is the only one appropriate to use) when deal
ing with nominal data.
 The mean and/or median are usually preferred when dealing wi
th all other types of data, but this does not mean it is never use
d with these data types.
When is the median the best measure of
central tendency?
 The median is usually preferred to other measures
of central tendency when your data set is skewed
(i.e., forms a skewed distribution)
 When dealing with ordinal data.
 However, the mode can also be appropriate in the
se situations, but is not as commonly used as the
median.
What is the most appropriate measure of
central tendency when the data has outliers?
 The median is usually preferred in these situations
because the value of the mean can be distorted by
the outliers.
 However, it will depend on how influential the out
liers are.
 If the outliers do not significantly distort the mean
, using the mean as the measure of central tenden
cy will usually be preferred.
In a normally distributed data set, which is
greatest: mode, median or mean?
 If the data set is perfectly normal, the mean, medi
an and mode are equal to each other (i.e., the sa
me value).
In a normally distributed data set, which is
greatest: mode, median or mean?
 For any data set, which measures of central tende
ncy have only one value?

 The median and mean can only have one value for
a given data set.

 The mode can have more than one value

Class Exercise 1
• The performance of UMI Participants in Quantitative Methods for
the last five years has not been good despite the fact that it is a
compulsory module.

• In 2019/2020 UMI rolled out a QM Performance improvement

program that included a new QM Curriculum, change in teaching
methods and availing lasts reading materials on the subject.

• The data below shows the scores out of 50 in Quantitative

Methods by 50 UMI DHSMA Participants selected at random in
2022
Class Exercise cont.
29 22 8 34 24 33 28 37 30 35
17 30 11 23 34 15 18 30 38 38
33 26 31 35 46 21 27 19 29 36
43 36 13 25 32 26 33 32 45 40
44 42 34 38 25 13 20 8 39 31
Egessa Simon
Class exercise. Cont,d
Required
a) Starting with the class 5–10, 11-16 etc. and maintain equal
class width, formulate a frequency distribution table to show the
above data.
b) Using the information in the frequency table above, calculate
the mean, median, mode and standard deviation of the
Participants income.
c) Assuming that the mean performance/scores In 2020 of the of
participants in QM was 27 out of 50 is normally distributed, test
the hypothesis at 5% level of significance that the QM
performance improvement program has worked and scores in
QM has increased significantly
Class Exercise 2
• The following data in the table below presents the
monthly income (Shs‘000) of 50 randomly selected
villagers after a poverty reduction exercise had been
carried out in Apac District. Before the exercise those
in charge of the poverty reduction program had
carried out a baseline survey and established that, on
average their monthly income was Shs 49,000 per
month.
Class Exercise contd
70 41 34 55 45 66 73 77 80 30
50 45 72 50 27 70 55 70 85 70
30 50 60 53 40 45 35 55 20 81
25 51 35 62 60 30 45 35 50 89
53 23 28 65 68 50 65 34 35 76
Class exercise. Cont…
Required
a) Starting with the class 20–29 and using class intervals of
equal width, construct a frequency distribution table for the
data above.
b) Using the information in the frequency table above,
calculate the mean, median, mode and standard deviation
of the villagers income.
c) Assuming that the monthly income of the villagers is
normally distributed, test the hypothesis at 5% level of
significance that the poverty reduction initiative has worked
and income has increased significantly

Course Work Dppm 2 Wknd
No ratings yet
Course Work Dppm 2 Wknd
1 page
LESSON 1_2
No ratings yet
LESSON 1_2
76 pages
SQC final_1 module-3
No ratings yet
SQC final_1 module-3
40 pages
Group-4-Data-Management-Notes
No ratings yet
Group-4-Data-Management-Notes
21 pages
QMB FL. Chap 02
No ratings yet
QMB FL. Chap 02
63 pages
Basic of Statistics #5 (!!!)
No ratings yet
Basic of Statistics #5 (!!!)
49 pages
Intro to Stat
No ratings yet
Intro to Stat
50 pages
PSM 2020N
No ratings yet
PSM 2020N
399 pages
Basic Statistics
No ratings yet
Basic Statistics
52 pages
Basics of Statistics Unit-I SCLS
No ratings yet
Basics of Statistics Unit-I SCLS
127 pages
Basics of Statistics Unit-I SCLS
No ratings yet
Basics of Statistics Unit-I SCLS
135 pages
Study_notes_MPT
No ratings yet
Study_notes_MPT
32 pages
Statistics_ (1)
No ratings yet
Statistics_ (1)
63 pages
Lecture Afffasfafa
No ratings yet
Lecture Afffasfafa
29 pages
Data Management
No ratings yet
Data Management
81 pages
Lecture 2 African Philosophy _EDCC 514
No ratings yet
Lecture 2 African Philosophy _EDCC 514
28 pages
Basic Statistical Tools in Research and Data Analysis
No ratings yet
Basic Statistical Tools in Research and Data Analysis
14 pages
Introduction To Biostatistics
No ratings yet
Introduction To Biostatistics
53 pages
Data Management
No ratings yet
Data Management
75 pages
2 Descriptive Analytics
No ratings yet
2 Descriptive Analytics
32 pages
Intro To Stat1
No ratings yet
Intro To Stat1
31 pages
Basic Stat 1
No ratings yet
Basic Stat 1
50 pages
DOC-20250325-WA0014
No ratings yet
DOC-20250325-WA0014
63 pages
Statistics For Bussiness: By: Dr. (C) Nanik Istianingsih, S.E., M.E., C.LMA., C.PR., C.DM
No ratings yet
Statistics For Bussiness: By: Dr. (C) Nanik Istianingsih, S.E., M.E., C.LMA., C.PR., C.DM
31 pages
Unit 4
No ratings yet
Unit 4
152 pages
Topic Review - Statistics
No ratings yet
Topic Review - Statistics
5 pages
Statistical Methods
No ratings yet
Statistical Methods
43 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
101 pages
Module-3-4-MMW
No ratings yet
Module-3-4-MMW
6 pages
Introduction To Statistics
No ratings yet
Introduction To Statistics
24 pages
Midterm Exam Reviewer
No ratings yet
Midterm Exam Reviewer
12 pages
Biogeographic Regions of India
No ratings yet
Biogeographic Regions of India
4 pages
Basic Statistics (3685) PPT - Lecture On 20-01-2019
100% (1)
Basic Statistics (3685) PPT - Lecture On 20-01-2019
64 pages
Class1
No ratings yet
Class1
52 pages
CG8_DATA-ANALYSIS
No ratings yet
CG8_DATA-ANALYSIS
63 pages
grade 10 chemistry unit 3
No ratings yet
grade 10 chemistry unit 3
2 pages
QVM Program
No ratings yet
QVM Program
2 pages
MSI DPPM WKD II 2024.docx
No ratings yet
MSI DPPM WKD II 2024.docx
2 pages
Job Description Content Creator Graphic Designer
No ratings yet
Job Description Content Creator Graphic Designer
1 page
20 Years Speciliased Pyq Garima Goel Biological Classification
No ratings yet
20 Years Speciliased Pyq Garima Goel Biological Classification
14 pages
1.0_Introduction to QM Module DPPM 2024_25
No ratings yet
1.0_Introduction to QM Module DPPM 2024_25
32 pages
MSI PRESENTATION
No ratings yet
MSI PRESENTATION
17 pages
pr2-c4-l5
No ratings yet
pr2-c4-l5
9 pages
3_Introduction_to_Measures of Dispersion_DHSMA_PG
No ratings yet
3_Introduction_to_Measures of Dispersion_DHSMA_PG
82 pages
Lec Notes Business Stat
No ratings yet
Lec Notes Business Stat
7 pages
ISM_Session 1_May 2025
No ratings yet
ISM_Session 1_May 2025
54 pages
perpetual_black_cube
No ratings yet
perpetual_black_cube
49 pages
Deccan Clap - Woodpolish & Deco Paint Brochure
No ratings yet
Deccan Clap - Woodpolish & Deco Paint Brochure
6 pages
TrainingManualOnPigProduction
No ratings yet
TrainingManualOnPigProduction
15 pages
Leadership and Good Governance(1)
No ratings yet
Leadership and Good Governance(1)
17 pages
Data Management Part 1 2024
No ratings yet
Data Management Part 1 2024
68 pages
Session 1 ISM May 2024
No ratings yet
Session 1 ISM May 2024
59 pages
History_and_Evolution_of_Management
No ratings yet
History_and_Evolution_of_Management
8 pages
Gas Analysis v2 Powell 2010 StanfordGW
No ratings yet
Gas Analysis v2 Powell 2010 StanfordGW
27 pages
DR - Nesrin H. Darwesh University of Duhok-College of Dentistry
No ratings yet
DR - Nesrin H. Darwesh University of Duhok-College of Dentistry
15 pages
ISM Session 1-8+webinar1,2 Merged
No ratings yet
ISM Session 1-8+webinar1,2 Merged
718 pages
6.descriptve PPHD
No ratings yet
6.descriptve PPHD
70 pages
Measures
No ratings yet
Measures
8 pages
Measures of Central Tendency Position and Dispersion 1.Pptx 20241015 145631 0000
No ratings yet
Measures of Central Tendency Position and Dispersion 1.Pptx 20241015 145631 0000
44 pages
HISTORY OF MANAGEMENT 3
No ratings yet
HISTORY OF MANAGEMENT 3
73 pages
4. MEASURE OF DISPERSION_Distribution
No ratings yet
4. MEASURE OF DISPERSION_Distribution
37 pages
Test 8-Environment(1)-Questions_FINAL
No ratings yet
Test 8-Environment(1)-Questions_FINAL
20 pages
4.1 Introduction To Statistics SK 1
No ratings yet
4.1 Introduction To Statistics SK 1
76 pages
Module 2 - Statistical Foundations
No ratings yet
Module 2 - Statistical Foundations
108 pages
Sampling Design and Analysis MTH 494: Ossam Chohan Assistant Professor CIIT Abbottabad
No ratings yet
Sampling Design and Analysis MTH 494: Ossam Chohan Assistant Professor CIIT Abbottabad
34 pages
Character Sheet: Kobold Animal Whisperer Bard
No ratings yet
Character Sheet: Kobold Animal Whisperer Bard
4 pages
Reviewer Part 1
No ratings yet
Reviewer Part 1
9 pages
NEPAL ICE Promotion
No ratings yet
NEPAL ICE Promotion
12 pages
CÂU ĐIỀU KIỆN LOẠI 2
No ratings yet
CÂU ĐIỀU KIỆN LOẠI 2
8 pages
security_atlas_guidebook
No ratings yet
security_atlas_guidebook
71 pages
Unit 3 - Descriptive Statistics
No ratings yet
Unit 3 - Descriptive Statistics
44 pages
Quantitative Data Analysis
No ratings yet
Quantitative Data Analysis
31 pages
Week 5A - Statistics Handout
No ratings yet
Week 5A - Statistics Handout
9 pages
Mmw Data Management
No ratings yet
Mmw Data Management
35 pages
18 (WS Consumerism vs. Christmas)
No ratings yet
18 (WS Consumerism vs. Christmas)
6 pages
Unit II: Basic Data Analytic Methods
No ratings yet
Unit II: Basic Data Analytic Methods
38 pages
Data Analysis for Engineers and Statisticians: A Modern Guide to Statistical Methods and Techniques
From Everand
Data Analysis for Engineers and Statisticians: A Modern Guide to Statistical Methods and Techniques
Pasquale De Marco
No ratings yet
MO9 - Mineral Oil Emulsion Research
No ratings yet
MO9 - Mineral Oil Emulsion Research
4 pages
Basic-Statistical-Concepts-_-Measures-of-Location.docx
No ratings yet
Basic-Statistical-Concepts-_-Measures-of-Location.docx
14 pages
Applied Statistical Methods (ASM) : "The True Logic of This World Is in The Calculus of Probabilities"
No ratings yet
Applied Statistical Methods (ASM) : "The True Logic of This World Is in The Calculus of Probabilities"
90 pages
Ajantha E.M School: 13 Street, Chandramouli Nagar, Vedayapalem, Nellore
No ratings yet
Ajantha E.M School: 13 Street, Chandramouli Nagar, Vedayapalem, Nellore
1 page
0510 s15 QP 22
No ratings yet
0510 s15 QP 22
16 pages
Statistics
100% (4)
Statistics
124 pages
Firewall - Sonicwall Models
No ratings yet
Firewall - Sonicwall Models
2 pages
Dsbda Unit 2
No ratings yet
Dsbda Unit 2
155 pages
The History of Concrete: From Prehistoric Rubble Mixes To Roman Cement
No ratings yet
The History of Concrete: From Prehistoric Rubble Mixes To Roman Cement
3 pages
Nomination For Meeting/Workshop and National Consultant: Internal Auditing and Compliance of Quality Assurance Plan
No ratings yet
Nomination For Meeting/Workshop and National Consultant: Internal Auditing and Compliance of Quality Assurance Plan
2 pages
Understandingstatisticsinresearch 151026064600 Lva1 App6892
No ratings yet
Understandingstatisticsinresearch 151026064600 Lva1 App6892
37 pages
CV Raven Rullyapatra Nasution
No ratings yet
CV Raven Rullyapatra Nasution
3 pages
HDD RAW Fix Partition !
No ratings yet
HDD RAW Fix Partition !
18 pages
AHSME 1996 Problems
No ratings yet
AHSME 1996 Problems
5 pages
Change Management Poster
No ratings yet
Change Management Poster
1 page
Product Datasheet: Circuit Breaker Compact NSX250H - TMD - 250 A - 3 Poles 3d
No ratings yet
Product Datasheet: Circuit Breaker Compact NSX250H - TMD - 250 A - 3 Poles 3d
2 pages
in ICT
No ratings yet
in ICT
77 pages
Cefr Lesson Plan 2018 (2!10!2018) Speaking Year 3
No ratings yet
Cefr Lesson Plan 2018 (2!10!2018) Speaking Year 3
1 page
Ms. Virginia Gamba - Summary CV
No ratings yet
Ms. Virginia Gamba - Summary CV
2 pages
Service Cloud Summer 2018
No ratings yet
Service Cloud Summer 2018
45 pages
Basics of Statistics: Definition: Science of Collection, Presentation, Analysis, and Reasonable
100% (1)
Basics of Statistics: Definition: Science of Collection, Presentation, Analysis, and Reasonable
33 pages
Introduction To Non Parametric Methods Through R Software
From Everand
Introduction To Non Parametric Methods Through R Software
Editor IJSMI
No ratings yet
Overview Of Bayesian Approach To Statistical Methods: Software
From Everand
Overview Of Bayesian Approach To Statistical Methods: Software
Vinaitheerthan Renganathan
No ratings yet
COA - C2004 006 Receipts
No ratings yet
COA - C2004 006 Receipts
5 pages