STAT 309
STAT 309
February 4, 2025
Measures of Central Tendency
Mean
The mean is basically the average value of a data set.
The mean is easy to compute.
It is the most commonly used measure of central tendency.
Ungrouped Data.
For a given data X1 , X2 , X3 , ..., Xn , the population mean is:
N
1 X
µ= Xi
N
i=1
where, i = 1, 2, 3, ..., N
N = population size.
Grouped Data.
Considering a data X1 , X2 , X3 , ..., Xn , with the respective frequencies
f1 , f2 , f3 , ..., fN , the population mean is:
N
1 X
µ= fi Xi
N
i=1
where, i = 1, 2, 3, ..., N
fi = the frequency
N= the population size
where, i = 1, 2, 3, ..., n
n = sample size.
Grouped Data.
Given a data x1 , x2 , x3 , ..., xn , with their frequencies f1 , f2 , f3 , ..., fn , the
sample mean can be estimated by:
n
1X
x̄ = Xi fi
n
i=1
where, i = 1, 2, 3, ..., n
n = sample size
fi = various frequencies
Grouped Data
To find the median of a grouped data, we first locate the median class
( n2 ), and use the interpolation method.
n
2 − fcm
Median = l0 + Cw
fm
Where,
l0 = lower class boundary of the median class,
fcm = cumulative frequency of the pre-median class,
fm = frequency of the median class,
n = the data size.
The mode is the data point that occurs most in a data set.
There is no symbol for the mode.
Ungrouped Data
For ungrouped data, the mode is the data point with the highest
occurrence or the value that appears most in the data set.
Grouped Data
Using the interpolation method:
∆1
Mode = l0 + Cw
∆1 + ∆ 2
Where,
l0 = lower class boundary of the modal class,
∆1 = Absolute difference between the frequency of the modal class and
the frequency of the pre-modal class.
∆2 = Absolute difference between the frequency of the modal class and
the frequency of the post-modal class,
Cw = class width.
Note: The modal class is the class with the highest frequency.
Properties of Range
The range is easy and quicker to compute and easily understood.
It is affected by the one or two extreme values of the data and not
very sensitive to the number of observation of the data.
It is very crude and generally, not a useful measure of variation.
The range is a rough estimate of dispersion and unsuitable for further
statistical analysis.
Grouped Data
n
!
1 X
MD = fi |xi − x̄|
n
i=1
and
n
1 X
s2 = (xi − x̄)2
n−1
i=1
and
n
2 1 X
s = fi (xi − x̄)2
n−1
i=1
Standard Deviation.
The standard deviation is defined as the positive root of the variance
q P q
σ = N1 N i=1 fi (x i − µ)2 or s = 1 Pn
n−1 i=1 (xi − x̄)
2
Quartiles
Quartiles divide a set of data into four equal parts such that :
The first or lower quartile (Q1 ) has 25% of the observations falling
below it.
The second or middle quartile (Q2 ), known as the median, has 50%
of the observation falling below/above it.
The third quartile (Q3 ) has 75% of the observation falling below it
when the data is arranged in order of magnitude.
Grouped Data
The interpolation method is Used:
n
−fcm
Q1 = l1 + 4 fm Cw
n
−fcm
Q2 = l2 + 2 fm Cw
3n
−fcm
Q3 = l3 + 4 fm Cw
Ungrouped Data
Deciles divide the data set into ten equal parts.
They are denoted by D1 , D2 , D3 , .., D9 and are such that
10%, 20%, 30%..90% of the observations fall below D1 , D3 , D3 , ..., D9 ,
respectively.
Grouped Data
!
2n
10 − fcm
D2 = l2 + Cw
fm
, !
5n
10 − fcm
D5 = l5 + Cw
fm
Grouped Data
Interpolation Method.
The K th percentile for a grouped data is determined by the formula,
!
kn
100 − fck
Pk = lk + Cw
fk
NUMERICAL MEASURES February 4, 2025 21 / 25
Interpolation Method
3(mean − median)
Sk =
SD
which ranges from -3 to 3.
Comparing to 3
If the value of K = 3, the distribution is said to be symmetric or
normal.
If K < 3, the distribution flattens at the center than the normal
distribution (the individual observation scatter widely about the
mean).
If K > 3, the distribution is more peaked than the normal distribution
(the observation are closed to the mean).