0% found this document useful (0 votes)
3 views25 pages

STAT 309

The document provides an overview of numerical measures, focusing on measures of central tendency (mean, median, mode) and measures of dispersion (range, mean deviation, variance, standard deviation). It explains how to calculate these measures for both ungrouped and grouped data, along with additional concepts such as quartiles, deciles, percentiles, skewness, and kurtosis. The document emphasizes the importance of understanding these measures for analyzing and interpreting data effectively.

Uploaded by

bcorneliusdela
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views25 pages

STAT 309

The document provides an overview of numerical measures, focusing on measures of central tendency (mean, median, mode) and measures of dispersion (range, mean deviation, variance, standard deviation). It explains how to calculate these measures for both ungrouped and grouped data, along with additional concepts such as quartiles, deciles, percentiles, skewness, and kurtosis. The document emphasizes the importance of understanding these measures for analyzing and interpreting data effectively.

Uploaded by

bcorneliusdela
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

NUMERICAL MEASURES

February 4, 2025
Measures of Central Tendency

The Three Measurement Under Central Tendency


Mean
Median
Mode

Mean
The mean is basically the average value of a data set.
The mean is easy to compute.
It is the most commonly used measure of central tendency.

NUMERICAL MEASURES February 4, 2025 2 / 25


Population Mean (µ)

The symbol for population mean is Mu(µ).

Ungrouped Data.
For a given data X1 , X2 , X3 , ..., Xn , the population mean is:
N
1 X
µ= Xi
N
i=1

where, i = 1, 2, 3, ..., N
N = population size.

NUMERICAL MEASURES February 4, 2025 3 / 25


Population Mean (µ).

Grouped Data.
Considering a data X1 , X2 , X3 , ..., Xn , with the respective frequencies
f1 , f2 , f3 , ..., fN , the population mean is:
N
1 X
µ= fi Xi
N
i=1

where, i = 1, 2, 3, ..., N
fi = the frequency
N= the population size

NUMERICAL MEASURES February 4, 2025 4 / 25


Sample Mean (x̄)

The symbol for the sample mean is x̄.


Ungrouped Data.
Given a data x1 , x2 , x3 , ..., xn , the sample mean can be estimated by:
n
1X
x̄ = Xi
n
i=1

where, i = 1, 2, 3, ..., n
n = sample size.

NUMERICAL MEASURES February 4, 2025 5 / 25


Sample Mean (x̄)

Grouped Data.
Given a data x1 , x2 , x3 , ..., xn , with their frequencies f1 , f2 , f3 , ..., fn , the
sample mean can be estimated by:
n
1X
x̄ = Xi fi
n
i=1

where, i = 1, 2, 3, ..., n
n = sample size
fi = various frequencies

NUMERICAL MEASURES February 4, 2025 6 / 25


Median

The median is the middle value of a data set.


The median has no symbol.
Ungrouped Data
Steps to locate the median in an umgrouped data;
Re-arrange the data set in ascending or descending order.
Locate the middle value from the re-arranged data set as the median.

Case 1, n = odd  th


n+1
Median =
2
Case 2, n = even
1 n n th
Median = , +1
2 2 2

NUMERICAL MEASURES February 4, 2025 7 / 25


Median

Grouped Data
To find the median of a grouped data, we first locate the median class
( n2 ), and use the interpolation method.
n
2 − fcm

Median = l0 + Cw
fm

Where,
l0 = lower class boundary of the median class,
fcm = cumulative frequency of the pre-median class,
fm = frequency of the median class,
n = the data size.

NUMERICAL MEASURES February 4, 2025 8 / 25


Mode

The mode is the data point that occurs most in a data set.
There is no symbol for the mode.
Ungrouped Data
For ungrouped data, the mode is the data point with the highest
occurrence or the value that appears most in the data set.

Cases under mode


If there is zero mode = No mode.
If there is one mode = Unimodal.
If there is two modes = Bimodal.
If there is three modes = Trimodal.

NUMERICAL MEASURES February 4, 2025 9 / 25


Mode

Grouped Data
Using the interpolation method:
 
∆1
Mode = l0 + Cw
∆1 + ∆ 2

Where,
l0 = lower class boundary of the modal class,
∆1 = Absolute difference between the frequency of the modal class and
the frequency of the pre-modal class.
∆2 = Absolute difference between the frequency of the modal class and
the frequency of the post-modal class,
Cw = class width.
Note: The modal class is the class with the highest frequency.

NUMERICAL MEASURES February 4, 2025 10 / 25


Measures of Dispersion

When an average is used to describe a given set of data it tends to


give a very misleading result unless it is identified and accompanied
by supplementary information which indicates the amount of
deviations of the various observations from the average.
The degree to which the numerical data tend to spread about an
average is the dispersion or variation of the data. Variation or
dispersion is a very important characteristic of data.
A measure of dispersion of a given set of data is important in two
ways: It is used to show the degree of variation among the values in
the given data.
For example, a low dispersion of wages of workers indicates that
workers are approximately paid equal wages while a high dispersion
gives an impression what workers are paid wages which are
significantly different.

NUMERICAL MEASURES February 4, 2025 11 / 25


Range

The range is the simplest measure of dispersion.


The range of set of measurements x1 , x2 , x3 , ..., xn is defined as the
difference between the largest and smallest measurements.
In the case of grouped data, the range is defined as the difference
between the last and the first class marks.

Properties of Range
The range is easy and quicker to compute and easily understood.
It is affected by the one or two extreme values of the data and not
very sensitive to the number of observation of the data.
It is very crude and generally, not a useful measure of variation.
The range is a rough estimate of dispersion and unsuitable for further
statistical analysis.

NUMERICAL MEASURES February 4, 2025 12 / 25


The Mean Deviation

The mean deviation(MD) is a measure of the average amount by


which the observations, x1 , x2 , x3 ..., xn , forming the data differ from
the arithmetic mean x̄. It is defined as follows:
Ungrouped Data
n
!
1 X
MD = |xi − x̄|
n
i=1

Grouped Data
n
!
1 X
MD = fi |xi − x̄|
n
i=1

NUMERICAL MEASURES February 4, 2025 13 / 25


Properties of the Mean Deviation

The mean deviation is easily understood.


It is not greatly affected by extreme the observations.
It is very useful in dealing with with simple samples and situation
where elaborate analysis is required.

NUMERICAL MEASURES February 4, 2025 14 / 25


Variance and Standard Deviation

The variance (or standard deviation) is the most preferred used


measure of dispersion. The variance of a set of observations
x1 , x2 , x3 , ..., xn is the average of the squared deviations from the
arithmetic mean.
It is denoted by σ and s population and sample data respectively.
That is,

Variance (Ungrouped Data)


N
1 X
σ2 = (xi − µ)2
N
i=1

and
n
1 X
s2 = (xi − x̄)2
n−1
i=1

NUMERICAL MEASURES February 4, 2025 15 / 25


Variance and Standard Deviation

Variance (Grouped Data)


N
1 X
2
σ = fi (xi − µ)2
N
i=1

and
n
2 1 X
s = fi (xi − x̄)2
n−1
i=1

Standard Deviation.
The standard deviation is defined as the positive root of the variance
q P q
σ = N1 N i=1 fi (x i − µ)2 or s = 1 Pn
n−1 i=1 (xi − x̄)
2

NUMERICAL MEASURES February 4, 2025 16 / 25


The Coefficient of Variation

The standard deviation is useful as a measure of dispersion within a


given set ot data.
Sometimes, we may be interested in comparing variations between
two or more sets of data.
The standard deviation or the variance can be used for this purpose
when the variables are given in the same same units and are such that
their means are approximately equal.

Coefficient of Variation (CV)


SD
CV = ∗ 100%
mean(x̄)

NUMERICAL MEASURES February 4, 2025 17 / 25


Measures of Position

Quartiles
Quartiles divide a set of data into four equal parts such that :
The first or lower quartile (Q1 ) has 25% of the observations falling
below it.
The second or middle quartile (Q2 ), known as the median, has 50%
of the observation falling below/above it.
The third quartile (Q3 ) has 75% of the observation falling below it
when the data is arranged in order of magnitude.

NUMERICAL MEASURES February 4, 2025 18 / 25


QUARTILES

Ungrouped Data (Steps)


Determine the median = Q2
Obtain values below Q2 and determine their median = Q1
Obtain values above Q2 and determine their median = Q3

Grouped Data
The interpolation method is Used:
n 
−fcm
Q1 = l1 + 4 fm Cw
n 
−fcm
Q2 = l2 + 2 fm Cw
 
3n
−fcm
Q3 = l3 + 4 fm Cw

NUMERICAL MEASURES February 4, 2025 19 / 25


Deciles

Ungrouped Data
Deciles divide the data set into ten equal parts.
They are denoted by D1 , D2 , D3 , .., D9 and are such that
10%, 20%, 30%..90% of the observations fall below D1 , D3 , D3 , ..., D9 ,
respectively.

Grouped Data
!
2n
10 − fcm
D2 = l2 + Cw
fm
, !
5n
10 − fcm
D5 = l5 + Cw
fm

NUMERICAL MEASURES February 4, 2025 20 / 25


Percentiles
Percentiles
The percentiles divide the data into 100 equal parts. They are denoted by
P1 , P2 , P3 , ..., P99 and are such that 1%, 2% .. 99% of the observations
fall below P1 , P2 , P3 , ..., P99 respectively. They are used when dealing with
large amount of data.
The 25th percentile (P25 ) is equal to Q1 .
The 50th percentile (P50 ) is the median (Q2 or M).
The 75th percentile (P75 ) is equal to Q3 .

Grouped Data
Interpolation Method.
The K th percentile for a grouped data is determined by the formula,
!
kn
100 − fck
Pk = lk + Cw
fk
NUMERICAL MEASURES February 4, 2025 21 / 25
Interpolation Method

lk = the lower limit of the class in which the K th percentiles lies.


fck = the cumulative frequency just before the K th class boundary.
fk the frequency of the K th percentile class boundary.
Ck = the class width of the K th percentile class boundary.

NUMERICAL MEASURES February 4, 2025 22 / 25


Measures of Shape

The shape of a frequency distribution of n data observations,


x1 , x2 , X3 , ..., Xn , represented graphically by a histogram/frequency
polygon can be described using various measures of shape.
Measures of shape determine whether the distribution of data exhibits
a symmetric pattern or stretch out in a particular direction.
Two of such measures of shape are the skewness and kurtosis.
Skewness
The skewness of a distribution indicates its degree of symmetry or
non-symmetry.
It is measured by the Pearson Coefficient of Skewness (Sk ), defined by

3(mean − median)
Sk =
SD
which ranges from -3 to 3.

NUMERICAL MEASURES February 4, 2025 23 / 25


Skewness

If x̄ = m, then Sk = 0: the distribution is said to be symmetric.


If x̄ > m, then Sk > 0: the distribution is said to be skewed to the
right or positively skewed.
If x̄ < m then Sk < 0: the distribution is said to be skewed to the left
or negatively skewed.

NUMERICAL MEASURES February 4, 2025 24 / 25


Kurtosis
The degree of peakness or kurtosis of a distribution is described by the
coefficient of kurtosis, K defined by
1
2 (Q3− Q1 )
K=
P90 − P10
compared kurtosis (k) with 3.

Comparing to 3
If the value of K = 3, the distribution is said to be symmetric or
normal.
If K < 3, the distribution flattens at the center than the normal
distribution (the individual observation scatter widely about the
mean).
If K > 3, the distribution is more peaked than the normal distribution
(the observation are closed to the mean).

NUMERICAL MEASURES February 4, 2025 25 / 25

You might also like