0% found this document useful (0 votes)
69 views

Describing Data Numerically

This document describes various statistical measures used to numerically describe data, including measures of location (minimum, maximum, mean, median, mode), percentiles, quartiles, measures of dispersion (range, interquartile range, standard deviation, variance, coefficient of variation), and measures of distribution shape (skewness and kurtosis). It provides examples calculating these statistics for datasets of star distances and blood test results. Key outputs include minimum, maximum, mean, median, percentiles, quartiles, interquartile range, standard deviation, variance, and coefficient of variation.

Uploaded by

Death Bringer
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
69 views

Describing Data Numerically

This document describes various statistical measures used to numerically describe data, including measures of location (minimum, maximum, mean, median, mode), percentiles, quartiles, measures of dispersion (range, interquartile range, standard deviation, variance, coefficient of variation), and measures of distribution shape (skewness and kurtosis). It provides examples calculating these statistics for datasets of star distances and blood test results. Key outputs include minimum, maximum, mean, median, percentiles, quartiles, interquartile range, standard deviation, variance, and coefficient of variation.

Uploaded by

Death Bringer
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 9

DESCRIBING DATA

NUMERICALLY
Sept 23, 24 & 25 2019
MEASURES OF LOCATION • Arrange the data values in
increasing order
1.) Minimum (Min) • Find the location of Pj in the
arrange list by computing L
2.) Maximum (Max) = (j/100)xN
• If L is a whole number, then
3.) Measures of Central Tendency Pj is the mean of the values
in the Lth and (L+1)th
a.) Mean (arithmetic mean)  positions
• If L is not a whole number,
b.) Median (middle value) then Pj is the value at the
c.) Mode (most frequent value) next higher position

4.) Percentile (divides an array into 100 equal parts)


5.) Decile (divides an array into 10 equal parts) • Deciles correspond to
certain percentile
6.) Quartiles (divides an array into 4 equal parts) positions, i.e. D1=P10
• Q1=P25
• Percentiles, deciles
and quartiles are best
used for large data
sets, i.e. N>100
EXAMPLE 1
Distances of stars. Of the 25 brightest stars, the distances from earth (in
light-years) for those with distances less than 100 light-years are found
below. Find the mean, median, mode, maximum, minimum, P50, D5 and
Q2. (Source: New York Times Almanac 2010)
8.6 36.7 42.2 16.8 33.7 77.5 87.9
4.4 25.3 11.4 65.1 25.1 51.5

Answers

Max = 87.9 P50 = 33.7


Min = 4.4 D5 = 33.7
Mean = 37.4 Q2 = 33.7
Median = 33.7
EXAMPLE 2
BUN Count. The blood urea nitrogen (BUN) count of 20 individual
patients is given here in milligrams per deciliter (mg/dl). Construct an
ungrouped frequency distribution for the data. Find the mean, median,
mode, maximum, minimum, P50, D5 and Q2.
17 12 13 14 16 18 17 18 16 15
13 11 19 17 19 14 20 17 12 22

Answers

Max = 22 P50 = 16.5


Min = 11 D5 = 16.5
Mean = 16 Q2 = 16.5
Median = 16.5
MEASURES OF DISPERSION Gives the range of the middle
50% of the observations
1.) Range (R), R = max - min
2.) Inter-quartile range (IQR), IQR = Q3 – Q1
σ𝑵 𝟐 Measures the average distance of
𝒊=𝟏 𝒙𝒊 −𝝁
3.) standard deviation (),𝝈 = the observations from the mean
𝑵

4.) variance (2)


5.) Coefficient of variation (CV) , CV = ( / )100%
Is useful for comparing two or
more data sets with the same or
different units of measurement
EXAMPLE 3. From the given data in Example 1, compute
the following: IQR, standard deviation, variance and CV

Answers

IQR = 44.2 2 = 652.29


 = 25.54 CV = 68.29% or 68%
MEASURE OF SKEWNESS

The skewness (SK) of a distribution indicates whether or not a


distribution is symmetric.
• If SK = 0, then data set is symmetric
• If SK > 0, the data set is skewed to the right or positively skewed
• If SK < 0, then data set is skewed to the left or negatively skewed
KURTOSIS

The Kurtosis (K) is a measure of peakedness or flatness of a data


distribution.
• If K = 0, then the distribution of data set is mesokurtic or normal
• If K > 0, then the distribution of data set is leptokurtic
• If K < 0, then the distribution of data set is platykurtic
BOX-AND-WHISKERS PLOT AND THE FIVE NUMBER SUMMARY

WHISKER

Minimum (Min)
Maximum (Max)
1st Quartile (Q1)
Median (P50=Q2=D5)
3rd Quartile (Q3) IQR = Q3 – Q1 Box length
BOX-AND-WHISKERS PLOT AND THE FIVE NUMBER SUMMARY

BOXPLOT for
Example 2

You might also like