0% found this document useful (0 votes)
44 views

Numerical Descriptive Measures: Powerpoint To Accompany

Uploaded by

pinky
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
44 views

Numerical Descriptive Measures: Powerpoint To Accompany

Uploaded by

pinky
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 53

Chapter 3

Numerical
descriptive
measures

PowerPoint to accompany:

Cover illustration: © Raw Pixel/Shutterstock.com

Slide 1
Copyright © 2016 Pearson Australia (a division of Pearson Australia Group Pty Ltd) – 9781486018956 / Berenson / Basic Business Statistics 4/E
Learning Objectives

After studying this chapter you be able to:


1 calculate and interpret numerical descriptive
measures of central tendency, variation and
shape for numerical data
2 calculate and interpret descriptive
3 summary measures for a population
4 construct and interpret a box-and-whisker plot
5 calculate and interpret the covariance and the
coefficient of correlation for bivariate data

Slide 2
Copyright © 2016 Pearson Australia (a division of Pearson Australia Group Pty Ltd) – 9781486018956 / Berenson / Basic Business Statistics 4/E
Sheldon's calculation of Penny's men

This YouTube clip contains strong language but has been included because of its excellent valued content. We leave it to your discretion whether to retain or remove

Source: http://www.youtube.com/watch?v=-TIgftOZwy0

Slide 3
Copyright © 2016 Pearson Australia (a division of Pearson Australia Group Pty Ltd) – 9781486018956 / Berenson / Basic Business Statistics 4/E
Describing Data
Describing data by its central tendency,
variation and shape

Central Tendency Quartiles Variation Shape

Arithmetic Mean Range Skewness

Median Interquartile Range

Mode Variance

Geometric Mean Standard Deviation

Coefficient of Variation

Slide 4
Copyright © 2016 Pearson Australia (a division of Pearson Australia Group Pty Ltd) – 9781486018956 / Berenson / Basic Business Statistics 4/E
Measures of Central Tendency

Central Tendency

Arithmetic Median Mode


Mean
n

X i
X i1
n
Midpoint of Most
ranked frequently
values observed
value
Slide 5
Copyright © 2016 Pearson Australia (a division of Pearson Australia Group Pty Ltd) – 9781486018956 / Berenson / Basic Business Statistics 4/E
The Arithmetic Mean

For a sample of size n the sample mean, denoted X , is


calculated:

X i
X1  X 2    X n
X  i 1

n n

Xi’s are observed values


Where Σ means to sum or add up

Slide 6
Copyright © 2016 Pearson Australia (a division of Pearson Australia Group Pty Ltd) – 9781486018956 / Berenson / Basic Business Statistics 4/E
The Median

In an ordered array, the median is the ‘middle’


number (50% above, 50% below)

0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10

Median = 3 Median = 3
Its main advantage over the arithmetic mean is that
it is not affected by extreme values

Slide 7
Copyright © 2016 Pearson Australia (a division of Pearson Australia Group Pty Ltd) – 9781486018956 / Berenson / Basic Business Statistics 4/E
Finding the Median

The location of the median:


2
Median = ranked value
n 1
2
Note that n  1 is not the value of the median, only the
position of the median in the ranked data

Rule 1: If the number of values in the data set is odd, the median
is the middle ranked value

Rule 2: If the number of values in the data set is even, the


median is the mean (average) of the two middle ranked values

Slide 8
Copyright © 2016 Pearson Australia (a division of Pearson Australia Group Pty Ltd) – 9781486018956 / Berenson / Basic Business Statistics 4/E
The Mode
• A measure of central tendency
• Value that occurs most often (the most frequent)
• Not affected by extreme values

Unlike mean and median, there may be no unique (single) mode for a
given data set

Used for either numerical or categorical (nominal) data

An example of no mode:

0 1 2 3 4 5 6
An example of several modes:

Modes = 5 and 9
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Slide 9
Copyright © 2016 Pearson Australia (a division of Pearson Australia Group Pty Ltd) – 9781486018956 / Berenson / Basic Business Statistics 4/E
Review Example

Prices for five houses located near the beach:

$2,000,000
$500,000
$300,000
$100,000
$100,000

Slide 10
Copyright © 2016 Pearson Australia (a division of Pearson Australia Group Pty Ltd) – 9781486018956 / Berenson / Basic Business Statistics 4/E
Review Example (continued)

Mean=  2,000,000  500,000  300,000  100,000  100,000


5
3,000,000
 House Prices:
5
$1,000,000
 $600,000
$500,000
$300,000
$100,000
Median (position = 6/2 = 3) $100,000
= $300,000

Mode = $100,000

Slide 11
Copyright © 2016 Pearson Australia (a division of Pearson Australia Group Pty Ltd) – 9781486018956 / Berenson / Basic Business Statistics 4/E
Which Measure of Location is the ‘best’ in this
situation?

• The mean is generally used most often, unless extreme values


(outliers) exist

• The median is often used, since it is not sensitive to extreme


values

• The mode is usually the least used of the three

• Since we have an obvious outlier ($2,000,000), it makes sense


to use the median in this instance

• Most housing prices are now reported as median housing prices


in Australian newspapers due to possible outliers

Slide 12
Copyright © 2016 Pearson Australia (a division of Pearson Australia Group Pty Ltd) – 9781486018956 / Berenson / Basic Business Statistics 4/E
Quartiles

• Quartiles split the ranked data into four segments, with an equal
number of values per segment

• The first quartile, Q1, is the value for which 25% of the observations
are smaller and 75% are larger

• The second quartile, Q2, is the same as the median (50% are smaller,
50% are larger)

• Only 25% of the observations are greater than the third quartile, Q3
25% 25% 25% 25%
Q1 Q2 Q3 Q4

Slide 13
Copyright © 2016 Pearson Australia (a division of Pearson Australia Group Pty Ltd) – 9781486018956 / Berenson / Basic Business Statistics 4/E
Quartiles (continued)

Similar to the median, we find a quartile by determining the value


in the appropriate position in the ranked data, where:

First quartile position: Q1 = (n+1)/4

Second quartile position: Q2 = (n+1)/2 (the median)

Third quartile position: Q3 = 3(n+1)/4

where n is the number of observed values (sample size)

Slide 14
Copyright © 2016 Pearson Australia (a division of Pearson Australia Group Pty Ltd) – 9781486018956 / Berenson / Basic Business Statistics 4/E
Quartile Example

First, data must be arranged in ordered array (note n = 15)

1 3 5 7 8 10 11 12 13 16 16 17 18 21 22

Q1 Q2 = median Q3

• Q1 is in the (15+1)/4 = 4th position of the ranked data,


so Q1 = 7
• Q3 is in the 3*(15+1)/2 = 12th position of the ranked
data, so Q3 = 17

Slide 15
Copyright © 2016 Pearson Australia (a division of Pearson Australia Group Pty Ltd) – 9781486018956 / Berenson / Basic Business Statistics 4/E
Geometric Mean vs. Geometric Mean Rate of Return

Geometric mean is used to measure the average rate of change


of a variable over n periods of time

XG  ( X1  X 2    Xn )1/ n

Geometric mean rate of return measures the status of an investment


over time or average percentage change in a variable

R G  [(1  R1 )  (1  R2 )    (1  Rn )]1/ n  1

where Ri is the rate of return in time period i

Slide 16
Copyright © 2016 Pearson Australia (a division of Pearson Australia Group Pty Ltd) – 9781486018956 / Berenson / Basic Business Statistics 4/E
Geometric Mean and Mean Rate Example

An investment of $100,000 declined to $50,000 at


the end of year 1 and rebounded to $100,000 at
end of year 2:

X1  $100,000 X 2  $50,000 X3  $100,000

50% decrease 100% increase

The overall two-year rate of return is zero, since it started


and ended at the same level

Slide 17
Copyright © 2016 Pearson Australia (a division of Pearson Australia Group Pty Ltd) – 9781486018956 / Berenson / Basic Business Statistics 4/E
Geometric Mean and Mean Rate Example (continued)

Use the 1-year returns to calculate the


arithmetic mean and the geometric mean:

Arithmetic ( 50%)  (100%)


X   25%
mean rate 2
of return:
Misleading result

R G  [(1  R1 )  (1  R 2 )    (1  Rn )]1/ n  1
Geometric
mean rate  [(1  ( 50%))  (1  (100%))]1/ 2  1
of return:  [(. 50)  (2)]1/ 2  1  11/ 2  1  0%

More accurate result


Slide 18
Copyright © 2016 Pearson Australia (a division of Pearson Australia Group Pty Ltd) – 9781486018956 / Berenson / Basic Business Statistics 4/E
Measures of Variation

Variation

Range Interquartile Variance Standard Coefficient


Range Deviation of Variation

Measures of variation
give information on
the spread or
variability of the data
values
E.g. same centre,
different variation
Slide 19
Copyright © 2016 Pearson Australia (a division of Pearson Australia Group Pty Ltd) – 9781486018956 / Berenson / Basic Business Statistics 4/E
The Range

Simplest measure of variation


Difference between the largest and the smallest
values in a set of data

Range = Xlargest – Xsmallest

Example:

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

Range = 14 - 1 = 13
Slide 20
Copyright © 2016 Pearson Australia (a division of Pearson Australia Group Pty Ltd) – 9781486018956 / Berenson / Basic Business Statistics 4/E
Disadvantages

Ignores the distribution of the data

7 8 9 10 11 12
Range = 12 - 7 = 5 Range = 12 - 7 = 5

Like the mean, the range is sensitive to outliers


1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,5

Range = 5 - 1 = 4
1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,120

Range = 120 - 1 = 119

Slide 21
Copyright © 2016 Pearson Australia (a division of Pearson Australia Group Pty Ltd) – 9781486018956 / Berenson / Basic Business Statistics 4/E
The Interquartile Range (IQR)

Like the median and Q1 and Q2, the IQR is a resistant


summary measure (resistant to the presence of
extreme values)

Eliminates outlier problems by using the interquartile


range, as high- and low-valued observations are
removed from calculations

IQR = 3rd quartile – 1st quartile

IQR = Q3 = Q1
Slide 22
Copyright © 2016 Pearson Australia (a division of Pearson Australia Group Pty Ltd) – 9781486018956 / Berenson / Basic Business Statistics 4/E
The Interquartile Range (IQR) (continued)

Example: Range = 200 – 10 = 190 (misleading)

Q2 = Median
X Q1 Q3 X
minimum maximum
25% 25% 25% 25%

10 30 45 60 200

IQR = 60 – 30 = 30
Even if the value of 200 changes to 300, IQR remains
the same, hence resistant to changes in extreme values

Slide 23
Copyright © 2016 Pearson Australia (a division of Pearson Australia Group Pty Ltd) – 9781486018956 / Berenson / Basic Business Statistics 4/E
The Sample Variance – S2

• Measures average scatter around the mean


• Units are also squared

 (X i  X) 2 where
X = mean
S  2 i1
n = sample size
n -1 Xi = ith value of the
variable X

Slide 24
Copyright © 2016 Pearson Australia (a division of Pearson Australia Group Pty Ltd) – 9781486018956 / Berenson / Basic Business Statistics 4/E
The Sample Standard Deviation - S

• Most commonly used measure of variation


• Shows variation about the mean
• Has the same units as the original data

 (X i  X) 2

S i 1
n -1

Slide 25
Copyright © 2016 Pearson Australia (a division of Pearson Australia Group Pty Ltd) – 9781486018956 / Berenson / Basic Business Statistics 4/E
Calculation Example: Sample Standard Deviation

Sample
Data (Xi) 10 12 14 15 17 18 18 24

n=8 Mean = X = 16

(10  X)2  (12  X)2  (14  X)2    (24  X)2


S
n 1

(10  16)2  (12  16)2  (14  16)2    (24  16)2



8 1
130 A measure of the
 = 4.3095
7 ‘average’ scatter around
the mean
Copyright © 2016 Pearson Australia (a division of Pearson Australia Group Pty Ltd) – 9781486018956 / Berenson / Basic Business Statistics 4/E
Slide 26
Measuring variation

Small standard deviation

Large standard deviation

Slide 27
Copyright © 2016 Pearson Australia (a division of Pearson Australia Group Pty Ltd) – 9781486018956 / Berenson / Basic Business Statistics 4/E
Comparing Standard Deviations

Data A
Mean = 15.5
11 12 13 14 15 16 17 18 19 20 21 S = 3.338
Data B
Mean = 15.5
11 12 13 14 15 16 17 18 19 20 21
S = 0.926
Data C
Mean = 15.5
11 12 13 14 15 16 17 18 19 20 21
S = 4.567
Slide 28
Copyright © 2016 Pearson Australia (a division of Pearson Australia Group Pty Ltd) – 9781486018956 / Berenson / Basic Business Statistics 4/E
Variance and Standard Deviation

Advantages
• Each value in the data set is used in the calculation
• Values far from the mean are given extra weight as
deviations from the mean are squared

Disadvantages
• Sensitive to extreme values (outliers)
• Measures of absolute variation not relative variation

Slide 29
Copyright © 2016 Pearson Australia (a division of Pearson Australia Group Pty Ltd) – 9781486018956 / Berenson / Basic Business Statistics 4/E
The Coefficient of Variation

• Measures relative variation


i.e. shows variation relative to mean
• Can be used to compare two or more sets of data
measured in different units
• Always expressed as percentage (%)

S 
CV     100%
X 

Slide 30
Copyright © 2016 Pearson Australia (a division of Pearson Australia Group Pty Ltd) – 9781486018956 / Berenson / Basic Business Statistics 4/E
Coefficient of Variation Example

Stock A:
Average price last year = $50; standard deviation = $5

Both stocks have


S  $5 the same standard
CVA     100%   100%  10%
X  $50 deviation, but
Stock B is less
Stock B: variable relative to
its price
Average price last year = $100; standard deviation = $5

S  $5
CVB     100%   100%  5%
X  $100

Slide 31
Copyright © 2016 Pearson Australia (a division of Pearson Australia Group Pty Ltd) – 9781486018956 / Berenson / Basic Business Statistics 4/E
The Z Score

• The difference between a given observation and


the mean, divided by the standard deviation

XX
Z
S
For example:
• A Z score of 2.0 means that a value is 2.0 standard
deviations from the mean
• A Z score above 3.0 or below -3.0 is considered an
outlier
Slide 32
Copyright © 2016 Pearson Australia (a division of Pearson Australia Group Pty Ltd) – 9781486018956 / Berenson / Basic Business Statistics 4/E
Z Score
Example:

If the mean is 14.0 and the standard deviation is 3.0,


what is the Z score for the value 18.5?

X  X 18.5  14.0
Z   1.5
S 3.0

• The value 18.5 is 1.5 standard deviations above the


mean
• A negative Z score would indicate that a value is
below the mean

Slide 33
Copyright © 2016 Pearson Australia (a division of Pearson Australia Group Pty Ltd) – 9781486018956 / Berenson / Basic Business Statistics 4/E
The Shape of a Distribution

Describes how data are distributed


Measures of shape:
• symmetric or skewed

Left-skewed Symmetric Right-skewed


Mean < Median Mean = Median Median < Mean

Slide 34
Copyright © 2016 Pearson Australia (a division of Pearson Australia Group Pty Ltd) – 9781486018956 / Berenson / Basic Business Statistics 4/E
Using Microsoft Excel

Use menu choice:


Data>Data Analysis>
Descriptive Statistics

Slide 35
Copyright © 2016 Pearson Australia (a division of Pearson Australia Group Pty Ltd) – 9781486018956 / Berenson / Basic Business Statistics 4/E
Using Microsoft Excel (continued)

Slide 36
Copyright © 2016 Pearson Australia (a division of Pearson Australia Group Pty Ltd) – 9781486018956 / Berenson / Basic Business Statistics 4/E
Numerical Measures for a Population

• Population summary measures are called parameters


• The population mean is the sum of the values in the
population divided by the population size, N

X i
X1  X 2    XN
 i1

N N

Slide 37
Copyright © 2016 Pearson Australia (a division of Pearson Australia Group Pty Ltd) – 9781486018956 / Berenson / Basic Business Statistics 4/E
Population Variance vs. Standard Deviation

Population Variance: N

• the average of the squared  (X i  μ) 2

deviations of values from the σ2  i1


mean N

μ = population mean; N = population size; Xi = ith value of the variable X

Population Standard Deviation:


• shows variation about the mean
N
is the square root of the population
 i

variance
(X  μ) 2

• has the same units as the original σ i1

data N
Slide 38
Copyright © 2016 Pearson Australia (a division of Pearson Australia Group Pty Ltd) – 9781486018956 / Berenson / Basic Business Statistics 4/E
The Empirical Rule

If the data distribution is approximately bell-shaped,


then the interval μ  1σ contains about 68% of
the values in the population

68%

μ
μ  1σ

Slide 39
Copyright © 2016 Pearson Australia (a division of Pearson Australia Group Pty Ltd) – 9781486018956 / Berenson / Basic Business Statistics 4/E
The Empirical Rule (continued)
• μ  2σ
contains about 95% of the values in the population

• μ  3σ
contains about 99.7% of the values in the population

95% 99.7%

μ  2σ μ  3σ

Slide 40
Copyright © 2016 Pearson Australia (a division of Pearson Australia Group Pty Ltd) – 9781486018956 / Berenson / Basic Business Statistics 4/E
Chebyshev Rule and Examples
Regardless of how the data are distributed, the
percentage of values within k standard deviations of
the mean must be at least:
[(1 - 1/k2)] x 100% (for k > 1)

At least Within

(1 - 1/12) x 100% = 0% k=1 (μ ± 1σ)

(1 - 1/22) x 100% = 75% k=2 (μ ± 2σ)

(1 - 1/32) x 100% = 89% k=3 (μ ± 3σ)

Slide 41
Copyright © 2016 Pearson Australia (a division of Pearson Australia Group Pty Ltd) – 9781486018956 / Berenson / Basic Business Statistics 4/E
Approximating the Mean
• Sometimes only a frequency distribution is
available, not the raw data
• Use the midpoint of a class interval to approximate
the values in that class
c

m f
j1
j j

X
n
wheren = number of values or sample size
c = number of classes in the frequency distribution
mj = midpoint of the jth class
fj = number of values in the jth class

Slide 42
Copyright © 2016 Pearson Australia (a division of Pearson Australia Group Pty Ltd) – 9781486018956 / Berenson / Basic Business Statistics 4/E
Approximating the Standard Deviation

Assume that all values within each class interval are


located at the midpoint of the class

 (m
j 1
j  X) f j 2

S
n -1

Slide 43
Copyright © 2016 Pearson Australia (a division of Pearson Australia Group Pty Ltd) – 9781486018956 / Berenson / Basic Business Statistics 4/E
Exploratory Data Analysis

Box-and-whisker Plot: A graphical display of data using the 5-


number summary:

Q2 = Median
X Q1 Q3 X
minimum maximum
25% 25% 25% 25%

Minimum(Xsmallest) -- Q1 -- Median -- Q3 -- Maximum (Xlargest)

Slide 44
Copyright © 2016 Pearson Australia (a division of Pearson Australia Group Pty Ltd) – 9781486018956 / Berenson / Basic Business Statistics 4/E
Distribution Shape and Box-and-whisker Plot

Left-skewed Symmetric Right-skewed

Q1 Q2 Q3 Q1 Q2 Q3 Q1 Q2 Q3

Slide 45
Copyright © 2016 Pearson Australia (a division of Pearson Australia Group Pty Ltd) – 9781486018956 / Berenson / Basic Business Statistics 4/E
The Covariance

The sample covariance measures the strength of


the linear relationship between two numerical
variables:
n

 (X i  X )(Yi  Y )
cov ( X , Y )  i 1

n 1

• Only concerned with the direction of the relationship


• No causal effect is implied
• Is affected by units of measurement
Slide 46
Copyright © 2016 Pearson Australia (a division of Pearson Australia Group Pty Ltd) – 9781486018956 / Berenson / Basic Business Statistics 4/E
Correlation
Measures the relative strength of the linear relationship
between two variables

cov (X , Y)
r
SX SY
n

where:  (X i  X)(Yi  Y)
cov (X , Y)  i1
n 1

n n

 (Xi  X) 2
 i
(Y  Y ) 2

SX  i 1
SY  i 1
n 1 n 1
Slide 47
Copyright © 2016 Pearson Australia (a division of Pearson Australia Group Pty Ltd) – 9781486018956 / Berenson / Basic Business Statistics 4/E
Features of Correlation Coefficient, r

Also called Standardised Covariance


i.e. invariant to units of measure

Ranges between –1 and 1


• The closer to –1, the stronger the negative linear
relationship
• The closer to 1, the stronger the positive linear
relationship
• The closer to 0, the weaker the linear relationship

Slide 48
Copyright © 2016 Pearson Australia (a division of Pearson Australia Group Pty Ltd) – 9781486018956 / Berenson / Basic Business Statistics 4/E
Scatter Plots of Data with Various
Correlation Coefficients

Y Y Y

X X X
r = -1 r = -.6 r=0
Y Y Y

X X X
r = +1 r = +.3 r=0
Slide 49
Copyright © 2016 Pearson Australia (a division of Pearson Australia Group Pty Ltd) – 9781486018956 / Berenson / Basic Business Statistics 4/E
Industry Application
Skyscrapers 'linked with impending financial crashes'

• There is an ‘unhealthy correlation’ between the building of skyscrapers


and subsequent financial crashes, according to Barclays Capital.

• Examples include the Empire State building, built as the Great


Depression was under way; and the current world's tallest, the Burj
Khalifa, built just before Dubai almost went bust.

• China is currently the biggest builder of skyscrapers, the bank said.

• India also has 14 skyscrapers under construction.

‘Often the world's tallest buildings are simply the edifice of a broader
skyscraper building boom, reflecting a widespread misallocation of capital
and an impending economic correction’, Barclays Capital analysts said.
(Source: http://www.bbc.co.uk/news/business-16494013)

Slide 50
Copyright © 2016 Pearson Australia (a division of Pearson Australia Group Pty Ltd) – 9781486018956 / Berenson / Basic Business Statistics 4/E
Pitfalls and Ethical Issues

Data analysis is objective


• Should report the summary measures that best meet the
assumptions about the data set

Data interpretation is subjective


• Should be done in fair, neutral and transparent manner

• Should document both good and bad results

• Results should be presented in a fair, objective and neutral


manner
• Should not use inappropriate summary measures to distort
facts
• Do not fail to report pertinent findings even if such findings do
not support original argument
Slide 51
Copyright © 2016 Pearson Australia (a division of Pearson Australia Group Pty Ltd) – 9781486018956 / Berenson / Basic Business Statistics 4/E
Chapter Summary
Described measures of central tendency
• Mean, median, mode, geometric mean

Described quartiles

Described measures of variation


• Range, interquartile range, variance and standard deviation,
coefficient of variation, Z scores

Illustrated shape of distribution


• Symmetric, skewed, box-and-whisker plots

Discussed covariance and correlation coefficient

Addressed pitfalls in numerical descriptive measures and ethical


considerations
Slide 52
Copyright © 2016 Pearson Australia (a division of Pearson Australia Group Pty Ltd) – 9781486018956 / Berenson / Basic Business Statistics 4/E
End of Chapter

Slide 53
Copyright © 2016 Pearson Australia (a division of Pearson Australia Group Pty Ltd) – 9781486018956 / Berenson / Basic Business Statistics 4/E

You might also like