Statistics For Managers Using Microsoft® Excel 5th Edition: Numerical Descriptive Measures
Statistics For Managers Using Microsoft® Excel 5th Edition: Numerical Descriptive Measures
Chapter 3
Numerical Descriptive Measures
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 3-1
Learning Objectives
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 3-2
Summary Definitions
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 3-3
Measures of Central Tendency
The Arithmetic Mean
The arithmetic mean (mean) is the most common
measure of central tendency
X i
X1 X 2 Xn
X i1
n n
Sample size Observed values
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 3-4
Measures of Central Tendency
The Arithmetic Mean
The most common measure of central tendency
Mean = sum of values divided by the number of values
Affected by extreme values (outliers)
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
Mean = 3 Mean = 4
1 2 3 4 5 15 1 2 3 4 10 20
3 4
5 5 5 5
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 3-5
Measures of Central Tendency
The Median
In an ordered array, the median is the “middle” number (50%
above, 50% below)
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
Median = 4 Median = 4
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 3-6
Measures of Central Tendency
Locating the Median
The median of an ordered set of data is located at the
n 1 ranked value.
2
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 3-7
Measures of Central Tendency
The Mode
Value that occurs most often
Not affected by extreme values
Used for either numerical or categorical data
There may be no mode
There may be several modes
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 0 1 2 3 4 5 6
Mode = 9 No Mode
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 3-8
Measures of Central Tendency
Review Example
House Prices: Mean: ($3,000,000/5)
$2,000,000 = $600,000
500,000 Median: middle value of ranked
300,000
100,000 data
100,000 = $300,000
Sum 3,000,000 Mode: most frequent value
= $100,000
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 3-9
Measures of Central Tendency
Which Measure to Choose?
The mean is generally used, unless extreme
values (outliers) exist.
Then median is often used, since the median
is not sensitive to extreme values. For
example, median home prices may be
reported for a region; it is less sensitive to
outliers.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 3-10
Quartile Measures
Quartiles split the ranked data into 4 segments with
an equal number of values per segment.
Q1 Q2 Q3
The first quartile, Q1, is the value for which 25% of
the observations are smaller and 75% are larger
Q2 is the same as the median (50% are smaller, 50% are
larger)
Only 25% of the values are greater than the third quartile
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 3-11
Quartile Measures
Locating Quartiles
Find a quartile by determining the value in the appropriate
position in the ranked data, where
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 3-12
Quartile Measures
Guidelines
Rule 1: If the result is a whole number, then the
quartile is equal to that ranked value.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 3-13
Quartile Measures
Locating the First Quartile
Example: Find the first quartile
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 3-14
Measures of Central Tendency
The Geometric Mean
Geometric mean
Used to measure the rate of change of a variable over time
X G ( X 1 X 2 X n )1/ n
Geometric mean rate of return
Measures the status of an investment over time
RG [(1 R1 ) (1 R 2 ) (1 Rn )]1/ n 1
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 3-15
Measures of Central Tendency
The Geometric Mean
An investment of $100,000 declined to $50,000 at the end of
year one and rebounded to $100,000 at end of year two:
R G [(1 R1 ) (1 R2 ) (1 Rn )]1/ n 1
Geometric More
mean rate of [(1 (.5)) (1 (1))] 1/ 2
1 accurate
return: [(.50) (2)]1/ 2 1 11/ 2 1 0% result
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 3-17
Measures of Central Tendency
Summary
Central Tendency
X i
XG ( X1 X2 Xn )1/ n
X i 1
n Middle value Most
in the ordered frequently
array observed
value
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 3-18
Measures of Variation
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 3-19
Measures of Variation
Range
Simplest measure of variation
Difference between the largest and the smallest values:
Example:
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Range = 13 - 1 = 12
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 3-20
Measures of Variation
Disadvantages of the Range
Ignores the way in which data are distributed
7 8 9 10 11 12 7 8 9 10 11 12
Range = 12 - 7 = 5 Range = 12 - 7 = 5
Sensitive to outliers
1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,5
Range = 5 - 1 = 4
1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,120
Range = 120 - 1 = 119
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 3-21
Measures of Variation
Interquartile Range
Problems caused by outliers can be eliminated by
using the interquartile range.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 3-22
Measures of Variation
Interquartile Range
Example:
Median
X Q1 Q3 X
minimum (Q2) maximum
12 30 45 57 70
Interquartile range
= 57 – 30 = 27
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 3-23
Measures of Variation
Variance
The variance is the average (approximately) of
squared deviations of values from the mean.
n
i
(X X ) 2
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 3-24
Measures of Variation
Standard Deviation
Most commonly used measure of variation
Shows variation about the mean
Has the same units as the original data
i
(X X ) 2
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 3-25
Measures of Variation
Standard Deviation
Steps for Computing Standard Deviation
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 3-26
Measures of Variation
Standard Deviation
Sample
Data (Xi) : 10 12 14 15 17 18 18 24
n=8 Mean = X = 16
(10 X ) 2 (12 X ) 2 (14 X ) 2 (24 X ) 2
S
n 1
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 3-27
Measures of Variation
Comparing Standard Deviation
Data A
Mean = 15.5
11 12 13 14 15 16 17 18 19 20 S = 3.338
21
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 3-28
Measures of Variation
Comparing Standard Deviation
Small standard deviation
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 3-29
Measures of Variation
Summary Characteristics
The more the data are spread out, the greater
the range, interquartile range, variance, and
standard deviation.
The more the data are concentrated, the
smaller the range, interquartile range,
variance, and standard deviation.
If the values are all the same (no variation),
all these measures will be zero.
None of these measures are ever negative.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 3-30
Coefficient of Variation
The coefficient of variation is the standard deviation
divided by the mean, multiplied by 100.
It is always expressed as a percentage. (%)
It shows variation relative to mean.
The CV can be used to compare two or more sets of
data measured in different units.
S
CV 100%
X
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 3-31
Coefficient of Variation
Stock A:
Both stocks
Average price last year = $50 have the
Standard deviation = $5 same
S $5 standard
CVA 100% 100% 10%
$50 deviation,
X
but stock B
Stock B:
is less
Average price last year = $100 variable
Standard deviation = $5 relative to its
S $5 price
CVB 100%
100% 5%
X $100
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 3-32
Locating Extreme Outliers
Z-Score
To compute the Z-score of a data value, subtract the
mean and divide by the standard deviation.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 3-33
Locating Extreme Outliers
Z-Score
XX
Z
S
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 3-34
Locating Extreme Outliers
Z-Score
Suppose the mean math SAT score is 490,
with a standard deviation of 100.
Compute the z-score for a test score of 620.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 3-35
Shape of a Distribution
Describes how data are distributed
Measures of shape
Symmetric or skewed
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 3-36
General Descriptive Stats
Using Microsoft Excel
1. Select Tools.
3. Select Descriptive
Statistics and click OK.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 3-37
General Descriptive Stats
Using Microsoft Excel
4. Enter the cell
range.
5. Check the
Summary
Statistics box.
6. Click OK
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 3-38
General Descriptive Stats
Using Microsoft Excel
Microsoft Excel
descriptive statistics output,
using the house price data:
House Prices:
$2,000,000
500,000
300,000
100,000
100,000
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 3-39
Numerical Descriptive
Measures for a Population
Descriptive statistics discussed previously described
a sample, not the population.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 3-40
Population Mean
X i
X1 X 2 X N
i 1
N N
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 3-41
Population Variance
The population variance is the average of squared
deviations of values from the mean
N
i
( X μ) 2
σ2 i 1
N
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 3-42
Population Standard Deviation
The population standard deviation is the most
commonly used measure of variation.
It has the same units as the original data.
(X i μ) 2
σ i 1
N
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 3-43
Sample statistics versus
population parameters
Measure Population Sample
Parameter Statistic
Mean
X
Variance
2 S2
Standard
S
Deviation
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 3-44
The Empirical Rule
The empirical rule approximates the variation of
data in bell-shaped distributions.
68%
μ
μ 1σ
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 3-45
The Empirical Rule
Approximately 95% of the data in a bell-shaped
distribution lies within two standard deviation of the
mean, or μ 2σ
Approximately 99.7% of the data in a bell-shaped
distribution lies within three standard deviation of the
mean, or μ 3σ
95% 99.7%
μ 2σ μ 3σ
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 3-46
Using the Empirical Rule
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 3-47
Chebyshev Rule
Regardless of how the data are distributed
(symmetric or skewed), at least (1 - 1/k2) of the
values will fall within k standard deviations of
the mean (for k > 1)
Examples:
At least within
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 3-48
Exploratory Data Analysis
The Five Number Summary
The five numbers that describe the spread of
data are:
Minimum
Median (Q2)
Third Quartile (Q3)
Maximum
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 3-49
Exploratory Data Analysis
The Box-and-Whisker Plot
The Box-and-Whisker Plot is a graphical display of
the five number summary.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 3-50
Exploratory Data Analysis
The Box-and-Whisker Plot
The Box and central line are centered between the
endpoints if data are symmetric around the median.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 3-51
Exploratory Data Analysis
The Box-and-Whisker Plot
Q1 Q2Q3 Q1Q2Q3 Q1 Q2 Q3
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 3-52
Sample Covariance
The sample covariance measures the strength of the linear
relationship between two numerical variables.
( X X)( Y Y )
i i
The sample covariance: cov ( X , Y ) i1
n 1
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 3-53
Sample Covariance
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 3-54
The Correlation Coefficient
The correlation coefficient measures the relative
strength of the linear relationship between two
variables.
( X X)( Y Y )
i i
cov ( X , Y )
r i 1
n n SX SY
( Xi X )
i 1
2
i
( Y
i 1
Y ) 2
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 3-55
The Correlation Coefficient
Unit free
Ranges between –1 and 1
The closer to –1, the stronger the negative linear
relationship
The closer to 1, the stronger the positive linear
relationship
The closer to 0, the weaker any linear
relationship
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 3-56
The Correlation Coefficient
Y Y Y
X X X
r = -1 r = -.6 r=0
Y Y
X X
X
r = +1 r = +.3
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 3-57
The Correlation Coefficient
Using Microsoft Excel
1. Select Tools/Data
Analysis
2. Choose Correlation from
the selection menu
3. Click OK . . .
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 3-58
The Correlation Coefficient
Using Microsoft Excel
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 3-59
The Correlation Coefficient
Using Microsoft Excel
r = .733
Scatter Plot of Test Scores
100
There is a relatively 95
Test #2 Score
relationship between test 85
#2. 75
70
70 75 80 85 90 95 100
Test #1 Score
Students who scored high
on the first test tended to
score high on second test.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 3-60
Pitfalls in Numerical
Descriptive Measures
Data analysis is objective
Analysis should report the summary measures that best
meet the assumptions about the data set.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 3-61
Ethical Considerations
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 3-62
Chapter Summary
In this chapter, we have
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 3-63
Chapter Summary
In this chapter, we have
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc. Chap 3-64