Lecture Notes 02
Lecture Notes 02
Chapter 2
Describing Data: Numerical
Copyright 2013 Pearson Education, Inc. Publishing as Prentice Hall
Ch. 2-1
Central Tendency
Variation
Arithmetic Mean
Range
Median
Interquartile Range
Mode
Variance
Standard Deviation
Coefficient of Variation
Ch. 2-2
2.1
Overview
Central Tendency
Mean
Median
Mode
Midpoint of
ranked values
Most frequently
observed value
x=
i=1
Arithmetic
average
Ch. 2-3
Arithmetic Mean
n
x1 + x 2 + ! + x N
=
=
N
N
i=1
Population
values
Population size
x=
x
i=1
x1 + x 2 + ! + x n
=
n
Observed
values
Sample size
Ch. 2-4
Arithmetic Mean
(continued)
n
n
n
0 1 2 3 4 5 6 7 8 9 10
Mean = 3
1 + 2 + 3 + 4 + 5 15
=
=3
5
5
Copyright 2013 Pearson Education, Inc. Publishing as Prentice Hall
0 1 2 3 4 5 6 7 8 9 10
Mean = 4
1 + 2 + 3 + 4 + 10 20
=
=4
5
5
Ch. 2-5
Median
n
0 1 2 3 4 5 6 7 8 9 10
0 1 2 3 4 5 6 7 8 9 10
Median = 3
Median = 3
Ch. 2-6
n + 1
Median position =
2
n
n
th
n +1
Note that
is not the value of the median, only the
2
position of the median in the ranked data
Ch. 2-7
Mode
n
n
n
n
n
n
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Mode = 9
Copyright 2013 Pearson Education, Inc. Publishing as Prentice Hall
0 1 2 3 4 5 6
No Mode
Ch. 2-8
Ch. 2-9
Shape of a Distribution
n
Measures of shape
n
Symmetric or skewed
Left-Skewed
Symmetric
Right-Skewed
Mean = Median
Ch. 2-10
Geometric Mean
n
Geometric mean
n
x g = (x1 x 2 ! x n ) = (x1 x 2 ! x n )
n
rg = (x1 x 2 ... xn ) 1
n
Ch. 2-11
Example
An investment of $100,000 rose to $150,000 at the
end of year one and increased to $180,000 at end
of year two:
50% increase
20% increase
Ch. 2-12
Example
(continued)
(150%) + (120%)
X=
1 = 35%
2
Misleading result
rg = (x1 x 2 )1/n 1
= [(150%) (120%)]1/2 1
1/2
Accurate
result
Ch. 2-13
Ch. 2-14
Quartiles
n
25%
25%
Q2
25%
Q3
The first quartile, Q1, is the value for which 25% of the
observations are smaller and 75% are larger
Q2 is the same as the median (50% are smaller, 50% are
larger)
Only 25% of the observations are greater than the third
quartile
Ch. 2-15
Quartile Formulas
Find a quartile by determining the value in the
appropriate position in the ranked data, where
First quartile position:
Q1 = 0.25(n+1)
Q2 = 0.50(n+1)
Q3 = 0.75(n+1)
Ch. 2-16
Quartiles
n
(n = 9)
Q1 = is in the 0.25(9+1) = 2.5 position of the ranked data
so use the value half way between the 2nd and 3rd values,
so
Q1 = 12.5
Ch. 2-17
Five-Number Summary
The five-number summary refers to five descriptive
measures:
minimum
first quartile
median
third quartile
maximum
minimum < Q1 < median < Q3 < maximum
Ch. 2-18
2.2
Measures of Variability
Variation
Range
Interquartile
Range
Variance
Standard
Deviation
Coefficient of
Variation
Ch. 2-19
Range
n
n
Example:
0 1 2 3 4 5 6 7 8 9 10 11 12
13 14
Range = 14 - 1 = 13
Copyright 2013 Pearson Education, Inc. Publishing as Prentice Hall
Ch. 2-20
10
11
12
Range = 12 - 7 = 5
n
10
11
12
Range = 12 - 7 = 5
Sensitive to outliers
1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,5
Range = 5 - 1 = 4
1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,120
Range = 120 - 1 = 119
Copyright 2013 Pearson Education, Inc. Publishing as Prentice Hall
Ch. 2-21
Interquartile Range
n
Ch. 2-22
Interquartile Range
n
Ch. 2-23
Population Variance
n
Population variance:
2
=
Where
(x )
i
i=1
= population mean
N = population size
xi = ith value of the variable x
Ch. 2-24
Sample Variance
n
Sample variance:
2
s =
Where
(x x)
i=1
n -1
X = arithmetic mean
n = sample size
Xi = ith value of the variable X
Ch. 2-25
(x )
i
i=1
N
Ch. 2-26
(x x)
S=
i=1
n -1
Ch. 2-27
Calculation Example:
Sample Standard Deviation
Sample
Data (xi) :
10
12
14
n=8
s=
15
17
18
18
24
Mean = x = 16
130
7
4.3095
Measuring variation
Ch. 2-29
12
13
14
15
16
17
18
19
20 21
s = 3.338
13
14
15
16
17
18
19
20 21
s = 0.926
13
14
15
16
17
18
19
Data A
11
12
Data B
11
12
Data C
Copyright 2013 Pearson Education, Inc. Publishing as Prentice Hall
s = 4.570
Ch. 2-31
Select:
data / data analysis / descriptive statistics
Ch. 2-32
Using Excel
n
Ch. 2-33
Using Excel
Enter input
range details
Ch. 2-34
Excel output
Microsoft Excel
descriptive statistics output,
using the house price data:
House Prices:
$2,000,000
500,000
300,000
100,000
100,000
Ch. 2-35
Coefficient of Variation
n
Population coefficient of
variation:
CV = 100%
Copyright 2013 Pearson Education, Inc. Publishing as Prentice Hall
Sample coefficient of
variation:
s
CV = 100%
x
Ch. 2-36
Comparing Coefficient
of Variation
n
Stock A:
n Average price last year = $50
n Standard deviation = $5
s
CVA =
x
Stock B:
n
n
$5
100% =
100% = 10%
$50
s
CVB =
x
$5
100% =
100% = 5%
$100
Both stocks
have the same
standard
deviation, but
stock B is less
variable relative
to its price
Ch. 2-37
Chebychevs Theorem
n
[ k]
Is at least
2
Ch. 2-38
Chebychevs Theorem
(continued)
n
Examples:
At least
within
Ch. 2-39
68%
1
Copyright 2013 Pearson Education, Inc. Publishing as Prentice Hall
Ch. 2-40
95%
99.7%
Ch. 2-41
2.3
Weighted Mean
and Measures of Grouped Data
The weighted mean of a set of data is
n
w x
i
x=
n
w1x1 + w 2 x 2 + ! + w n x n
=
n
i=1
Ch. 2-42
fm
i
x=
i=1
where n = fi
i=1
Ch. 2-43
s2 =
2
f
(m
x
)
i i
i=1
n 1
Ch. 2-44
Measures of Relationships
Between Variables
2.4
Covariance
n
Correlation Coefficient
n
Ch. 2-45
Covariance
n
(x
i
Cov (x , y) = xy =
n
)(yi y )
i=1
(x x)(y y)
i
Cov (x , y) = s xy =
n
n
i=1
n 1
Ch. 2-46
Interpreting Covariance
n
Cov(x,y) > 0
Cov(x,y) < 0
Cov(x,y) = 0
Ch. 2-47
Coefficient of Correlation
n
Cov (x , y)
=
XY
n
Cov (x , y)
r=
sX sY
Copyright 2013 Pearson Education, Inc. Publishing as Prentice Hall
Ch. 2-48
Features of
Correlation Coefficient, r
n
Unit free
Ch. 2-49
r = -1
r = -.6
X
Y
r = +1
r=0
r = +.3
r=0
X
Ch. 2-50
Click OK . . .
Ch. 2-51
Ch. 2-52
r = .733
100
There is a relatively
strong positive linear
relationship between
test score #1
and test score #2
Test #2 Score
95
90
85
80
75
70
70
75
80
85
90
95
100
Test #1 Score
Ch. 2-53
Chapter Summary
n
Symmetric, skewed
Ch. 2-54