02
02
Chapter 2
Mode Variance
Standard Deviation
Coefficient of Variation
x i
x i1
n
Arithmetic Midpoint of Most frequently
average ranked values observed value
(if one exists)
Copyright © 2013 Pearson Education Ch. 2-4
Arithmetic Mean
The arithmetic mean (mean) is the most
common measure of central tendency
For a population of N values:
N
xx1 x 2 xN
i Population
μ
i1
values
N N
Population size
x i
x1 x 2 xn Observed
x i1
values
n n
Sample size
Copyright © 2013 Pearson Education Ch. 2-5
Arithmetic Mean
(continued)
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
Mean = 3 Mean = 4
1 2 3 4 5 15 1 2 3 4 10 20
3 4
5 5 5 5
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
Median = 3 Median = 3
n 1
Note that is not the value of the median, only the
2
position of the median in the ranked data
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 0 1 2 3 4 5 6
No Mode
Mode = 9
Copyright © 2013 Pearson Education Ch. 2-9
Review Example
$2,000,000
500,000 $500 K
300,000 $300 K
100,000
100,000
$100 K
$100 K
House Prices:
Mean: ($3,000,000/5)
$2,000,000 = $600,000
500,000
300,000
100,000
100,000 Median: middle value of ranked data
Sum 3,000,000
= $300,000
Q1 Q2 Q3
The first quartile, Q1, is the value for which 25% of the
observations are smaller and 75% are larger
Q2 is the same as the median (50% are smaller, 50% are
larger)
Only 25% of the observations are greater than the third
quartile
(n = 9)
Q1 = is in the 0.25(9+1) = 2.5 position of the ranked data
so use the value half way between the 2nd and 3rd values,
so Q1 = 12.5
Variation
Same center,
different variation
Copyright © 2013 Pearson Education Ch. 2-19
Range
Example:
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Range = 14 - 1 = 13
7 8 9 10 11 12 7 8 9 10 11 12
Range = 12 - 7 = 5 Range = 12 - 7 = 5
Sensitive to outliers
1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,5
Range = 5 - 1 = 4
1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,120
Range = 120 - 1 = 119
IQR = Q3 - Q1
Example:
Median X
X Q1 Q3 maximum
minimum (Q2)
25% 25% 25% 25%
12 30 45 57 70
σ 2 i1
N
Where μ = population mean
N = population size
xi = ith value of the variable x
Copyright © 2013 Pearson Education Ch. 2-26
Sample Variance
s
2 i 1
n -1
Where X = arithmetic mean
n = sample size
Xi = ith value of the variable X
Copyright © 2013 Pearson Education Ch. 2-27
Population Standard Deviation
i
(x μ) 2
σ i1
N
Copyright © 2013 Pearson Education Ch. 2-28
Sample Standard Deviation
(x x)
Sample standard deviation: 2
i
S i1
n -1
11 12 13 14 15 16 17 18 19 20 21
s = 3.338
(compare to the two
Data A cases below)
11 12 13 14 15 16 17 18 19 20 21
s = 0.926
(values are concentrated
Data B near the mean)
s = 4.570
11 12 13 14 15 16 17 18 19 20 21 (values are dispersed far
Data C from the mean)
Standard deviation = $5
s $5
CVA 100% 100% 10%
x $50 Both stocks
Stock B: have the same
standard
Average price last year = $100 deviation, but
stock B is less
Standard deviation = $5 variable relative
to its price
s $5
CVB 100% 100% 5%
x $100
Copyright © 2013 Pearson Education Ch. 2-34
Chebychev’s Theorem
At least within
(1 - 1/1.52) = 55.6% ……... k = 1.5 (μ ± 1.5σ)
(1 - 1/22) = 75% …........... k = 2 (μ ± 2σ)
(1 - 1/32) = 89% …….…... k = 3 (μ ± 3σ)
68%
μ
μ 1σ
Copyright © 2013 Pearson Education Ch. 2-37
The Empirical Rule
(continued)
μ 2σ contains about 95% of the values in
the population or the sample
μ 3σ contains almost all (about 99.7%) of
the values in the population or the sample
95% 99.7%
μ 2σ μ 3σ
xi - μ
z
σ
x i - μ 121 - 100
z 1.4
σ 15
A score of 121 is 1.4 standard
deviations above the mean.
w x i i
w1x1 w 2 x 2 w n xn
x i1
n n
Where wi is the weight of the ith observation
and n w i
fm i i
K
where n fi
x i 1
i 1
n
i i
f (m x)2
s2 i1
n 1
425 430 430 435 435 435 435 435 440 440
440 440 440 445 445 445 445 445 450 450
450 450 450 450 450 460 460 460 465 465
465 470 470 472 475 475 475 480 480 480
480 485 490 490 490 500 500 500 500 510
510 515 525 525 525 535 549 550 570 570
575 575 580 590 600 600 600 600 615 615
x x
34, 356
i
490.80
n 70
425 430 430 435 435 435 435 435 440 440
440 440 440 445 445 445 445 445 450 450
450 450 450 450 450 460 460 460 465 465
465 470 470 472 475 475 475 480 480 480
480 485 490 490 490 500 500 500 500 510
510 515 525 525 525 535 549 550 570 570
575 575 580 590 600 600 600 600 615 615
425 430 430 435 435 435 435 435 440 440
440 440 440 445 445 445 445 445 450 450
450 450 450 450 450 460 460 460 465 465
465 470 470 472 475 475 475 480 480 480
480 485 490 490 490 500 500 500 500 510
510 515 525 525 525 535 549 550 570 570
575 575 580 590 600 600 600 600 615 615
425 430 430 435 435 435 435 435 440 440
440 440 440 445 445 445 445 445 450 450
450 450 450 450 450 460 460 460 465 465
465 470 470 472 475 475 475 480 480 480
480 485 490 490 490 500 500 500 500 510
510 515 525 525 525 535 549 550 570 570
575 575 580 590 600 600 600 600 615 615
375 400 425 450 475 500 525 550 575 600 625
Q1 = 445 Q3 = 525
Q2 = 475
Box Plot
375 400 425 450 475 500 525 550 575 600 625
s2 (x
i x ) 2
2, 996.16
n1
Standard Deviation
the standard
deviation is
s s 2 2996.47 54.74
about 11% of
of the mean
Coefficient of Variation
s 54.74
100 % 100 % 11.15%
x 490.80
-1.20 -1.11 -1.11 -1.02 -1.02 -1.02 -1.02 -1.02 -0.93 -0.93
-0.93 -0.93 -0.93 -0.84 -0.84 -0.84 -0.84 -0.84 -0.75 -0.75
-0.75 -0.75 -0.75 -0.75 -0.75 -0.56 -0.56 -0.56 -0.47 -0.47
-0.47 -0.38 -0.38 -0.34 -0.29 -0.29 -0.29 -0.20 -0.20 -0.20
-0.20 -0.11 -0.01 -0.01 -0.01 0.17 0.17 0.17 0.17 0.35
0.35 0.44 0.62 0.62 0.62 0.81 1.06 1.08 1.45 1.45
1.54 1.54 1.63 1.81 1.99 1.99 1.99 1.99 2.27 2.27
Rent (€) fi Mi Mi - x (M i - x )2 f i (M i - x )2
420-439 8 429.5 -63.7 4058.96 32471.71
440-459 17 449.5 -43.7 1910.56 32479.59
460-479 12 469.5 -23.7 562.16 6745.97
480-499 8 489.5 -3.7 13.76 110.11
500-519 7 509.5 16.3 265.36 1857.55
520-539 4 529.5 36.3 1316.96 5267.86
540-559 2 549.5 56.3 3168.56 6337.13
560-579 4 569.5 76.3 5820.16 23280.66
580-599 2 589.5 96.3 9271.76 18543.53
600-619 6 609.5 116.3 13523.36 81140.18
Total 70 208234.29
continued
Sample Variance
s2 = 208,234.29/(70 – 1) = 3,017.89
Sample Standard Deviation
s 3,017.89 54.94
Covariance
a measure of the direction of a linear relationship
between two variables
Correlation Coefficient
a measure of both the direction and the strength of a
linear relationship between two variables
(x i x )(yi y )
Cov (x , y) xy i1
N
The sample covariance:
n
(x x)(y y)
i i
Cov (x , y) sxy i1
n 1
Only concerned with the strength of the relationship
No causal effect is implied
Copyright © 2013 Pearson Education Ch. 2-66
Interpreting Covariance
Unit free
Ranges between –1 and 1
The closer to –1, the stronger the negative linear
relationship
The closer to 1, the stronger the positive linear
relationship
The closer to 0, the weaker any positive linear
relationship
X X X
r = -1 r = -.6 r=0
Y
Y Y
X X X
r = +1 r = +.3 r=0
Copyright © 2013 Pearson Education Ch. 2-70
Interpreting the Result
95
Test #2 Score
There is a relatively 90
85
relationship between 75
test score #1 70
70 75 80 85 90 95 100
Test #1 Score
and test score #2
x y ( xi x ) ( yi y ) ( xi x )( yi y )
277.6 69 10.65 -1.0 -10.65
259.5 71 -7.45 1.0 -7.45
269.1 70 2.15 0 0
267.0 70 0.05 0 0
255.6 71 -11.35 1.0 -11.35
272.9 69 5.95 -1.0 -5.95
Mean 267.0 70.0 Total -35.40
Std. Dev. 8.2192 0.8944
Sample Covariance
sxy
( x x )( y
i i y)
35.40
7.08
n1 61
Sample Correlation Coefficient
sxy 7.08
rxy -0.9631
sx sy (8.2192)(0.8944)