Statistical Methods I: Stor 455
Statistical Methods I: Stor 455
455
STATISTICAL
METHODS
I
Jan
Hannig
9/16/10
STOR455 Lecture 8
Homework
Comments
The
test/CI
for
standard
deviaEon
does
not
use
T
but
chi-square.
9/16/10
STOR455 Lecture 8
Parameters
0
the
intercept
1
the
slope
2=var(i)
the
variance
of
the
error
term
9/16/10
STOR455 Lecture 8
9/16/10
STOR455 Lecture 8
EsEmaEon of 2
s2 =
)
( Yi Y
i
n 2
ei
SSE
=
=
= MSE
n 2 df E
s = s = Root MSE
AlternaEve
expression
for
SSE
9/16/10
STOR455 Lecture 8
SMSA
Data
Info
about
440
most
populous
counEes
between
1990
and
1992
From
US
Bureau
of
the
Census
Variables:
id,
area,
total
populaEon,
percent
of
city
populaEon,
percent
of
populaEon
65
or
older,
number
of
physicians,
number
of
hospital
bed,
percent
high
school
graduates,
total
labor,
total
income,
total
crime,
region
(NE,
NC,
S,
W)
9/16/10
STOR455 Lecture 8
Do it in SAS
9/16/10
STOR455 Lecture 8
Do it in SAS
h
c
i
o
s
g
u p
u
t
a
p p
h b
n c
t
O n
a
r
y o p b
c s a p e i
i r
b it
t
e t
o l h e
r c c o m n
n e
s dy
e
a p
n d y d
i h h v p c
c g
1
2
3
4
5
6
7
8
1 Los_Ange CA 4060 8863164 32.1 9.7 23677 27700 688936 70.0 22.3 11.6 8.0 20786 184230 4
2 Cook IL 946 5105067 29.2 12.4 15153 21550 436936 73.4 22.8 11.1 7.2 21729 110928 2
3 Harris TX 1729 2818199 31.3 7.1 7553 12449 253526 74.9 25.4 12.5 5.7 19517 55003 3
4 San_Dieg CA 4205 2498016 33.5 10.9 5905 6179 173821 81.9 25.3 8.1 6.1 19588 48931 4
5 Orange CA 790 2410556 32.6 9.2 6062 6369 144524 81.2 27.8 5.2 4.8 24400 58818 4
6 Kings NY 71 2300664 28.3 12.4 4861 8942 680966 63.7 16.6 19.5 9.5 16803 38658 1
7 Maricopa AZ 9204 2122101 29.2 12.5 4320 6104 177593 81.5 22.1 8.8 4.9 18042 38287 4
8 Wayne MI 614 2111687 27.4 12.5 3823 9490 193978 70.0 13.7 16.9 10.0 17461 36872 2
9/16/10
STOR455 Lecture 8
SMSA
data
RelaEon
between
#
of
physicians
and
populaEon,
land
area,
income
RelaEon
between
crime
and
populaEon
Is
there
any
regional
dierence?
9/16/10
STOR455 Lecture 8
Do
it
in
SAS
/* Physician vs.
total population
*/
proc reg
data=smsa;
model phy=tp;
plot (phy)*(tp);
run;
9/16/10
STOR455 Lecture 8
Do
it
in
SAS
440
440
Analysis of Variance
Source
DF
Sum of
Mean
Squares
Square
F Value
Model
1 1243181164 1243181164
Error
438
163025135
372204
Corrected Total
439 1406206299
Pr > F
3340.06
<.0001
Root MSE
610.08483 R-Square 0.8841
Dependent Mean
987.99773 Adj R-Sq 0.8838
Coeff Var
61.74962
Parameter Estimates
Variable
Parameter
Standard
DF
Estimate
Error
Intercept 1
tp
1
9/16/10
t Value
Pr > |t|
-110.63478
34.74602
-3.18
0.0016
0.00280 0.00004837
57.79
<.0001
STOR455 Lecture 8
Do it in SAS
9/16/10
STOR455 Lecture 8
Do it in SAS
STOR455 Lecture 8
Do it in SAS
STOR455 Lecture 8
Do it in SAS
9/16/10
STOR455 Lecture 8
Do it in SAS
Model: MODEL1
Dependent Variable: cri
103
103
Analysis of Variance
Source
DF
Sum of
Mean
Squares
Square
F Value
Model
1 2.311376E11 2.311376E11
Error
101 2.418884E11 2394934612
Corrected Total
102 4.73026E11
Pr > F
96.51
Root MSE
48938 R-Square 0.4886
Dependent Mean
23086 Adj R-Sq 0.4836
Coeff Var
211.98132
Parameter Estimates
Variable
Parameter
Standard
DF
Estimate
Error
Intercept 1
tp
1
9/16/10
-27958
0.12895
t Value
Pr > |t|
7088.63230
-3.94
0.0001
0.01313
9.82
<.0001
STOR455 Lecture 8
<.0001
Model: MODEL1
Dependent Variable: cri
Number of Observations Read
Number of Observations Used
Do
it
in
SAS
108
108
Analysis of Variance
Source
DF
Sum of
Mean
Squares
Square
F Value
Model
1 2.309087E11 2.309087E11
Error
106 8997722792
84884177
Corrected Total
107 2.399064E11
Root MSE
9213.26095 R-Square
Dependent Mean
21781 Adj R-Sq
Coeff Var
42.30003
Pr > F
2720.28
<.0001
0.9625
0.9621
Parameter Estimates
Variable
Parameter
Standard
DF
Estimate
Error
Intercept 1
tp
1
9/16/10
t Value
Pr > |t|
-7159.19264 1045.87028
-6.85
<.0001
0.08360
0.00160
52.16
<.0001
STOR455 Lecture 8
Change
of
Units
Fahrenheit
=
32
+(9/5).Celsius
What
happens
to
the
regression
line
if
we
change
units?
Y*=c+d.Y
X*=a+b.X
The
regression
funcEon
Y(x)
=
0
+
1x,
Y*(x*)
=
*0
+
*1x*
*1=(d/b).1,
*0=c+(d/b).(b.0-a.
1),
*=|d|
Same
for
esEmator
STOR455 Lecture 8
9/16/10
STOR455 Lecture 8
9/16/10
STOR455 Lecture 8
DiagnosEcs
for
X
DistribuEon
of
X
outliers,
inuenEal
cases
range
and
concentraEon
variance,
skewness
9/16/10
STOR455 Lecture 8
9/16/10
Boxplot
dot
plot,
stem-leaf
plot
Q-Q
plot
Time
sequence
plot
STOR455 Lecture 8
data
score;
input
x;
cards;
21.5
16
20.5
16
;
Do it in SAS
9/16/10
proc
univariate
plot:
Summary
staEsEcs,
stem-
leaf,
box,
and
QQ
plots.
STOR455 Lecture 8
Do
it
in
SAS
The
UNIVARIATE
Procedure
Variable:
x
Moments
N
33
Sum
Weights
33
Mean
19.0606061
Sum
ObservaMons
629
Std
DeviaMon
3.12439388
Variance
9.76183712
Skewness
0.179907
Kurtosis
-1.3149512
Uncorrected
SS
12301.5
Corrected
SS
312.378788
Coe
VariaMon
16.3918916
Std
Error
Mean
0.54388716
Basic
StaMsMcal
Measures
LocaMon
Variability
Mean
19.06061
Std
DeviaMon
3.12439
Median
18.50000
Variance
9.76184
Mode
15.50000
Range
10.00000
InterquarMle
Range
5.50000
9/16/10
STOR455 Lecture 8
Do
it
in
SAS
9/16/10
STOR455 Lecture 8
Do
it
in
SAS
The
UNIVARIATE
Procedure
Variable:
x
Extreme
ObservaMons
----Lowest----
----Highest---
Value
Obs
Value
Obs
14.5
28
23.5
3
15.0
13
23.5
19
15.0
5
23.5
27
15.5
32
24.0
6
15.5
30
24.5
15
Stem
Leaf
#
Boxplot
24
05
2
|
23
555
3
|
22
005
3
|
21
0555
4
+-----+
20
05
2
|
|
19
0
1
|
+
|
18
05555
5
*-----*
17
555
3
|
|
16
00
2
+-----+
15
0055555
7
|
14
5
1
|
9/16/10
STOR455 Lecture 8
Toluca
Example
Toluca
Company
try
to
nd
out
the
relaEonship
between
lot
size
and
labor
hours
needed
to
produce
the
lot
9/16/10
STOR455 Lecture 8
Do
it
in
SAS
data
lot;
inle
T:\...\CH01TA01.TXT';
input
size
hours;
run;
proc
print
data=lot;
proc
reg
data=lot;
model
hours=size;
plot
hours*size;
run;
9/16/10
STOR455 Lecture 8
Analysis of Variance
Source
DF
Model
1
Error
23
Corrected Total 24
Sum of
Squares
Mean
Square F Value Pr > F
252378
252378 105.88
54825 2383.71562
307203
Root MSE
Dependent Mean
Coeff Var
48.82331
R-Square
312.28000 Adj R-Sq
15.63447
<.0001
???
0.8138
Parameter Estimates
9/16/10
Variable
DF
Parameter
Estimate
Intercept
size
1
1
62.36586
3.57020
Standard
Error t Value Pr > |t|
26.17743
0.34697
STOR455 Lecture 8
2.38
10.29
0.0259
<.0001
9/16/10
STOR455 Lecture 8
Residuals
The
ei
should
be
similar
to
the
i
The
model
assumes
i
iid
N(0,
)
9/16/10
STOR455 Lecture 8
Residuals
9/16/10
STOR455 Lecture 8
Sca]er
plot
of
Y
vs
X
Sca]er
plot
of
e
vs
X
Plot
of
e
vs
X
emphasize
deviaEons
from
linear
pa]ern
9/16/10
STOR455 Lecture 8
Do
it
in
SAS
Obs
x
y
STOR455 Lecture 8
Do
it
in
SAS
proc
reg
data=resid
noprint;
model
y=x;
plot
y*x
r.*x
student.*x;
run;
9/16/10
STOR455 Lecture 8