Stat 151 Formulas
Stat 151 Formulas
Numerical Summaries :
i Sample Mean : y =
yi
n
( yi y ) 2
n 1
2
i Sample Variance : s
i Range : max min
i Interquartile Range : IQR = Q3 Q1
i Sample Standard Deviation : s =
i Least-Squares regression: y = b0 + b1 x ,
sy
where b1 = r , b0 = y b1 x ,
sx
s
1 n
i Correlation: r =
z x z y = b1 x
n 1 i =1
sy
p (1 p )
pq
=
, where q = 1 p
n
n
For large n ( np 10 and n(1 p) 10 ), the
sampling distribution of p is approximately
normal.
Inference for one population mean:
Sample size: For a 100(1 )% confidence with
SD( p ) = p =
z* / 2
margin of error E , n =
E
Case 1: is known
Probability Formulas:
Complement: P( AC ) = 1 P ( A)
Conditional probability: P ( A | B) =
n
If population distribution is normal, then Y has
a normal distribution.
Central Limit Theorem: For large n (n > 30) ,
the sampling distribution of Y is approximately
normal.
n
Test-Statistic for testing H 0 : = 0 :
TS =
Y 0
/ n
Case 2: is unknown
s
n
Y 0
~ t distribution with df = n 1 ,
S/ n
if H 0 is true.
Large Sample Inference for one proportion:
Sample size: For a 100(1 )% confidence with
2
z*
margin of error E , n = / 2 p (1 p )
E
An approximate 100(1 )% C.I. for p :
p (1 p )
n
Test-Statistic for testing H 0 : p = p0 :
p p 0
~ Normal distribution if H 0 is
TS =
p 0 (1 p 0 )
n
true.
p z* / 2
Independent samples:
Case 1: Assuming 1 = 2
Sp =
1 1
+
Y1 Y2 tn*1 + n2 2, /2 S p
n1 n2
Y1 Y2 0
~ t distribution with
1 1
Sp
+
n1 n2
df = n1 + n 2 2 if H 0 is true.
S2 S2
*
Y1 Y2 tMin ( n1 1,n2 1), /2 1 + 2
n1 n2
TS =
Y1 Y2 0
2
1
2
2
S
S
+
n1 n2
~ t distribution with
TS =
D 0
SD / n
if H 0 is true.
~ t distribution with df = n 1
n1Y1 + n2Y2 +
N
+ nI
+ nk Yk
df = Min(n1 1, n 2 1) if H 0 is true.
Paired samples: D = Y1 Y 2
y1 + y2 n1 p1 + n2 p 2
=
n1 + n2
n1 + n2
p 1 (1 p 1 ) p 2 (1 p 2 )
+
n1
n2
= k
MS ( B)
~ F distribution, if H 0 is true,
MS (W )
with df1 = k 1 and df 2 = N k
TS =
Model: {Y | X } = 0 + 1 X , 0 = Intercept ,
1 = Slope
Estimated model: {Y | X } = b0 + b1 X , where
sy
b1 = r and b0 = y b1 x
sx
SS ( Error )
n2
- the standard error of the model
= se = MS Error =
(n 1) s
2
x
SE (b1 ) =
se
(n 1) sx
1 ( x x ) 2
+
n (n 1) sx2
Coefficient of Determination:
SS (Total) SS (Error)
R2 = r 2 =
SS (Total)
SS (Regression)
=
SS (Total)
Chi-Square Test:
TS =
1 ( x x ) 2
+
n (n 1) sx2
Obs Exp
2
TS =
~ distribution with
Exp
i =1
df = k 1 if H 0 is true.
k
Expected Values :
= E[ X ] = x p( x)
i 2 = Var[ X ] = ( x ) 2 p( x)
i E[aX bY c] = aE[ X ] bE[Y ] c
i If X are Y independent, then
Var[aX bY c] = a 2Var[ X ] + b 2Var[Y ]
Standardized values :
Observation Mean
z values =
Standard Deviation