Formulasheet
Formulasheet
Statistical Formulas
Vrije Universiteit
School of Business and Economics
Formula Sheets
X
= X = E (X) = xi P (X = xi )
X X
Eh (X) = h (xi ) P (X = xi ) e.g. E (X 3)2 = (xi 3)2 P (X = xi )
E (aX) = aE (X)
X
2
= 2
X = var X = (x )2 P (X = x) = E (X )2 =
= E (X E X)2 = E X 2 (E X)2
var (a X ) = a2 var X aX = jaj X
2
X
X;Y = cov (X; Y ) = (xi X ) (yj Y ) P (X = xi ; Y = y j )
i;j
1 X
N
= (xi X ) (yj Y)
N i=1
1 X
n
sX;Y = (xi x) (yj y)
n 1 i=1
sX;Y
rX;Y =
sX sY
cov (aX; bY ) = ab cov (X; Y )
2 2 2
X+Y = X + Y X+Y = var (X + Y ) = X + Y +2 X;Y
Variables X and Y are independent if and only if for all x and for all y:
P (X = x; Y = y) = P (X = x) P (Y = y)
If X and Y independent:
2 2 2
X+Y = var (X + Y ) = X + Y cov (X; Y ) = 0
Special Distributions:
n n! n!
n Cr = = n Pr =
r r! (n r)! (n r)!
Binomial:
n
P ( X = x) = x
(1 )n x
; EX=n ; var X = n (1 )
x
(normal approximation if n 5 and n (1 ) 5)
(Poisson approximation if n 20 and 0:05)
Hypergeometric ( = S=N ):
S N S
x n x N n
P ( X = x) = N
; EX = n ; var X = n (1 )
n
N 1
3
Geometric:
1
P ( X = x) = (1 )x 1
; for x = 1; 2; : : : EX = ; var X = (1 )= 2
Uniform discrete:
1
P (X = x) = for x = a; a + 1; : : : ; b
b a+1
1 1
EX = (a + b) ; var X = (b a + 1)2 1
2 12
Uniform continuous:
1 1 1
f (x) = for a x b EX= (a + b) ; var X = (b a)2
b a 2 12
Exponential:
Normal:
1 1
f (x) = p exp 2
(x )2 ; EX= ; var X = 2
2 2
Estimators.
Estimator for :X
2 2
N n
EX = ; var X = or
n n N 1
X
p N (0; 1) (normal popul.; or symmetric popul. and n 15; or n 30)
= n
X
p tn 1 (normal popul.; or symmetric popul. and n 15; or n 30)
S= n
2
Estimator for : S2
(n 1) S 2 2
2 n 1 (normal populations)
(1 ) (1 )N n
Ep = ; var p = or
n n N 1
4
Con…dence intervals:
r
N n
x z =2 p or x z =2 p
n n N 1
r
s s N n
x tn 1; =2 p or x tn 1; =2 p
n n N 1
r r r
p (1 p) p (1 p) N n
p z =2 or p z =2
n n N 1
p
x z =2 x
(n 1) s2 2 (n 1) s2
2 2
n 1; =2 n 1;1 =2
Sample size:
r
(1 )
E=z =2 p or E=z =2
n n
Testing hypotheses
2
x 0 x 0 2
z= or z= q n = (Z + Z )
p N n
n p
n N 1
x 0 x 0
t= or t= q
ps
n ps N n
n N 1
r
n 2
t=r df = n 2
1 r2
p
z = rS n 1
1 1
x n 0 2
x n 0 2
z=p or z=p q
n 0 (1 0) n 0 (1 0)
N n
N 1
2 (n 1) s2
= 2
(normal populations)
0
x1 x2 ( 1 2)
z= p 2 2
1 =n1 + 2 =n2
2
s21 s22
x1 x2 ( 1 +
2) n1 n2
t= p ; df = 2 2
2 2 s2 s2
s1 =n1 + s2 =n2 1
n1
2
n2
n1 1
+ n2 1
5
x1 x2 ( 1 2) (n1 1) s21 + (n2 1) s22
t= q ; df = n1 + n2 2; s2p =
2 2
sp =n1 + sp =n2 (n1 1) + (n2 1)
p1 p2 ( 1 2)
z=p
p1 (1 p1 ) =n1 + p2 (1 p2 ) =n2
p1 p2 0 x 1 + x2
z=p ; p=
p (1 p) =n1 + p (1 p) =n2 n1 + n2
s21
F = (df1 ; df2 ) = (n1 1; n2 1) (normal populations)
s22
0
X
n
n0 (n0 + 1) n0 (n0 + 1) (2n0 + 1)
W = Ri+ ; EW = ; var W =
i=1
4 24
n1
X n1 (n1 + n2 + 1) n1 n2 (n1 + n2 + 1)
T1 = Ri ; E T1 = ; var T1 =
i=1
2 12
X (fjk ejk )2
2
= ; df = (r 1) (c 1) ejk 5
ejk
Simple Regression:
P P
SSxy (xi x) (yi y) xi yi nxy
b1 = = P 2 = P 2
SSxx (xi x) xi nx2
b0 = y b1 x
X X
e2i = SSE = (yi yb)2
P
2 2 e2i
b = s = M SE =
n k 1
X
n
(et et 1 )2
t=2
DW =
X
n
e2t
t=1
b1 1 b
t= where sB1 = qP ; df = n k 1
sB1 2
(xi x)
6
r
n 2
t=r ; df = n k 1
1 r2
p p 1 (xi x)2
ybi tn 2b hi resp. ybi tn 2b 1 + hi where hi = +P
n (xi x)2
Multiple Regression
Sum of Mean
Squares df Squares
P
Regression (b
y y)2 k
P i
Residual (y ybi )2 n k 1 b2error
P i
Total (yi y)2 n 1
P P
(b
yi
2 y)2 e2i
R =P =1 P
(yi y)2 (yi y)2
B Std.Error Beta t
b0
(Constant) b0 s (B0 ) s(B0 )
b1
X1 b1 s (B1 ) ::: s(B1 )
b2
X2 b2 s (B2 ) ::: s(B2 )
::: ::: ::: ::: :::
1
P
2 n 1 n k 1
e2i
Radj =1 1 R2 =1 P
n k 1 1
n 1
(yi y)2
partial F
2 (k + 1)
hi
n
7
ANOVA (c columns, n1 ; n2 ; : : : ; nc observations per column)
Tukey
s
yj yk 1 1
Tcalc = r Tc;n c ; Crit.Range = Tc;n c M SE +
1 1
nj nk
M SE nj
+ nk
Kruskal-Wallis
!
12 X Tj2
c
H= 3 (n + 1) df = c 1; nj 5
n (n + 1) 1 nj