Finalformulaesheet1 7
Finalformulaesheet1 7
2a
3(mean − median) p
SK =
standard deviation index p th percentile = ( n + 1)
100
x−µ x−x
IQR = Q − Q z= or z=
3 1 σ s
s σ
CV = ⋅100% or CV = ⋅100%
Coefficient of Variation: x µ
1
Chebyshev's Rule: at least 1 − 2 of the data fall within k standard deviations of the mean
k
j
Median for grouped data (similar for any fractile): x ≈ L + c
f
Where L is the lower boundary of the class into which the median must fall, f is the frequency of
this class, c is the class interval, and j is the number of values we still lack when we reach L.
P ( A) ≥ 0 for any event A; P ( A) ≤ 1 for any event A; P (φ ) = 0; P ( A) + P ( A′) = 1
If A and B are mutually exclusive events, then P(A ∪B)=P(A)+P(B).
P( A ∩ B)
If P(B) is not equal to zero, then: P (A B) =
P( B)
If A and B are independent: P(A) = P(A|B) and P(A ∩ B) = P(A) P(B)
P (B A) P (A)
Bayes’ Theorem: P ( A B) =
P ( B A ) P ( A ) + P ( B A′ ) P ( A′ )
P(at least one success) = 1 – P(zero successes)
If the probabilities of obtaining the amounts a1, a2, a3, … , or ak are p1, p2, p3, … , and pk ,
where p1 + p2 + p3 + … + pk = 1, then the mathematical expectation is E = a p + a p + …+ a p
1 1 2 2 k k
x n – x for x = 0, 1, 2, …, or n, µ = np, σ2 =np(1 − p)
Binomial: f(x) = nCx p (1 – p)
C x b Cn − x
Hypergeometric: P ( X = x) = f ( x) = a
for x = 0, 1, 2, …, or n; x < a, (n – x) < b
a + b Cn
e−λ λ x
P ( X = x) = f ( x) = , µ =λ
Poisson: x!
e− np (np ) x
Poisson Approx. to the Binomial P( X = x) ≈ f ( x) = assumes np < 10, n > 100
x!
x−µ
Normal distribution: Z=
σ
Sampling distribution for means – infinite population:
σ X2 σ X x − µ x − µ
µX = µX σ X = 2
σ X = Z = x
=
n n σ x σ
n
Sampling distribution for means – finite population:
σX N −n
E ( X ) = µX = µX = E ( X ) σX =
n N −1
Finding the sample size in estimation of means situations:
2
⎡ zα ⋅ σ ⎤
σ
Formula in text: n = ⎢ 2 ⎥ or use E = zα
⎢ E ⎥ n
⎢⎣ ⎥⎦ 2
Estimating µ, σ known, (1 − α)·100%, (large sample case , use s for σ if σ not known):
σ σ σ
x − Zα ≤ µ ≤ x + Zα equivalent to: µ = x ± Z α
2 n 2 n 2 n
Estimating µ, σ unknown, (1 − α)·100%, (small sample case):
s s s
x −t α ≤ µ ≤ x +t α equivalent to: µ = x ± t α
n −1,
2 n n −1,
2 n n −1,
2 n
(n − 1) s 2 (n − 1) s 2
2
Estimating σ , (1 − α)·100% , (small sample case): <σ2 <
χ α2 χ2 α
(1− )
2 2
s s
Estimating σ, (1 − α)·100%, (large sample case): <σ <
z z
1+ α /2 1− α /2
2n 2n
pˆ (1 − pˆ ) pˆ (1 − pˆ )
pˆ − Zα / 2 ⋅ < p < pˆ + Zα / 2 ⋅
Estimating p, (1 − α)·100%: n n
2
x − µ0
z=
Test about µ, σ known, (large sample case use s for σ if σ not known): σ
n
x − µ0
tn −1 =
Test about µ, σ unknown, (small sample case): s
n
Test about µ1 and µ2, σ’s known (or CLT applicable: large indep. samples case use si for σi):
x1 − x2 − δ
z=
σ 12 σ 22
+
n1 n2
Pooled variance t-test about µ1 and µ2, σ’s unknown, (small independent samples case):
x1 − x2 − δ (n1 − 1) s12 + (n2 − 1) s22
tn1 + n2 − 2 = where s p =
1 1 n1 + n2 − 2
sp +
n1 n2
Test about the means of Paired or Matched Data: Find differences, and then use either
the one-sample Z-test or t-test about µ, whichever is applicable.
(n − 1) s 2
Test concerning one standard deviation (Small Sample Case): χ 2
n −1 =
σ 02
Test concerning one standard deviation (Large Sample Case): s −σ0
z=
σ0
2 2
2n
s s
Test concerning two standard deviations: F = 1
2
or F = 2
2
, whichever is > 1
s
2 s
1
x − np0
Test concerning one proportion: z = ; (np > 5, n(1 − p) >5)
np0 (1 − p0 )
x1 x2
−
n1 n2 x1 + x2
Test concerning two proportions:
z= pˆ =
1 1 n1 + n2
pˆ (1 − pˆ )( + )
n1 n2
( o − e) 2 o2
Test for r by c contingency table: χ 2
( c −1)( r −1) =∑ = ∑ −n
e e
(o − e) 2 o2
Test for goodness of fit: χ 2
k − m −1 =∑ = ∑ −n
e e
3
ANOVA (unequal sample size)
Source of Degrees Sum of Mean Fk−1,N-k
Variation of Freedom Squares Square
Treatment k−1 SS(Tr) MS(Tr)= MS(Tr)/ MSE
SS(Tr)/(k−1)
Error N−k SSE MSE=
SSE/(N−k))
Total N−1 SST
k ni k ni
1
SST = ∑ ∑ x − ⋅ T••2
2
ij N = ∑ ni Ti• = ∑ xij
i =1 j =1 N i =1 j =1
ni
k k k
Ti•2 1 2
T•• = ∑ Ti• = ∑
i =1 i =1
∑ xij SS (Tr ) = ∑
j =1 i =1 n
− ⋅ T••
N
SSE = SST − SS (Tr )
i
Regression: yˆ = a + bx ei = yi − yˆi
The standard error of estimate :
S yy − bS xy 1 1 1
se = ; S yy = ∑ y 2 − ( ∑ y ) 2 , S xy = ∑ xy − ( ∑ x )( ∑ y ), S xx = ∑ x 2 − ( ∑ x ) 2
n−2 n n n
a b r n−2
Tests for α = 0, β = 0, and ρ = 0: tn −2 = tn − 2 = tn − 2 =
1 x 2 se 1− r2
se +
n S xx S xx
(1 − α)•100% confidence interval for the estimate of the mean of y when x = x0
1 ( x0 − x ) 2
µ y = yˆ 0 ± tα ⋅ se +
2
, n−2 n S xx
1 ( x0 − x ) 2
(1 − α)•100% prediction interval for y when x = x0 y = yˆ 0 ± tα , n − 2 ⋅ se 1 + +
2 n S xx
ANOVA (Simple Linear Regression)
Source of Degrees Sum of Mean F1,n−2
Variation of Freedom Squares Square
Regression 1 SSR MSR = SSR MSR/ MSE
Error n−2 SSE MSE = SSE/(n−2)