Stats formula sheet
Stats formula sheet
1 X
n or a special case
s2x = (xi − x)2
n − 1 i=1 P (A)P (B|A)
P (A|B) =
P (A)P (B|A) + P (Ac )P (B|Ac )
Sample standard deviation
p De Morgan’s Law
sx = s2x
(A ∪ B)c = Ac ∩ B c and (A ∩ B)c = Ac ∪ B c
Inter-quartile range
For a discrete r.v. X, its pmf is
IQR = Q3 − Q1
p(x) = P (X = x).
1.5IQR rule p(x) is non-negative and
Q1 − 1.5IQR, Q3 + 1.5IQR X
p(x) = 1.
all x
Chapter 2 The mean and variance of X are
X
For any event A, µX = x · p(x)
all x
0 ≤ P (A) ≤ 1 and P (Ac ) = 1 − P (A). X X
2
σX = (x−µX )2 ·p(x) = [ x2 ·p(x)]−µ2X .
For an empty set ∅,
all x all x
P (∅) = 0. For a continuous r.v. X with a pdf f (x) and two
real numbers a and b where a < b,
For any two events A and B, Z b
P (A ∪ B) = P (A) + P (B) − P (A ∩ B). P (a < X < b) = f (x)dx.
a
1
STAT 201 Formula Sheet Winter 2023
1 −(x−µ)2
Sampling Distribution of Sn
Binomial Distributions
Let X1 , . . . , Xn be a random sample from a po-
X ∼ Bin(n, p), then pulation with mean µ and variance σ 2 . Let Sn
n! be the sample total. Then
p(x) = px (1−p)n−x for x = 0, 1, . . . , n
x!(n − x)! µSn = nµ and σS2 n = nσ 2 .
2
µX = np and σX = np(1 − p). CLT : Sn follows a normal distribution approxi-
mately when n is sufficiently large.
P (X = x) = dbinom(x, n, p) z-score is
sn − nµ
P (X ≤ x) = pbinom(x, n, p) z= √
nσ
2
STAT 201 Formula Sheet Winter 2023
If H1 : µ > µ0 , then
where
df = nX + nY − 2 and p−value = P (T ≥ tts ) = pt(tts , df, lower.tail = F ).
If H1 : µ < µ0 , then
s
(nX − 1)s2X + (nY − 1)s2Y
sp = .
nX + nY − 2 p − value = P (T ≤ tts ) = pt(tts , df ).
3
STAT 201 Formula Sheet Winter 2023
2
σX
If H1 : µ ̸= µ0 , then To test H0 : 2 = 1, the test statistic is
σY
reject a false null hypothesis when the alterna- p−value = 2∗pt(F , n −1, n −1, lower.tail = F ), if F > 1;
ts X Y ts
tive hypothesis is true.
p−value = 2∗pt(Fts , nX −1, nY −1), if Fts < 1.
To test H0 : µX − µY = 0 :
If you use the F distribution table to find the
p-value, then you need to refer to the lecture
If two samples are independent with unequal
notes method to form the test statistic.
population variances, then the test statistic is
x−y If a set of data has been given, you may use
tts = q , var.test() in R to perferm a test. Please refer to
s2X s2Y
nX + nY R documentation for more info.
with
s2 s2Y 2
( nXX + nY ) Chapter 9
df = s2 s2
.
( nX )2 ( nY )2
X
+ Y You will always need to check proper
nX −1 nY −1
condistions so that the following formu-
If two samples are independent with equal po- lae can be used.
pulation variances, then the test statistic is
x−y One way ANOVA model is
tts = q ,
sp n1X + 1
nY
Xij = µi + ϵij
4
STAT 201 Formula Sheet Winter 2023
I
r
X M SE
SSE = (Ji − 1)s2i x·j· ± tIJ(K−1),α/2 .
IK
i=1
Block design and 23 factorial experiments use
SST otal = SST r + SSE
similar techniques as shown in one-way or two-
The critical value for one way ANOVA F test way factor analysis. Please study these two
is sections for more information.
qf (α, I − 1, N − I, lower.tail = F ). If data are given, you may use lm(), anova() and
confint(), etc. in R to perferm ANOVA tests.
If the factor/treatment effect is significant,
Please refer to R documentation for more info.
compute the CI for each µi as
r
M SE
xi ± tN −I,α/2 .
Ji Chapter 7
Two way ANOVA model is
You will always need to check proper
Xij = µ + αi + βj + γij + ϵijk condistions so that the following formu-
lae can be used.
where i = 1, . . . , I for levels of facotr A, j =
1, . . . , J for levels of factor B and k = 1, . . . , K Correlation coefficient is
for replicates. 1 xi − x yi − y
r= ( )( ) = cor(x, y).
n − 1 sX sY
To test interaction effects and main effcts, use
the two way ANOVA table as follows : The fitted regression line is
ŷ = β̂0 + β̂1 x,
where
Pn
(x − x)(yi − y) sy
β̂1 = Pn i
i=1
2
=r
(x
i=1 i − x) sx
β̂0 = y − β̂1 x.
The estimated effects are
The coefficient of determination is
α̂i = xi·· − x··· Pn Pn
2 SSR − y)2 − i=1 (yi − ŷ)2
i=1 (yi P
r = = n
β̂j = x·j· − x··· SST otal i=1 (yi − y)
2
5
STAT 201 Formula Sheet Winter 2023
is fit<-lm(Resp~FactorA+FactorB, data)
ŷ ± tn−2,α/2 spred anova(fit)
mean(x) sd(x)
var(x) median(x)
min(x) max(x) If functions that you would like to use aren’t provi-
quantile(x) summary(x)
ded in this list, please feel free to search them in the
hist(x) boxplot(x)
lecture notes or R documentation.