Stats Cheat Sheets
Stats Cheat Sheets
sx s y
n 1 i 1
Covariance and correlation summarize how strong a linear relationship there is between
two variables.
The sign of the covariance and the correlation will have the same sign.
Facts: 1 rxy 1 The closer r is to 1 the stronger the linear relationship is with a positive
slope. When one goes up, the other tends to go up. The closer r is to -1 the stronger the
linear relationship is with a negative slope. When one goes up, the other tends to go
down. The closer to zero has r has little linear relationship.
Linearly Related Variables: y co c1 x
The variable y is a linear function of the variable x. co : the intercept c1 : the slope.
We think of the cs as constants (fixed numbers) while x and y vary.
y co c1 x
y co c1 x
s y | c1 | sx
s 2y c12 s12
Portfolio: c is portfolio weights, x is the return on the portfolio output
y co c1 x1 c2 x2 ...ck xk
y co c1 x1 c2 x2 ...ck xk This is the avg. return on the portfolio
m
2 2
1 x1
2 2
2 x2
2
A Discrete Random Variable is a numerical quantity we are unsure about. We quantify
our uncertainty by 1. Listing the numbers it could turn out to be and 2. Assigning to each
number a probability.
Probabilities are number between 0 and 1, that SUM up to one.
P(X=x)
Px (X)
all mean the same thing.
p(X)
a x b
x
1
2
3
4
5
6
p(x)
1/6
1/6
1/6
1/6
1/6
1/6
p ( x )i
p( x) x
all _ x
p( x)( x
)2
all _ x
Variance
( x X ) 2 standard error
2
X
Weighted by probability
Value of X
Expected prediction of x
This is the weighted average of the squared prediction error.
Example:
Xs
Weights
s
1
2
3
4
p(s)
.095
.23
.44
.235
Var(s)=.095*(1-2.815)2+.23*(2-2.815)2+.44*(3-2.815)2
+.235*(4-2.815)2=.811
Standard Error from Expected Value 2.815
S,Ep(S,E)
0.25S=4
0.07E=1
up
0.3E=0
down
4,1
0.175
0.5S=3
3,1
0.35
0.2S=2
2,1
0.14
1,1
0.035
0.2S=4
4,0
0.06
0.3S=3
3,0
0.09
0.3S=2
2,0
0.09
0.2S=1
1,0
0.06
0.05S=1
The conditional probability that a random variable Y turns out to be y given that X is x.
P(Y=y|X=x) or p(y|x)
4
Conditional = Joint/Marginal
P(S=4|E=1) =p(S=4 and E=1)/p(E=1)=.175/.7=.25
P(E=1|S=4)=p(S=4 and E=1)/p(S=4)=.175/.235=.745
Example Table
S
E
0
1
p( x | y)
1
.06
.035
.095
p ( y, x )
p( y)
2
.09
.14
.23
3
.09
.35
.44
4
.06 |
.175|
.235|
.3
.7
1
p ( x) p ( y | x)
p( x) p( y | x) Bayes theorem
x
Outcomes of Y1*Y2*Y3
p(y1,y2,y3)
p(y1,y2,y3)
Y1
Y2
Y3
(0,0,0)
0.173611
0.50
0.56
0.63
5 of 10
5 of 9
5 of 8
(0,0,1)
(0,1,0)
0.173611
0.138889
0.50
0.56
1/2
0.63
5/9
1/2
5 of 10
5 of 10
5 of 9
5 of 9
5 of 8
4 of 8
(0,1,1)
(1,0,0)
0.138889
0.111111
1/2
1/2
5/9
4/9
1/2
1/2
5 of 10
5 of 10
5 of 9
4 of 9
4 of 8
4 of 8
(1,0,1)
(1,1,0)
0.111111
0.083333
1/2
1/2
4/9
4/9
1/2
3/8
5 of 10
5 of 10
4 of 9
4 of 9
4 of 8
3 of 8
(1,1,1)
0.083333
1/2
4/9
3/8
5 of 10
4 of 9
3 of 8
iid
Independence: If the joint is a product of the Marginals.
Identically Distributed: if the Marginals are all the same
Sampling with Replacement
10 Voters: 6 Dems, 4 Rep
Y=1 Dem 0 if Rep
We choose 3 randomly
What is P(Y1=1, Y2=1, Y3=1)=p(1,1,1)
Even though all 3 chosen were dems each time they had an equal change of being chosen
Outcomes of Y1*Y2*Y3=.6*.6*.6=.216
Yi~Bernoulli(.6)
Assume iid with small sample of large group.
p(x)
(1-p)
p
|
|
|
|
X1
X2
0
1
0
(1-p)2
(1-p)p
1
p(1-p)
p2
The probability of a flip 2p(1-p)is the probability of the next one being different than the previous one.
6
1.875-.4(.125)+.6(.125)=1.9
Continuous Distribution
We give the probability of intervals Ex. P(a<X<b)=.1
The probability of an interval is the area under the pdf (curve). {Probability Density
Function.}
Ex. The probability that X is in the interval (0,2) is .477 (mean to 2 std above the mean).
Uniform Distribution
Ex. The pdf is flat of an interval (0,.5) therefore, the height must be 2 so the total area
equals 1 and any value is equally as likely. U~(a,b) a<X<b, height is 1/b-a.
X ~ N ( , 2 )
Normal Distribution
x
z P ( 2 X 2 ) .95
P ( X ) .68
P(0<Z<1)=.34
P(-1<Z<1)=.68
P(-2<Z<2)=.954
P(-1.96<Z<1.96)=.95
P(-3<Z<3)=.9974
Ex R~N(.1,.01) with the 95% CI [confidence interval] is (-.1,.3)=.1+ 2.(.1)
Mean .1
Variance.01
StD .1
The area under the curve tells me the probability of that interval.
Rt~N(.01,.042) iid The 95% CI would be .01+ .08
Cumulative Distribution Function cdf
Fx=Fx(b)-Fx(a)
P ( a X b) P ( X b) P ( X a )
For Z (standard normal) we have: P((-1,1))=F1-F(-1)=.84-.16=.68
The Expected Value is the Long Run Value or the Expectation as a Long Run Average.
We can interpret Expectation as the long run average from iid draws. (Sample Mean)
Ex. Probability of getting 2Heads 1Head1Tail 2Tails
Pr(x) x
.25(0)+.5(1)+.25(2)=1
.25
0
.5
1
.25
2
1 n
E ( X ) X i Expectation as long run average. (same as the mean to me)
n i 1
This works for X continuous and discrete.
1 n
( X i )2
n i 1
Variance Formula
2
Var ( X ) .25(0 1) 2 .5(1 1) 2 .25 2 1 .5 Example with a coin toss
TheoreticalVar ( X )
1 n
( X i )2
n i 1
1 n
SampleVar ( X i X ) 2
n i 1
1 n
Variance
( X i X )2
n 1 i 1
Observation
Mean
Sample Mean
Expected Value and Variance for Continuous Random Variables
1 n
Xi
n i 1
whether X is discrete or continuous for all iid Xi all having the same distribution as X.
Same for Variance
Ex 500 iid N(0,1)
E(Z)=0, Var(Z)=1
Z~N(0,1)
Var ( X ) E (( X ) 2 ) (or the sum of the numbers and their variances) E ( X )
y | c1 | x
The Expected Value is E(t)=3.2, The Variance is 1.06
E(C)=20000+2000E(t)=20000+2000(3.2)=26400
Var(C)=20002+Var(t)=424000
c 424000 2000* 1.06
8
Another Example of Random Variables and Formulas
L~Bernoulli(p) I place a $10 bet Leafs will win and I would get $20. W=-10+20L
W=-10+20(0) Lose, W=-10+20(1) Win
Now Suppose p=.5, then E(L)=.5, Var(L)=.5(1-.5)=.25 same as p(1-p),
L=.5 same as p (1 p )
E(W)=-10+20E(L)=-10+20(.5)=-10+10=0
Var(W)=202Var(L)= 202(.25)=100
w=20L=20(.5)=10
The variance equation always drops the constants.
If you were to double your money you double the Expected Value 2(X) Remember to
square the p value for the variance equation
Simple Special Cases
Assume: Y=a+X
E(Y)=E(a+X)=a+E(X)
V(Y)=V(X)
Assume:Y=aX
EY=E(aX)=aE(X)
Var(Y)=Var(aX)=a2Var(X)
sd(Y)=sd(aX)=|a|sd(X)
Formula
X : N ( , 2 )
:
E( X )
Var ( X ) 2
sd ( X )
all ( x , y )
Return Y
.05 |
.15 |
Cov(XY)=xy .4(.05-.1)(.05-.1)+.1(.05-.1)(.15-.1)+.1(.15-.1)(.05-.1)+.4(.15-.1)
(.15-.1)=.0015 our intuition tell us we have an 80% chance that X and Y are both above
X .1, X .05, Y .1, Y .05
or below the mean together.
9
Correlation for Pairs of Random Variables (discrete or continuous)
p XY XY -1<p<1 The closer to 1 there is a line with a positive slope and vice versa.
XY
Return X
cov./ std*std
Mean =.1 one std is .05
.05
.15
pxy=.0015/.05*.05=.6
Return Y
.05 | .4
.1
.15 | .1
.4
Another example: Notice iid Cov and Corr are ZERO
X
0
1
Y
0 |
.25
.25 | .5
1 |
.25
.25 | .5
.5
.5
1
.25(0-.5)(0-.5)+.25(0-.5)(1-.5)+ .25(1-.5)(0-.5)+ .25(1-.5)(1-.5)=0
0 over anything is 0
Y co c11 c2 2
y2 c12 x21 c22 x22 2c1 c2 x1x2 Variance of a linear combination
NOTICE We dropped the Constant. The constant does NOT vary.
Example
X1 .05, X 2 .1, X 3 .15, X2 1 .01, X2 2 .009, X2 3 .008,
px1x2 .3, px1x3 .2, px2 x3 .2,
10
Special Cases
E( X1 X 2 ) E ( X1) E( X 2 )
E( X1 X 2 ) E ( X1) E ( X 2 )
Var ( X 1 X 2 ) Var ( X 1 ) Var ( X 2 ) 2 cov( X 1 , X 2 )
Var ( X 1 X 2 ) Var ( X 1 ) Var ( X 2 ) 2 cov( X 1 , X 2 )
if 0correlation
Var ( X 1 X 2 ) Var ( X 1 ) Var ( X 2 )
Var ( X 1 X 2 ) Var ( X 1 ) Var ( X 2 )
Example:
I am about to toss 100 coins Xi=1 if the ith is a head, 0 else (same as lots of voters)
Y=X1+X2+X100
E (Y ) E ( X 1 ) E ( X 2 ) ...E ( X 100) 100*.5 50
Var (Y ) Var ( X 1 ) Var ( X 2 ) ...Var ( X 100 ) 100*.25 25 Var is calculated p(1-p)
Y 5
50 10
if it is approximately normal Mean+_2
n=trials, p=probability
Y : B (20,.84)
11
Special case:
Suppose prob of a defect is .01 and you make 100 parts. What is the probability they are
all good? .99100=.366(scary)
Now suppose you are about to make 100 parts and defects are iid Bernoulli (.1)
n=100
p=.1
=10 E(Y)=np=100*.1=10
=9 Var=np(1-p)=100*.1(1-.1)=9 and =3
Y~N(10,32) mean and variance
Therefore, 10+ 6=(4,16) at a 95% confidence interval
b/c of the Central Limit Theorem, if our curve is normal we have a pretty good idea of the
defect interval.
Standardization
X ~ N ( 2 ) the z value corresponding to an X value is z
Z ~ N (0,1)
12
E ( p ) p
p (1 p )
n
p (1 p )
p ~ N ( p,
)
n
p
Var ( p )
p (1 p)
n
95%CI P( p 2
p 2
p (1 p )
n
p (1 p )
p 2
n
p (1 p )
p (1 p )
p p 2
n
n
.48*(1 .48)
.038(approx 4 pts.)
700
Var ( p )
13
Example:
We have 200 parts 40 of which are defective.
Our model is X1.Xn~Bernoulli (p) iid
40
.2
200
the 95% CI is .2+.057 which is pretty big
.2(1 .2)
2*
.057
200
p
n
E( X )
X
2
Var ( X )
n
2
X ~ N ( , )
n
se( X )
sx
Standard error of sample mean
n
14
Example: Cereal
Mean 345 SD 15
sx
15
.67 (1) se
n
345
If we claim 360 we are not on target
95%CI 345 2(.67) 345 1.34
se( X )
Now if we had a sample deviation of 14.6 and a mean of 348 and 10 observations
14.6
4.6
10
tval tinv(.05,9) 2.262
(100 95 .05) & (10 1 9)
CI 95% &10 numberobservations
348.5 2.262* 4.6 348.5 10.4
se( X )
p(1 p )
)
n
.1(1 .1)
N (.1,
)
100
N (.1,.032 )
Now we need to standardize these values to see what they really represent.
p .1 .22 .1 x
True value
Proposed value
So we REJECT
p N ( p,
15
Ho : p p o
Ha : p p o
p p o
2 standard deviation
p o (1 p o )
n
sameas
p p o
this formula is x
p o (1 p o ) =Z
z
p is data, p o is claim
We reject if the test statistics are bigger than 2 or smaller than -2 that will happen 5% of
the time, when the null is true (ie the probability of making mistakes is .05).
Example: Daily prices of GE
Stock price went down 17 of 40 times 17/40=.425 Estimated p=.425
and it has an equal chance of going up =1 or down =0, Bernoulli(.5) or WE TEST p=.5
The test statistic
p (1 p )
.5(1 .5)
x
(.425 .5)
z
.94 1 So we FAIL to REJECT it is probably true.
orSE
.08
P-Value
The p-value for the null hypothesis is P ( Z z or z Z ) forZ ~ N (0,1)
The p-value is the probability of getting a test statistic z value as far out or farther than
the one we got.
So if we got a test statistic z of 4 as we did in defects question the p-value is .0000634 we
would REJECT.
If the test statistic is less than |2| then the p-value is greater than .05 Fail to Reject
If the test statistic is greater than |2| then the p-value is less than .05 Reject
The basic formula for confidence interval estimation
s
X 2se( X ) se( X ) x or Bernoulli ( p ) p 2se( p ) se( p )
n
p (1 p )
n