0% found this document useful (0 votes)
136 views15 pages

Stats Cheat Sheets

The document discusses key concepts in statistics including: - How to calculate the mean, variance, and standard deviation of a sample of data. - How to calculate the covariance and correlation between two variables to assess their linear relationship. - How discrete and continuous random variables are defined and their key properties like expected value and variance are calculated. - Key probability distributions like the normal, uniform, and Bernoulli distributions. - Concepts of independence, identical distribution, and how to calculate probabilities of events for sampling with and without replacement.

Uploaded by

Claudia Yang
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
136 views15 pages

Stats Cheat Sheets

The document discusses key concepts in statistics including: - How to calculate the mean, variance, and standard deviation of a sample of data. - How to calculate the covariance and correlation between two variables to assess their linear relationship. - How discrete and continuous random variables are defined and their key properties like expected value and variance are calculated. - Key probability distributions like the normal, uniform, and Bernoulli distributions. - Concepts of independence, identical distribution, and how to calculate probabilities of events for sampling with and without replacement.

Uploaded by

Claudia Yang
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 15

1

The sample Mean is just the average of all the Xs


1 n
X 1 ,K , X n
X
or you could use X i
n i=1
n
1 n
2
To compute the sample Variance we S x
(Xi X )
n 1 i 1

The sample Standard Deviation is sx sx2

The sample Covariance


The sample Correlation
n
s
1
rxy xy
sxy
( X i X ) (Yi Y )

sx s y
n 1 i 1
Covariance and correlation summarize how strong a linear relationship there is between
two variables.
The sign of the covariance and the correlation will have the same sign.
Facts: 1 rxy 1 The closer r is to 1 the stronger the linear relationship is with a positive
slope. When one goes up, the other tends to go up. The closer r is to -1 the stronger the
linear relationship is with a negative slope. When one goes up, the other tends to go
down. The closer to zero has r has little linear relationship.
Linearly Related Variables: y co c1 x
The variable y is a linear function of the variable x. co : the intercept c1 : the slope.
We think of the cs as constants (fixed numbers) while x and y vary.
y co c1 x
y co c1 x
s y | c1 | sx
s 2y c12 s12
Portfolio: c is portfolio weights, x is the return on the portfolio output
y co c1 x1 c2 x2 ...ck xk
y co c1 x1 c2 x2 ...ck xk This is the avg. return on the portfolio
m

R p w1 x1 w2 x2 ...wm xm xi wi Return on an asset.


i 1

s c s c s 2c1 c2 sx1x2 Variance of a linear combination


NOTICE We dropped the Constant. The constant does NOT vary.
s 2y c12 sx21 c22 sx22 c32 sx23 2c1 c2 s x1x2 c1c3 s x1x3 c2c3 s x2 x3 Multiple inputs
2
y

2 2
1 x1

2 2
2 x2

2
A Discrete Random Variable is a numerical quantity we are unsure about. We quantify
our uncertainty by 1. Listing the numbers it could turn out to be and 2. Assigning to each
number a probability.
Probabilities are number between 0 and 1, that SUM up to one.
P(X=x)
Px (X)
all mean the same thing.

p(X)

Probabilities of Subsets of Outcomes


Example Dice:Pr(a<X<b)=

a x b

x
1
2
3
4
5
6

p(x)
1/6
1/6
1/6
1/6
1/6
1/6

p ( x )i

Ex. Pr(2<X<5) =p(3)+p(4)=2/6=1/3

The Expected Value of a Discrete Random Variable E ( X )

p( x) x

all _ x

Weight the possible values by how likely they are.


s
p(s)
1
.095
E(s)=.095*1+.23*2+.44*3+.235*4=2.815
2
.23
The expected value of S
3
.44
4
.235

X is the Expected value of X


The Variance of a Discrete Random Variable Var ( X )

p( x)( x

)2

all _ x

Variance
( x X ) 2 standard error
2
X

Weighted by probability

Value of X
Expected prediction of x
This is the weighted average of the squared prediction error.
Example:
Xs

Weights

s
1
2
3
4

p(s)
.095
.23
.44
.235

Var(s)=.095*(1-2.815)2+.23*(2-2.815)2+.44*(3-2.815)2
+.235*(4-2.815)2=.811
Standard Error from Expected Value 2.815

Standard Deviation of a Discrete Random Variable X X2

Bernoulli Distribution X~Bernoulli(p)


If we can Random Variable X we have:
x
p(x)
Coin toss X=1 if heads, 0 if tails.
X
1
p
X~Bernoulli(.5)
0
1-0
The Mean and the Variance of a Bernoulli(p)
E(X)=p
Var(X)=p(1-p)
Conditional Distribution
p(Y|X=1) Given what we know about X what is Y?
Marginal Distribution is the marginal probability of X
p(X) or p(Y)
Joint Distribution is the product of the Marginals p(X,Y)

Economy and Sales

S,Ep(S,E)
0.25S=4

0.07E=1
up

0.3E=0
down

p(S=4 and E=1)=.7*.25=.175

4,1

0.175

0.5S=3

p(S=3 and E=1)=.7*.5=.35

3,1

0.35

0.2S=2

p(S=2 and E=1)=.7*.2=.14

2,1

0.14

p(S=1 and E=1)=.7*.05=.035

1,1

0.035

0.2S=4

p(S=4 and E=0)=.3*.2=.06

4,0

0.06

0.3S=3

p(S=3 and E=0)=.3*.3=.09

3,0

0.09

0.3S=2

p(S=2 and E=0)=.3*.3=.09

2,0

0.09

0.2S=1

p(S=1 and E=0)=.3*.2=.06

1,0

0.06

0.05S=1

The marginal probability of p(S=4) is .175+.06=.235 or p(S=3) is .35+.09=.44

The conditional probability that a random variable Y turns out to be y given that X is x.
P(Y=y|X=x) or p(y|x)

4
Conditional = Joint/Marginal
P(S=4|E=1) =p(S=4 and E=1)/p(E=1)=.175/.7=.25
P(E=1|S=4)=p(S=4 and E=1)/p(S=4)=.175/.235=.745
Example Table
S
E

0
1

p( x | y)

1
.06
.035
.095
p ( y, x )

p( y)

2
.09
.14
.23

3
.09
.35
.44

4
.06 |
.175|
.235|

.3
.7
1

p ( x) p ( y | x)
p( x) p( y | x) Bayes theorem
x

Sampling without replacement


Suppose we have 10 voters 5 are Rep and 5 Dems
Y=1 Dem 0 if Rep
We choose 3 randomly
What is P(Y1=1, Y2=1, Y3=1)=p(1,1,1)
Ans:
Now suppose we have 5 dems and 5 rep and we sample three voters
without replacement.
Let Y1 be 1 if the ith is dem and 0 else, i=1,2,3.
As usual, let p(y1,y2,y3) denote P(Y1=y1, Y2=y2, Y3=y3)
Give the joint distribution of (Y1, Y2, Y3) by giving p(y1,y2,y3) for all possible p(y1,y2,y3)

Outcomes of Y1*Y2*Y3
p(y1,y2,y3)

p(y1,y2,y3)

Y1

Y2

Y3

(0,0,0)

0.173611

0.50

0.56

0.63

5 of 10

5 of 9

5 of 8

(0,0,1)
(0,1,0)

0.173611
0.138889

0.50

0.56
1/2

0.63
5/9

1/2

5 of 10
5 of 10

5 of 9
5 of 9

5 of 8
4 of 8

(0,1,1)
(1,0,0)

0.138889
0.111111

1/2
1/2

5/9
4/9

1/2
1/2

5 of 10
5 of 10

5 of 9
4 of 9

4 of 8
4 of 8

(1,0,1)
(1,1,0)

0.111111
0.083333

1/2
1/2

4/9
4/9

1/2
3/8

5 of 10
5 of 10

4 of 9
4 of 9

4 of 8
3 of 8

(1,1,1)

0.083333

1/2

4/9

3/8

5 of 10

4 of 9

3 of 8

iid
Independence: If the joint is a product of the Marginals.
Identically Distributed: if the Marginals are all the same
Sampling with Replacement
10 Voters: 6 Dems, 4 Rep
Y=1 Dem 0 if Rep
We choose 3 randomly
What is P(Y1=1, Y2=1, Y3=1)=p(1,1,1)
Even though all 3 chosen were dems each time they had an equal change of being chosen
Outcomes of Y1*Y2*Y3=.6*.6*.6=.216
Yi~Bernoulli(.6)
Assume iid with small sample of large group.

X~Bernoulli(p) p stands for the probability of the 1 not the 0


X
Heads 0
Tails 1

p(x)
(1-p)
p

|
|
|
|

X1
X2

0
1

0
(1-p)2
(1-p)p

1
p(1-p)
p2

The probability of a flip 2p(1-p)is the probability of the next one being different than the previous one.

What is the prob. of 1000 heads in a row? .51000 =9.332636 e-302


Monty Hall, 3 doors, I pick one, then I pick one or switch.
1 1 2
Probability /
3 2 3
Joint/Marginal=conditional

iid Defect Data ex. 20% defective Y~Bernoulli(.2)


Suppose we are about to make 10 parts what is the probability that they are all good?
.810=0.11
Marginal * Conditional = Joint
Conditional = Joint/Marginal
Marginal=Joint/Conditional

The Random Walk Model


Price Changes (Differences) The Dts (price differences) are iid but not the prices
Dt Pt Pt 1
Pt Pt 1 Dt
The next value of Pt+1 is the current value of Pt plus a random increment Dt+1 which is
independent of previous increments.
Ex. The last price in our series is 1.875 the expected value (.125) of change -.4 and +.6

6
1.875-.4(.125)+.6(.125)=1.9
Continuous Distribution
We give the probability of intervals Ex. P(a<X<b)=.1
The probability of an interval is the area under the pdf (curve). {Probability Density
Function.}
Ex. The probability that X is in the interval (0,2) is .477 (mean to 2 std above the mean).

Uniform Distribution
Ex. The pdf is flat of an interval (0,.5) therefore, the height must be 2 so the total area
equals 1 and any value is equally as likely. U~(a,b) a<X<b, height is 1/b-a.

X ~ N ( , 2 )

Normal Distribution

x
z P ( 2 X 2 ) .95

P ( X ) .68

P(0<Z<1)=.34
P(-1<Z<1)=.68
P(-2<Z<2)=.954
P(-1.96<Z<1.96)=.95
P(-3<Z<3)=.9974
Ex R~N(.1,.01) with the 95% CI [confidence interval] is (-.1,.3)=.1+ 2.(.1)
Mean .1
Variance.01
StD .1
The area under the curve tells me the probability of that interval.
Rt~N(.01,.042) iid The 95% CI would be .01+ .08
Cumulative Distribution Function cdf
Fx=Fx(b)-Fx(a)
P ( a X b) P ( X b) P ( X a )
For Z (standard normal) we have: P((-1,1))=F1-F(-1)=.84-.16=.68
The Expected Value is the Long Run Value or the Expectation as a Long Run Average.
We can interpret Expectation as the long run average from iid draws. (Sample Mean)
Ex. Probability of getting 2Heads 1Head1Tail 2Tails
Pr(x) x
.25(0)+.5(1)+.25(2)=1
.25
0
.5
1
.25
2
1 n
E ( X ) X i Expectation as long run average. (same as the mean to me)
n i 1
This works for X continuous and discrete.

Variance of the Expected Long Run Average of iid draws


For the Function
f ( x) ( x ) 2
Var ( X ) E ( f ( X ))

1 n
( X i )2
n i 1

Variance Formula

2
Var ( X ) .25(0 1) 2 .5(1 1) 2 .25 2 1 .5 Example with a coin toss

TheoreticalVar ( X )

1 n
( X i )2

n i 1

1 n
SampleVar ( X i X ) 2
n i 1
1 n
Variance
( X i X )2

n 1 i 1

Observation

Mean

Sample Mean
Expected Value and Variance for Continuous Random Variables
1 n
Xi
n i 1
whether X is discrete or continuous for all iid Xi all having the same distribution as X.
Same for Variance
Ex 500 iid N(0,1)
E(Z)=0, Var(Z)=1
Z~N(0,1)
Var ( X ) E (( X ) 2 ) (or the sum of the numbers and their variances) E ( X )

Random Variables and Formulas


t
p(t)
What is the probability it will take less than three days? .05+.2=.25
1
.05
Ex. Fixed Cost 20000 plus an additional 2000 per day.
2
.2
C=Cost
C=20000+2000(t)
3
.35
4
.3
5
.1
The mean of a single random variable is not the same as the mean of a sample.
Mean and Variance of a Linear Function
Y co c1 X
E (Y ) co c1 E ( X )
Var (Y ) c12Var ( X )

Stands for single RVs not a list of numbers.

y | c1 | x
The Expected Value is E(t)=3.2, The Variance is 1.06
E(C)=20000+2000E(t)=20000+2000(3.2)=26400
Var(C)=20002+Var(t)=424000
c 424000 2000* 1.06

8
Another Example of Random Variables and Formulas
L~Bernoulli(p) I place a $10 bet Leafs will win and I would get $20. W=-10+20L
W=-10+20(0) Lose, W=-10+20(1) Win
Now Suppose p=.5, then E(L)=.5, Var(L)=.5(1-.5)=.25 same as p(1-p),
L=.5 same as p (1 p )
E(W)=-10+20E(L)=-10+20(.5)=-10+10=0
Var(W)=202Var(L)= 202(.25)=100
w=20L=20(.5)=10
The variance equation always drops the constants.
If you were to double your money you double the Expected Value 2(X) Remember to
square the p value for the variance equation
Simple Special Cases
Assume: Y=a+X
E(Y)=E(a+X)=a+E(X)
V(Y)=V(X)
Assume:Y=aX
EY=E(aX)=aE(X)
Var(Y)=Var(aX)=a2Var(X)
sd(Y)=sd(aX)=|a|sd(X)

Formula
X : N ( , 2 )
:

E( X )
Var ( X ) 2
sd ( X )

Covariance for Pairs of Random Variables


Is X related to Y is it independent

(X,Y) joint distribution

The Covariance between Bivariate Discrete RV X and Y.


cov( X , Y ) XY

p( x, y )( x X )( y y ) +cov if move together, -cov if move opp

all ( x , y )

Return Y

.05 |
.15 |

Long Run means


Return X
with the average between return X or Y is.1
.05
.15
.4
.1
.1
.4

Cov(XY)=xy .4(.05-.1)(.05-.1)+.1(.05-.1)(.15-.1)+.1(.15-.1)(.05-.1)+.4(.15-.1)
(.15-.1)=.0015 our intuition tell us we have an 80% chance that X and Y are both above
X .1, X .05, Y .1, Y .05
or below the mean together.

9
Correlation for Pairs of Random Variables (discrete or continuous)

p XY XY -1<p<1 The closer to 1 there is a line with a positive slope and vice versa.
XY
Return X
cov./ std*std
Mean =.1 one std is .05
.05
.15
pxy=.0015/.05*.05=.6
Return Y
.05 | .4
.1
.15 | .1
.4
Another example: Notice iid Cov and Corr are ZERO
X
0
1
Y
0 |
.25
.25 | .5
1 |
.25
.25 | .5
.5
.5
1
.25(0-.5)(0-.5)+.25(0-.5)(1-.5)+ .25(1-.5)(0-.5)+ .25(1-.5)(1-.5)=0

0 over anything is 0

Correlation Captures DEPENDENCE


Mean and Variance of a Linear Combination
Y co c1 X 1 c2 X 2 c3 X 3 ... ck X k

Y co c11 c2 2
y2 c12 x21 c22 x22 2c1 c2 x1x2 Variance of a linear combination
NOTICE We dropped the Constant. The constant does NOT vary.
Example
X1 .05, X 2 .1, X 3 .15, X2 1 .01, X2 2 .009, X2 3 .008,
px1x2 .3, px1x3 .2, px2 x3 .2,

x1x2 .002846, x1x3 .001789, x2 x3 .001697


Y=.2X1+.5X2+.3X3
y=.2(.05)+.5(.1)+.3(.15)=.105000

Y2 (.2)(.2)(.01) (.5) .5 .009 .3 .3 .008 2 .2 .5 .002846


2 .2 .3 .001789 2 .5 .3 .001697 .00423362

10

Special Cases
E( X1 X 2 ) E ( X1) E( X 2 )
E( X1 X 2 ) E ( X1) E ( X 2 )
Var ( X 1 X 2 ) Var ( X 1 ) Var ( X 2 ) 2 cov( X 1 , X 2 )
Var ( X 1 X 2 ) Var ( X 1 ) Var ( X 2 ) 2 cov( X 1 , X 2 )
if 0correlation
Var ( X 1 X 2 ) Var ( X 1 ) Var ( X 2 )
Var ( X 1 X 2 ) Var ( X 1 ) Var ( X 2 )
Example:
I am about to toss 100 coins Xi=1 if the ith is a head, 0 else (same as lots of voters)
Y=X1+X2+X100
E (Y ) E ( X 1 ) E ( X 2 ) ...E ( X 100) 100*.5 50
Var (Y ) Var ( X 1 ) Var ( X 2 ) ...Var ( X 100 ) 100*.25 25 Var is calculated p(1-p)

Y 5
50 10
if it is approximately normal Mean+_2

The Binomial Distribution

Notice this is not Bernoulli********

Let Y be the distribution of X (this is a discrete distribution)


The number of successes Binomial(n,p)

n=trials, p=probability

Dont get n and p confused with mean and variance of Normal


E(Y)=np, Var(Y)=np(1-p)
Example:
Suppose returns on assets are iid N(.1,.01) that makes the StD .1
and Let Y denote the number of positive returns.
E(Y)=.84 Var is 20*.84(1-.84)=2.68
Y X 1 ,K , X 20
X 1 : B (.84)

the probability of a positive return is .84 and a negative .16

Y : B (20,.84)

The Central Limit Theorem a combination of a large number of independent


variables is approximately normal.

11

Special case:
Suppose prob of a defect is .01 and you make 100 parts. What is the probability they are
all good? .99100=.366(scary)

Now suppose you are about to make 100 parts and defects are iid Bernoulli (.1)
n=100
p=.1
=10 E(Y)=np=100*.1=10
=9 Var=np(1-p)=100*.1(1-.1)=9 and =3
Y~N(10,32) mean and variance
Therefore, 10+ 6=(4,16) at a 95% confidence interval
b/c of the Central Limit Theorem, if our curve is normal we have a pretty good idea of the
defect interval.
Standardization
X ~ N ( 2 ) the z value corresponding to an X value is z

Z can be interpreted the number of standard deviations from the mean.


Example: Crash or (weird as Gretzky)
Dow Mean .0127 and StD .0437. The crash month return was -.2176
.2176 .127
z
5.27 more than 5 Standard Deviations from the Mean
.0437
Another Formula to be comfortable with:
X ~ N ( , 2 )
Pr(a X b)
x
x Important in Hypothesis Testing
Pr(
Z
)

Z ~ N (0,1)

12

Estimating p, Population and aggregate sample values


p is estimator of sample population
p is the population proportion

WE DONT Write p =.2

E ( p ) p
p (1 p )
n
p (1 p )
p ~ N ( p,
)
n
p

Var ( p )

p (1 p)
n

95%CI P( p 2
p 2

p (1 p )
n

p (1 p )
p 2
n

p (1 p )
p (1 p )
p p 2
n
n

We know this is wrong but we hope not too wrong.

The Confidence Interval


Example: Voters
700 likely voters, 48% would not like to see Bush re-elected. The survey has 4
percentage points margin of error.
2

.48*(1 .48)
.038(approx 4 pts.)
700

.48+.038 which is estimate plus or minus error.


Estimating p for Bernoullis
E ( p ) p
p (1 p )
p (1 p )
se( p )
But we now the STANDARD ERROR
n
n
p 2 se( p )
p (1 p )
p ~ N ( p,
)
n

Var ( p )

13
Example:
We have 200 parts 40 of which are defective.
Our model is X1.Xn~Bernoulli (p) iid
40
.2
200
the 95% CI is .2+.057 which is pretty big
.2(1 .2)
2*
.057
200
p

Now we are ABOUT to make 10 parts


What is the probability of NO defects (1-p)10
If p=.2+.057 = (1.943, .257)
we get (1-.257)10=.05
we get (1-.057) 10=.214
The tragedy of root n
If phat is .2 and n=100 se=.04 (sqrt of p(1-p)/n
If phat is .2 and n=10000 se=.004
If we want half the se we have to increase the sample size by a factor of 4

Confidence Interval for the Mean of a Numeric Variable


X 1 , X 2 K ~ N ( , 2 )iid
X1, X 2 K , X n

n
E( X )
X

2
Var ( X )
n
2
X ~ N ( , )
n
se( X )

Our estimator is unbiased. The exact sampling distribution.

sx
Standard error of sample mean
n

for X1, X2 X 2 se( X )

14
Example: Cereal
Mean 345 SD 15
sx
15

.67 (1) se
n
345
If we claim 360 we are not on target
95%CI 345 2(.67) 345 1.34
se( X )

Now if we had a sample deviation of 14.6 and a mean of 348 and 10 observations
14.6
4.6
10
tval tinv(.05,9) 2.262
(100 95 .05) & (10 1 9)
CI 95% &10 numberobservations
348.5 2.262* 4.6 348.5 10.4
se( X )

Hypothesis Tests and p-values for p


Example
We claim we have a defect rate of .1 but we get 220 of 1000 defects.
The question that we are interested in answering is : Can we get p .22 if p .1 NO,
Reject!
Another way to look at this.
N=100, 22% defects and we want to test the claim that p could really be of p=.1
If p=.1

p(1 p )
)
n
.1(1 .1)
N (.1,
)
100
N (.1,.032 )
Now we need to standardize these values to see what they really represent.
p .1 .22 .1 x

z 4 4 standard deviations from the mean is pretty unlikely.


.03
.03

True value
Proposed value
So we REJECT
p N ( p,

FAILING TO REJECT means the claim is TRUE

15

To Test the Null Hypothesis

Ho : p p o
Ha : p p o

We would fail to reject if

p p o

2 standard deviation
p o (1 p o )
n
sameas
p p o
this formula is x
p o (1 p o ) =Z
z

we reject at .05 or 95%confidence if

p is data, p o is claim
We reject if the test statistics are bigger than 2 or smaller than -2 that will happen 5% of
the time, when the null is true (ie the probability of making mistakes is .05).
Example: Daily prices of GE
Stock price went down 17 of 40 times 17/40=.425 Estimated p=.425
and it has an equal chance of going up =1 or down =0, Bernoulli(.5) or WE TEST p=.5
The test statistic

p (1 p )
.5(1 .5)

.08 gives us the standard deviation


n
40
Or more accurately Standard Error

x
(.425 .5)
z
.94 1 So we FAIL to REJECT it is probably true.
orSE
.08

P-Value
The p-value for the null hypothesis is P ( Z z or z Z ) forZ ~ N (0,1)
The p-value is the probability of getting a test statistic z value as far out or farther than
the one we got.
So if we got a test statistic z of 4 as we did in defects question the p-value is .0000634 we
would REJECT.
If the test statistic is less than |2| then the p-value is greater than .05 Fail to Reject
If the test statistic is greater than |2| then the p-value is less than .05 Reject
The basic formula for confidence interval estimation
s
X 2se( X ) se( X ) x or Bernoulli ( p ) p 2se( p ) se( p )
n

p (1 p )
n

You might also like