s2 Revision Notes
s2 Revision Notes
Var ( X )
npq
Poisson Distribution
Binomial probability distribution is defined as:
o
P(X=r)
e x r
r!
Var ( X )
Karina Gahir
f ( x ) dx
f ( x ) dx
Finding probabilities just sub numbers into the F(x), but check the limits first
o
o
o
o
In general
To
To
To
To
find
find
find
find
the
the
the
the
dy
dx
lower limit
and make
dy
=0 and solve
dx
upper limit
E(X) =
xf ( x ) dx
lower limit
upper limit
x 2 f ( x ) dx
Var(X) =
o
o
E(aX+b) = aE(X) +b
Var(aX+b) = a2Var(X)
SD(X) =
o
o
For a 3 part
o
o
o
o
o
lower limit
[E(X)]2
Var ( X )
Sampling
Key Definitions
o A statistical model is a statistical process devised to describe or make predictions about
the expected behaviour of a real-world problem
o A population is a collection/ group/ set of individuals or items
o A sample is any subset of a population
o A sampling frame is a complete list or complete identification of the population (e.g. a
list, index register, database, map or file)
o A sampling unit is an individual member of the population
o A census is a survey of every member of a population
o A sample survey of a sample of a population
o A random sample is one in which every member if the population has an equal chance of
being selected
o Statistical inference are those methods by which one makes generalisations about a
population, based on information obtained from samples selected from that population
Karina Gahir
A statistic is a random variable that depends only on the observed sample (e.g. the
sample mean)
o A sampling distribution is the probability density function of a statistic (e.g. the sampling
distribution of the mean is the probability distribution of all possible means of samples of
a fixed size)
Advantages and Disadvantages of a sample survey
Disadvantages of a survey:
o May be too expensive
o May be time consuming
o May involve testing till destruction (e.g. if you want to find out how long batteries last,
you can test them until they run out)
o May be impossible to carry out for every member of the population
Therefore a sample survey:
o Is less time consuming
o Not as expensive
o HOWEVER it may not be representative of the whole population, and can be biased.
Common sources of bias involve:
Subjective choice by person taking the sample
Non-response
Sampling from an incomplete sampling frame
Sampling Distribution of a statistic
o Parameters are quantities that describe characteristics of a population (e.g. the mean,
variance or proportion that satisfies certain criteria)
o They can be estimated from sample data using quantities called statistics:
For example, if a random sample of size n, X 1, ..., Xn, is taken from a population
o
Xi
X =
n
The probability distribution of a statistic is called its sampling distribution, it gives
all the possible values that a statistic can take
Continuous Distributions
o
For the uniform distribution X~U[a,b], the probability density function (p.d.f) is f(x) =
1
a xb
ba
0 otherwise
o
o
The area under the graph is always equal to 1 as it is uniform (i.e. a rectangle)
When finding probabilities, we use the concept that the area is always equal to 1
To find E(X) =
a+b
2
0
xa
ba
1
x <a
a xb
x >b
Karina Gahir
2
(a+ b)
12
To find Var(X) =
Continuity Corrections
1
2
P(X = b) = P(b -
<Y<b+
1
2
1
2
P(a X b) = P(a -
1
2
1
2
1
2
1
2
1
2
1
2
1
2
Yb+
<Y<b-
Y<b-
<Yb+
1
2
1
2 )
1
2
1
2
z=
-
x valuemean
standard deviation
Hypothesis Testing
Carrying out the tests
Karina Gahir
o
When