Edexcel S2 Cheat Sheet
Edexcel S2 Cheat Sheet
Chapter
1 Binomial
Distribution
Tips
www.drfrostmaths.com
( )
Remember the two edge cases where you dont need to use the full formula:
)
a. 0 successes: (
b.
successes:
(
)
c. Thus at least 1 success:
For worded questions, always start your working by writing out your distribution, e.g.
(
)
Sometimes you require the probability of a range when the cumulative table cant be used,
because the value of is not a nice number. This involves subtracting the opposite cases from 1:
)
(
)
(
)
e.g. (
). See on the right
Remember your table requires , so if you have (
), use (
regarding problems of flipping your inequality.
Assumptions of a Binomial Distribution:
a. Fixed number of trials.
b. Probability of success constant.
c. Each trial is independent (ensure you put in context!)
d. Each trial has two outcomes (success and failure)
When is unknown: An unfair coin with probability of heads is tossed 20 times. The
probability of seeing no heads is 0.1. Determine .
) (
) . Thus:
Since this is an edge case: (
(
)
When is unknown: I play a game for which the probability of winning is 0.7. If I win every
game, what is the smallest number of times I play such that the probability of winning every
game is less than 0.01?
)
Again an edge case so: (
Thus at least 13 games required. Notice that the direction of the inequality reversed because we
divided by a negative number (
).
Sometimes youll get a part of a question which requires some non-Binomial probabilistic
calculation, particularly involving some number of failures before a success is obtained, e.g. Bob
keeps firing arrows at a target until he gets a Bullseye. The probability he gets a Bullseye is 0.4.
th
Whats the probability he hits the Bullseye on the 4 shot:
www.drfrostmaths.com
This is related to something called the Geometric Distribution which isnt formally covered in the
syllabus.
When switching from the number of successes to the number of failures (so that the
probability is less than 0.5 for the purposes of using tables), flip the inequality (but preserve vs
) and if the number of successes was , use
for number of failures:
(
)
(
)
(
)
(
)
e.g. In Joes caf 70% of customers buy a cup of tea. In a random sample of 20 customers find
the probability that more than 15 buy a cup of tea.
(
)
(
)
(
)
(
)
(
)
(
)
(
), find the smallest value of such that (
)
Given
We can only use the table if the probability is less than 0.5. This question requires a great deal of
care, particularly with the effect of switching from to and getting vs right!
(
)
(
)
(
)
(
)
(
)
(
)
2 Poisson
Distribution
www.drfrostmaths.com
While this is in the formula booklet, the easy way to remember it is that reading left to right and
then down, the repeats consecutively, as does the .
( )
( )
and
. The fact these are the same sometimes provides a justification for
why a Poisson distribution would be suitable to model certain data. See Wordy Questions page.
As with the Binomial Distribution, ensure you state the distribution for any wordy question, e.g.
( ).
Conditions required for Poisson:
a. Events occur independently (e.g. volcano eruptions might not be modelled using Poisson
because volcano less likely to erupt immediately after previous eruption, thus eruptions
not independent).
b. Events occur singly in time.
c. A fixed rate for which events occur.
A very common occurrence is that you will need to scale to another time period. e.g. A printer
jams on average 0.3 times an hour. Find the probability over a 5 hour period the printer jams at
least 4 times.
Just scaling the 3 times an hour to a 15 times every 5 hours:
(
)
( )
(
)
(
)
Another common question is to feed the value calculated from a Poisson question, into a
Binomial Distribution. e.g. Defects occur in planks of wood with rate 0.5 per 10cm. If Bob buys 6
blanks each of length 100cm, find prob that fewer than 2 of planks contain at most 3 defects.
First find probability a plank of 100cm contains at most 3 defects:
( )
(
)
Then feed into a Binomial Distribution:
(
)
Let be the number planks with at most 3 defects.
(
)
(
)
(
)
(
)
Notice that we couldnt use tables for the Binomial here because of the non-nice p value.
As with the Binomial Distribution, we can get nasty double inequality questions: While a
popcorn bag is in the microwave, an average of 5 pops can be heard per second. Whats the
minimum number of pops heard such that theres less than a 10% chance of hearing more than
this number of pops?
( )
(
)
(
)
(
)
Note we had to take care with
vs
and
vs
.
For Binomial Poisson approximations, see notes on Normal Approximations.
3 Continuous
Random Variables
Remember that ( ) is the probability density function for continuous variables, and that this is
not a probability as such: we only get a probability when we integrate ( ) over some range.
)
Relatedly, if is a continuous variable, then (
because the probability of a specific
value is infinitely small (e.g. no one has an exact height of 1.5m).
Think of ( ) as the running total of the probability up to .
( )
(
)
To find a probability over a range:
(
(
( )
( )
For the latter, if you know the probability is 0 after some value (this will always be the case in
exams), we can use instead of .
) using (
)
(
)
If ( ) is known, then you could calculate say (
( )
When asked to find the value of some constant used in a p.d.f., use the fact that
( )
, i.e. the area under the whole probability function is 1.
However, if the cumulative distribution function is given, then there is no need to integrate, just
)
use ( )
where is the highest possible value (since (
).
Remember that it doesnt matter for continuous variables if you use
or
(but it does
matter for discrete variables!).
( )
( )
( )
( )
( )
( )
range:
( )
Note the extra 1 row required since the running total of the probability by the time you get to 2
is 1. Note that the use of was to avoid a clash with the used as the upper limit of the integral.
But the mark scheme permits
, so do this way if you find it less confusing.
www.drfrostmaths.com
When ( ) has multiple rows, see the note on the right about ensuring you add the running total
up to that range.
To go from ( ) to ( ) just differentiate. Dont forget that the 1 row disappears.
( )
( )
To find median or quartiles: use ( )
. Either ( ) will
already be given, or you will have to determine it from ( ) first.
Sometimes you have to determine which range the quartile/median occurs in first by evaluating
the borderline values (although this is not necessary if you only have one range).
e.g. If:
without properly
( )
{
and we wished to find the median , then it might be in the
range or
range. However ( )
, i.e. the running total of the probability up to 1 is 0.25, thus the median
wouldnt have yet occurred, and thus its in the
range.
Then using ( )
:
and so on.
The mode is the value of such that ( ) is at its maximum. The mode can be calculated in two
different ways: (and usually you can only use one of the two)
a. For curved graphs, finding the turning points using
( )
b.
www.drfrostmaths.com
evaluating
This will result in a missing
constant.
When finding the
cumulative distribution
function from ( ) where
there are multiple ranges,
forgetting to add on the
running total up to the
start of the range being
considered. e.g. If you had
ranges
and
in ( ), then
when finding ( ) in the
latter range, ( )
( ) ( ) . This is
because you want the area
up to 1 and then the area
between 1 and .
i.e. Dont forget the ( )!
When finding the mode,
accidentally giving the
probability density of the
mode as the answer rather
than the mode itself (e.g.
Jan 2011 Q5d: answer is 0
not 4).
4 Continuous
Uniform
Distribution
If
( )
(
( )
Suppose that
( ).
Then what is
(
)? You
might be tempted to
)
calculate (
, but
the probability above a
value of 5 is 0, thus:
(
)
(
)
(
)
. i.e. we
truncate any part of the
range which is outside the
range of the uniform
distribution.
( )
www.drfrostmaths.com
) then:
I pick 10 real numbers randomly from 12 to 17. Find the probability that at least 5 of these
numbers are greater than 15.5.
) (
)
If is the number picked each time, (
. Then if is the
number of times a number greater than 15.5 was picked,
(
), and calculate (
Dont forget your rules of coding from S1:
( )
( )
(
)
( )
).
5 Normal
Approximations
Be able to approximate a
Binomial Distribution using
a Normal Distribution.
Be able to approximate a
Poisson Distribution using a
Normal Distribution.
Be able to approximate a
Binomial Distribution using
a Poisson Distribution.
Be able to give the
conditions under which
such approximations can
be made.
Justify why we need
continuity corrections.
www.drfrostmaths.com
This diagram may seem like quite a lot to memorise, but all you need to memorise for carrying
out the majority of approximations is this: If you have a Binomial Distribution, is
? If yes
use Poisson, else use Normal. In terms of converting between the distributions, the mean and
variance of the Normal/Poisson approximation is just the mean and variance of the original
distribution.
If asked why a continuity correction is needed (and suppose the original distribution is Poisson),
say: Poisson is discrete, but Normal is continuous.
For continuity corrections, we want to go from a discrete to a continuous version of it .
You will never get a continuity correction wrong if you carry out these two simple steps:
a. Make sure your inequality uses or instead of < or >. i.e. Ensure inequality is nonstrict.
b. Extend your range by 0.5 at each end. i.e. If you visualise your inequality as a line on
the number line, it should be 0.5 longer each end.
)
(
)
Examples: (
(
)
(
)
(
)
(
)
(
)
(
)
I prefer to do the continuity correction immediately, i.e. before you either reverse the direction
of the inequality or standardise. e.g.
(
)
(
)
(
)
(
)
(
)
The number of marks effectively tells you what approximation you are using: If at least 6 marks,
its a Normal Approximation (because of the many steps of converting the distribution,
standardising and continuity corrections), otherwise its Binomial Poisson.
Example Normal Approximation: The number of houses sold by an estate agent follows a Poisson
distribution, with a mean of 2 per week. The estate agent will receive a bonus if he sells more
than 25 houses in the next 10 weeks. Use a suitable approximation to estimate the probability
that the estate agent receives a bonus.
a. Note first that you might be tempted to scale the 25 houses in 10 weeks to 5 houses in 2
weeks and stick with the original
. The catastrophic flaw in doing this is that the
continuity correction affects the range differently depending on whether youre using
the original or scaled value.
)
(
)
If not scaling: (
(
)
)
(
)
If scaling: (
(
)
In the latter incorrect case the 0.5 has a greater effect on the smaller value of 6
compared with the larger value of 26, so the probability will be too high.
b. Step 1: Determine what approximation to use.
In this example we have a Poisson Distribution, which always goes to Normal. If it were
Binomial, youd first determine if
.
c. Step 2: Identify original distribution.
As discussed, we scale (rather than the 25), so:
( )
d. Step 3: Write the approximation, potentially with reference to a new continuous
variable which is the continuous version of , i.e.:
(
)
As discussed, use the mean and variance of the original distribution.
e. Step 4: If necessary, carry out continuity correction to get a probability in terms of :
(
)
(
)
(
)
f. Step 5: Use your S1 knowledge and find the probability by first standardising. Dont
forget that youre dividing by the standard deviation, not the variance:
(
6 Populations
and Samples
www.drfrostmaths.com
(
)
Key definitions:
a. Statistic: A random variable (1) which is some function of the sample and not
dependent on any population parameters (1) - I think the random variable bit is a bit
pernicious (as does Wikipedia), but cest la vie! If 1 mark, the second part is important.
b. Population: The collection of all items.
c. Sample: Some subset of the population which is intended to be representative of the
population.
d. Census: When the entire population is sampled.
e. Sampling unit: Individual member or element of the population or sampling frame.
f. Sampling frame: A list of all sampling units or all the population.
g. Sampling distribution: All possible samples are chosen from a population (1); the values
of a statistic and the associated probabilities is a sampling distribution (1).
Its important you get your head around what the sampling distribution actually is: It gives the
distribution over possible values of the statistic as we take different samples. So if for example
www.drfrostmaths.com
the statistic was the range of the sample, then this range is likely to vary as we take different
samples. As these ranges vary across samples, it forms a distribution.
The sampling frame is the list of things in the population that are available for sampling, e.g. The
ID numbers, The list of car registration numbers. The mark scheme seems to particularly like it
when you refer to some identifying property of the things in the sampling frame.
The sampling frame may be different from the population, because some things in the
population may not be available for sampling. e.g. If sampling people whove visited a medical
practice, some people may have left the area but hadnt deregistered.
When listing outcomes, it helps to be systematic in listing them so you dont miss any. Note that
different orderings count as distinct possibilities.
e.g. You have a large collection of 1p, 2p and 5p coins, and take 3 coins. Find all samples in
which the maximum is 5. We may want to first list the possibilities where 5 appears once, 5
appears twice, and so on (5,1,1), (1,5,1), (1,1,5), (5,1,2), (5,2,1), (1,5,2), (2,5,1), (1,2,5), (2,1,5),
(5,2,2), (2,5,2), (2,2,5), (5,5,1), (5,1,5), (1,5,5), (5,5,2), (5,2,5), (2,5,5), (5,5,5)
e.g. A bag contains a large number of 1p and 2p coin, of which 40% are 1p and 60% are 2p. A
sample of 2 coins. Find the sampling distribution of the sample maximum.
When finding the sampling distribution, it may help to have a table as follows to organise your
working, such that the outcomes for each possible value of the statistic are grouped:
Possibilities
Statistic (Maximum)
Probability
(1,1)
1
(1,2), (2,1), (2,2)
2
Notice that we didnt need to do any complicated calculation for the last probability, because it
was just 1 minus the others! Had we had to calculate it fully, then
On the rare occasion you get a question asking for the sampling distribution, where you dont
actually have to do any calculation, but just have to consider what well-known distribution you
get as the sample varies:
A factory produces components. Each component has a unique identity number and it is
assumed that 2% of the components are faulty. On a particular day, a quality control manager
wishes to take a random sample of 50 components. A statistic represents the number of faulty
components in the sample. Specifying the sampling distribution of .
We know a sampling distribution is the possible values of the statistic as we take different
samples of 50 light bulbs. If the statistic is the count of light bulbs, we can see this count varies
Binomially between 0 and 50. Thus
(
)
7 Hypothesis
Testing
www.drfrostmaths.com
If we had been asked for the closest value to 0.025 and 0.975, then wed then get
and
instead.
It is vitally important you specify the probability of being in each tail to evidence that you have
)
used the table, e.g. (
For the critical region, dont forget to provide a lower limit or upper limit in the case of the
Binomial Distribution, as the outcomes are finite.
For the previous example:
. Mark schemes usually condone the lack of
, but dont take any chances.
The actual level of significance is the actual probability of being in the critical region. You should
have already written out the probabilities of being in each part of the critical region, so its then
just a case of adding the two probabilities.
The mark scheme for a hypothesis test without a normal approximation is as follows:
a. Specifying
and
(1 mark)
b. Specifying the distribution for under the null hypothesis, e.g.
(
) (1 mark)
The or will be your population parameter under the null hypothesis.
c. 2 marks for either: Determining the probability of the observed value or more extreme
)
(
)
(e.g. (
or determining the critical region.
10
d.
www.drfrostmaths.com
11
Wordy/interpretation questions:
www.drfrostmaths.com
12