0% found this document useful (0 votes)
20 views

CH 8 Probability Distributions

Uploaded by

FJ
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views

CH 8 Probability Distributions

Uploaded by

FJ
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 57

Edexcel GCSE (9 – 1)

Statistics
Mr M Dominguez
[email protected]
Chapter 8 Probability distributions
§ 8.1 Binomial distributions
A binomial expansion is when a two term expressions
is raised to a power.
a) Expand (a + b)0 ?1
b) Expand (a + b)1 1a ?+ 1b
c) Expand (a + b)2 1a2 + 2ab
? + b2
d) Expand (a + b)3 1a3 + 3a2b ? + 3ab2 + 1b3
e) Expand (a + b)4 1a4 + 4a3b + 6a
? b
2 2
+ 4ab3 + b4

What do you notice about:

The coefficients: They follow Pascal’s triangle.?


The powers of a and b: Power of a decreases each time (starting at the power)
Power of b increases each time? (starting at 0)
Quickfire Pascal
What coefficients in your expansion do you use if the power is:

2: 1 2 1 ?
Quickfire Pascal
What coefficients in your expansion do you use if the power is:

4: 1 4 6 ? 4 1
Quickfire Pascal
What coefficients in your expansion do you use if the power is:

3: 1 3 3 ?1
Quickfire Pascal
What coefficients in your expansion do you use if the power is:

5: 1 5 10 ? 10 5 1
Quickfire Pascal
What coefficients in your expansion do you use if the power is:

2: 1 2 1 ?
Quickfire Pascal
What coefficients in your expansion do you use if the power is:

4: 1 4 6 ? 4 1
Quickfire Pascal
What coefficients in your expansion do you use if the power is:

3: 1 3 3 ?1
Quickfire Pascal
What coefficients in your expansion do you use if the power is:

5: 1 5 10 ? 10 5 1
Quickfire Pascal
What coefficients in your expansion do you use if the power is:

4: 1 4 6 ? 4 1
Step 1: You could first put in the first term with decreasing powers.
Step 2: Put in your second term with increasing powers, starting from 0
(i.e. so that 2y doesn’t appear in the first term of the expansion, because the power is 0)
Step 3: Add the coefficients according to Pascal’s Triangle.
* Remember to use brackets to remind your self to raise each term to the correct power.

Expand (x + 2y)4 =

1 x4 + 4 x3 (2y) + 6 x2 (2y)2 + 4 x (2y)3 + 1(2y)4

= x4 + 8x3y + 24x2y2 + 32xy3 + 16y4


Find the coefficient of the term in the expansion of

We need a better/more efficient way of


calculating the numbers in the 12th row of
Pascal’s triangle

We could think of the coefficients of each term as the number of ways of ordering the letters
Eg in the expansion of the coefficient is 2 as we have

In the expansion of the coefficient can be found by writing down all the different ways of
getting

Hence the coefficient is 4. This can also be called 4 choose 1, as there are 4 options and we are
finding the position of one x

But again this will take too long for


We can however think of this as 11 choose 7. Note it does not matter if we define it as 11
choose 4 or 4 choose 11 due to the symmetry of Pascal’s triangle.
How are the rows of Pascal’s Triangle generated?
How many ways are there of choosing 0 items from 4?
= 4C0 = ?
How many ways are there of choosing 1 item from 4?
= 4C1 = ?
How many ways are there of choosing 2 items from 4?
= 4C2 =
?
How many ways are there of choosing 3 items from 4?
= 4C3 =
?
How many ways are there of choosing 4 items from 4?
= 4C4 =
?
Binomial Coefficients – n choose r
This is known as a binomial coefficient. It can also be written as nCr
(said: “n choose r”)

? ? ?

? ?

? ? ?
Binomial Coefficients – non calculator
To calculate Binomial Coefficients easily:
Because when we divide 8! by 6!, we cancel out
all the numbers between 1 and 6 in the product.
i.e. The bottom number of the binomial
coefficient (2) tells us how many consecutive
numbers we multiply together.

? ?

? ?

?
In general we can calculate any binomial expansion using the formula

n n n n n


(1  x) n = 0 1  1  1n 1 x1   2  1n  2 x 2  ......  r  1n  r x r
       

n(n  1) 2 n(n  1)(n  2) 3


1  nx  x  x ......
2! 3!
Step 1: You could first put in the first term with decreasing powers.
Step 2: Put in your second term with increasing powers, starting from 0
(i.e. so that 2y doesn’t appear in the first term of the expansion, because the power is 0)
Step 3: Add the coefficients according to Pascal’s Triangle.
* Remember to use brackets to remind your self to raise each term to the correct power.

Find the coefficient of the term in the expansion of


1) Find the expansion of (x + 2y)3

2) Find the expansion of (2x - 5)4

3) The coefficient of x2 in the expansion of (2 - cx)3 is 294. Find the value of c.


Using Binomial Expansions for approximations

Q7 Write down the first four terms in the expansion of


By substituting an appropriate value for x, find an approximate value to (0.99)6.
Use your calculator to determine the degree of accuracy of your approximation.

1 – 0.6x + 0.15x2 – 0.02x3


0.94148, which is accurate to 5dp ?

Q8 Write down the first four terms in the expansion of


By substituting an appropriate value of x, find an approximate value to (2.1)10.
Use your calculator to determine your approximation’s degree of accuracy.

1024 + 1024x + 460.8x2 + 122.88x3


1666.56, which is accurate to 3sf ?
Using Binomial Expansions for approximations

a) 1 + 5(-2x) + 10(-2x)2 + 10(-2x)3


= 1 – 10x + 40x2 – 80x3 ?
b) We discard the x2 and x3 terms above.
? 2 = 1 2– 9x – 10x2
(1+x)(1-10x) = 1 – 10x + x – 10x
 1- 9x (since we can discard the x term again)
The Binomial Distribution -Probability

If events A and B are independent (the outcome of


one does not affect the outcome of the other) then
the probability that they all occur is found by
multiplying the probabilities of each one

If events A and B are mutually exclusive (there is no


overlap) then the probability of A or B occurring is
equal to the sum of the probabilities of A and B
8.1 Binomial distributions
In this chapter we will need to be able to use two probability distributions: The
Binomial distribution and the Normal distribution.

A probability distribution is a list of all possible outcomes together with their


probabilities. The Binomial / Normal distributions are used to make predictions
on the outcomes of the experiment.
Probability distributions
You are already familiar with the concept of variable in statistics: a collection of
values (e.g. favourite colour of students in the room):

outcome red green blue orange


Probability that the 0.3 0.4 0.1 0.2
outcome occurs
If each is assigned a probability of occurring, it becomes a random variable.

A random variable represents a single experiment/trial. It


consists of outcomes with a probability for each.

i.e. is a random variable (capital

𝑃 ( 𝑋 = 𝑥) letter), but is a particular outcome.

“The probability that… …the outcome of the …was the specific


random variable … outcome ”

A shorthand for is (note the lowercase ).


It’s like saying “the probability that the outcome of my coin throw was heads” ()
vs “the probability of heads” (). In the latter the coin throw was implicit, so we
can skip the ‘.
The Binomial Distribution
The binomial distribution is a suitable model to calculate probabilities if these
conditions are true.
• there are a fixed number of trials, ,
• the trials are independent of each other
• there are two possible outcomes: ‘success, , ’ and ‘failure, ’, (
If all are the above are true we use the following notation

To then calculate the probabilities we use the expansion of

On a table of 8 people, 6 people are left handed.


a) Suggest a suitable model for a random variable : the number of left-handed
people in a group of 8, where the probability of being left-handed is 0.1.
b) Find the probability 6 people are left handed.
c) Suggest why the chosen model may not have been appropriate.
a
b
In using a Binomial distribution, we assumed that each
person being left handed is independent of each other.
However, left-handedness isc partially genetic and many
people on the table were from the same family.
Is it Binomially Distributed?
The number of red
Is a Binomial Distribution appropriate as a model? balls selected when
Some number out 3 balls are drawn
of 8 people being Number of throws on Number of girls in from bag of 15 white
left-handed die until 6 obtained family of 4 children and 5 red balls.

1. We have a fixed No, not fixed. This is


number trials.  known as a ‘Geometric
Distribution’ (which we
 
won’t cover)
2. Each trial has two
possibilities,
“success” and    
“failure”.
?
Usually. But in my
? 
? ?
3. The trials are story, genetics has
Technically the probability of Only if balls drawn

having a girl increases if you
independent. previously had a girl, and vice with replacement.
an influence on
versa. But the probability is
handedness. still close to 0.5, so Binomial
Distribution is appropriate.

Only if balls drawn


4. There is a fixed
probability of    with replacement,

success in each trial.


How many trials: e.g:
How many times am I
going to roll my die
The random variable.
e.g: rolling a die and
getting a one.

𝑋 𝐵 ( 𝑛, 𝑝 ) Success probability of
my random variable:
e.g: The probability of
getting a one.

The probability of successes. e.g: The


probability of getting a one times after
tries.

𝑛 𝑟 𝑛 −𝑟
𝑃 ( 𝑟 )= 𝐶 𝑟 × 𝑝 ( 1 −𝑝 )

Quickfire Questions
Show the calculation required to find the indicated probability given the distribution.

?
What is ?

What is ? ?

2. I have a bag of 2 red and 8 white balls. represents the number of red
balls I chose after 5 selections (with replacement).
?
a) How is distributed?

b) Determine the probability that I chose 3 red balls.

?
The random variable A student suggests using a binomial distribution
1 4 to model the following situations. Give a
Find description of the random variable, state any
assumptions that must be made and give possible
values for and .
?
A sample of 20 bolts is checked for defects from a
a large batch. The production process should
The random variable produce 1% of defective bolts.
2 Find assuming bolts being defective are independent
from each other. ?
? Some traffic lights have three phases: stop 48% of
b
? the time, wait or get ready 4% of the time and go
48% of the time. Assuming that you only cross a
?
traffic light when it is in the go position, model
A balloon manufacturer claims that 95% the number of times that you have to wait or stop
3 of his balloons will not burst when blown on a journey passing through 6 sets of traffic
up. You have 20 balloons. lights.
assuming lights operate independently.
What is the probability that none of ?
them burst? When Stephanie plays tennis with Tim on average
one in eight of her serves is an ‘ace’. How many
c ‘aces’ does Stephanie serve in the next 30 serves
?
What is the probability exactly 2 burst?
against Tim?
assuming serves are independent and
? probability of an ace is constant.
?
How many trials: e.g:
How many times am I
going to roll my die
The random variable.
e.g: rolling a die and
getting a one.

𝑋 𝐵 ( 𝑛, 𝑝 ) Success probability of
my random variable:
e.g: The probability of
getting a one.

The probability of successes. e.g: The


probability of getting a one times after
tries.

𝑛 𝑟 𝑛 −𝑟
𝑃 ( 𝑟 )= 𝐶 𝑟 × 𝑝 ( 1 −𝑝 )

§ 8.2 Normal distributions
The Normal distribution is a suitable model to calculate probabilities if
• The data is continuous
• The distribution is symmetrical
Remember that if the data is symmetrical then mean = median = mode.
Conversely if the data is skewed the normal distribution is not suitable. (The
Poisson distribution can be used)

The normal distribution is defined by its


mean and standard deviation . For a
probability distribution X with mean and
standard deviation . We can use the
notation. Standard
deviation
2 squared
𝑋 𝑁 (𝜇 , 𝜎 )
Mean of
Population The standard deviation
squared is also called the
Variance
𝑁 (𝜇 , 𝜎 )
2
𝑋
There are a few very important standard results and probabilities that you are
expected to memorise.
68% of the observations lie within one standard deviation of the mean.
95% of the observations lie within two standard deviations of the mean.
99.8% (virtually all) the observations lie within three standard deviations of the mean

68%

95%
Sketching the Normal distribution

Key point:
The area
underneath the
graph is the
probability.
Hence the area
under every graph
must add to one.

The smaller the


standard deviation
the larger the ‘peak’
will be.
1. Students take two maths papers, paper A and paper B. The summary statistics are:
Mean Mark Standard deviation
Paper A 52 12 16 ? 88
Paper B is taller as it
has the smaller Paper B 48 8 24 72

a) One the same set of axis sketch the above distributions. 28 is


b) The pass mark for paper A was 28. What percentage of ?
Hence
students achieved a pass in paper A?
c) What percentage of students achieved between 40 and 40 is
56 marks in paper B? 56 is?
Hence 68%
The mean and standard deviation of two maths papers is shown in the table below.
Mean Standard Deviation
Paper A 65 5 50 ? 80
Paper B 55 10 25 85

a) On the same set of axis sketch the above distributions


b) The pass mark for paper A was 55, what percentage of people passed? 97.5% ?
c) What percentage of people who sat paper B achieved more than 65 marks? 16% ?
d) If we wanted only 84% of people to pass paper B. What should the pass mark be? 45
?
2. One month, the mean height of applicants for the Royal Marines was 1.78m. The
standard deviation of their heights was 6cm.
a) Estimate the height of
(i) the shortest applicant ?

(ii) the tallest applicant. ?

b) Work out an estimate of the heights within which 95% of the applicants are likely
to be.
is , hence, ?
3. Students sat two examination papers. Paper 1 had a mean mark of 52 and a standard
deviation of 9.5. paper 2 had a mean mark of 60 and a standard deviation of 2.5.

Julia achieved a mark of 70 in one of the papers. Using a sketch of the distributions or
otherwise explain which of the papers this mark is more likely to have been achieved.

Paper 1: three standard deviations above the mean is 80.5 so 70 is likely


Paper 2: three standard deviations above? the mean is 67.5 so 70 is unlikely
Hence, paper 1 as 70 is less than 3 standard deviations away from the mean.

Then go to page 349 and 352 etc and answer all questions.
§ 8.3 Standardised scores
English Art
Student Test Test
Score Score The table shows the scores (out of 50) of 10
Abby 30 45 students for an English test and an art test.
Max 25 31
Sarah 45 24 Why is it unwise to simply compare the raw
Joe 15 23 scores for each student?
Claire 50 19
Lee 20 39
Jan 42 30
Matthew 17 42 The results for the English test and Art test are
Aimee 38 47 symmetrical. We can therefore modelled the
Ethan 41 28 two sets of data using the normal distribution.
   
When we are asked to compare two sets of data (most often exam data) we will
expected to compare standardised scores. It may of course be possible to compare
the sets of data with of standardising the scores.

The advantages of standardised scores are: It is unbiased as no human interpretation


required. Can be used to compare data sets of any range and value (larger/ small)
A standardised score has a mean of zero and a standard
deviation one. We therefore can standardised a score by
‘shifting’ each piece of data down by the mean and ‘scale’ the
size of the data by the standardised score.

𝑥−𝜇
Standardised   score , 𝑧 =
𝜎

We may be required to first calculate the mean and standard deviation .

For a frequency table


∑ 𝑥
∑𝑥
( )
∑𝑥
2 2
𝜇=
𝑛 𝜎= −
𝑛 𝑛
In order to be considered for a place on a mechanics course at a local college,
Wing and Mia took tests in English and Mathematics. Each test had a maximum
mark of 100.
The table shows some information about the tests.
Overall Standard
  Wing Mia
Mean deviation
English mark 48 55 50 10
Mathematics mark 59 51 55 8

(a) Calculate Wing’s standardised scores in


(i) English, -0.2 ? (ii) Mathematics. 0.5 ?

Mia’s standardised score in English is 0.5 and in Mathematics it is –0.5.


(b) What is meant by a negative standardised score?
Their mark was below average
? for the class.

(c) Who do you think did best overall? Give a reason for your answer.
Wing, as their total standardised score in English and Mathematics was
higher than Mia. They both achieved 0.5 ? in one exam and in the other
Wing’s standardised score of -0.2 is better than Mia’s -0.5.
English Art
English Test Art Test
Student Standardised Standardised
Score (x) Score (y) Score Score
Abby 30 45 ?
-0.18 1.24
Max 25 31 -0.58 -0.18
Sarah 45 24 1.01 -0.89
Joe 15 23 -1.38 -1.0
Claire 50 19 1.41 -1.40
?
Lee 20 39 ?
-0.98 0.63
Jan 42 30 0.77 -0.29
Matthew 17 42 -1.22 0.94
Aimee 38 47 0.45 1.44
Ethan 41 28 0.69 -0.49
   

1) For each subject, calculate the mean and standard deviation of the test scores.
2) For each student, calculate their standardised scores for each subject. (2 d.p)
3) Decide which subject each student did better in.
4) Which student(s) did the best overall and why?
Given that:
  English Art
μ 32.3 32.8
?
σ 11.9 9.3
English Art English Art
English
Student Test Test Squared
Art Squared Standardised Standardised
Score Score Score Score
Abby 30 45        
Max 25 31        
Sarah 45 24        
Joe 15 23        
Claire 50 19        
Lee 20 39        
Jan 42 30        
Matthew 17 42        
Aimee 38 47        
Ethan 41 28        
       
Student English Art Test English Art English Art
Test Score Squared Squared Standardised Standardised
Score Score Score
Abby 30 45 900 2025 -0.183 1.240
Max 25 31 625 961 -0.581 -0.183
Sarah 45 24 2025 576 1.011 -0.894
Joe 15 23 225 529 -1.377 -0.996
Claire 50 19 2500 361 1.409 -1.402
Lee 20 39 400 1521 -0.979 0.630
Jan 42 30 1764 900 0.772 -0.285
Matthew 17 42 289 1764 -1.218 0.935
Aimee 38 47 1444 2209 0.454 1.443
Ethan 41 28 1681 784 0.693 -0.488
323 328 11853 11630
1) For each subject, calculate the mean and standard deviation of the test scores.
2) For each student, calculate their standardised scores for each
subject.
3) Decide which subject each student did better in.
4) Which student(s) did the best overall and why?
§ 8.4 Quality assurance and control charts
Quality Assurance:
• Collecting a sample
• Comparing the sample with the required standards, (mean and range)
• Actions may then need to be taken if the limits are not met.

Why it is important that we look at the mean of the sample and the range?

Samples taken at regular intervals can be plotted on a control chart along


with warning and action limits.
If the sample is between
the warning limits the
product is acceptable
If the sample is between
the warning and action
limits another sample
should be taken.
If the sample is outside of
the action limits,
production is stopped

95% of the samples should lie between the two warning limits. Or 1 in 20 will fall outside the
warning limit.
Assuming that the samples are normally distributed, this will be from the mean.
Upper warning limit
Lower warning limit

99.8% (almost all) of the samples should lie between the two action limits. Or 2 in 1000 will
fall outside the action limits.
Upper Action limit
Lower Action limit
If the sample is between
the warning limits the
You do not need to know how to calculate product is acceptable
warning and action limits for a control chart
showing the range. If the sample is between
the warning and action
limits another sample
should be taken.
If the sample is outside of
the action limits,
production is stopped
Sample 1 2 3 4 5 6 7 8
88.24 90.18 89.88 90.04 90.07 90.12 89.82 90.04
90.04 90.25 94.35 90.00 90.01 89.78 89.86 98.81
Mass (g)
89.46 89.66 93.52 89.59 90.06 87.08 90.28 88.92
90.43 89.50 92.43 89.48 89.93 90.10 93.80 90.03
Mean 89.54 89.90 94.3 89.78 90.02 89.27 90.94 91.95
?
Range 2.19 0.75 4.45 0.56 0.14 3.04 3.98 9.89
?
?

You might also like