Chapter 6 S 1
Chapter 6 S 1
Chapter No.: 6
Class: BSc
Subject: Statistics-1
E-Mail: [email protected]
Page |1
Important Points:
6.1) The Random Variable:
A numerical quantity, whose value is determined by the outcome of a random
experiment, is called a Random Variable.
Random variables come in two ‘types’ discrete and continuous. Discrete random
variables are those associated with count data, such as the score on a die, while
continuous random variables refer to measurable data, such as height, weight and
time.
x 1 2 3 4 5 6
𝟐
𝛔𝟐 = 𝐕(𝐗) = ∑ 𝐱 𝟐 𝐏(𝐱) − (∑ 𝐱𝐏(𝐱))
Or
𝟐
𝛔𝟐 = 𝐕(𝐗) = 𝐄(𝐗 𝟐 ) − (𝐄(𝐗))
Or
𝑵
The Normal Distribution is a two-parameter distribution, the parameters being μ its mean
and σ, its standard deviation. The practical implication of this link is beyond the scope of
this course, so it will be sufficient for you to refer to the random variable X as having a
normal distribution by using the notation:
X ~ 𝐍 (μ, 𝛔𝟐 )
If you want to evaluate the probabilities associated with this distribution it is necessary to
evaluate the areas under the normal curve.
Since any random variable can be transformed into a standard normal variable then if
probabilities for the standard variable can be found then this property can be used to
evaluate probabilities associated with normal distributions.
If, for example you wish to evaluate P (12 < X < 16) where X ~ 𝐍 (10, 𝟒𝟐 ) then using
𝑿−𝟏𝟎
𝒁= gives:
𝟒
𝟏𝟐 − 𝟏𝟎 𝟏𝟔 − 𝟏𝟎
𝑷(𝟏𝟐 < 𝑿 < 16) = 𝑷 ( <𝑿< ) = 𝑷(𝟎. 𝟓 < 𝒁 < 1.5)
𝟒 𝟒
To evaluate such a probability it is necessary to introduce a special symbol ∅(𝒛)
Where z is the particular value of Z and ∅(𝒛) represents the area to the left of any given z
value. Thus ∅(𝟏. 𝟓) represents the probability P (Z < 𝟏. 𝟓) and is equal to the area under
the standard normal curve between −∞ and 1.5.
𝑷(𝟏𝟐 < 𝑿 < 16) = 𝑷(𝟎. 𝟓 < 𝒁 < 1.5) = ∅(𝟏. 𝟓) − ∅(𝟎. 𝟓)
Evaluation ∅(𝒛) is very difficult but, fortunately values of ∅(𝒛) have been extensively
tabulated and you will find it in (table −4) (It will be given in past papers). Thus if you
now turn to this table and evaluate
= ∅(𝟏. 𝟓) − ∅(𝟎. 𝟓)
(-3.30, 0.00) you need to remember that any normal distribution is symmetrical about its
mean and in particular the Standard Normal Distribution is symmetrical about zero.
Keeping in view this we may convert the situation discussed above into seven cases
which are as follows:
𝒙−𝛍
̅ ≤ 𝒙) = 𝑃 (𝒁 ≤
𝑃(𝐗 )
𝛔
⁄ 𝒏
√
as n → ∞.
Solution:
̅ is exactly
Since the population is normally distributed, the sampling distribution of 𝐗
𝟗
N (25 , ). If Z is the standard normal random variable, then
𝟏𝟔
= 1 − ∅(1.33)
= 1 − 0.9082
= 0.0918
Exercise:
Find the mean and variance of each of the following distributions of X:
Q 1)
x 1 2 3
P (X = x) 1⁄ 1⁄ 1⁄
3 2 6
Q 2)
x -1 0 1
P (X = x) 1⁄ 1⁄ 1⁄
4 2 4
Q 3)
x -2 -1 1 2
P (X = x) 1⁄ 1⁄ 1⁄ 1⁄
3 3 6 6
Q 5) A fair coin is tossed two times and the random variable H represents the number of
heads recorded.
(a) Find the distribution of H. (b) Write down of E (H).
(c) Find V (H).
Q 9) The random variable X has a normal distribution with mean 16 and variance 0.64. Find
x, such that 𝑷(𝑿 < 𝑥) = 𝟎. 𝟎𝟐𝟓.
Page |9
Q 10) Tyre pressures on a certain type of car independently follow a normal distribution with
mean 1.9 bars and standard deviation 0.15 bars. Safety regulations state that the pressures
must be between 1.9 − b bars and 1.9 + b bars. It is known that 80% of tyres are within these
safety limits. Find the safety limits.
Q 11) Suppose that a population of men’s heights is normally distributed with a mean of 68
inches, and standard deviation of 3 inches. Find the proportion of men who are:
a) under 66 inches
b) over 72 inches
c) between 66 and 72 inches.
Q 12) Two statisticians disagree about the distribution of IQ scores for a population under
study. Both agree that the distribution is normal, and that the standard deviation is 15, but A
says that 5% of the population have IQ scores greater than 134.6735, whereas B says that
10% of the population have IQ scores greater than 109.224. What is the difference between
the mean IQ score as assessed by A and that as assessed by B?
Q 14) The following six observations give the time taken, in seconds, to complete a 100-metre
sprint by all six individuals competing in the race, hence this is population data.
Measurements of a certain characteristic have a standard normal distribution. What can you
say about an individual’s position (with respect to this characteristic) in the population if their
score is
x can take values 1, 2, 3 and 5 with probability 0.1, 0.3, 0.4 and 0.2 respectively. Find E (X)
Measurements of a certain characteristic have a standard normal distribution. What can you
say about an individual’s position (with respect to this characteristic) in the population if their
z score is
x can take values 1, 4, 5 and 6 with probability 0.2, 0.4, 0.3 and 0.1 respectively. Find E (X)
State whether each of these statements is true or false, giving brief reasons why this is so
(Note that no marks will be awarded for a simple true/false reply)
i) The chance that a normally distributed statistic is less than two standard deviations
from its mean is greater than 99%.
Find E(X)
State whether each of these statements is true or false, giving brief reasons why this is so
(Note that no marks will be awarded for a simple true/false reply)
i) The chance that a normally distributed statistic is less than two standard deviations
from its mean is greater than 95%.
P a g e | 13
State whether each of these statements is true or false, giving brief reasons why this is so
(Note that no marks will be awarded for a simple true/false reply)
iv) The larger a sample, the larger the variation in its sample mean.
Three balls are thrown at random into five bowls so that each ball has the same chance of
going into any bowl independently of wherever the other 2 balls fall. Determine the
probability distribution of the number of empty bowls.
P a g e | 14
State whether each of these statements is true or false, giving brief reasons why this is so
(Note that no marks will be awarded for a simple true/false reply)
iv) The larger a sample, the smaller the variation in its sample mean.
In an examination the scores of students who attend schools of type A are normally distributed
about a mean of 55 with a standard deviation of 6. The scores of students who attend type B
schools are normally distributed about a mean of 60 with a standard deviation of 5. Which
type of school would have a higher proportion of students with marks above 70?
A charity believes that when it puts out an appeal for charitable donations the donations it
receives will normally distributed with a mean £50 and standard deviation £6, and it is
assumed that donations will be independent of each other.
i) Find the probability that the first donation it receives will be greater than £40.
ii) Find the probability that it will be between £55 and £60.
iii) Find the value x such that 5% of donations are more than £x.
iv) Find the probability that the first donation is at least £3 more than second.
A charity believes that when it puts out an appeal for charitable donations the donations it
receives will normally distributed with a mean £50 and standard deviation £6, and it is
assumed that donations will be independent of each other.
i) Find the probability that the first donation it receives will be less than £40.
ii) Find the probability that it will be between £40 and £45.
iii) Find the value x such that 5% of donations are more than £x.
iv) Find the probability that the first donation is at least £3 more than second.
P a g e | 15
State whether each of these statements is true or false, giving brief reasons why this is so
(Note that no marks will be awarded for a simple true/false reply)
i) When using a large random sample, we cannot assume that its mean forms part of a
normal distribution.
Assume that the number of weekly study hours for students at a certain university is
approximately normally distributed with a mean of 22 and a standard deviation of 6.
i) Find the probability that a randomly chosen student studies less than 12 hours.
ii) Estimate the percentage of students that study more than 37 hours.
iii) A certain lecture group consists of 225 students. You may assume that this class group
forms a simple random sample from the students in the university. Find the probability
that the average number of study hours for this group is between 21 and 23 hours.
State whether each of these statements is true or false, giving brief reasons why this is so
(Note that no marks will be awarded for a simple true/false reply).
i) When using a large random sample, the distribution of the sample mean may be
regarded as normal due to the central limit theorem.
Assume that the number of weekly study hours for students at a certain university is
approximately normally distributed with a mean of 20 and a standard deviation of 8.
i) Find the probability that a randomly chosen student studies less than 10 hours.
ii) Estimate the percentage of students that study more than 35 hours.
iii) A certain lecture group consists of 100 students. You may assume that this class group
forms a simple random sample from the students in the university. Find the probability
that the average number of study hours for this group is between 19 and 21 hours.
State whether the following are possible or not and give a brief explanation. (Note that no
marks will be awarded for a simple possible/not possible reply).
i) Random sampling causes high variability in estimates.
P a g e | 16
i 1 2 3 4
X -1 0 +1 +2
Calculate: E(X)
The starting annual salaries for students graduating from two departments, X and Y, of a
university are being investigated. Two random samples of last year's intake have been selected
and the results are as follows:
i) What proportion of new graduates from Department X earn more than $ 22,000 per month?
State whether the following are possible or not. Give a brief explanation. (Note that no marks
will be awarded for a simple possible/not possible reply)
X -1 0 +1 +2
Calculate: E(X)
P a g e | 17
The starting annual salaries for students graduating from two departments, X and Y, of a
university are being investigated. Two random samples of last year's intake have been selected
and the results are as follows:
What proportion of new graduates from Department X earn more than $22,000 per month?
Assume that the marks of students at a certain university are normally distributed with mean
52 and variance 100. Consider a randomly chosen student from that university and find the
probability
i) of failing the class (pass mark is 34).
ii) of obtaining a mark between 60 and 70.
Assume that the marks of students at a certain university are normally distributed with mean
55 and variance 81. Consider a randomly chosen student from that university and find the
probability
i) of getting a first (70 or above).
ii) of obtaining a mark between 50 and 60.
A test is taken by some students, their marks are recorded and we are interested in the
properties of the sample mean. Under the assumption that the marks follow a Normal
distribution with exact mean 60 and variance 81, calculate the probability that the mark of a
randomly selected student
A test is taken by some students, their marks are recorded and we are interested in the
properties of the sample mean. Under the assumption that the marks follow a Normal
distribution with exact mean 65 and variance 144, calculate the probability that the mark of a
randomly selected student
In country B it is also normally distributed but with a mean of £960 per week and a standard
deviation of £200 per week.
Which country has a higher proportion of households spending less than £800?
Weekly household expenditure in country A is normally distributed with a mean of $300 per
week and a standard deviation of $100 per week.
In country B it is also normally distributed but with a mean of $240 per week and a standard
deviation of $50 per week.
Which country has a higher proportion of households spending less than $200?
State whether the following are true or false and give a brief explanation. (Note that no marks
will be awarded for a simple true/false answer.)
i) The chance that a normal random variable is less than one standard deviation from its
mean is 95%.
P a g e | 21
State whether the following are true or false and give a brief explanation. (Note that no marks
will be awarded for a simple true/false answer.)
i) The chance that a normal random variable is less than two standard deviation from its
mean is 99%.
State whether the following are true or false and give a brief explanation (no marks will be
awarded for a simple true/false answer).
State whether the following are true or false and give a brief explanation (no marks will be
awarded for a simple true/false answer).
State whether the following are true or false and give a brief explanation (no marks will be
awarded for a simple true/false answer).
State whether the following are true or false and give a brief explanation (no marks will be
awarded for a simple true/false answer).
Answers:
11 17
Q 1) E(X) = V(X) =
6 36
Q 4) a)
m: -5 -4 -3 -2 -1 0 1 2 3 4 5
P(M=m): 1⁄ 2 3 4 5 6 5 4 3 2 1
36 ⁄36 ⁄36 ⁄36 ⁄36 ⁄36 ⁄36 ⁄36 ⁄36 ⁄36 ⁄36
b) E(M) = 0 c) V(M) = 35⁄6
Q 7) a) 1.64 b) 0.31
Q 9) 14.432
(ii)
(iii)
P a g e | 28
Q 15) 1.96 is above than mean, 0.3 is very close to the mean, and -5.5 is well below the mean.
Q 17) 1.66 is above than mean, 6.5 is well above the mean and - 0.5 is very close to the mean
Q 20) False, because two standard deviations from the mean is about 95.45%
Q 25) False, for larger the sample size, smaller will be the variation of sample mean.
Q 27)
No. of empty bowls 2 3 4
Probabilty 12/25 12/25 1/25
Q 28) True, for larger the sample size, smaller will be the variation of sample mean.
Q 32) False, because when using a large random sample, we can assume that its mean forms
part of a normal distribution by Central Limit Theorem.
Q 36) There is no reason why random sampling should have high or low variability in its
estimates. But, in general, the lower the sample size n, the higher the variability and so the
statement is `possible'.
Q 39) This is indeed possible and an example, such as `we know that the mean of sample
means will be unbiased if we sample randomly'
Q 46) (i)
Q 49) (i)
Q 53) 1.7
P a g e | 30
Q 55) 2.9
Q 58) 5.2
(ii) The researcher's statement is justified since we can apply the central limit
theorem given the large sample size n.
Q 63) False; A normal distribution is symmetric about its mean, hence P (X < 5) = 0:5.
(ii) The researcher's statement is justified since we can apply the central limit
theorem given the large sample size n.
Q 66) False; A normal distribution is symmetric about its mean, hence P (X ≥ 7) = 0:5.
Q 67) True; A normal distribution is symmetric about its mean, hence P (X ≤ 3) = 0:5.
P a g e | 31
Q 70) True; A normal distribution is symmetric about its mean, hence P (X ≥ 8) = 0:5.
The End
Good Luck