0% found this document useful (0 votes)
9 views

Unit 4 - Probability and Statistics

Uploaded by

Bharat
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

Unit 4 - Probability and Statistics

Uploaded by

Bharat
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 37

1.1.

Measures of Central Tendency


1.1.1. The Arithmetic Mean
Introduction:
This chapter describes the most commonly used average, the arithmetic mean. It is initially
defined in words with an accompanying simple example, then some important notation for
describing sets of data is given. Techniques for calculating the mean for (discrete) sets of data and
frequency distributions are then demonstrated and the place of so-called “weighted” means is
shown.

Definition of the Arithemetic Mean:


The ARITHMETIC MEAN of a set of values is defined as “the sum of the values” divided
by “the number of values”. The arithmetic mean is normally abbreviated to just the “mean”

Example 1: (ARITHMETIC MEAN for a SET)


(a) If a firm received orders worth ₤151, ₤52 and ₤280 for three consecutive months, their
mean average values of orders per month would be calculated as:
£ (151 + 52 + 280) £483
= = £161
3 3
(b) The mean of the values 12, 8, 25, 26 and 10 is calculated as:
12 + 8 + 25 + 26 + 10 81
= = 16.2
5 5

Formula for the mean of a set of values:


The Arithmetic Mean is the most commonly used average and is defined (for a set of values) as
follows:
the sum of all the values
Arithmetic mean =
the number of values
It is commonly known as the “mean”, which it will generally be called from hereon in the manual.
Using the notation of the previous section, the mean of a set of values x1,x2…….xn is calculated
as follows:

Mean for a set:


x 1 + x 2 + ......... + x n
x=
n

i.e. x =
x
n

Example 2: (MEAN for a SET):


To calculate the mean for the set: 43,75,50,51,51,47,50,47,40,48
Here, n = 10 and  x = 502.

Therefore: x =
 x = 502 = 50.2
n 10
The Mean of a Simple Frequency Distribution:
Large sets of data will normally be arranged into a frequency distribution, and thus the
formula for the mean given in last section is not quite appropriate, since no account is taken of
frequencies. In the case of a simple (discrete) frequency distribution such as:
x 10 12 13 14 16 19
f 2 8 17 5 1 1

Example 3: (MEAN for a SIMPLE FREQUENCY DISTRIBUTION)


Calculate the mean of the following distribution:
No. of vehicles serviceable x 0 1 2 3 4 5
Number of days f 2 5 11 4 4 1

Solution:
x f fx
0 2 0
1 5 5
2 11 22
3 4 12
4 4 16
5 1 5
27 60

Thus x =
 fx = 60 = 2.2
 f 27
Hence, the mean number of vehicles serviceable is 2.2.

The Mean of a Grouped Frequency Distribution:


One of the disadvantages of arranging discrete data into the form of a grouped frequency
distribution is the fact that individual values of items are lost. This is particularly inconvenient
when a mean needs to be calculated since, clearly, it is impossible to find the total of the values
of the items, which means, in effect, that it is impossible to calculate the mean exactly. However,
it is possible to estimate it. This is done by:
(a) using the group (or class) mid-points as representative x-values,
(b) estimating the total of the values in each group using x times f (the group frequency),
(c) adding these totals together to form an estimate of the total of all values (i.e.  fx ),
(d) dividing by the total number of items ( f ) .

Notice that this gives an estimate of the mean as x =


 fx which is exactly the same formula as
f
for a simple frequency distribution

Formula for the Mean of a frequency distribution:


The mean for a frequency distribution is calculated using the following formula.
Mean for a frequency distribution
Mean, x =
 fx
f
Note: For a grouped (discrete or continuous) frequency distribution, x is the class mid-point

Example 4: (MEAN for a GROUPED DISCRETE FREQUENCY DISTRIBUTION)


The following data relates to the number of successful sales made by the salesman employed
by a large microcomputer firm in a particular quarter.
Number of Sales 0 – 4 5 – 9 10 – 14 15 – 19 20 – 24 25 – 29

Number of 1 14 23 21 15 6
Salesman
Calculate the mean number of sales.

Solution: The standard layout and calculations are as follows:

Number of Sales Number of Salesman Class midpoint

(f) (x) (fx)


0 to 4 1 2 2
5 to 9 14 7 98
10 to 14 23 12 276
15 to 19 21 17 357
20 to 24 15 22 330
25 to 29 6 27 162
Totals 80 1225

Here,  fx =1225  f = 80
and

Mean number of sales, x =


 fx
f
1225
=
80
= 15.3

Example 5: (MEAN of a GROUPED CONTINUOUS FREQUENCY DISTRIBUTION)


A machine produces circular bolts and, for a quality control test, 250 were selected randomly and
the diameter of their heads measured. Find the mean of the following resulting diameters.
Diameter of Head Number of Diameter of Number of
cm components Head cm components
0.9747 – 0.9719 2 0.9765 – 0.9767 49
0.9750 – 0.9752 6 0.9768 – 0.9770 25
0.9753 – 0.9755 8 0.9771 – 0.9773 18
0.9756 – 0.9758 15 0.9774 – 0.9776 12
0.9759 – 0.9761 42 0.9777 – 0.9779 4
0.9762 – 0.9764 68 0.9780 – 0.9782 1

Solution: Since there are many classes and the data is fairly unwieldy, the given classes have not
been repeated in the following table of calculation. This would be perfectly acceptable in an
examination.

Mid Points x f fx
0.9748 2 1.9496
0.9751 6 5.8506
0.9754 8 7.8032
0.9757 15 14.6355
0.9760 42 40.9920
0.9763 68 66.3884
0.9766 49 47.8534
0.9769 25 24.4225
0.9772 18 17.5896
0.9775 12 11.7300
0.9778 4 3.9112
0.9781 1 0.9781
250 244.1041

Here Σfx = 24444.1041 and Σf = 250


 fx
Therefore, x = = 0.97642
f
That is, the mean diameter of head = 0.97642 cm.

Note:
(a) The arithmetic mean is the most well known example of a measure of location, or average,
which aims to represent a set of items numerically.
(b) The special notation, x1, x2, x3, ……etc is used as a method of describing the individual
values of the items in a group in general terms, without specifying their actual values.
(c) The summation operator, ∑, is used to represent the addition of a set of values in general
terms.
(d) The mean for a set of values is found by dividing the sum of the values by their number.
(e) The mean is the most popular average, being well understood and taking all items into
account. Its main disadvantage is the fact that it takes extreme values too much into account
and can be considered unrepresentative where such values occur.

1.1.2. The Median


Introduction
The median is generally considered as an alternative average to the mean. This section defines
the median and shows how to find its value for a set and for simple and grouped frequency
distributions. In the case of a grouped frequency distribution, two equivalent methods are
demonstrated, one using a formula, the other a graphics method. Also described are the standard
situations where the median is most effectively used.

Definition of the Median


Suppose a machine produces 5, 3, 5, 21 and 2 defective items each day over a five-day period.
The mean number of defectives per day would be calculated as:
5 + 3 + 5 + 21 + 2 36
Mean = = = 7.2
5 5
An objection to using 7.2 as an average here is that it is unrepresentative, both of the four lower
values (5, 3, 5 and 2) and the largest value (21). The mean takes extreme items into account and
thus is sometimes not very useful as a practical average. In cases such as these, the median, is used.
This is found by placing the values in size order and picking the middle value as the average. The
above five values, in order of size, can be written as:
2, 3, 5, 5, 21
and the median (the middle value, underlined) is seen to be 5, which is more useful as a working
average.

The MEDIAN of a set of data is the value of that item which lies exactly half-way along the set
(which must be arranged into size order).

Note 1: When a set of data contains an even number of items, there is no unique middle or central
value. The convention used in this situation is to use the mean of the middle two items to give a
(practical) median.
Note 2: For a set with an odd number (n) of items, the median can be precisely identified as the
 n + 1
th

value of the   item. Thus in a size-ordered set of 15 items, the median would be the
 2 
 15 + 1 
th

  = the 8th item along.


 2 

Example 6: (MEDIAN of a set of values)


(a) The median of 43, 75, 48, 51, 51, 47, 50 is determined by size-ordering the set as:
43, 47, 48, 50, 51, 51, 75 and then: median = middle item = 50.
(b) The median of 2, 4, 6, 1, 2, 3, 3, 2 is found by size-ordering the set as: 1,2,2,2,3,3,4,6
(noticing that there is an even number of items) which gives median = mean of
middle two = (2+3)2 = 2.5

Median for a Simple Frequency Distribution


Where there are a large number of discrete items in a data set, but the range of values is limited,
a simple frequency distribution will probably have been complied. For example, if records had
been kept of the number of vehicles not available for hire on each of 80 consecutive days for a
large taxi fleet, the results might appear as follows.

Number of vehicles Number of days


unavailable
0 15
1 24
2 18
3 12
4 8
5 2
6 1

Procedure for calculating the Median


To calculate the median for a simple (discrete) frequency distribution, the following procedure
should be followed.

Step 1: Calculate the value of


 f + 1 (identifying the central item)
2
Step 2: Form a F (cumulative frequency) column

Step 3: Find that F value which first exceeds


 f +1
2
x f F
__________________
0 15 15
1 24 39

Median = 2 18 57
 f + 1 = 81 = 40.5
2 2
3 12 69
4 8 77
5 2 79
6 1 80

Step 4: The median is the x-value corresponding to the F value identified in Step 3
Note: Sometimes  f is replaced by N for convenience.

Example 7: Calculate the median for the following distribution of delivery times of orders sent
out from a firm
Delivery 0 1 2 3 4 5 6 7 8 9 10 11
time(days)
No. of orders 4 8 11 12 21 15 10 4 2 2 1 1

Solution:
 N + 1
th th
 92 
The median is the   =   = 40th item.
 2   2
The F Column is shown in the following table.
The first F value to exceed 46 is F = 56

Delivery time (days) x No. of orders f Cum. Freq. (F)


0 4 4
1 8 12
2 11 23
3 12 35
4 21 56
5 15 71
6 10 81
7 4 85
8 2 87
9 2 89
10 1 90
11 1 91

The median is thus 4 (days)

Median for a Grouped frequency distribution:


As mentioned earlier, the penalty paid for grouping values is the loss of their individual
identities and thus there is now way that a median can be calculated exactly in this situation.
However, there is a method commonly employed for estimating the median: Using an interpolation
formula. In this context is a simple mathematical technique which estimates an unknown value by
utilizing immediately surrounding known values.

Steps for estimating the median by formula:


Step 1: Form a cumulative frequency (F) column
Step 2: Find the value of N + 2(where N = Σf)
Step 3: Find that F value that first exceeds N/2, which identifies the median class M.
Step 4: Calculate the median using the following interpolation formula:
N 
 2 − FM −1 
Median = LM +  cM
 f M 
 
where LM is the lower bound of the median class
FM-1 is the cumulative frequency of class immediately prior to the median class
fM is the actual frequency of median class
cM is the width of the median class.

Example 8: Estimate the median for the following data, which represents the ages of a set of 130
representatives who took part in the statistical survey.
Age in years 20 and 25 and 30 and 35 and 40 and 45 and
under 25 under 30 under 35 under 40 under 45 under 50
No. of 2 14 29 43 33 9
representatives

Solution:
Age (in years) Number of Representatives f F
20 and under 25 2 2
25 and under 30 14 16
30 and under 35 29 45
35 and under 40 43 88
40 and under 45 33 121
45 and under 50 9 130

N/2 = 130//2 = 65

The median class is the class that has the first F greater than 65, i.e. 35 to 40

The median can now be estimated using the interpolation formula

LM = 35 FM-1= 45 fM = 43 cM = 5

N 
 2 − FM −1   65 − 45 
Median = LM +   c M = 35 +   5 = 37.33
 fM   43 
 
Therefore the median is 37.33years.

Median for a simple continuous frequency distribution:

Occasionally, continuous data will be measured to a particular value rather than naturally
allocated to true continuous groups. For example, during a work study exercise, the times taken
by 46 workers to complete a particular job were measured to give the following:

No. of minutes 11 12 13 14 15 16 17 18 19
No. of workers 2 6 18 12 5 0 1 1 1
Notice that, although at first sight he data might appear discrete, it is strictly continuous. In order
to calculate median, the values given for number of minutes must be translated as true continuous
groups rather discrete values.

Characteristics of the Median:


(i) It is an appropriate to the mean when extreme values are present at one or both ends of
a set or distribution
(ii) It can be used when certain end values of a set of distribution are difficult, expensive
or impossible to obtained particularly appropriate to life data.
(iii) It can be used with non-numeric data if desired, providing the measurements can be
naturally ordered
(iv) It will often assume a value equal to one of the original items, which is considered as
an advantage over the mean
(v) The main disadvantage of the median is that it is difficult to handle theoretically in
more advanced statistical work.

1.1.3. The Mode


Introduction:
Although the mean and median will be the averages used in most circumstances, there are
situations in which other averages are particularly appropriate. Whereas the mean can be said to
find the centre of gravity and the median , the middle of a set of items, the mode identifies the
most popular item and is described in the following sections.

Definition of Mode:
Mode of a set of data is the value that occurs often, or equivalently has the largest
frequency.

The mode of the set 2, 1, 4, 3, 3, 1, 1, 2, 1, is 1 since this value occurs often.

Example 9: The mode of the following simple discrete frequency distribution

x 4 5 6 7 8 9 10
f 2 5 21 18 9 2 1
Is 6, since this value has the largest frequency of 21.

The mode for grouped data


For a grouped frequency distribution, the mode( in line with the mean and median) cannot
be determined exactly and so must be estimated. The technique used is one of the interpolation,
similar to that used to estimate the median of a frequency distribution.

Steps for estimating the mode by an interpolation formula


Step 1: Determine the modal class(that class which has the largest frequency)
Step 2: Calculate D1 = difference between the largest frequency and the frequency immediately
preceding it.
Step 3: Calculate D2 = difference between the largest frequency and the frequency immediately
following it.
Step 4. Use the following interpolation formula
 D1 
Mode = L +  C
 1
D + D 2 

Where L is the lower bound of modal class


C is the modal class width.

Example 10: Estimate the mode of the following distribution of ages.


Age in years 20-25 25-30 30-35 35-40 40-45 45-50
No. of employees 2 14 29 43 33 9

Solution:
Age (Years) No. of Employees
20-25 2
25-30 14
30-35 29
→ 35-40 43
40-45 33
45-50 9

The modal class is 35-40

D1 = 43 – 29 = 14
D2 = 43 – 33 = 10

The lower class bound of the modal class is L = 35.


The class width of the modal class, C = 5 (from 35 – 40)

 D1   14 
Thus Mode = L +    C = 35 +   5 = 37.92
 D1 + D2  14 + 10 
Therefore the mode is 37.92 years.

Characteristics of the Mode:


(i) Occasionally used as an alternative to the mean or median when the situation calls for
the most popular value to represent some data
(ii) Easy to understand, not difficult to calculate and can be used when a distribution has
open ended classes
(iii) Although the mode usefully ignores isolated extreme values, it is thought to be too
much affected by the most popular class when a distribution is significantly skewed.
(iv) Like the median, the mode is not used in advanced statistical work.

Empirical Relationship between Mean, Median and Mode.

Mode = 3 Median – 2 Mode.

Exercise 1.1:
1. Find the arithmetic mean of the following sets:
(a) 84, 92, 73, 67, 88, 74, 91, 74
(b) 0.53, 0.46, 0.50, 0.49, 0.52, 0.53, 0.44, 0.55, 0.54

2. Find the mean of the following frequency distributions:


(a)
x f
18.5 5
19.5 12
20.5 20
(b)

x f
1 2
2 8
3 24
4 52
5 31
6 11

3. A firm recorded the number of orders received for each of 58 successive weeks to give the
following distribution:

Number of Orders Number of


Received Weeks
10 – 14 3
15 – 19 7
20 – 24 15
25 – 29 20
30 – 34 9
35 – 39 4
Calculate the mean weekly number of orders received.

4. The ages of a company’s employees are tabulated below:


Calculate the mean employee age in years.

Number of
Age in years Employees
20 and under 25 2
25 and under 30 14
30 and under 35 29
35 and under 40 43
40 and under 45 33
45 and under 50 9

5. A quality control section of a cannery inspected the contents of 130 randomly selected tins of
cooked spaghetti from output. As part of their measurements, the following net weights (in grams)
were tabulated:

Weight (in grams) Number


of Tins
under 424.9 1
424.900 – 424.925 1
424.925 – 424.950 6
424.950 – 424.975 18
424.975 – 424.000 33
425.000 – 425.025 46
425.025 – 425.050 14
425.050 – 425.075 5
425.075 – 425.100 5
425.1 and over 1

Calculate the mean net weight of the contents of the tins and say whether you think that the
consumer is getting reasonable value if the label on the tin advertises the contents as 425gms.

6. The following is an extract from a business report.


“….. Over the past 15 months, the number of orders received has averaged 24 per month with the
best three months averaging 35. The lowest months saw only 14, 14, 16 and 22 orders
respectively……”.
(a) Find the average number of orders that were received in the middle 8 months.
(b) If the target over 16 months is an average of 25, how many orders must be received in
month 16 to achieve this?

7. During the 1984 – 85 session, a college ran 70 different classes of which 44 were “science”,
with a mean class size of 15.2, and 26 were “arts”, with a mean class size of 19.2. The frequency
distribution of class sizes is given:

Size of Class Number of Number of


(Number of students) Science Classes Arts Classes
1–6 4 0
7 – 12 15 3
13 – 18 11 10
19 – 24 8 8
25 – 30 5 4
31 – 36 1 1
No student belonged to more than one class.
(a) Calculate the mean class size of the college.
(b) Suppose now that no class of 12 students or less had been allowed to run. Calculate what
the mean class size for the college would have been if the students in such classes:
(i) had been transferred to the other classes;
(ii) had not been admitted to the college.
(c) Number of students enrolling in 1986-87 on science and arts courses is expected to rise by
20% and to fall by 10% respectively, compared with 1984-85. Calculate the maximum
number of classes the college should run if the mean class size is to be not less than 20.s

8. Find the median of the following sets of data:


(a) 2.52, 3.96, 3.28, 9.20, 3.75
(b) 84, 91, 72, 68, 87, 78, 78, 82, 79

9. The following figures were obtained by sampling the output of bags of walnuts which were
ready to be distributed to a national chain of supermarkets.

No. of walnuts 19 20 21 22 23

No. of Bags 2 11 29 36 10
Find the median number of walnuts per bag.

10. Use the interpolation formula to estimate the median of the following data, which relate to the
IQ of a special group of an organisation’s employees.
IQ 98-106 107-115 116-124 125-133 134-142 143-151 152-160
No. of employees 3 5 9 12 5 4 2

11. The following figures relate to the length of time spent by cars in a particular car park during
one day.
Time Parked Upto 1-2 2-3 3-4 4-5 5-6 6-9 9-12
1
No. of Cars 450 730 640 120 40 30 20 20
Estimate the median parking time.

12. Determine the value of the mode for the following sets of data:
(a) 10, 11, 10, 12, 11, 10, 11, 11, 11, 12, 13, 11, 12
(b) 2, 1, 1, 2, 3, 2, 3, 4, 6, 4, 1, 2, 3
(c)
x 14 15 16 17 18 19 20
f 14 26 18 9 2 1 1

13. Calculate a modal value for the following data of age at commitment of crime of 500 male
criminals.
Age (years) Under 16-17 18 19-20 21-27 28-36
16
Number of men 8 70 95 133 161 33

14. Find the mode of the following distribution.


No. of children 0 1 2 3 4 5 6 or more
No. of families 11 47 28 9 4 1 1

1.2 Measures of Dispersion


The absolute measures can be divided into following four positional measures.
1. Standard Deviation.
The relative measures in each of the above four cases are called the coefficient of the
respective measures such as coefficient of standard deviation etc. The relative measures are used
only for the purpose of comparision between two or more series with varying size or number of
items or varying central values or varying units of calculation.

1.2.1 Standard Deviation

Standard Deviation is the most important and commonly used measure of dispersion. It
measures the absolute dispersion or variability of a distribution. A small standard deviation means
a high degree of uniformity of the observations as well as homogeneity of a series. It is extremely
useful in judging the representativeness of the mean.
Standard deviation is the positive square root of the average of squared deviations taken from
arithmetic mean. It is, generally, denoted by the Greek alphabet  or by S.D. or s.d. Let x be a
random variate which takes on n values, viz., x1, x2, … , xn, then the standard deviation of these n
observations is given by,
n

 (x i − x)2
= i =1
where x is the mean of the observations.
n
When the items are very small, the following formula is used
 x2   x 
2

= − 
n  n 

Example 19: Find the standard deviation of 3, 4, 5, 6


Solution: Here n = 4, x = 3 + 4 + 5 + 6 = 18
x2 = 32 + 42 + 52 + 62 = 9 + 16 + 25 + 36 = 86
 x2   x 
2

  = − 
n  n 
2
86  18 
= −   = 1.12
4 4

Merits, Demerits and Uses of Standard Deviation:


Merits:
1. It is based on all the observation
2. It is rigidly defined
3. It has a greater mathematical significance and is capable of further mathematical
treatments
4. It represents the true measurement of dispersion of a series.
5. It is least affected by fluctuation of sampling
6. It is not reliable and dependable measure of dispersion
7. It is extremely useful in correlation etc.
Demerits:
1. It is difficult to compute unlike other measures of dispersion
2. It is not simple to understand and not easily understood
3. It gives more weightage to extreme values
4. It consumes much time and labour while computing it.
Uses:
1. It is widely used in biological studies.
2. It is used in fitting a normal curve to a frequency distribution
3. It is most widely used measure of dispersion.

Calculation of Standard Deviation – Individual observations:


When the data under consideration consists of individual observations, the standard deviation
may be computed by any of the following two methods.
(a) By taking deviations of the items from the actual mean.
(b) By taking deviations of the items from an assumed mean.

Direct Method:
In case of simple series, the standard deviation can be obtained by the formula

 (x i − x)2
= i =1

n
d 2
or = , where d = xi – x
n
and xi is the value of the variable or observation,
x is the arithmetic mean,
n is the total number of observations.

Steps of calculation
Step 1: Calculate the arithmetic mean x
Step 2: Take the deviations of the items from the mean
i.e. calculate d = xi – x
Step 3: Take the sum of the square of all these deviations
n
i.e. d2 =  (x
i =1
i − x)2

Step 4: Find the mean of the squared deviations obtained in step 3.


d2
i.e. , where n is the total number of observations. It is known as
n
variance.
Step 5: Take the square root of variance to get the desired standard deviation.
Example 20: Find the standard deviation of 16, 13, 17, 22.
16 + 13 + 17 + 22 68
Solution: Here A.M. = Mean = = = 17 .
4 4
Let us prepare the following table in order to calculate the standard deviation.

X d=x– x=x– d2 = (x – x )2
17
16 –1 1
13 –4 16
17 0 0
22 5 25
d2 = 42
n

 (x i − x)2
d2 42
Now,  = i =1
= = = = 3.2
n n 4
Short cut method:
This method is applied to calculate standard deviation, when the mean of the data comes
out to be a fraction. In that case it is very difficult and tedious to find the deviations of all
d2  d 
2

observations from the mean by the above method. The formula used is  = −  ,
n  n 
where d = x – A, A is assumed mean
Steps in calculation:
Step 1: Take any arbitrary number as the assumed mean A.
Step 2: Take the deviations from the assumed mean and denote it by d.
i.e. d = x – A. Take the total of these deviations, i.e. obtain d.
Step 3: Square these deviations and obtain d2.
d d2  d 
2

Step 4: Calculate , ,  , where n is the total number of the observations.


n n  n 
d2  d 
2

Step 5: Find –   . Take its square root to get the standard deviation of the given data.
n  n 

Example 21: Find the standard deviation of the following data:


48, 43, 65, 57, 31, 60, 37, 48, 59, 78.
Solution: Let us prepare the following table in order to calculate the value of S.D. by assuming
the value of A as 50.
Value x d=x–A d2
48 –2 4
43 –7 49
65 15 225
57 7 49
31 – 19 361
60 10 100
37 – 13 169
48 –2 4
59 9 81
78 28 784
n = 10 d = 26 d2 = 1826

d 26
Here, x = A+ = 50 + = 52.6 , which is a fraction.
n 10
Let us apply the short cut formula in order to calculate S.D.
d2  d 
2 2
1826  26 
 = −  = −   = 13.26
n  n  10  10 

Standard Deviation for Discrete Series or Grouped Data:


The standard deviation of a discrete series or grouped data can be calculated by any one of
the following methods.
(a) Actual Mean Method or Direct Method
(b) Assumed Mean Method or Short cut method
(a) Direct Method: The standard deviation for the discrete series is given by the formula
n

 f (x i − x)2
= i =1
, where x is the arithmetic mean, x is the size of the item, f is the
n
corresponding frequency and n = f.
However, in practice, this method is rarely used because if the arithmetic mean is in
fraction, the calculations take a lot of time and are cumbersome.

(b) Short cut Method: In this method we use the following formula to calculate the standard
 fd 2   fd 
2

deviation  = −  , where d = x – A, A is the assumed mean and n = f.


n  n 

Steps in calculation
Step 1: Take any item of the given series as assumed mean A.
Step 2: Take the deviations of the items from the mean A and denote it by d
Step 3: Multiply the deviations by the respective frequency and denote it by fd. Obtain the total 
fd.
Step 4: Calculate d2, where d’s are obtained in step 2.
Step 5. Multiply the squared deviations by respective frequencies to get  fd2.
 fd 2   fd 
2

Step 6. Find the value of  = 2


–   .
n  n 
Step 7: Take the square root of 2 obtained in step 6 to get the value of standard deviation

Example 22: Find the standard deviation from the following data:
Size of the item: 10 11 12 13 14 15 16
Frequency: 2 7 11 15 10 4 1
Solution:
Size of the Frequency f d=x–A fd d2 fd2
item x A = 13
10 2 –3 –6 9 18
11 7 –2 – 14 4 28
12 11 –1 – 11 1 11
13 15 0 0 0 0
14 10 1 10 1 10
15 4 2 8 4 16
16 1 3 3 9 9
n = f = 50  fd = – 10  fd2 = 92
 fd (−10)
Now A.M. = x = A + = 13 + = 12.8 , a fraction.
n 50
 fd 2   fd  92  − 10 
2 2

S.D. =  = −  =  = −  = 1.342
n  n  50  50 

Calculation of Standard Deviation for a continuous series:


The standard deviation of a continuous series can be calculated by any one of the methods
discussed for discrete frequency distribution. However, in practice only Step Deviation Method
is mostly used. In this method the formula used is
 fd 2   fd  m− A
2

= −   i , where d = , i is the class interval, m is the mid value of the


n  n  i
interval, A is the assumed mean.

Steps in calculation;
Step 1: Find the mid values or mid points of the various classes and denote it by m
Step 2: Take any one of the values of m’s as the assumed mean A
Step 3: Take the deviations of the mid points from the assumed mean A and divide it by class
interval or common factor i. Denote it by d.
Step 4: Multiply the respective frequencies f with the corresponding deviation d and obtain  fd.
Step 5: Square the deviations d and multiply it with their respective frequencies. Obtain  fd2
Step 6: Substitute the values of  fd,  fd2, i in the formula
 fd 2   fd 
2

= −   i , where n = f.
n  n 

Example 23: Find the standard deviation of the following distribution:


Marks 10 – 20 20 – 30 30 – 40 40 – 50 50 – 60 60 – 70 70 – 80
No. of Students 5 12 15 20 10 4 2

Solution: Assume A = 45
Class interval No of students f Mid value x x − 45 fd fd2
d=
10
10 – 20 5 15 –3 – 15 45
20 – 30 12 25 –2 – 24 48
30 – 40 15 35 –1 – 15 15
40 – 50 20 45 0 0 0
50 – 60 10 55 1 10 10
60 – 70 4 65 2 8 16
70 – 80 2 75 3 6 18
f = n = 68 – 30 fd2=152

 fd 2   fd  152  − 30 
2 2

= −  i = −   10 = 14.3
n  n  68  68 
4. Theory of Probability
4.1.1. Introduction

If an experiment is repeated under essential homogeneous and similar conditions we


generally come across two types of situations:
(i) The result or what is usually known as the ‘outcome’ is unique or certain.
(ii) The result is not unique but may be one of the several possible outcomes.
The phenomena covered by (i) are known as deterministic. For example, for a perfect gas, PV =
constant.
The phenomena covered by (ii) are known as probabilistic. For example, in tossing a coin we are
not sure if a head or tail will be obtained.

In the study of statistics we are concerned basically with the presentation and interpretation
of chance outcomes that occur in a planned study or scientific investigation.

4.1.2. Definition of various terms

Trial and event: Consider an experiment which, though repeated under essentially identical
conditions, does not give unique results but may result in any one of the several possible outcomes.
The experiment is known as a trial and outcomes are known as events or cases. For example,
throwing of a die is a trial and getting 1(or 2 or … 6) is an event.

Exhaustive events: The total number of possible outcomes in any trial is known as exhaustive
events or exhaustive cases. For example, in tossing of a coin there are two exhaustive case, viz.:
Head and Tail(the possibility of the coin standing on an edge being ignored)

Favourable events or cases: The number of cases favourable to an event in a trial is the number
of outcomes which entail the happening of the event. For example, in throwing of two dice, the
number of cases favourable to getting the sum 3 is: (1,2) and (2,1)

Mutually exclusive events: Events are said to be mutually exclusive or incompatible if the
happening of any one of them precludes the happening of all the others, that is if no two or more
of them can happen simultaneously in the same trial. For example, in tossing a coin the events
head and tail are mutually exclusive.

Equally likely events: Outcomes of a trial are said to be equally likely, if taking into consideration
all the relevant evidences, there is no reason to expect one in preference to the others. For example,
in throwing an unbiased die, all the six faces are equally likely to come.

Sample Space: Consider an experiment whose outcome is not predictable with certainty.
However, although the outcome of the experiment will not be known in advance, let us suppose
that the set of all possible outcomes is known. This set of all possible outcomes of an experiment
is known as the sample space of the experiment and is denoted by S.

Some examples follow.


1. If the outcome of an experiment consists in the determination of the sex of a newborn child,
then
S = { g,b}
where the outcome g means that the child is a girl and b that it is a boy.

2. If the experiment consists of flipping two coins, then the sample space consists of the following
four points:
S = {(H,H), (H,T), (T,H), (T,T)}
The outcome will be (H,H) if both coins are heads, (H,T) if the first coin is heads and the
second tails, (T,H) if the first is tails and the second heads, and (T,T) if both coins are tails.

3. If the experiment consists of tossing two dice, then the sample space consists if the 36 points
S = { (i,j):i,j = 1, 2, 3, 4, 5,
= { (1,1)------(1,6)-----(6,1)-----(6,6) }
where the outcome (i,j) is said to occur if i appears on the leftmost die and j on the other die.

4.2. Definitions of Probability

1. Mathematical or Classical or a priori probability:


If a trial results in n exhaustive, mutually exclusive and equally likely cases and m of them
are favourable to the happening of an event E, then the probability ‘p’ of happening of E is given
by,
Favourable number of cases m
p = P(E) = =
Exhaustive number of cases n

2. Statistical or empirical probability:


If a trial is repeated a number of times under essentially homogenous and identical
conditions, then the limiting value of the number of times the event happens to the number of trials,
as the number of trials become indefinitely large is called the probability of happening of the event.
Symbolically, if in n trials an event E happens m times, then the probability ‘p’ of the happening
of E is given by,
m
P = P(E) = lim
n → n
3. Axiomatic Definition:
Consider an experiment whose sample space is S. For each event E of the sample space S,
we assume that a number P(E) is defined and satisfies the following three axioms.

Axiom 1: 0 ≤ P(E) ≤ 1
Axiom 2: P(S) = 1
Axiom 3: For any sequence of mutually exclusive events, E1, E2, … (that is, events for which
Ei Ej = Φ, when i ≠ j),
  
P  E i  =  P(E i )
 i =1  i =1

4.2.1. Some Important Formulas

1. If A and B are any two events, then


P(A  B) = P(A) + P(B) − P(A  B)
This rule is known as additive rule on probability.

For three events A, B and C, we have,


P(A  B  C) = P(A) + P(B) + P(C) − P(A  B) − P(B  C) − P(A  C) + P(A  B  C)

2. If A and B are mutually exclusive events, then


P(A  B) = P(A) + P(B)

In general, if A1, A2, … , An are mutually exclusive, then


P(A1  A 2  A 3  ...  A n ) = P(A1 ) + P(A2 ) + ... + P(An )

3. If A and Ac are complementary events, then


P(A) + P(Ac) = 1

4. P(S) = 1

5. P(Φ) = 0

6. If A and B are any two events, then


P(A  B) = P(A) + P(B) − P(A  B)

7. If A and B are independent events, then


P(A  B) = P(A)  P(B)
Glossary of Probability terms:

Statement Meaning in terms of


Set theory
1. At least one of the events A or B occurs  A  B
2. Both the events A and B occur  A  B
3. Neither A nor B occurs  A  B
4. Event A occurs and B does not occur  A  B
5. Exactly one of the events A or B occurs  A  B
6. If event A occurs, so does B AB
7. Events A and B are mutually exclusive AB=
8. Complementary event of A A
9. Sample space Universal set S

4.2.2. Solved Examples

Example 1: Find the probability of getting a head in tossing a coin.


Solution: When a coin is tossed, we have the sample space Head, Tail
Therefore, the total number of possible outcomes is 2
The favourable number of outcomes is 1, that is the head.
The required probability is ½.

Example 2: Find the probability of getting two tails in two tosses of a coin.
Solution: When two coins are tossed, we have the sample space HH, HT, TH, TT
Where H represents the outcome Head and T represents the outcome Tail.
The total number of possible outcomes is 4.
The favourable number of outcomes is 1, that is TT
The required probability is ¼.

Example 3: Find the probability of getting an even number when a die is thrown
Solution: When a die is thrown the sample space is 1, 2, 3, 4, 5, 6
The total number of possible outcomes is 6
The favourable number of outcomes is 3, that is 2, 4 and 6
3
The required probability is= = ½.
6

Example 4: What is the chance that a leap year selected at random will contain 53 Sundays?
Solution: In a leap year(which consists of 366 days) there are 52 complete weeks and 2 days over.
The following are the possible combinations for these two over days:
(i) Sunday and Monday (ii)Monday and Tuesday (iii)Tuesday and Wednesday (iv)Wednesday
and Thursday (v)Thursday and Friday (vi)Friday and Saturday (vii)Saturday and Sunday.
In order that a leap year selected at random should contain 53 Sundays, one of the two over
days must be Sunday. Since out of the above 7 possibilities, 2 viz. (i) and (ii)are favourable to this
event,
2
Required probability =
7

Example 5: If two dice are rolled, what is the probability that the sum of the upturned faces will
equal 7?
Solution: We shall solve this problem under the assumption that all of the 36 possible outcomes
are equally likely. Since there are 6 possible outcomes – namely (1,6), (2,5), (3,4), (4,3), (5,2,),
6 1
(6,1) – that result in the sum of the dice being equal to 7, the desired probability is = .
36 6

Example 6: A bag contains 3 Red, 6 White and 7 Blue balls. What is the probability that two balls
drawn are white and blue?
Solution: Total number of balls = 3 + 6 + 7 = 16.
Out of 16 balls, 2 can be drawn in 16C 2 ways.
Therefore exhaustive number of cases is 120.
Out of 6 white balls 1 ball can be drawn in 6C 1 ways and out of 7 blue balls 1 ball can be
drawn in 7C 1 ways. Since each of the former cases can be associated with each of the latter cases,

total number of favourable cases is 6C 1 x 7C 1 = 6 x 7 = 42.


42 7
The required probability is = =
120 20

Example 7: A lot consists of 10 good articles, 4 with minor defects and 2 with major defects. Two
articles are chosen from the lot at random (without replacement). Find the probability that (i) both
are good, (ii) both have major defects, (iii) at least 1 is good, (iv) at most 1 is good, (v)exactly 1 is
good, (vi) neither has major defects and (vii) neither is good.
Solution: Although the articles may be drawn one after the other, we can consider that both articles
are drawn simultaneously, as they are drawn without replacement.
No. of ways drawing 2 good articles
(i) P(both are good) =
Total no. of ways of drawing 2 articles
10C 2 3
= =
16C 2 8

No. of ways of drawing 2 articles with major defects


(ii) P(both have major defects) =
Total no. of ways

2C 2 1
= =
16C 2 120
(iii) P(at least 1 is good) = P(exactly 1 is good or both are good)
=P(exactly 1 is good and 1 is bad or both are good)
10C1 x6C1 + 10C 2 7
= =
16C 2 8

(iv) P(atmost 1 is good) =P(none is good or 1 is good and 1is bad)


10C 0 x6C 2 + 10C1 x6C1 5
= =
16C 2 8
(v) P(exactly 1is good) =P(1 is good and 1 is bad)
10C1 x6C1 1
= =
16C 2 2

(vi) P(neither has major defects) = P(both are non-major defective articles)
14C 2 91
= =
16C 2 120

(vii) P(neither is good) = P(both are defective)


6C 2 1
= =
16C 2 8

Example 8: From 6 positive and 8 negative numbers, 4 numbers are chosen at random (without
replacement) and multiplied. What is the probability that the product is positive?
Solution: If the product is to be positive, all the 4 numbers must be positive or all the 4 must be
negative or 2 of them must be positive and the other 2 must be negative.
No. of ways of choosing 4 positive numbers= 6C4 =15.
No. of ways of choosing 4 negative numbers= 8C 4 =70.
No.of ways of choosing 2 positive and 2 negative numbers
= 6C2 x8C2 = 420.
Total no. of ways of choosing 4 numbers from all the 14 numbers
= 14C4 = 1001.
P(the product is positive)
No. of ways by which the product is positive
=
Total no. of ways
15 + 70 + 420 505
= =
1001 1001

Example 9: If 3 balls are “randomly drawn” from a bowl containing 6 white and 5 black balls,
what is the probability that one of the drawn balls is white and the other two black?
Solution: If we regard the order in which the balls are selected as being relevant, then the sample
space consists of 11∙ 10 ∙ 9 = 990 outcomes. Furthermore, there are 6∙ 5∙ 4 = 120 outcomes in
which the first ball selected is white and the other two black; 5 ∙ 6∙ 4 = 120 outcomes in which the
first is black, the second white and the third black; and 5∙ 4 ∙ 6 = 120 in which the first two are
black and the third white. Hence, assuming that “randomly drawn” means that each outcome in
120 + 120 + 120
the sample space is equally likely to occur, we see that the desired probability is
990
4
=
11

Example 10: In a large genetics study utilizing guinea pigs, Cavia sp., 30% of the offspring
produced had white fur and 40% had pink eyes. Two-thirds of the guinea pigs with white fur had
pink eyes. What is the probability of a randomly selected offspring having both white fur and pink
eyes?
Solution: P(W) = 0.30, P(Pi) = 0.40, and P(Pi W) = 0.67. Utilizing Formula 2.9,
P(Pi ∩ W) = P(Pi W). P(W) = 0.67. 0.30 = 0.20.
Twenty percent of all offspring are expected to have both white fur and pink eyes.

Example 11: Consider three gene loci in tomato, the first locus affects fruit shape with the oo
genopyte causing oblate or flattened fruit and OO or Oo normal round fruit. The second locus
affects fruit color with yy having yellow fruit and YY or Yy red fruit. The final locus affects leaf
shape with pp having potato or smooth leaves and PP or Pp having the more typical cut leaves.
Each of these loci is located on a different pair of chromosomes and, therefore, acts independently
of the other loci. In the following cross OoYyPp × OoYypp, what is the probability that an
offspring will have the dominant phenotype for each trait? What is the probability that it will be
heterozygous for all three genes? What is the probability that it will have round, yellow fruit and
potato leaves?
Solution: Genotypic array:
1 2 1 1 2 1 1
( OO + Oo + oo) ( YY + Yy + yy) ( pp)
4 4 4 4 4 4 2

Phenotypic array:
3 1 3 1 1 1
( O- + oo) ( Y- + yy) ( P + pp)
4 4 4 4 2 2
The probabiltity of dominant phenotype for each trait from the phenotypic array above is
3 3 1 9
P(O-Y-P-) = P(O-) × P(Y-) × P(P-) = × × = .
4 4 2 32
The probability of heterozygous for all three genes from the genotypic array above is
2 2 1 4 1
P(OoYyPp) = P(Oo) × P(Yy) × P(Pp) = × × = = .
4 4 2 32 8
The probability of a round, yellow-fruited plant with potato leaves from the phenotypic
array above is
3 1 1 3
P(O-yypp) = P(O-) × P(yy) × P(pp) = × × = .
4 4 2 32
Each answer applies the probability rules for independent events to the separate gene loci.

Example 12: (a) Two cards are drawn at random from a well shuffled pack of 52 playing cards.
Find the chance of drawing two aces.
(b) From a pack of 52 cards, three are drawn at random. Find the chance that they are a
king, a queen and a knave.
(c) Four cards are drawn from a pack of cards. Find the probability that (i) all are diamond
(ii) there is one card of each suit (iii) there are two spades and two hearts.
Solution: (a) From a pack of 52 cards 2 can be drawn in 52C 2 ways, all being equally likely.

Exhaustive number of cases is 52C 2 .


In a pack there are 4 aces and therefore 2 aces can be drawn in 4C 2 ways.
4C 2 1
 Required probability = =
52C 2 221
(b) Exhaustive number of cases = 52C 3
A pack of cards contains 4 kings, 4 queens and 4 knaves. A king, a queen and a knave can
each be drawn in 4C 1 ways and since each way of drawing a king can be associated with each of
the ways of drawing a queen and a knave, the total number of favrourable cases = 4C 1  4C 1 
4C 1 .
4C1  4C1  4C1 16
 Required probability = =
52C3 5525

(c) Exhaustive number of cases 52C 4


13C 4
(i) Required probability =
52C 4
13C1  13C1  13C1  13C1
(ii) Required probability =
52C 4
13C 2  13C 2
(iv) Required probability =
52C 4

Example 13: What is the probability of getting 9 cards of the same suit in one hand at a game of
bridge?
Solution: One hand in a game of bridge consists of 13 cards.
 Exhaustive number of cases 52C 13
Number of ways in which, in one hand, a particular player gets 9 cards of one suit are 13C 9 and

the number of ways in which the remaining 4 cards are of some other suit are 39C 4 . Since there
are 4 suits in a pack of cards, total number of favourable cases is 4  13C9  39C4 .
4  13C9  39C 4
 Required probability =
52C13
Example 14: A committee of 4 people is to be appointed from 3 officers of the production
department, 4 officers of the purchase department, two officers of the sales department and 1
chartered accountant. Find the probability of forming the committee in the following manner:
(i) There must be one from each category
(ii) It should have at least one from the purchase department
(iii) The chartered accountant must be in the committee.
Solution: There are 3 + 4 + 2 + 1 = 10 persons in all and a committee of 4 people can be formed
out of them in 10C 4 ways. Hence exhaustive number of cases is 10C 4 = 210
(i) Favourable number of cases for the committee to consist of 4 members, one from each category
is 4C 1  3C 1  2C 1  1 = 24
24
Required probability =
120
(ii) P(Committee has at least one purchase officer) = 1 – P(Committee has no purchase Officer)
In order that the committee has no purchase officer, all the four members are to be selected
amongst officers of production department, sales department and chartered accountant, that
is out of 3 + 2 + 1 = 6 members and this can be done in 5C 4 = 15 ways. Hence,
15 1
P(Committee has no purchase officer) = =
210 14
1 13
P(Committee has at least one purchase officer) = 1 – =
14 14

(iii) Favourable number of cases that the committee consists of a chartered accountant as a member
and three others are:
1 9C 3 = 84 ways.
Since a chartered accountant can be selected out of one chartered accountant in only 1 way
and the remaining 3 members can be selected out of the remaining 10 – 1 persons in 9C 3
84 2
ways. Hence the required probability = = .
210 5

Example 15: A box contains 6 red, 4 white and 5 black balls. A persons draws 4 balls from the
box at random. Find the probability that among the balls drawn there is at least one ball of each
colour.
Solution: The required event E that in a draw of 4 balls from the box at random there is at least
one ball of each colour can materialize in the following mutually disjoint ways:
(i) 1 Red, 1 White and 2 Black balls
(ii) 2 Red, 1 White and 1 Black balls
(iii) 1 Red, 2 White and 1 Black balls
Hence by addition rule of probability, the required probability is given by,
P(E) = P(i) + P(ii) + P(iii)
6C1  4C1  5C 2 6C 2  4C1  5C1 6C1  4C 2  5C1
= + +
15C 4 15C 4 15C 4
= 0.5275

Example 16: A problem in Statistics is given to the three students A, B and C whose chances of
solving it are 1/2, 3/4 and 1/4 respectively. What is the probability that the problem will be solved
if all of them try independently?
Solution: Let A, B and C denote the events that the problem is solved by the students A, B and C
respectively. Then
P(A) = 1/2 P(B) = 3/4 P(C) = 1/4
P( A ) = 1 – 1/2 = 1/2 P( B ) = 1 – 3/4 = 1/4 P( C ) = 1 – 1/4 = 3/4

P(Problem solved) = P(At least one of them solves the problem)


= 1 – P(None of them solve the problem)
= 1 – P( A  B  C )
= 1 – P( A  B  C )
= 1 – P( A ) P( B ) P( C )
1 1 3
= 1−  
2 4 4
29
=
32

Example 17: Three groups of children contain respectively 3 girls and 1 boy, 2 girls and 2 boys
and 1 girl and 3 boys. One child is selected at random from each group. Find the probability that
the three selected consist of 1 girl and 2 boys.
Solution: The required event of getting 1 girl and 2 boys among the three selected children can
materialize in the following three mutually exclusive cases:

Group No. → I II III


(i) Girl Boy Boy
(ii) Boy Girl Boy
(iii) Boy Boy Girl

By addition rule of probability,


Required probability = P(i) + P(ii) + P(iii)

Since the probability of selecting a girl from the first group is 3/4, of selecting a boy from
the second is 2/4, and of selecting a boy from the third group is ¾, and since these three events of
selecting children from the three groups are independent of each other, we have,
3 2 3 9
P(i) =   =
4 4 4 32
1 2 3 3
P(ii) =   =
4 4 4 32
1 2 1 1
P(iii) =   =
4 4 4 32
9 3 1 13
Hence the required probability = + + =
32 32 32 32

Exercise 4.1.

1. From a bag containing 3 red and 2 black balls, 2 balls are drawn at random. Find the probability
that they are of the same colour.

2. A card is drawn from a well-shuffled pack of playing cards. What is the probability that it is
either a spade or an ace?

3. The probability that a contractor will get a plumbing contract is 2/3 and the probability that he
will get an electric contract is 4/9. If the probability of getting at least one contract is 4/5, what
is the probability that he will get both?

4. What is the probability of getting atleast 1 head when 2 coins are tossed?

5. If the probability that A solves a problem is ½ and that for B is ¾ and if they aim at solving a
problem independently, what is the probability that the problem is solved?
6. An urn contains 3 white balls, 4 red balls, and 5 black balls. Two balls are drawn from the urn
at random. Find the probability that (i) both of them are of the same colour and (ii) they are of
different colours.

7. Ten chips numbered 1 through 10 are mixed in a bowl. Two chips are drawn from the bowl
successively and without replacement. What is the probability that their sum is 10?

8. A box contains 4 white, 5 red and 6 black balls. Four balls are drawn at random from the box.
Find the probability that among the balls drawn, there is at least 1 ball of each colour.

9. (i) Four persons are chosen at random from a group consisting of 4 men, 3 women and 2 children.
Find the chance that the selected group contains at least 1 child.
(ii) A committee of 6 is to be formed from 5 lecturers and 3 professors. If the members of the
committee are chosen at random, what is the probability that there will be a majority of
lecturers in the committee?

10. Suppose that A and B are mutually exclusive events for which P(A) = .3 and P(B)= .5. What
is the probability that
(a) either A or B occurs;
(b) A occurs but B does not;
(c) both A and B occur?
11. Sixty percent of the students at a certain school wear neither a ring nor a necklace. Twenty
percent wear a ring and 30 percent wear a necklace. If one of the students is chosen randomly,
what is the probability that this student is wearing
(a) a ring or a necklace;
(b) a ring and a necklace? Ans: 0.4; 0.1

12. An urn contains 5 red, 6 blue, and 8 green balls. If a set of 3 balls is randomly selected, what
is the probability that each of the ball will be (a) of the same color; (b) of different colors?
Ans: 0.0888

13. There are 30 psychiatrists and 24 psychologists attending a certain conference. Three of these
54 people are randomly chosen to take part in a panel discussion. What is the probability that
at least one psychologists is chosen? Ans: 0.8363

14. Two cards are chosen at random from a deck of 52 playing cards. What is the probability that
they
(a) are both aces;
(b) have the same value? Ans: 0.0045; 0.0588

15. An instructor gives her class a set of 10 problems with the information that the final exam will
consist of a random selection of 5 of them. If a student has figured out how to do 7 of the
problems, what is the probability that he or she will answer correctly.
(a) all 5 problems;
(b) at least 4 of the problems? Ans: 0.0833; 0.5
(c)
16. If there are 12 strangers in a room, what is the probability that no two of them celebrate their
birthday in the same month?
17. A group of 6 men and 6 women is randomly divided into 2 groups of size 6 each. What is the
probability that both groups will have the same number of men?
Ans: 0.4329

18. If a zoologist has 6 male guinea pigs and 9 female guinea pigs, and randomly selects 2 of them
for an experiment, what are the probabilities that
(a) both will be males?
(b) both will be females?
(c) there will be one of each sex? Ans: 0.143; 0.343; 0.514
(d)
19. Suppose you are planning to study a species of crayfish in the ponds at a wildlife preserve.
Unknown to you 15 of the 40 ponds available lack this species. Because of time constraints
you feel you can survey only 12 ponds. What is the probability that you choose 8 ponds with
crayfish and 4 ponds without crayfish?
Ans:0.264

20. In a study of the effects of acid rain on fish populations in Adirondack mountain lakes, samples
of yellow perch, Perca flavescens, were collected. Forty percent of the fish had gill filament
deformities and 70% were stunted. Twenty percent exhibited both abnormalities.
(a) Find the probability that a randomly sampled fish will be free of both symptoms.
(b) If a fish has a gill filament deformity, what is the probability it will be stunted?
(c) Are the two symptoms independent of each other? Explain.

4.3. Conditional Probability and Baye’s Theorem

4.3.1. Conditional Probability and Multiplication Law


For two events A and B
P(A∩B) = P(A) . P(B/A), P(A) > 0
= P(B) . P(A/B), P(B) > 0
where P(B/A) represents the conditional probability of occurrence of B when the event A has
already happened and P(A/B) is the conditional probability of occurrence of A when the event B
has already happened.

4.3.2. Theorem of Total Probability:


If B1, B2, … , Bn be a set of exhaustive and mutually exclusive events, and A is another event
associated with (or caused by) Bi, then
n
P(A) =  P( B ) P( A / B )
i =1
i i

4.3.3. Solved Examples

Example 18 : A box contains 4 bad and 6 good tubes. Two are drawn out from the box at a time.
One of them is tested and found to be good. What is the probability that the other one is also
good?
Solution: Let A = one of the tubes drawn is good and B = the other tube is good.
P(A∩B) = P(both tubes drawn are good)
6C 2 1
= =
10C 2 3
Knowing that one tube is good, the conditional probability that the other tube is also
good is required, i.e., P(B/A) is required.
By definition,

P( A  B) 1 / 3 5
P(B/A) = = =
P( A) 6 / 10 9

Example 19: A bolt is manufactured by 3 machines A, B and C. A turns out twice as many items
as B, and machines B and C produce equal number of items. 2% of bolts produced by A and B are
defective and 4% of bolts produced by C are defective. All bolts are put into 1 stock pile and
chosen from this pile. What is the probability that it is defective?
Solution: Let A = the event in which the item has been produced by machine A, and so on.
Let D = the event of the item being defective.
1 1
P(A) = , P(B) = P(C) =
2 4
P(D/A) = P(an item is defective, given that A has produced it)
2
= = P(D/B)
100
4
P(D/C) =
100
By theorem of total probability,
P(D) = P(A )× P(D/A) + P(B) × P(D/B) + P(C) ×P(D/c)
1 2 1 2 1 4
= × + × + ×
2 100 4 100 4 100
1
=
40

Example 20: In a coin tossing experiment, if the coin shows head, one die is thrown and the result
is recorded. But if the coin shows tail, 2 dice are thrown and their sum is recorded. What is the
probability that the recorded number will be 2?
Solution: When a single die is thrown, P(2) = 1/6
When 2 dice are thrown, the sum will be 2 only if each dice shows 1.
1 1 1
P(getting 2 as sum with 2 dice) =  = (since independence)
6 6 36
By theorem of total probability,
P(2) = P(H)  P(2/H) + P(T)  P(2/T)
1 1 1 1 7
=  +  =
2 6 2 36 72

Example 21: An urn contains 10 white and 3 black balls. Another urn contains 3 white and 5
black balls. Two balls are drawn at random from the first urn and place in the second urn and then
one ball is taken at random from the latter. What is the probability that it is a white ball?
Solution: The two balls transferred may be both white or both black or one white and one black.
Let B1 = event of drawing 2 white balls from the first urn, B2 = event of drawing 2
black balls from it and B3 = event of drawing one white and one black ball from it.
Clearly B1, B2 and B3 are exhaustive and mutually exclusive events.
Let A = event of drawing a white ball from the second urn after transfer.
10C 2 15
P(B1) = =
13C 2 26
3C 2 1
P(B2) = =
13C 2 26
10  3 10
P(B3) = =
13C 2 26
P(A/B1) = P(drawing a white ball / 2 white balls have been transferred)
= P(drawing a white ball / urn II contains 5 white and 5 black balls)
5
=
10
3 4
Similarly, P(A/B2) = and P(A/B3) =
10 10
By theorem of total probability,
P(A) = P(B1)  P(A/B1) + P(B2)  P(A/B2) + P(B3)  P(A/B3)
15 5 1 3 10 4 59
=  +  +  =
26 10 26 10 26 10 130

Example 22: In 1989 there were three candidates for the position of principal – Mr.Chatterji, Mr.
Ayangar and Mr. Singh – whose chances of getting the appointment are in the proportion 423
respectively. The probability that Mr. Chatterji if selected would introduce co-education in the
college is 0.3. The probabilities of Mr. Ayangar and Mr.Singh doing the same are respectively 0.5
and 0.8. What is the proabability that there will be co-education in the college?
Solution: Let the events and probabilities be defined as follows:
A: Introduction of co-education
E1: Mr.Chatterji is selected as principal
E2: Mr.Ayangar is selected as principal
E3: Mr.Singh is selected as principal
Then,
4 2 3
P(E1) = P(E2) = P(E3) =
9 9 9
P(A/E1) = 0.3 P(A/E2) = 0.5 P(A/E3) = 0.8

P(A) = P( A  E1 )  ( A  E2 )  ( A  E3 )
= P( A  E1 ) + ( A  E2 ) + ( A  E3 )
= P(E1) P(A/E1) + P(E2) P(A/E2) + P(E3) P(A/E3)
4 3 2 5 3 8 23
=  +  +  =
9 10 9 10 9 10 45

4.3.4. Baye’s theorem


If E1, E2, … , En are mutually disjoint events with P(Ei)  0, (i = 1,2, … , n) then for any
n
arbitrary event A which is a subset of E
i =1
i such that P(A) > 0, we have,

P( Ei ) P ( A / Ei )
P(Ei/A) = n
, i = 1, 2, … , n
 P( E ) P( A / E )
i =1
i i

3.3.5. Solved Examples


Example 23. A bag contains 5 balls and it is not known how many of them are white. Two balls
are drawn at random from the bag and they are noted to be white. What is the chance that all the
balls in the bag are white?
Solution: Since 2 white balls have been drawn out, the bag must have contained 2, 3, 4 or 5 white
balls.
Let B1 = Event of the bag containing 2 white balls, B2 = Events of the bag containing 3
white balls, B3 = Event of the bag containing 4 white balls and B4 = Event of the bag containing 5
white balls.

Let A = Event of drawing 2 white balls.


2C 2 1 3C 2 3
P(A/B1) = = P(A/B2) = =
5C 2 10 5C 2 10

4C 2 4 5C 2
P(A/B3) = = P(A/B4) = =1
5C 2 10 5C 2
Since the number of white balls in the bag is not known, Bi’s are equally likely.
1
P(B1) = P(B2) = P(B3) = P(B4) =
4
By Baye’s theorem,
P ( B )  P ( A / B4 )
P(B4/A) = 4 4
 P( Bi )  P( A / Bi )
i =1
1
1
4 1
= =
1 1 3 3  2
  + + + 1
4  10 10 5 

Example 24: There are 3 true coins and 1 false coin with ‘head’ on both sides. A coin is chosen at
random and tossed 4 times. If ‘head’ occurs all the 4 times, what is the probability that the false
coin has beeb chosen and used?
Solution:
3
P(T) = P(the coin is a true coin) =
4
1
P(F) = P(the coin is a false coin) =
4

Let A = Event of getting all heads in 4 tosses


1 1 1 1 1
Then P(A/T) =    = and P(A/F) = 1
2 2 2 2 16
By Baye’s theorem

P( F )  P( A / F )
P(F/A) =
P( F )  P( A / F ) + P(T )  P( A / T )
1
1
4 16
= =
1 3 1 19
1 + 
4 4 16

Example 25: The contents of urns I, Ii and III are as follows:


1 white, 2 black and 3 red balls
2 white, 1 black and 1 red balls
4 white, 5 black and 3 red balls
One urn is chosen at random and two balls are drawn. They happen to be white and red.
What is the probability that they come from urns I, II or III?

Solution: Let E1, E2 and E3 denote the events that the urn I, II and III is chosen, respectively, and
let A be the event that the two balls taken from the selected urn are white and red. Then
1
P(E1) = P(E2) = P(E3) =
3
1 3 1
P(A/E1) = =
6C 2 5
2 1 1
P(A/E2) = =
4C 2 3
43 2
P(A/E3) = =
12C 2 11

P( E 2 ) P( A / E 2 )
Hence P(E2/A) = 3

 P( E ) P( A / E )
i =1
i i

1 1

3 3 55
= =
1 1 1 1 1 2 118
 +  + 
3 5 3 3 3 11

1 2

3 11 30
Similarly, P(E3/A) = =
1 1 1 1 1 1 118
 +  + 
3 5 3 3 3 11

55 30 33
Therefore P(E1/A) = 1 − − =
118 118 118

Exercise 3.2
1. Bag I contains 2 white and 3 black balls and bag II contains 4 white and 1 black balls. A ball
chosen at random from one of the bags is white. What is the probability that it has come from
bag I?

2. Five men out of 100 and 25 women out of 1000 are colour-blind. A colour-blind person is chosen
at random. What is the probability that the person is a male? (Assume males and females are
in equal numbers).

3. There are 2 bags one of which contains 5 re and 8 black balls and the other 7 red and 10 black
balls. A ball is drawn from one or the other of the 2 bags. Find the chance of drawing a red
ball.

4. In a bolt factory, machines A, B and C produce 25%, 35% and 40% of the total output,
respectively. Of their outputs, 5, 4 and 2%, respectively, are defective bolts. If a bolt is chosen
at random from the combined output, what is the probability that it is defective? If a bolt chosen
at random is found to be defective, what is the probability that it was produced by B or C?

5. There are 4 candidates for the office of the highway commissioner; the respective probabilities
that they will be selected are 0.3, 0.2, 0.4 and 0.1, and the probabilities for a project’s approval
are 0.35, 0.85, 0.45 and 0.15, depending on which of the 4 candidates is selected. What is the
probability of the project getting approved?

6. Urn I has 2 white and 3 black balls, urn II has 4 white and 1 black balls and urn III has 3 white
and 4 black balls. An urn is selected at random and a ball drawn at random is found to be white.
Find the probability that urn I was selected.

7. Three urns contain 3 white, 1 red and 1 black balls; 2 whit, 3 red and 4 black balls; 1 white, 3
red and 2 black balls respectively. One urn is chosen at random and from it 2 balls are drawn
at random. If they are found to be 1 red and 1 black ball, what is the probability that the first
urn was chosen?

8. Find the probability of drawing a queen and a king from a pack of cards in two consecutive
draws, the cards drawn not being replaced.

9. Police plan to enforce speed limits by using radar traps at 4 different locations within the city
limits. The radar traps at each of the locations L1, L2, L3 and L4 are operated at 40%, 30%,
20% and 30% of the time, and if a person who is speeding on his way to work has probabilities
of 0.2, 0.1, 0.5 and 0.2 respectively, of passing through these locations, what is the probability
that he will receive a speeding ticket?

10. The blood type distribution in the United States at the time of world war II was thought to be
type A, 41%; type B, 9%; type AB, 4% and type O, 46%. It is estimated that during world war
II, 4% of inductees with type O blood were typed as having type A, 88% of those with type A
blood were correctly typed; 4% with type B blood were typed as A and 10% with type AB
were typed as A. A soldier was wounded and brought to surgery. He was tested and typed as
having type A blood. What is the probability that this was his correct blood type?

You might also like