Statistics: Ankon Gopal Banik Umme Hasunat Toafia Kanij Fatema Koli Puja Sutradhar
Statistics: Ankon Gopal Banik Umme Hasunat Toafia Kanij Fatema Koli Puja Sutradhar
1|Page
Contents
Chapter 01: Introduction------------------------------------------------------------------------------------------------03
Chapter 02: Collection and presentation of data------------------------------------------------------------------06
Chapter 03: Measure of central tendency and location---------------------------------------------------------15
Chapter 04: Measures of dispersion----------------------------------------------------------------------------------31
Chapter 06: Probability--------------------------------------------------------------------------------------------------43
2|Page
Chapter 01: Introduction
Statistics is the science which deals with methods of collecting, classifying, presenting, comparing,
interpreting numerical data collected to throw some light on any sphere of inquiry.
Variable: As the information of a particular characteristic vary from unit to unit, it is called variable.
Variable is of two types, they are -
1. Qualitative variable: The variable which cannot normally be measured by numerical figure
is known as qualitative variable. For example, occupation of a person, type of disease of a
patient, qualification of a doctor, type of computer, experience of a computer scientist, rank
of nurse etc.
2. Quantitative variable: The variable which is measured by numerical figure is known as
quantitative variable. For example, age of patient, height of patient, blood pressure level of
patient, sugar level of patient, duration of disease of a patient number of patients admitted
in different days etc.
The quantitative variable is of two types, such as-
a) Discrete variable: The variable which is measured only by integral value is known as discrete
variable. For example, number of ever born children of mother, number of dead children of
mother, number of patients admitted in days in a hospital, number of doctors/nurse in a
hospital, number of computers in a computer lab, number of expert programmers etc.
b) Continuous variable: The variable which is measured by integral as well as by functional
value is known as continuous variable. For example, age of patient, height of patient, blood
pressure level of patient, sugar level of patient, time to develop programs, time to send
emails, consumption of electricity in days etc.
Classification of variable can be displayed as follows -
3|Page
The variable, discrete or continuous, again be classified as -
(i) Random variable: If the values of a variable are observed from different units
selected by random process, then the variable is known as random variable. For
example, let there be N=500 patients in a hospital and n=100 patients are
selected by random process for investigation. In that case if variable is observed
from those selected 100 patients, then it would be random variable.
(ii) Non-random variable: If variable is observed from all units in the population,
then it is known as non-random variable.
•••Importance of statistics: The statistical methods depict the cleaner picture of any problem
affecting the welfare of mankind. The importance of statistics is thus be pointed out as follows -
1. Statistics of birth, death, marriage, migration etc. are important for the plan of welfare of
people.
2. Statistics of family planning adoption, service provided for family planning activities, current
fertility level, child morality level, level of education etc. are important for sound population
policy.
3. Statistics of child health care, health care of pregnant mothers, neonatal and post neonatal
death etc. are needed for planning improved family planning activities.
4. Statistics of current number of doctors, nurses, health care units, hospitals, clinics, system of
service provided in hospitals/clinics etc. are needed for planning efficient hospital
management.
5. Statistics of current academic institutes and trend students, ability of trend teachers are
needed to develop educational system of the country.
6. Statistics of current status of digital system will help in planning future IT system.
7. Statistics of production, import, distribution are also needed to improve health care system.
•••Limitations of statistics: Despite of its wide application in medical science, population science,
industrial sector and administrative sector, statistics has some limitations which restrict its scope
and utility. These limitations are-
1. Qualitative character in medical and biological science and in administrative science is very
prominent. But, except some classification by qualitative character the variable cannot be
used for mathematical treatment. However, the qualitative variable is used for
mathematical treatment if its values are scored by assigning appropriate number.
2. Statistics deals with comment for population units. So, it needs mass of data instead of data
from a single unit of population.
3. Analyzing bio-statistical data statistical inference is drawn with certain level of accuracy
based on probability theory. Nothing is concluded with 100% confidence.
4. Statistics cannot provide anything. It plays an auxiliary role to summarize fact.
4|Page
5. Though it can be reduced by efficient planning of survey, yet statistical measurement shows
an error due to the difference of population true value and estimated value from simple
survey result.
5|Page
Chapter 02: Collection and presentation of data
Population: By population we mean total of units which are under investigation according to pre-
determined object and are available in a specified area at a specified time period.
For example, we want to study the performance of computer centers working in Dhaka city. Here
each computer center of any area is a unit and all computer centers of the city constitute the
population of computer centers. This number total unit is denied by 'N'.
Population is of two types-
1. Finite population: If a population has definite number of units, it is called finite population.
For example, the population of mobile operator's servicing centers of an area, the
population of nurses working in different private clinics etc.
2. Infinite population: The number of units of a population which is not known to the
researcher is known as infinite population.
Unit: A unit in a statistical analysis is one member of a set of entities being studied.
Common examples of a unit would be a single person, animal, plant, manufactured item, or country
that belongs to a larger collection of such entities being studied.
Sample: A representative number of population units, which are under investigation is known as
sample. Sample size is denoted by n (≤N).
For example, if we want to know the condition of fishes of a pond, we can't check all fishes, but we
can take some fishes to check.
Sampling unit: The population units which are to be investigated for a pre-determined objective
are known as sampling units. For example, when studying a group of college students, a single
student could be a sampling unit.
6|Page
Data: Any measurement of one or more variables recorded either from population units or from
sample units is known as data.
For example, the number of computers of a selected group of computer centers are as follows -
Number of computers: 5, 8, 7, 12, 6, 10, 8, 7, 11, 12, 10, 9.
*** The following data represent the number of workers in different small scale industries in the
country-
16 25 28 32 26 25 25 20 20 22 24 26 28 30 35 32 17 20 22 22 24 25 26 28 20 18 26 28
30 30 32 34 31 36 30 35 28 27 21 24 20 18 15 15 15 18 20 22 36 26 21 23 24 26 28 30
32 34 28 27 15 20 19 26 16 24 20 18 20 20 24 27 25 25 25 26 20 21 20 28 17 30 32 33
30 28 26 24 26 24 20 18 19 18 15 16 23 18 15 17 18 20 20 18 18 19 20 21 27 25 26 19
29 20 24 26 27 29 30 32 34 28 30 27 26 28 28 33
I. Prepare a frequency table of discontinuous type (ungrouped).
II. Find number of industries having 25 workers or more.
III. Find number of industries having 20 workers or less.
IV. Prepare a frequency table (grouped).
Solve: I.
7|Page
Title: The distribution of industries by number of workers
8|Page
Title: The distribution of industries by number of workers
***The following data represent the production of tea (in kg) per day in a tea garden-
8.8 4.8 6.2 8.4 10.0 12.5 6.7 3.8 8.8 10.0 8.5 3.6 5.4 8.8
9.6 9.7 10.4 11.6 12.8 8.9 10.2 14.0 6.6 7.2 8.6 5.8 6.7 9.7
7.2 11.2 4.8 5.0 5.6 6.6 7.2 3.4 7.6 8.8 9.0 10.2 6.7 11.0
11.4 12.8 13.4 8.6 4.7 7.2 9.8 10.4 6.5 7.8 9.7 5.8 6.7 11.5
10.4 13.0 14.4 10.4 5.5 5.6 6.8 7.4 9.4 12.0 12.5 11.2 14.0 10.0
6.8 9.5 5.8 4.7 6.6 5.2 9.2 10.2 9.0 8.0 7.4 10.0 8.2 9.5
9|Page
Title: The distribution representing tea production (in kg) in different days
10 | P a g e
variable. On the other hand diagram furnish only approximate information relating
to figure and do not add anything to the meaning of data.
• Graphs are helpful to indicate the further statistical analysis, while diagrams cannot
do so.
Example: The following data represent the number of dead patient of a hospital in different
years-
Year 2006 2007 2008 2009 2010 2011
No. of 725 680 550 650 540 500
dead
patient
❖ Bar diagram: This diagram is similar to line diagram expect that the value of a variable is
shown by rectangle instead of a line.
This is specially used to present the values of different levels of a qualitative variable, where
levels of qualitative variable are plotted in X- axis and the values of different levels are
plotted in Y- axis. The value of each level is shown by a rectangle paralleled to the Y- axis.
This rectangle is also called a bar and hence the diagram is bar diagram. The distance from
11 | P a g e
bar to bar should be half of the width of the bar.
*** The following data represent the monthly rainfall (in m.m.) in August and September
of 1998 in different Meteorological stations of Bangladesh-
Station Chittagong Noakhali Comilla Dhaka Rajshahi Khulna
rainfall in
(m.m.)
August 1194 897 336 552 268 258
September 224 313 281 246 310 300
Solve: Bar diagram representing the rainfall of two months in 1998 in different
Meteorological stations of Bangladesh-
Chart Title
1400
1200
1000
800
600
400
200
0
Chittagong Noakhali Comilla Dhaka Rajshahi Khulna
August September
❖ Pie Diagram: This diagram is also used to present the values of different levels of a
qualitative variable, where values are transformed to angle and the angles are drawn within
a circle. The total angle of a circle is 3600. Accordingly, the value of a particular level of the
qualitative variable is transformed to angle, where the angle is proportional to total angle
3600. All the angles, except one, are drawn within the circle one.
12 | P a g e
***The following data represent the number of diabatic patients visiting in a week in
different centers in a city-
Center A B C D Total
No. of 1500 3660 1890 2575 9625
patients
% of 16.0 37.0 20.0 27.0 100.0
patients
Angle of 1500 × 360 3660 × 360 1890 × 360 2675 × 360 360.0
patients 9625 9625 9625 9625
Solve: Pie diagram representing the number of diabatic patients visiting in a week in
different centers in a city-
Sales
A B C D
13 | P a g e
❖ Frequency Curve: This graph is also used to represent the frequency of a continuous class of
the quantitative variable, where class intervals are plotted in X-axis and frequencies are
plotted in Y-axis with appropriate scales.
Y-Values
14
12
10
0
136 137 138 139 140 141 142 143 144
14 | P a g e
Chapter 03: Measure of Central Tendency and Location
Measure of central tendency: The measure which usually reflects the complete data set and falls in
the center is known as measure of central tendency, since it tends to lie in the center. Some
measure of tendency applications can be listed as follows-
Arithmetic mean: Let x1, x2, ………… xn ne n observations recorded from any statistical investigation.
Then arithmetic mean (A.M) is defined by-
1
A.M, 𝑥̅ = 𝑛 ∑𝑛𝑖=1 𝜘𝑖
For example, let us consider that the followings are the income per day (in taka) of some laborer’s
selected by a random process-
125.50, 100.00, 110.00, 115.50, 90.00, 95.00, 110,00, 125.00, 120.00, 115.00
The mean income of these n=10 laborer’s is-
1 1106
A.M, 𝑥̅ = 𝑛 ∑ 𝜘𝑖 = = 110.60 taka
10
Again. Let us consider that the income (in taka) data are recorded from 100 randomly selected
laborers and they are classified as follows-
Income (in taka), xi: 90.00 95.00 100.00 110.00 115.00 120.00 125.00
No. of laborers, fi: 5 18 32 25 12 5 3
The average income of these 100 laborers is calculated by-
1
A.M, 𝑥̅ = 𝑁 ∑𝑘𝑖=1 𝑓𝑖 𝜘𝑖 ; N = ∑𝑘𝑖=1 𝑓𝑖 = 100
15 | P a g e
10465.00
= = 104.65 taka
100
***The following are the number of customers of a bank who are served within 10 minutes of
their arrival in different days-
***The following data represent the production of garments (in 1000pices) of different industries
in a day and the working hours of the industry per day.
Industry 1 2 3 4
Production, Xi 5.0 3.0 5.5 4.0
Working hours, Wi 10 8 12 10
Calculate average production per day of the industries.
Solve: The average production is-
1 180
𝑥̅ = 𝑤 ∑4𝑖=1 𝑤𝑖 𝜘𝑖 = = 4.5
40
16 | P a g e
Properties of arithmetic mean:
1. The from mean is zero.
Proof: Let xi be the mid-value of i-th class and fi is the frequency of that class, then
arithmetic mean is
1
𝑥̅ = 𝑁 ∑𝑘𝑖=1 𝑓𝑖 𝜘𝑖
Now,
1 1 ℎ
∑ 𝑓𝑖 𝑥𝑖 = ∑ 𝑓𝑖 𝑎 + ∑ 𝑓𝑖 𝑧𝑖
𝑁 𝑁 𝑁
1
Or, 𝑥̅ = a + h𝑧̅ ;where, 𝑧̅ = 𝑁 ∑𝑘𝑖=1 𝑓𝑖 𝑧𝑖
Thus Arithmetic mean depends on change of origin and scale. Usually, the middle or
approximately middle value of xi is taken as ‘a’ and h is the width of the class interval.
17 | P a g e
***The average blood pressure (mm Hg) of a group of people is 130. The distribution of people by
blood pressure is shown bellow-
***Find the weighted average of first n natural numbers, where weights are the respective values
of the numbers.
Solve: We have, xi : 1, 2 ,3, ………., n
wi : 1, 2 ,3, ………., n
weighted mean of x is-
𝑛(𝑛+1)(2𝑛+1)
12 +22 +32 + …………+ 𝑛2 6 1
𝑥̅ = = 𝑛(𝑛+1) = 3 (2n + 1)
1 + 2 + 3 + ……… +n
2
18 | P a g e
5𝑓1 +(𝑓1 +2)10+(𝑓1 −1)15+40𝑓1
Solve: Given, 13.2 =
𝑓1 +𝑓1 +2+𝑓1 −1+2𝑓1
Geometric mean: Let x1, x2, ………, xn be a set of observations recorded in a statistical investigation.
Then G.M. of these n observations is defined by
1
G.M. = (𝑥1 𝑥2 … … … … 𝑥𝑛 )𝑛
1
Or, log G. M. = 𝑛 log(𝑥1 𝑥2 … … … … 𝑥𝑛 )
1
∴ G.M. = Anti log 𝑛 ∑𝑛𝑖=1 log 𝑥𝑖
Let xi be the mid-value of i-th class of a frequency distribution and fi be the corresponding
frequency (i = 1, 2, ……., k).
Then geometric mean is defined by
1
G.M. = (𝑥1 𝑓1 𝑥2 𝑓2 … … … 𝑥𝑘 𝑓𝑘 )𝑁 ;where, N = f1 + f2 + ………… fk
1
Or, log G. M. = 𝑁 log(𝑥1 𝑓1 𝑥2 𝑓2 … … … 𝑥𝑘 𝑓𝑘 )
1
Or, log G. M. = 𝑁 ∑𝑘𝑖=1 𝑓𝑖 log 𝑥𝑖
1
∴ G.M. = Anti log 𝑁 ∑𝑘𝑖=1 𝑓𝑖 log 𝑥𝑖
19 | P a g e
***The following data represent the rate of change of production of rice in different years
compared to the production of 1980. Calculate geometric mean of the rate of change of
production.
1
Solve: We have, log G. M. = 𝑁 ∑𝑘𝑖=1 𝑓𝑖 log 𝑥𝑖
47.21016
= 21
= 2.248103
∴ G.M. = Anti log 2.248103 =177.05
Proof :
The geometric mean of all observation is defined by
1
G = (𝑥11 𝑥12 … … 𝑥1𝑛1 𝑥21 𝑥22 … … 𝑥2𝑛2 )(𝑛1+ 𝑛2 )
1
Log G = 𝑛 log (𝑥11 𝑥12 … … 𝑥1𝑛1 𝑥21 𝑥22 … … 𝑥2𝑛2 )
1 +𝑛2
1
=𝑛 [ log 𝑥11 𝑥12 … … 𝑥1𝑛1 + log 𝑥21 𝑥22 … … 𝑥2𝑛2 ]
1 +𝑛2
1
We know, log 𝐺1 = 𝑛 log ( 𝑥11 𝑥12 … … 𝑥1𝑛1 )
1
20 | P a g e
1
Then, Log G = log [ 𝑛1 log 𝐺1 + 𝑛2 log 𝐺2 ]
𝑛1 +𝑛2
𝑛1 log 𝐺1 + 𝑛2 log 𝐺2
∴ 𝐺 = 𝐴𝑛𝑡𝑖 log
𝑛1 + 𝑛2
In a similar way, if the sample observations are divided into K groups, where the geometric
mean of 𝑛1 observations is 𝐺𝑖 of i-th group (i = 1, 2,….k ), then the geometric mean of all
observations is given by
𝑛1 𝐺1 + 𝑛2 𝐺2 + …..+ 𝑛𝑘 𝐺𝑘
log 𝐺 = 𝑛1 +𝑛2 + ……..+𝑛𝑘
2. A.M. ≥ G.M.
Proof : Let there be two observations 𝑥1 and 𝑥2 , The arithmetic and geometric mean of
these two observations are, respectively
1
A.M. = 2 (𝑥1 + 𝑥2 ) ; G.M. = √𝑥1 𝑥2
1
Now, A.M. – G.M. = (𝑥1 + 𝑥2 ) - √𝑥1 𝑥2
2
1
= (𝑥1 + 𝑥2 - 2√𝑥1 𝑥2 )
2
1
= 2 (√𝑥1 − √𝑥2 )2 = +ve
21 | P a g e
Demerits:
I. It is neither easy to understand nor easy to calculate.
II. Geometric mean is not calculated if any of the observation is zero, since the result is found
as zero.
III. It is not calculated if even or odd number of observations are negative number of negative
observations give imaginary value of geometric mean.
***The arithmetic mean and geometric mean of two observations are 15 and 9 respectively. Find
the two observations.
Solve: Let x and y be the two observations. Then,
𝑥+𝑦
2
= 15 and √𝑥𝑦 =9
𝑜𝑟, x + y = 30 and xy = 81
We know, (x − y)2 = (x + y)2 – 4xy
= 302 - 4×81
= 576
∴ x – y = ± 24
We have, x + y = 30……………………..(i)
x – y = 24……………………...(ii)
Solving (i) and (ii), we get x = 27, y=3
Again, x + y = 30……………………..(iii)
x – y = -24…………………….(iv)
Solving (iii) and (iv), we get x = 3, y = 27
Harmonic mean (H.M): Let x1, x2,……...., xn be a set of n observations recorded in any statistical
investigation. Then harmonic mean of this set of observations is defined by
n 𝑛
H= 1 1 1 = 1
+ +⋯+ ∑𝑛
𝑖=1 𝑥
x 1 x2 xn 𝑖
22 | P a g e
***A train moves first 50 km at a speed of 60 km/hour, second 50 km at a speed of 75 km/hour,
third 50 km at a speed of 65 km/hour and forth 50 km at a speed of 80 km/hour. What is the
average speed of the train throughout the journey?
Solution: Given x1 = 60, x2 = 75, x3 = 65, x4 = 80.
The train covers same distance at each step. The distance can be ignored in calculating average
speed. The average speed is given by
𝑛
H= 1 ; n=4
∑𝑛
𝑖=1 𝑥𝑖
4
H= 1 1 1 1 = 69.10𝑘𝑚/ℎ.
+ + +
60 75 65 80
***A plane moves first 800 km at a speed of 600 km/hour, second 400 km at a speed of 800
km/hour and last 200 km at a speed of 500 km/hour. Find the average speed of the plane.
Solution: Let the speeds be x1=600, x2=800 and x3=500 and the distances covered by the plane be
d1=800, d2=400 and d3=200. The time taken to cover the distances are
𝑑 800 𝑑 400 𝑑 200
t1= 𝑥1 = 600, t2 = 𝑥2 = 800, and t3 = 𝑥3 = 500,
1 2 3
The above average speed is known as weight harmonic mean, where weights are the distances (d1)
covered and the speeds are x1, x2, ……. xk (K=3)
Harmonic mean from frequency table: Similar to the weighted harmonic mean, this mean can also
be calculated from the frequency distribution, where the weight or the important of any value
(mid-value) is the frequency of that value. Let x1 be the mid-value of i-th class of a frequency
distribution and f1 be the corresponding frequency (i=1, 2, ……., k).
23 | P a g e
***The following data represent the distribution of workers in a garments industry according to
the rate of bonus they received during a festival.
There are 165 workers who received bonus at the rate of 70% and above.
24 | P a g e
Theorem: A. M. ≥ G. M. ≥ H. M.
Proof: The theorem is proved for two value of series.
However, the theorem is true for any number of observations.
Let the observation be x1 and x2. Then
1 2 2𝑥1 𝑥2
A.M = 2 (𝑥1 + 𝑥2 ). G.M = √𝑥1 𝑥2 . H.M.= 1 1 =𝑥
+ 1 +𝑥2
𝑥1 𝑥2
1 1
A. M.─ G.M. = (𝑥1 + 𝑥2 )─ √𝑥1 𝑥2 = 2 (𝑥1 + 𝑥2 ─2 √𝑥1 𝑥2 )
2
1
= 2 (√𝑥1 ─ √𝑥2 )2 = +ve
Again,
2𝑥1 𝑥2 𝑥1 𝑥2
G.M. ─ H.M. = √𝑥1 𝑥2 ─ 𝑥 = 𝑥√ [𝑥1 + 𝑥2 ─2 √𝑥1 𝑥2 ]
1 +𝑥2 1 +𝑥2
𝑥1 𝑥2
= 𝑥√ (√𝑥1 ─ √𝑥2 )2 = +ve
1 +𝑥2
***Find arithmetic mean, geometric mean and harmonic mean of the series
1, 2, 3, 4, 5, 6, 7, 8, 9, 10.
And show that A.M. > G.M. > H.M.
Solution: The arithmetical mean (A.M.) is given by
1
A.M. = 10 (1 + 2 + 3 + ⋯ + 10)[∵ 𝑛 = 10]
55
= 10 = 5.5
25 | P a g e
∴ A. M. > G. M. > H.M.
Median: Median is that value of the series which divides the array of the series into two equal parts
such that half of the observations are bellow it and another half of the observation are above it. For
example, let, x: 1, 3, 7, 12, 13. Here, 7 is called median as it is middle of the array.
***The following data represent the distribution of cows of a dairy farm according to the amount
of milk (in kg) given per day:
h = 5,
The class for which c.f ≥ 61 is 15-20, where
Lower limit of the class is l = 15
Frequency of this class is f = 48
And, c.f preceding to the class is c = 35
We know,
ℎ 𝑛 5
Me = l + 𝑓 (2 − 𝑐) = 15 + 48 (61 - 35) = 17.71kg
(ii) The c.f calculated from bottom shows that 97 cows are giving milk 15kg or above.
26 | P a g e
IV. Median is a suitable measure of central tendency if the frequency distribution is prepared
with upper end open classes.
V. Median is the only average to be used to deal with qualitative variable.
VI. It is not affected by extreme value in the series.
Disadvantages
I. It is not based on all observations.
II. It is not suitable for further mathematical treatment.
III. Median is not found out properly from ungrouped data if number of observations is even.
IV. It is affected more by sampling fluctuations.
V. Median is not found out from a frequency table with lower end open classes,
VI. If there are several groups of observations and median of each group is available, then the
median of the combined observations is not available by combining the medians of different
groups.
Uses of median:
I. It is a good measure of central tendency for a markedly skewed distribution such as income
distribution.
II. It is used to find the average wage rate of a group of workers.
III. It is used to find average value, of a set of observations related to qualitative variable such
as intelligence, health condition, socioeconomic condition etc.
IV. Median is a good measure of central tendency if the characteristic under study is in ranks or
scores.
***Two frequencies of two classes are missing but median value of the distribution is known as
Me = 80, where the frequency distribution is related to the number of packages received in
different days in a post office. Find the missing frequencies.
Solve: Me = 80
f2 + f7 = 246 – 174 = 72
27 | P a g e
ℎ 𝑛
Me = l + ( − 𝑐)
𝑓 2
𝑛
Here, 2 =123
Or, 42 = 123 - f2 – 54
∴ f2 = 27
We have,
f2 + f7 = 72
∴ f7 = 72 – 27 = 45
Mode: Mode is that value of the variable which occurs most frequently in the series of observations
of the variable. For example, let us consider the ages (in year) of some children investigated in a
small locality:
5, 2, 2, 8, 7, 6, 5, 4, 3, 4, 5, 2, 2
In the above example, age 2 years are recorded 4 times (maximum time). So, 2years is the mode of
the distribution of ages of children.
28 | P a g e
***The following data represent the distribution of female workers in different garments
industries according to their monthly salary (in taka).
(ii) From c.f it is observed that 540 female workers salary is less than 1000.00taka.
Solve: Since 25 is the frequency of two classes, there are two modes of the given distribution.
The first mode is-
ℎ (𝑓 − 𝑓 ) 5 (25−12)
Mo = l + 2𝑓 −1𝑓 −𝑓
0
= 20 + 2 ×25−12−24 = 14.64
1 0 2
29 | P a g e
Merits and de merits of mode:
Merits
I. It is easy to understand and easy to calculate.
II. It has a definite formula if it is calculated from frequency distribution.
III. It can be found out easily from graph.
IV. It can be calculated if frequency distribution is with upper end open classes.
V. It is not affected by extreme value unless it occurs frequently.
VI. It is obtained by inspection from raw data.
Demerits
I. It is not rigidly defined, specially if the data are in raw form.
II. It is not based on all the observations of a distribution.
III. It is to a greater extent by extreme value if extreme value occurs most frequently.
IV. Mode is ill defined if the modal class is the first or last class of the frequency distribution.
V. It is also ill defined if maximum frequency occurs repeatedly.
VI. Modes of different sets of observations cannot be combined to get mode of combined
observations of all sets.
Uses of mode:
I. Mode is used to handle economic data such as daily sells in a shop, daily output of an
industry, daily wages of workers, daily export of a company etc. to know the maximum
value of the variable with frequency.
II. It is also used in market research, where business man or company needs to know the type
or quality of commodities which are most frequently demanded.
III. It is also used in analyzing data related to weather, where mode provides number of days
with maximum temperature during a summer, number of days having maximum rainfall
during a rainy season, number of days having maximum or minimum humidity etc.
IV. It is also used to handle social data. Mode provides information related to maximum
number of road accidents in days, maximum number of suicide cases in days, maximum
number of people killed due to illegal agents in days etc.
30 | P a g e
Chapter 04: Measures of dispersion
The term dispersion means the scatteredness of observations from some central value.
Data set -1: x1i : 4, 5, 6, 7, 6, 8; 𝑥1 = 6, n1 = 6
̅̅̅
Data set -2: x2i : 2, 4, 6, 8, 4, 12; 𝑥2 = 6, n2 = 6
̅̅̅
The amount of scatteredness can be evaluated by absolute deviations as follows:
| x1 - ̅̅̅
𝑥1 | : 2, 1, 0, 1, 0, 2
| x2 - ̅̅̅
𝑥2 | : 4, 2, 0, 2, 2, 6
Measures of dispersion
Let us consider the age of some children as follows-
x (in years) : 2, 5, 4, 3, 4, 5, 2, 7; 𝑥̅ = 4years, n = 8
Consider another set of observations which indicate electric failure in different days as follows-
Electric failure, y (in hours) : 2, 5, 4, 3, 4, 5, 2, 7; 𝑦̅ = 4hours, n = 8
The deviation of x from 𝑥̅ is,
|x - 𝑥̅ | : 2, 1, 0, 1, 0, 1, 2, 3
10
Mean deviation = = 1.25
8
31 | P a g e
IV. Semi-interquartile Range or Quartile Deviation (Q.D).
2. Relative Measure of Dispersion: It' is a measure which depicts the average amount of
scatteredness of observations but free of unit of the variable under study. It measures the
percentage variation of observations from some central value. The measures are-
I. Coefficient of Range,
II. Coefficient of Mean Deviation,
III. Coefficient of Standard Deviation,
IV. Coefficient of Variation (c.v),
V. Coefficient of Quartile Deviation.
Range
Consider a set of observations of size n; where the observations can be arranged in ascending order
as follows:
x(1) < x(2) < x(3) < ……………. < x(n)
Here x(1) = the lowest observation in the series, and x(n) = the highest observation in the series.
Then range, R, is defined by, R = x(n) - x(1)
For example, let us consider the total annual rainfall (in m.m) recorded in some Meteorological
stations in Bangladesh in 1998, where the rainfall data are as follows :
3863, 3914, 4672, 4139, 4435, 4245, 3216, 2518, 3368, 4388, 2312, 1819, 2200, 2858, 2548, 1490,
1994, 3217, 2852, 2601, 2391, 1636, 1540, 2365, 3139.
Here, n= 25
R(25) = 4672, highest amount of rainfall,
R(1) = 1490, lowest amount of rainfall.
Therefore, range of rainfall is,
R = R(25) - R(1) = 4672 – 1490 = 3182m.m
This R is an absolute measure of dispersion. The corresponding relative measure of dispersion is
coefficient of range and is given by-
𝑥(𝑛) − 𝑥(1)
Coefficient of range = 𝑥
(𝑛) + 𝑥(1)
This coefficient is multiplied by 100 to express the result in percentage. In our given example, the
coefficient of range is,
4672 – 1490
Coefficient of range = 4672+ 1490 × 100% = 51.64%
32 | P a g e
***Find the range and coefficient of range for the following frequency distribution-
Solve: R = 35 – 5 = 30
35 – 5
Coefficient of range = 35+ 5 × 100% = 75.0%
USE OF RANGE
I. Range is used for statistical quality control of industrial products,
II. It used to measure the variation of data, where small variations are observed in the data set.
Such data with small variations are (a) stock market data throughout the day, (b) rate of
exchange of money, (e) rate of interest in call money,
III. It is used in weather forecast to estimate the difference between maximum and minimum
temperature or between maximum rainfall and minimum rainfall during rainy season.
IV. It is used in quoting interest rate and security prices at the stock exchange.
MEAN DEVIATION
One of the important absolute measure of dispersion is mean deviation, since it is based on all
observations. The deviation can be measured from mean, median and mode. Let us consider a set
of observations. as follows:
x1, x2, x3, ……………. Xn
Let the mean, median and mode of this set of observations be 𝑥̅ , Me and Mo, respectively, Then
mean deviation from mean is defined by -
33 | P a g e
1
M.D. (mean) = ∑ |𝑥𝑖 − 𝑥̅ |
𝑛
Similarly, the mean deviation from median and mode are given, respectively by-
1
M.D. (median) = 𝑛 ∑ |𝑥𝑖 − 𝑀𝑒| and
1
M.D. (mode) = 𝑛 ∑ |𝑥𝑖 − 𝑀𝑜|
The corresponding relative measure of dispersion are coefficient of mean deviation are given by-
M.D.(mean)
Coefficient of M.D. (mean) = × 100%
mean
M.D.(median)
Coefficient of M.D. (median) = × 100%
median
M.D.(mode)
Coefficient of M.D. (mode) = × 100%
mode
The mean deviation from mean, median, mode can be found out from frequency distribution. The
formula are-
1
M.D. (mean) = 𝑁 ∑𝑘𝑖=1 𝑓𝑖 |𝑋𝑖 − 𝑋̅|
1
M.D. (median) = 𝑁 ∑𝑘𝑖=1 𝑓𝑖 |𝑋𝑖 − 𝑀𝑒| and
1
M.D. (mode) = 𝑁 ∑𝑘𝑖=1 𝑓𝑖 |𝑋𝑖 − 𝑀𝑜|
where Xi is the mid-value of i-th class of a frequency distribution and fi is the corresponding
frequency, 𝑋̅ is the mean of the distribution, i = 1, 2, ………… k. The coefficient of mean deviation is
calculated by the formula shown above.
***The monthly average temperature (oc) in different months of 1998 recorded in Dhaka
Meteorological station are-
12.7, 16.1, 18.3, 22.9, 25.3, 28.1, 26.4, 26.8, 26,3, 25.4, 20.6, 14.8
Find percentage change of variation in average temperature in different months
Solution: The percentage change of variation in the average amount of minimum temperature is
found out by coefficient of mean deviation, where deviation can be measured from mean or
median. The mean and median of the given set of observations are-
1 263.70
Mean, 𝑥̅ = 𝑛 ∑ 𝑥 = = 21.97 oc
12
As, n is even,
34 | P a g e
1
= Value of [6th + 7th] observation; n=12
2
1
= 2 [22.9 + 25.3] = 24.10 oc
Now, the observations of temperature data from mean and median are shown bellow-
|𝒙𝒊 − 𝒙̅| 9.27 5.87 3.67 0.93 3.33 6.13 4.43 4.83 4.33 3.43 1.37 7.17
̅̅̅̅̅
|𝒙𝒊 − 𝑴𝒆| 11.4 8.0 5.8 1.2 1.2 4.0 2.3 2.7 2.2 1.3 3.5 9.3
1 54.76
∴ M.D. (mean) = 𝑛 ∑ |𝑥𝑖 − 𝑥̅ | = = 4.56 oc
12
1 52.9
And, M.D. (mean) = 𝑛 ∑ |𝑥𝑖 − 𝑀𝑒| = = 4.40 oc
12
M.D.(mean) 4.56
∴ Coefficient of M.D. (mean) = × 100% = 21.97 = 20.76%
mean
M.D.(median) 4.40
And, Coefficient of M.D. (mdian) = × 100% = 24.1 = 18.26%
median
The average minimum temperature varies by 20.76% from mean temperature. It varies by 18.26%
from median.
***The following data represent the distribution of amount of fertilizer (in kg) sold in a shop
different days during boro rice season-
ℎ(𝑓 −𝑓 ) 50(18−17)
Mode, Mo = I + 2𝑓 −𝑖𝑓 −𝑜 𝑓 = 200 + 2×18−17−10 = 205.56kg
𝑖 𝑜 2
1 5637.5
M.D. (mean) = 𝑁 ∑ |𝑋𝑖 − 𝑋̅| = 80 = 70.47kg
35 | P a g e
M.D.(mean) 70.47
Coefficient of M.D. (mean) = × 100% = × 100% = 30.07%
mean 234.375
Therefore, the daily sell is scattered by around 30% from mean sell.
1 5519.48
M.D. (median) = 𝑁 ∑ |𝑋𝑖 − 𝑀𝑒| = = 68.99kg
80
M.D.(median) 68.99
Coefficient of M.D. (median) = × 100% = 222.22 × 100% = 31.05%
median
The daily sell is dispersed by around 31% from median amount of sell.
1 5761.04
M.D. (mode) = 𝑁 ∑ |𝑋𝑖 − 𝑀𝑜| = = 72.013kg
80
M.D.(mode) 72.013
Coefficient of M.D. (mode) = × 100% = 205.56 × 100% = 35.03%
mode
The daily sell is scattered by around 35% from the mode amount of sell.
Variance:
Let x1, x2, …………………….. xn be a set of observations recorded in any statistical investigation. Then
variance of x is defined by-
1 1 (∑ 𝑥𝑖 )2 𝑆𝑆(𝑥)
V(x) = 𝑛 ∑𝑛𝑖=1(𝑥𝑖 − 𝑥̅ )2 = 𝑛 [∑ 𝑥𝑖2 − ]= , where SS(x) = sum of sequences of x.
𝑛 𝑛
It is called mean square deviation about mean. This mean square deviation is minimum (property of
arithmetic mean).
The variance of x from a frequency table is calculated by-
1 1 (∑ 𝑓𝑖 𝑋𝑖 )2
V(x) = 𝑁 ∑𝑘𝑖=1 𝑓𝑖 (𝑋𝑖 − 𝑋̅)2 = 𝑁 [∑ 𝑓𝑖 𝑋𝑖2 − ], where N = ∑𝑘𝑖=1 𝑓𝑖 = total frequency.
𝑛
Here, 𝑋𝑖 is the mid value of the i-th class of a frequency distribution and 𝑓𝑖 is corresponding
frequency.
36 | P a g e
1 (∑ 𝑓𝑖 𝑋𝑖 )2
Standard deviation, 𝜎 = √V(x) = √𝑁 [∑ 𝑓𝑖 𝑋𝑖2 − ]
𝑛
1 (∑ 𝑥𝑖 )2
𝜎 = √V(x) = √𝑛 [∑ 𝑥𝑖2 − ]
𝑛
By standard deviation we measure the average distance of the observations from mean and hence
we consider only the positive square root of variance.
***The following observations represent the prices of mango sold in different market in a city-
Price of mango (in taka): 45.40, 50.65, 50.00, 45.00, 46.00, 48.00, 47.00, 55.00, 50.00, 54.00
Find standard deviation in price of mango.
Solution: The variance of prices is calculated by-
1 (∑ 𝑥𝑖 )2
V(x) = 𝜎 2 = 𝑛 [∑ 𝑥𝑖2 − ]; n = 10
𝑛
1 (491.15)2
= 10 [24234.6725 - ]
10
1
= 10 [24234.6725 – 24122.8322]
= 11.18403 (taka)2
∴ Standard deviation of price is-
This indicates that the price of mango varies from mean price by an amount ±3.34 taka.
***The distribution of working hours of some female workers in different garments indrustries
are shown below-
37 | P a g e
1 (2752.5)2
= [26771.25 - ]
285 285
1
= 285 [26771.25 – 26583.36]
= 0.6593 (hour)2
(𝑥 − 𝑥̅ )2 : 4, 1, 0, 1, 2
(𝑦 − 𝑦̅)2 : 16, 4, 0, 4, 16
1 10 1 40
Therefore, the V(x) = 𝑛 ∑(𝑥 − 𝑥̅ )2 = = 2 and V(y) = 𝑛 ∑(𝑦 − 𝑦̅)2 = = 8.
5 5
***The distribution of mothers by their number of ever born children are shown below. Show
that, for this distribution 𝝈 ≥ M.D (mean).
1 939.0
Solve: 𝑋̅ = 𝑁 ∑ 𝑓𝑖 𝑋𝑖 = = 3.7
254
1 232.0
M.D (mean) = 𝑁 ∑ 𝑓𝑖 (𝑋𝑖 − 𝑋̅) = 254 = 0.91
38 | P a g e
1 1 (∑ 𝑓𝑖 𝑋𝑖 )2
𝜎2 = ∑ 𝑓𝑖 (𝑋𝑖 − 𝑋̅)2 = [∑ 𝑓𝑖 𝑋𝑖2 − ]
𝑁 𝑁 𝑛
1 (939)2
= 254 [3829.5 - ]
254
1
= 254 [3829.5 – 3471.34]
= 1.41
∴ 𝜎 = √𝜎 2 = √1.41 = 1.19
∴ 𝜎 > M.D
This coefficient is multiplied by 100 to get the percentage change of variation of a set of
observations. This percentage change of variation of a set of observations is called coefficient of
variation and is given by-
𝜎
C.V = 𝑥̅ × 100%
This measure is free of unit of variable under study. Hence it is a measure compare the dispersion
of two or more distributions.
***Calculate percentage change of variation of the following two sets of observations and
compare the formation of two sets of observations.
Set-1, x1i: 4, 8, 10, 12, 18, 8
Set-1, x2i: 10, 10, 11, 11, 12, 6
Solve: For 1st of observations,
39 | P a g e
1 60
𝑥1 =
̅̅̅ ∑ 𝑥1𝑖 = = 10
𝑛1 6
1 1 (∑ 𝑥1𝑖 )2
𝜎12 = 𝑛 ∑(𝑥1𝑖 − 𝑥
̅̅̅)
1
2
= 𝑛 [∑ 𝑥1𝑖 2 − ]
1 1 𝑛1
1 (60)2
= 6 [696 - ]
6
= 16
∴ 𝜎1 = 4
𝜎 4
C.V1 = ̅𝑥̅̅1̅ × 100% = 10 × 100% = 40%
1
1 1 (∑ 𝑥2𝑖 )2
𝜎22 = 𝑛 ∑(𝑥2𝑖 − 𝑥
̅̅̅)
2
2
= 𝑛 [∑ 𝑥2𝑖 2 − ]
2 2 𝑛2
1 (60)2
= 6 [622 - ]
6
= 3.67
∴ 𝜎1 = 1.91
𝜎 1.91
C.V2 = ̅𝑥̅̅1̅ × 100% = × 100% = 19.1%
2 10
This observed that C.V2 < C.V1. This implies that though the means of two sets of observations are
same, the first set of observations are more scattered from mean than the scatteredness of second
set of observations. The second set of observations are more homogenious.
***The combine grade point average (CGPA) in different semesters of students are shown below-
1 1 (∑ 𝑥1𝑖 )2
𝜎12 = 𝑛 ∑(𝑥1𝑖 − 𝑥
̅̅̅)
1
2
= 𝑛 [∑ 𝑥1𝑖 2 − ]
1 1 𝑛1
1 (26)2
= 8 [86.5 - 8
]
= 0.25
40 | P a g e
∴ 𝜎1 = √0.25 = 0.5
𝜎 0.5
C.V1 = ̅𝑥̅̅1̅ × 100% = 3.25 × 100% = 15.38%
1
For student B-
1 26
𝑥2 = 𝑛 ∑ 𝑥2𝑖 =
̅̅̅ = 3.25
2 8
1 1 (∑ 𝑥2𝑖 )2
𝜎22 = 𝑛 ∑(𝑥2𝑖 − 𝑥
̅̅̅)
2
2
= 𝑛 [∑ 𝑥2𝑖 2 − ]
2 2 𝑛2
1 (26)2
= 8 [89.5 - ]
8
= 0.625
∴ 𝜎1 = √0.625 = 0.79
𝜎 0.79
C.V2 = ̅𝑥̅̅1̅ × 100% = 3.35 × 100% = 24.31%
2
It is observed that average CGPA of both students are same but C.V of A is less than C.V of B (C.V 1 <
C.V2). This implies that the student A is better than B throughout of the course of studies. The of A
is more homogeneous in all semesters.
***The production of jute goods (in tons) in different days of first and second half of the year are
shown below-
Class f1i f2i Mid-value f1i Xi f1i Xi2 f2i Xi f2i Xi2
interval of Xi
production
2.0-2.5 12 5 2.25 27.00 60.75 11.25 25.3125
2.5-3.0 48 38 2.75 132.00 363.00 104.50 287.375
3.0-3.5 70 80 3.25 227.50 739.375 260.00 845.00
3.5-4.0 35 50 3.75 131.25 492.1875 187.50 703.125
4.0-4.5 15 7 4.25 63.75 270.9375 29.75 126.4375
Total 180 180 581.50 1926.25 593.00 1987.25
For the first half of the year-
41 | P a g e
1 (∑ 𝑓1𝑖 𝑋𝑖 )2
𝜎12 = [∑ 𝑓1𝑖 𝑋𝑖2 − ]
𝑁1 𝑁1
1 (581.50)2
= 180 [1926.25 - ]
180
= 0.2649
∴ 𝜎1 = √0.2649 = 0.51tons
𝜎 1 0.51
C.V1 = ̅̅̅̅ × 100% = 3.23 × 100% = 15.79%
𝑋 1
1 (∑ 𝑓2𝑖 𝑋𝑖 )2
𝜎22 = 𝑁 [∑ 𝑓2𝑖 𝑋𝑖2 − ]
2 𝑁2
1 (593.00)2
= 180 [1987.25 - ]
180
= 0.1864
∴ 𝜎1 = √0.1864 = 0.43tons
𝜎 2 0.43
C.V2 = ̅̅̅̅ × 100% = 3.29 × 100% = 13.11%
𝑋 2
The results show that in the second part of the year the production is, on an average, more and it is
more homogeneous over days, since C.V2 < C.V1.
42 | P a g e
Chapter 06: Probability
Probability: If in any random experiment the n outcomes are exhaustive, mutually exclusive and
equally likely and m of these are favorable to an event A, then the probability of A is defined by-
𝑚
P(A) = 𝑛
Let 𝐴̅ be the complementary event to A. The favorable outcomes to 𝐴̅ are n – m. Then, probability
of 𝐴̅ is given by-
n–m
P(𝐴̅) = 𝑛
𝑚
=1- 𝑛
= 1 - P(A)
∴ P(A) + P(𝐴̅) = 1
***In a family there are 3 male and 2 female members. A family work is to be finished by any two
of them. Find the probability that-
a) The work will be finished by one male and one female member.
b) The work will be finished by either two males or two females.
Solve: (a) Let A be the event that the work will be finished by one male and one female member.
Since the work is to be finished by any two members, it can be finished in-
n = 5𝐶2 = 10ways
The work can be finished by one male and one female member in-
m = 3𝐶1 × 2𝐶1 = 6ways
𝑚 6
∴ P(A) = = = 0.6
𝑛 10
(b) Let 𝐴̅ be the event that the work will be finished by either two males or two females. Here 𝐴̅ is
complimentary event, since statement of 𝐴̅ is against the statement of A.
***Two unbiased dice are thrown once. Find the probability that
(i) Both dice show same number,
(ii) First dice shows even number,
(iii) Both dice show even number,
(iv) Sum of the upper faces of the dice is 8 or more,
(v) Sum of the upper faces of the dice is above 10,
43 | P a g e
(vi) Sum of the upper faces of the dice is less than 7,
(vii) Second die shows number 5 or more.
Solution: The sample space of the experiment is –
11 21 31 41 51 61
12 22 32 42 52 62
S: { 13 23 33 43 53 63 } ; n = 36
14 24 34 44 54 64
15 25 35 45 55 65
16 26 36 46 56 66
(i) Let A be the event that both dice show same number. Favorable case to A, m = 6
𝑚 6 1
∴ P(A) = = =6
𝑛 36
(ii) Let B be the event that first dice shows even number. Favorable case to B, m = 18
𝑚 18 1
∴ P(B) = = =2
𝑛 36
(iii) Let C be the event that both dice shows even number. Favorable case to C, m = 9
𝑚 9 1
∴ P(C) = = =
𝑛 36 4
(iv) Let D be the event that sum of the upper faces of the dice is 8 or more. Favorable case to D, m =
15
𝑚 15 5
∴ P(D) = = = 12
𝑛 36
(v) Let E be the event that sum of the upper faces of the dice is above 10. Favorable case to E, m = 3
𝑚 3 1
∴ P(E) = = = 12
𝑛 36
(vi) Let F be the event that sum of the upper faces of the dice is less than 7. Favorable case to F, m =
15
𝑚 15 15
∴ P(F) = = =
𝑛 36 12
(vi) Let g be the event that Second die shows number 5 or more. Favorable case to G, m = 12
𝑚 12 1
∴ P(G) = = =3
𝑛 36
44 | P a g e
𝟐
*** Three biased coins are tossed once. It is known that any shows head with probability P(H) =
𝟑
𝟏`
[P(T) = 𝟑 ] . Find the probability that-
Permutation: It is a technique to arrange r objects taken together from n distinguished objects. The
total number of such arrangements is denoted by 𝑛𝑃𝑟
where
𝑛!
𝑛𝑃𝑟 = (n—r)!
45 | P a g e
If n objects are arranged taken all n together, the total number of arrangements is
𝑛! 𝑛!
𝑛𝑃𝑛 = = = n (n - 1) (n - 2) (n - 3) ……………………….. 3 . 2 . 1
(n—n)! O!
***In a box there are 10 books. The books are to be arranged in 2 shelves each of which can
contain 5 books. Find the number of arrangements of the books.
Solution: Let us consider that the books are numbered by 1, 2, ………..., 10. We need to find the
value of 10𝑃5 , where
10!
10𝑃5 = = 30240
(10—5)!
In particular 𝑛𝑃𝑛 = nn
***In a box there are 3 balls numbered 1, 2, 3. Find the number of arrangements of balls-
(i) Taken 2 together with repetition,
(ii) Taken 3 at a time with repetition.
Solution: (i) Given n = 3, r = 2. We need 𝑛𝑃𝑟 with repetition, where,
𝑛𝑃𝑟 = nr = 32 = 9
𝑛𝑃𝑛 = nn = 33 = 27
***Find the number of arrangements of the letters in the word 'STA STICS’ taken all together.
46 | P a g e
Solution: Given n = 10, n1 = 3 (S), n2 = 3 (T), n3 = 2 (I), n4 = 1 (A), n5 = 1 (C),
The number of arrangements of the letters is-
𝑛! 10!
= = 50400
𝑛1 ! 𝑛2 ! 𝑛3 ! 𝑛4 ! 𝑛5 ! 3!3!2!1!1!
***In an office there are 5 chairs with handle and 5 others without handle. In how many ways
these chairs can be arranged for sitting?
Solution: Given, n = 5 + 5 = 10 objects, n1 = 5, n2 = 5
The number of arrangements of these chairs is-
𝑛! 10!
= 5!5! =252
𝑛1 ! 𝑛2 !
In this technique of arrangement the order of arrangement of objects is not considered. Thus it is
different from permutation.
*** In an industry there are 4 engineers, 2 technicians and 3 machine operators. A committee of
3 is to be formed to run the machines of the industry efficiently. In how many ways the
committee can be formed?
Solution: Given n = 4 + 2 + 3 = 9. A committee of r = 3 members is to be formed.
This can be done in-
9!
𝑛𝐶𝑟 = 9𝐶3 = 3!(9—3)! = 84ways
***In a box there are 3 red balls, 2 white balls and 3 black balls.
(i) In how many ways 3 balls can be drawn from the box?
(ii) In how many ways one ball of each color can be drawn?
Solution: Total number of balls in the box is n = 8
(i) Three balls from 8 balls can be drawn in-
8!
𝑛𝐶𝑟 = 8𝐶3 = 3!(8—3)! = 56ways.
47 | P a g e
(ii) Drawing one ball of each color means to draw 1 red, 1 white and 1 black ball from the box.
Number of ways to draw one ball of each color is-
3𝐶1 × 2𝐶1 × 3𝐶1 = 3 × 2 × 3 =18
***In a packet there are 6 books. threeOfWhich are on mathematics and 3 are on statistics. Two
books are taken at random. Find the probability that-
(i) The drawn books are on mathematics,
(ii) The drawn books are on statistics,
(iii) One of the drawn book is on mathematics and another one Is on statlst,icg.
Solution: Two books can be drawn in n = 6𝐶2 = 15ways.
(i) Let A be the event that the drawn books are on mathematics. Two mathematics books can be
drawn from 3 mathematics books in m = 3𝐶2 = 3ways.
𝑚 3 1
∴ P(A) = = =
𝑛 15 5
(ii) Let B be the event that the drawn books are of statistics. Two statistics books can be drawn
from 3 statistics books in m = 3𝐶2 = 3ways.
𝑚 3 1
∴ P(B) = = =5
𝑛 15
(iii) Let C be the event that one of the drawn book is of mathematics and another one is of
statistics. The event C can be occur in m = 3𝐶1 × 3𝐶1 = 9ways.
𝑚 9 3
∴ P(C) = = =
𝑛 15 5
***From a pack of 52 cards two cards are drawn at random. Find the probability that the cards
are-
(i) Aces.
(ii) Kings,
(iii) Spades,
(iv) One spade and one club,
(v) Of same color,
(vi) Of same number.
Solution: Two cards from 52 cards can be taken in n = 52𝐶2 = 1326ways.
(i) Let A be the event that the two cards are aces.
There are 4 aces. Two aces can be taken from 4 aces in m = 4𝐶2 = 6ways.
𝑚 6 1
∴ P(A) = = = 221
𝑛 1326
48 | P a g e
(ii) Let B be the event that the cards are kings.
There are 4 kings. Two kings can be taken in m = 4𝐶2 = 6ways.
𝑚 6 1
∴ P(B) = = =
𝑛 1326 221
(iv) Let D be the event that one card is a spade and another one is club.
There are 13 spades and 13 clubs. One spade and one club can be taken in in m = 13𝐶1 × 13𝐶1 =
169ways.
𝑚 169 13
∴ P(D) = = =
𝑛 1326 102
(v) Let E be the event that the cards are of same color.
There are 26 cards of black color and 26 cards of red color. Two black color cards are drawn in 26𝐶2 =
325 ways. Similarly. two red color cards are drawn in 325 ways.
Favorable cases to E are m = 325 + 325 = 650
𝑚 650 325
∴ P(E) = = = 663
𝑛 1326
(vi) Let F be the event that the cards are of same number.
There are 4 cards of four suits bearing same number. In each suit there are 13 cards. Two cards of any
one number can be drawn in 4𝐶2 = 6ways. Since there are 13 numbers of one suit of card. Two
cards of same number can be drawn in m = 13 x 6 = 78ways.
𝑚 78 1
∴ P(F) = = =
𝑛 1326 17
***In a family 3 babies are born. The birth of a boy or a girl is equiprobable, find the probability
that-
(i) All three babies are boys.
(ii) There are exactly two boys.
(iii) There are at least two boys:
(iv) There are at best two boys.
Solution: "me birth of 3 babies can occur in n = 23 = 8ways.
(i) Let A be the event that the 3 babies are boys. Out of 3 babies 3 can be boy in m = 3𝐶3 = 1way.
𝑚 1
∴ P(A) = =
𝑛 8
49 | P a g e
(ii) Let B be the event that there are exactly 2 boys. Out of 3 babies 2 boys can take birth in m = 3𝐶2
= 3ways.
𝑚 3
∴ P(B) = =
𝑛 8
(iii) Let C be the event that there are at least 2 boys. The number of boys are either 2 or 3. Two boys
can take birth in 3𝐶2 = 3ways and 3 boys can take birth in 3𝐶3 = 1way. Therefore, m = 3 + 1 = 4
𝑚 4 1
∴ P(C) = = =2
𝑛 8
(iv) Let D be the event that there are at best 2 boys. The number of boys are either zero or 1 or 2,
̅ indicates that there are 3 boys.
but not 3. The event 𝐷
̅) = 1
P(𝐷 [(i)]
8
1 7
̅) = 1 - =
∴ P(D) = 1 - P(𝐷 8 8
***An urn contains 4 red and 3 white balls. balls are drawn one after another (a) with
replacement, (b) without replacement. Find the probability that-
(1) Both balls are white,
(2) One ball is white and another one is red.
Solution: The Probable types of point to draw two balls one after another is S: {WW, WR, RW, RR},
where R = red ball, W = white ball.
(1) Let A be the event that both balls are white. Favorable point to A: {WW}
(a) P(A) = P(WW)
3 𝐶1 3 𝐶1 9
=7 . =
𝐶1 7 𝐶1 49
(2) Let B be the event that one ball is white and another one is red. Favorable point to B: {WR, RW}.
(a) P(B) = P(RW) + P(WR)
4 𝐶1 3 𝐶1 3𝐶1 4𝐶1 12 12 24
=7 . +7 . = + 49 = 49
𝐶1 7 𝐶1 𝐶1 7𝐶1 49
50 | P a g e
***An urn contains 6 red and 4 black balls. Three balls are taken at random from the urn. Find
the probability that-
(a) All three are red
(b) Two balls are red
(c) One ball is red.
Solution: The urn contains 6 + 4 = 10 balls. Three balls can be taken from the urn in n = 120ways.
(a) Let A be the event that all 3 balls are red. Three red balls can be drawn in m = 6𝐶3 = 20ways.
𝑚 20 1
∴ P(A) = = =6
𝑛 120
(b) Let B be the event that two balls are red and 1 ball is black. Two red balls and 1 black ball can be
drawn in m = 6𝐶2 × 4𝐶1 = 60ways.
𝑚 60 1
∴ P(B) = = =2
𝑛 120
(c) Let C be the event that one ball is red and other two are black. One red and 2 black balls can be
drawn in m = 6𝐶1 × 4𝐶2 = 36ways.
𝑚 36 3
∴ P(C) = = = 10
𝑛 120
***The letters of the word 'MATHEMATICS' are arranged at random. Find the probability that the
vowels occupy only odd positions.
Solution: The word 'MATHEMATICS' can be arranged in-
11!
n = 2!2!2!1!1!1!1!1! = 6652800
There are 11 letters out of which 4 are vowels. These 4 vowels will occupy 1 st, 3rd, 5th, 7th, 9th and
11th places and this can be done in 6𝐶4 = 15ways.
Again these 4 vowels can be arranged among themselves in 4! = 24 ways and remaining 7
consonants can be arranged among themselves in 7! = 5040ways.
Therefore, 4 vowels can be placed only odd places in-
m = 15 × 24 × 5040 = 1814400
Hence the required probability is-
1814400
= 0.2727
6652800
51 | P a g e