Maths - Class - 12 - Statistics and Probability
Maths - Class - 12 - Statistics and Probability
STATISTICS:
MEASURES OF CENTRAL TENDENCY:
An average value or a central value of a distribution is the value of variable which
is representative of the entire distribution, this representative value are called the
measures of central tendency.
Three types of mathematical averages are (i) arithmetic mean, (ii) geometric
mean, and (iii) harmonic mean.
Five types of positional averages are (i) median, (ii) quartiles, (iii) deciles, (iv)
percentiles, and (v) mode.
Arithmetic mean
Σx
Individual observations: X́ = , Σ x=¿ sum of items. N=¿ number of
N
observations
Σ f ( x)
Discrete series: X́ = , f =¿ frequency, N=Σ f
N
Σ fd
Short-cut method: X́ =A + , d =X− A , N =Σ f , A=¿ assumed
N
mean
Σ fm
Continuous series: X́ = , m=¿ mid-values of various classes,
N
N=Σ f classes, N=Σ f
Combined mean: If a series of N observation consists of two components, with
means X́ 1 , X́ 2 and number of items N 1 and N 2 , then combined mean is
N X́ + N 2 X́ 2
X́ 12= 1 1
N 1+ N 2
GEOMETRIC MEAN:
(i) For ungrouped distribution: If x 1 , x 2 , … … x n are n positive values of
variate then their geometric mean G is given by
[ ]
n
1
G=( x1 , x2 , … … … . x n ) ⇒ G=antilog ∑ ❑ log xi
1 /n
n i=1
(ii) For frequency distribution: If x 1 , x 2 , … x n are n positive values with
corresponding frequencies f 1 , f 2 , … . f n resp. then their G.M.
[ ]
n
1
∑ ❑ f i log x i
1/N
G=( x fi × x f2 ×… × xnf )
1 2 n
⇒G=antilog
N i=1
Note: If G 1 and G 2 are geometric means of two series which containing n1 and
n2 positive values resp. and G is geometric mean of their combined series then
[ ]
1
n1 log G1+ n2 log G2
G=( G1 × G2 )
n1 n2 n1+n2
⇒G=antilog
n1 + n2
HARMONIC MEAN:
x 1 , x 2 , … .. x n are n non-zero values of
(i) For ungrouped distribution: If
variate then their harmonic mean H is defined as
n n
H= =
1 1 1 n
1
xn ∑
+ + … .+ ❑
x1 x 2 i=1 xi
(ii) For frequency distribution: If x 1 , x 2 , … x n are n non-zero values of variate
with corresponding frequencies f 1 , f 2 , … . f n respectively their H.M.
N N
H= = n
f1 f 2 fn f
+ + … .+
x1 x 2
∑
x n i=1 xi
❑ i
Median: Median is defined as the central value of set of observations. In order to
calculate median, first of all, arrange the data in ascending or descending order of
magnitude of the observations.
Individual observation: If N is odd, then median ¿ size of (N +1)/2 th item.
If N is even, median ¿ average of N / 2 th and
N
2 ( )
+1 th items.
Discrete series: First, arrange the data in ascending or descending order, find
cumulative frequencies, then median is the size of the observation which lies in
the class having cumulative frequency just greater than N /2.
Continuous series: Median class is the class corresponding to cumulative
frequency just greater than N / 2 and median is given by the formula
N
−c . f .
2
Median =l+ ×i
f
where l=¿ lower limit of median class, c.f. ¿ cumulative frequency of class
preceding to the median class, f =¿ frequency of median class, and i=¿ class
interval of median class.
Mode:
It is that value of the variable, which occurs greatest number of times, i.e.,
variable with maximum frequency.
In case of a discrete frequency distribution of value of mode is determined by
the method of grouping.
In case of a grouped or continuous frequency distribution, mode is given by
the formula
f −f 1
Mode =l + ×h
2 f −f 1−f 2
where l=¿ lower limit of the modal class, h=¿ width of modal class, f 1=¿
frequency of the class preceding the modal class, f 2=¿ frequency of the class
following the modal class, and f =¿ frequency of the modal class.
Note:
If there are two observations (or modal classes) with the same maximum
frequency, then the mode can be found by using the formula (known as empirical
formula).
Mode ¿ 3 Median −2 Mean
Measures of Dispersion
Dispersion may be defined as the extent of the scatteredness of item around a
measure of central tendency.
Methods of measuring dispersion
The following are the methods of measuring dispersion: (i) the range; (ii) the
semi-interquartile range or quartile deviation; (iii) the mean deviation; and (iv) the
standard deviation.
Range It is the difference between the highest and the lowest value in the series,
i.e., Range ¿ x h−x t , wherex h is highest value and x l is the lowest value. The
coefficient of range ¿ ( x h−x l ) / ( x h + x l ).
Mean deviation
Individual series:
n
1
MD= ∑ ❑|x i−M |
n i=1
where M =¿ median ¿ mean /¿ mode, n=¿ number of observations.
Discrete series:
n n
1
MD= ∑ ❑ f i|x i−M |, N =∑ ❑ f i
N i=1 i=1
Note: In general, mean deviation (MD) always stands for mean deviation about
the median.
Standard deviation:The arithmetic mean of the square of deviations of the
variable values from its actual arithmetic mean is known as variance and its
square root is known as standard deviation (σ ) .
Individual series:
( )
n n 2
1 1 1
2
σ = variance = ∑
n i=1
❑ ( x i−x́ ) = ∑ ❑ x i − ∑ x i
2
n i=1
2
n
Discrete series:
( )
n n 2
1
2
σ = variance =
N
∑ ❑ f i ( x i−x́ ) = N1 ∑ ❑ f i x 2i − N1 ∑ f i xi
2
i=1 i=1
Standard deviation
√ ( )
n 2
1
(σ )= ∑ ❑ f i x 2i − N1 ∑ f i xi
N i=1
x i− A
Note: If d = then
i
h
[ ( )]
n n 2
1 1
σ =h2
∑
N i=1
❑ f i d 2i − ∑ ❑ f i d i
2
N i=1
Properties of Mean, Median, and Mode
Mean:
1 The sum of the square of deviations from mean is minimum, i.e., Σ ¿ is least.
2 The sum of deviations of items from their mean is equal to zero, i.e.,
Σ (X − X́ )=0 .
3 The mean is affected accordingly if the observations are given mathematical
treatment by any constant item.
4 The arithmetic mean is independent of origin, i.e., it is not affected by any
change of origin (assumed mean).
Median
1 The sum of the absolute values of deviation of the item from median is
minimum.
2 It is a positional average and is not influenced by the position of the items.
Mode
It is not affected by the presence of extremely large or small items.
Combined standard deviation:
If there are two sets of observations containing n1 and n2 items with respective
mean X́ 1 and X́ 2 and standard deviations σ 1 and σ 2, then the mean X́ and
standard deviation of the n1 +n 2 observations, taken together, is
n X́ +n X́ 1
X́ 12= 1 1 2 2 ⇒ σ 2=
n1 +n2
[
n1 +n2 1 1 1
]
n ( σ 2 +d 2 ) + n2 ( α 22+ d22 )
d 1= X́ 12− X́ 1 , d 2= X́ 12− X́ 2
Properties of Standard Deviation:
1 The standard deviation of first n natural numbers 1 , 2,3 , … , n is
√ ( n −1 ) /12.
2
P( B)=P( A) ⋅P (B / A)+ P ( A ) ⋅ P ( B / A )
' '
Probability distribution
[ P ( A 1 ) ⋅ P ( A / A 1 )+ P ( A 2 ) ⋅ P ( A / A 2 )
+ ⋯+ P ( A n ) ⋅ P ( A / A n ) ]
Let S be the sample space associated with a given random experiment. Then a
random variable is a real-valued function whose domain is subset of sample
space of the experiment. If the experiment random variable assumes (takes) the
values 0,1,2 , … , n
x x1 x2 x 3 … xn
P( x ) p1 p2 p3 … pn
Binomial distribution
A probability distribution representing the binomial trials is said to binomial
distribution. Let us consider a binomial experiment which has been repeated " n
p and q
" times. Let the probability of success and failure in any trial be
respectively in these n trials. Now number of ways of choosing " r " success in " n
n
" trials ¿ Cr Probability of " r " success and (n−r ) failures is pr ⋅q n−r. Thus
n r n−r
probability of having exactly r successes ¿ Cr ⋅ p ⋅q .
Let " X " be random variable representing the number of successes, then
P( X =r )=n Cr ⋅ pr ⋅ qn−r ¿, … , n ¿
Notes:
Probability of atmost " r " successes in n trials
r
¿ ∑ ❑ Cλ p q
n λ n− λ
λ=0
Probability of atleast ' r ' successes in n trials
n
¿ ∑ ❑ n C λ p λ ⋅ qn− λ
λ=r
Probability of having first success at the r th trial ¿
p ⋅q
r −1