0% found this document useful (0 votes)
21 views

Iii - 4 - Statistical Methods and Its Applications - II

Uploaded by

althafagamed
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views

Iii - 4 - Statistical Methods and Its Applications - II

Uploaded by

althafagamed
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 114

PERIYAR UNIVERSITY

NAAC 'A++' Grade with CGPA 3.61 (Cycle - 3)


State University - NIRF Rank 56 - State Public University Rank 25
Salem-636011, Tamilnadu, India.

CENTRE FOR DISTANCE AND ONLINE EDUCATION


(CDOE)

BACHELOR OF COMPUTER SCIENCE


SEMESTER – IV

ELECTIVE COURSE: STATISTICAL METHODS


AND ITS APPLICATIONS – II
(Candidates admitted from 2024 onwards)

1
PERIYAR UNIVERSITY
CENTRE FOR DISTANCE AND ONLINE EDUCATION
(CDOE)

BACHELOR OF COMPUTER SCIENCE


SEMESTER – IV

ELECTIVE COURSE: STATISTICAL METHODS


AND ITS APPLICATIONS – II

Prepared by:
Centre for Distance and Online Education (CDOE)
Periyar University, Salem – 11.

2 Periyar University – CDOE| Self-Learning Material


TABLE OF CONTENTS
UNIT TOPICS PAGE
Syllabus
1 Random variable and Mathematical Expectation 09
2 Discrete Probability Distribution 35
3 Continuous Probability Distribution 53
4 Test of Significance (Large Samples Tests) 69
5 Test of Significance (Small Samples Tests) 88

3 Periyar University – CDOE| Self-Learning Material


Section Topic Page Nos.
UNIT - I
Unit Objectives
Section 1.1 RANDOM VARIABLES 09
1.1.1 Definition of Random Variable 09
1.1.2 Discrete Random Variable 09
1.1.3 Continuous Random Variable 10
1.1.4 Cumulative Distribution Function 10
1.1.5 Properties of Distribution Function 10
Let Us Sum Up 11
Check Your Progress 11
Section 1.2 TWO-DIMENSIONAL RANDOM VARIABLE 11
1.2.1 Definition of Two-Dimensional Random Variable 12
1.2.2 Example 12
1.2.3 Definition of Two-Dimensional Discrete Random Variable 13
1.2.4 Definition of Two-Dimensional Continues Random Variable 14
Let Us Sum Up 14
Check Your Progress 14
Section 1.3 MARGINAL PROBABILITY DISTRIBUTION 15
1.3.1 Definition 15
1.3.2 Definition 15
Let Us Sum Up 15
Check Your Progress 16
Section 1.4 CONDITIONAL PROBABILITY DISTRIBUTION 16
1.4.1 Definition 16
1.4.2 Definition 16
Let Us Sum Up 17
Check Your Progress 17
Section 1.5 INDEPENDENT RANDOM VARIABLE 18
1.5.1 Definition 18
1.5.2 Definition 18
1.5.3 Example 18
Let Us Sum Up 21
Check Your Progress 21
Section 1.6 MATHEMATICAL EXPECTATION 22
1.6.1 Expectation 22
1.6.2 Definition 22
1.6.3 Example 24

4 Periyar University – CDOE| Self-Learning Material


1.6.4 Example 24
Let Us Sum Up 25
Check Your Progress 25
Section 1.7 FUNCTION OF A RANDOM VARIABLE 25
1.7.1 Definition 26
1.7.2 Theorem 26
1.7.3 Example 27
1.7.4 Definition 28
1.7.5 Theorem 28
1.7.6 Example 29
1.7.7 Example 30
Let Us Sum Up 30
Check Your Progress 30
1.8 Unit- Summary 30
1.9 Glossary 31
1.10 Self- Assessment Questions 31
1.11 Activities / Exercises / Case Studies 32
1.12 Answers for Check your Progress 34
1.13 References and Suggested Readings 34
UNIT - II
Unit Objectives
Section 2.1 DISCRETE PROBABALITY DISTRIBUTION 35
2.1.1 Binomial Distribution 35
2.1.2 Definition of Binomial Distribution 36
2.1.3 Example 37
2.1.4 Example 38
2.1.5 Additive Property 39
2.1.6 Recurrence relation for Binomial Distribution 39
2.1.7 Mean and Variance 41
2.1.8 Fitting of Binomial Distribution 43
Let Us Sum Up 44
Check Your Progress 44
Section 2.2 Poisson Distribution 45
2.2.1 Definition of Poisson Distribution 45
2.2.2 Mean and Variance 45
2.2.3 Additive Property 46
2.2.4 Fitting of Poisson Distribution 46
Let Us Sum Up 48
Check Your Progress 48

5 Periyar University – CDOE| Self-Learning Material


2.3 Unit Summary 49
2.4 Glossary 49
2.5 Self - Assessment Questions 49
2.6 Activities / Exercises / Case Studies 50
2.7 Answers for Check your Progress 51
2.8 References and Suggested Readings 52
UNIT - III
Unit Objectives 53
Section 3.1 Normal Distribution 53
3.1.1 Characteristics of Normal Distribution 53
3.1.2 Example 54
3.1.3 Fitting of Normal Distribution 55
3.1.4 Curve Fitting 55
3.1.5 Types 56
3.1.6 Example 56
Let Us Sum Up 58
Check Your Progress 58
Section 3.2 SECOND DEGREE PARABOLA 59
3.2.1 Definition 59
3.2.2 Example 59
Let Us Sum Up 62
Check Your Progress 62
Section 3.3 EXPONENTIAL DISTRIBUTION 63
3.3.1 Definition 63
3.3.2 Example 63
Let Us Sum Up 65
Check Your Progress 65
3.4 Unit summary 65
3.5 Glossary 66
3.6 Self- Assessment Questions 66
3.7 Activities / Exercises / Case Studies 67
3.8 Answers for Check your Progress 68
3.9 References and Suggested Readings 68
UNIT - IV
Unit Objectives 69
Section 4.1 STATISTICAL HYPOTHESIS 69
4.1.1 Population 69
4.1.2 Sample 70
4.1.3 Parameter 70

6 Periyar University – CDOE| Self-Learning Material


4.1.4 Statistic 70
Let Us Sum Up 70
Check Your Progress 70
Section 4.2 TESING OF HYPOTHESIS 71
4.2.1 Types 71
4.2.2 Critical Region (Or) Rejection Region 71
4.2.3 Errors 71
4.2.4 Level of Significance 71
4.2.5 Degrees of Freedom 71
4.2.6 Small Sample Test 72
4.2.7 Test Procedure 72
Let Us Sum Up 72
Check Your Progress 72
Section 4.3 TEST FOR POPULATION MEAN 73
4.3.1 Test Procedure 73
4.3.2 Problem 73
4.3.3 Test for Difference of Population Mean 74
4.3.4 Problem 74
Let Us Sum Up 76
Check Your Progress 76
Section 4.4 PAIRED t-TEST 77
4.4.1 Test Procedure 77
4.4.2 Problem 77
Let Us Sum Up 79
Check Your Progress 79
Section 4.5 CHI-SQUARE TEST OF INDEPENDENT ATTRIBUTES 80
4.5.1 Test Procedure 80
4.5.2 Example 81
Let Us Sum Up 82
Check Your Progress 82
4.6 Unit Summary 83
4.7 Glossary 83
4.8 Self- Assessment Questions 83
4.9 Activities / Exercises / Case Studies 85
4.10 Answers for Check your Progress 87
4.11 References and Suggested Readings 87
UNIT - V
Unit Objectives 88
Section 5.1 TESTING OF SIGNIFICANCE - SMALL SAMPLES 88

7 Periyar University – CDOE| Self-Learning Material


5.1.1 Test Procedure 88
5.1.2 Test for Population Mean 89
5.1.3 Problem 89
5.1.4 Test for Difference of Population Mean 90
5.1.5 Problem 91
Let Us Sum Up 92
Check Your Progress 92
Section 5.2 PAIRED t-TEST 93
5.2.1 Test Procedure 93
5.2.2 Problem 93
5.2.3 Test for Specified Population Mean 95
5.2.4 Problem 96
5.2.5 Test for Difference of Population Mean 97
5.2.6 Problem 98
Let Us Sum Up 99
Check Your Progress 99
TESTING OF HYPOTHESIS ABOUT A POPULATION
Section 5.3 100
PROPORTION
5.3.1 Test for Single Proportion 100
5.3.2 Test for the Difference Between two Population Proportions 101
Let Us Sum Up 102
Check Your Progress 102
Section 5.4 CHI SQUARE TEST OF INDEPENDENT OF ATTRIBUTES 103
5.4.1 Example 104
5.4.2 Chi Square Test of Goodness of Fit 105
5.4.3 Problem 106
5.4.4 Test for Equality of Population Variances 106
5.4.5 Problem 107
Let Us Sum Up 108
Check Your Progress 108
5.5 Unit Summary 109
5.6 Glossary 109
5.7 Self- Assessment Questions 109
5.8 Activities / Exercises / Case Studies 111
5.9 Answers for Check your Progress 113
5.10 References and Suggested Readings 113

8 Periyar University – CDOE| Self-Learning Material


SYLLABUS
STATISTICAL METHODS AND ITS APPLICATIONS – II

Unit–I: Random Variable and Mathematical Expectation: Definitions –


Random variable – Discrete and Continuous Random variable –
Distribution functions and Density function – Mathematical Expectation and
its Properties - Simple Problems.

Unit–II: Discrete Probability Distribution: Binomial and Poisson Distributions


– Mean and Variance of Distributions – Recurrence formula – Fitting of
Binomial and Poisson Distributions – Simple Problems.

Unit-III: Continuous Probability Distribution and Curve Fitting: Definition of


Normal distribution – Characteristics of Normal distribution (Simple
Problems) – Curve fitting – Fitting of Straight line and Second degree
Parabola - Simple Problems.

Unit-IV: Test of Significance (Large Samples Tests): Concept of Statistical


Hypothesis – Simple and Composite Hypothesis – Null and Alternative
Hypothesis – Critical region – Type I and Type II Errors – Sampling
distribution and Standard Error – Test of Significance: Large Sample Tests
for Proportion, Difference of Proportions, Mean and Difference of Means -
Simple Problems.

Unit-V: Test of Significance (Small Samples Tests): Small sample tests with
regard to Mean, Difference between Means and Paired t- test, F-test -
Definition of Chi-square test – Assumptions – Characteristics – Chi-square
tests for Goodness of fit and Independence of attributes – Simple
Problems.

9 Periyar University – CDOE| Self-Learning Material


UNIT 1- RANDOM VARIABLE AND MATHEMATICAL
EXPECTATION
Unit–I: Random Variable and Mathematical Expectation Definitions –
Random variable – Discrete and Continuous Random variable –
Distribution functions and Density function – Mathematical Expectation
and its Properties - Simple Problems.

Unit Objectives

1. To impart statistical concepts with rigorous mathematical treatment.

2. To introduce the basic concepts of discrete and continuous random variables.

3. To introduce the concepts of Distribution functions and Density function.

4. To introduce the concepts of Mathematical Expectation.

SECTION 1.1: RANDOM VARIABLE

1.1.1 – DEFINITION OF RANDOM VARIABLE

Let E be an experiment and S a sample space associated with the experiment.

A function X assigning to each element s  S , a real number X is called a random

variable.

1.1.2 - DISCRETE RANDOM VARIABLE

A random variable is said to be discrete if it assumes only a finite or countably

infinite values of X ; that the range space R contains a finite or countably infinite points.

10 Periyar University – CDOE| Self-Learning Material


Definition: Let x1 , x2 ,..., be possible values of a discrete random variable X . Then

p  xi  is called the probability function of the discrete random variable X if

(i) p  xi   0 for i  1, 2,..., and (ii)  p  x   1.


i
i

1.1.3 CONTINUOUS RANDOM VARIABLE

A random variable is said to be continuous if it can be assuming all points on the real

line.

Definition: A function f is said to be pdf of a continuous random variable X if the

following conditions are satisfied.



(i) f  x   0  x and (ii)  f  x dx  1.


1.1.4 CUMULATIVE DISTRIBUTION FUNCTION

Let X be a random variable discrete or continuous. Then the function F  x 

defined by F  x   p  X  x  is called the cumulative distribution function of X . If X

is a discrete random variable then F  x    p  x . If


xi  X
i X is a continuous random


variable then, F  x    p  x  dx.
i


1.1.5 PROPERTIES OF DISTRIBUTION FUNCTION

1. F     0 and F     1.

2. F  x  is non-decreasing function

3. F  x  continuous from the left

11 Periyar University – CDOE| Self-Learning Material


4. If X is continuous random variable with pdf f  x  then F '  x   f  x  .

Let Us Sum Up

Learners, in this section we have seen that definition of random variable, discrete

and continuous random variables cumulative distribution function and its properties of

distribution function.

Check your Progress


1. If two random variables X and Y are said to be independent if:
A. E(XY)=1
B. E(XY)=0
C. E(XY)=E(X) E(Y)
D. E(XY)= any constant value
2. If X is a random variable E(etx) is known as:
A. Characteristic function
B. Moment generating function
C. Probability generating function
D. All the above

SECTION 1.2: TWO-DIMENSIONAL RANDOM VARIABLE


We have so far considered random experiments with a single characteristic. Now we

consider random experiment with more than one characteristic. For example, we study

the heights and weights of some persons resulting in the outcomes (H, W). here the

outcome has two characteristics height and weight. This can regard as a two-

dimensional random variable.

12 Periyar University – CDOE| Self-Learning Material


1.2.1 DEFINITION TWO-DIMENSIONAL RANDOM VARIABLE

Let S be the sample space associate with a random experiment E. Let X = X(s),

Y = Y(s) be two functions, each assigning a real number to every outcome s  S. Then

(X, Y) is called a two-dimensional random variable.

1.2.2 EXAMPLE

Consider the experiment of throwing a coin twice. Define X as number of tosses

obtained in the first toss and Y as the number of heads observed in the second toss.

Then,

Points of s (H, H) (H, T) (T, H) (T, T)


X(s) 1 1 0 0
Y(s) 1 0 1 0

The sample space of the two-dimensional random variable (X, Y) is {(1,1), (1,0), (0,1),

(0,0)}. If the values of (X, Y) are finite or countably infinite then (X, Y) is called a two-

dimensional continuous random variable. If (X, Y) assume all values in some region in

R(X, Y) plane, then (X, Y) is called a two-dimensional continuous random variable. In

order to introduce the concept of joint probability distribution of a two-dimensional

random variable, we consider the following example. Consider the experiment of

throwing a perfect coin twice. The sample points are.

s1  ( H , H ), s2  ( H , T ), s3  ( H , T ), s4  (T , T )
1
p ( s1 )  p ( s2 )  p ( s3 )  p ( s4 ) 
4

Define X as the number of heads in two throws and Y as the number of tails in two

throws. The joint probability distribution is given below;

13 Periyar University – CDOE| Self-Learning Material


x/y 0 1 2 q( y j )
1 1
0 0 0
4 4
1 1
1 0 0
2 2
1 1
2 0 0
4 4
1 1 1
p ( xi ) 1
4 2 4

If the joint probability function of X and Y is denoted by p( xi , yi ) then p( xi , yi ) represents

p( X  xi ,Y  yi ) . Clearly each p( xi , yi )  0 and  p( x , y ) =1.


i j
i j Let us now give the

definition for a two-dimensional function of a two-dimensional random variable.

1.2.3 DEFINITION OF TWO-DIMENSIONAL DISCRETE RANDOM


VARIABLE

Let (X, Y) be a two-dimensional discrete random variable. Let p( xi , y j ) be a real number

associated with each ( xi , y j ) ,i,j = 1,2, 3,….Then p is called the joint probability function

of (X, Y) if the following conditions are satisfied.

(i) p( xi , yi )  0 for all i,j = 1,2,…

(ii)  p( x , y ) =1.
i j
i j

14 Periyar University – CDOE| Self-Learning Material


1.2.4 DEFINITION OF TWO-DIMENSIONAL CONTINUOUS RANDOM
VARIABLE

Let (X, Y) be a two-dimensional continuous random variable. Then f(x,y) is called

the pdf of (x, y) if the following conditions are satisfied.

i. f(x, y)  0 for all x, y

 
ii. 
 
f(x,y)dxdy =1

Let Us Sum Up
Learners, in this section we have seen the definition of two-dimensional random
variable and its examples.

Check Your Progress


1. If X and Y are two random variables, the covariance between the variables

aX+b and cY+d in terms of COV(X,Y) is:

A. COV(aX+b, cY+d)= COV (X,Y)

B. COV(aX+b, cY+d)= abcd* COV(X,Y)

C. COV(aX+b,cY+d) = ac COV (X,Y)+bd

D. COV(aX+b, cY+d)= acCOV (X,Y)

2. Let X∈{0,1) and y∈{0,1) be two independent binary random variables. If

P(X=0)=p and P(Y=0)=q then P(X+Y)≥ 1 is equal to:

A. pq+(1-p)(1-q)

B. pq

C. p(1-q)

D.1-pq

15 Periyar University – CDOE| Self-Learning Material


SECTION 1.3: MARGINAL PROBABILITY DISTRIBUTION
1.3.1 DEFINITION

If (X, Y) is a two dimensional discrete random variable with joint probability

function p( xi , yi ) then marginal distribution of X is defined by p( xi ) =  p( x , y ) and the


j
i j

marginal distribution of Y is defined by q( y j ) =  p( x , y ).


j
i j

1.3.2 DEFINITION

If (X, Y) is a two dimensional continuous random variable with pdf of f then the marginal

distribution of X is defined by g(x) = 

f(x,y)dy and the marginal distribution of Y is

defined by

h(y) = 

f(x,y)dx

Note: p (a  x  b)  p (a  x  b,   y  )

b 
= 
a 
f(x,y)dxdy

b
= 
a
g(x)dx

d
Similarly, p(c  y  d ) = 
a
h(y)dy

Let Us Sum Up
Learners, in this section we have seen about definition of marginal probability

distribution.

16 Periyar University – CDOE| Self-Learning Material


Check Your Progress
1. Bivariate normal distribution is also named as:

A. Bravais distribution

B. Laplace gauss distribution

C. Gaussian distribution

D. All the above

2. If f(x,y) is a binomial density function, then the surface Z = f(xy) is called:

A. Normal correlation surface

B. Bivariate normal density surface

C. Neither of (a) nor (b)

D. Both (a) and (b)

SECTION 1.4: CONDITIONAL PROBABILITY DISTRIBUTION


1.4.1 DEFINITION
Let (X, Y) be a two-dimensional random variable with joint probability function

p( xi , yi ) . Let p( xi ) and q( y j ) be the marginal probability functions of X and Y

respectively. Then the conditional distribution of Y probability functions of X and Y

respectively. Then the conditional distribution of Y on X is defined by

P( X i ,Y j )
p( xi / y j )  p( X  xi / Y  y j )  if q( y j )>0.
q( y j )

1.4.2 DEFINITION

Let (X, Y) is a two dimensional continuous random variable with joint probability

function f(x,y). Then the conditional probability density function of X gives that Y = y is

17 Periyar University – CDOE| Self-Learning Material


f ( x, y)
defined by g(x/y) = if h(y) >0 and the conditional probability density function of Y
g ( x)

f ( x, y)
given that X = x is defined by h(y/x) = if g(x) >0.
g ( x)

Let Us Sum Up
Learner, in this we have seen that definition of conditional probability distribution.

Check Your Progress


1. If E(Y/x) is the conditional expectation of Y given X=x, the E(XY) in terms of

conditional expectation can be expressed as:

A. E(XY)=E(X) E(Y/x)

B. E(XY)=E(Y) E(Y/x)

C. E(XY)=X E(Y/x)

D. E(XY)=E[X E(Y/x)]

2. For any continuous variables X and Y, if a variable Z which is linear

combinations of X and Y follows normal distribution then X and Y jointly follow:

A. Jointly discrete distribution

B. Jointly continuous distribution

C. Bivariate normal distribution

D. Circular normal distribution

18 Periyar University – CDOE| Self-Learning Material


SECTION 1.5: INDEPENDENT RANDOM VARIABLE
1.5.1 DEFINITION

Let (X,Y) be a two dimensional random variable with probability function p( xi , yi )

.Let p( xi ) and q( y j ) be the marginal probability functions of random variable X and Y

respectively. Then the random variables X and Y are said to be independent if p( xi , yi ) =

p( xi ) . q( y j ).

1.5.2 DEFINITION

Let (X, Y) is a two dimensional continuous random variable with joint probability

function f(x,y). Let g(x) and h(y) be the marginal probability density functions of the

random variables X and Y respectively. Then the random variables X and Y are said to

be independent if f(x,y) = g(x).h(y).

1.5.3 EXAMPLE

The joint probability distribution of two random variables X, Y is given by

y/x 1 2 3 4

4 2 5 1
1
36 36 36 36

1 3 1 2
2
36 36 36 36

3 3 1 1
3
36 36 36 36

2 1 1 5
4
36 36 36 36
Find (a) The marginal probability functions of X and Y

19 Periyar University – CDOE| Self-Learning Material


(b) The conditional probability density functions of X when Y = 1 and that Y when X =2.

Solution: The marginal probabilities p( xi ) and q ( y j ) are given by

P( xi ) = j
p( xi , yi ) q( y j ) = j
p( xi , yi )

P(1) = p(1,1) + p(1,2) + p(1,3) + p(1,4) q(1) = p(1,1) + p(2,1) + p(3,1) + p(4,1)

4 1 3  2 10 4  2  5  1 12
P(1) = = q(1)= =
36 36 36 36

2  3  3 1 9 1 3 1 2 7
P(2) = = q(2)= =
36 36 36 36

5 111 8 3  3 11 8
P(3) = = q(3)= =
36 36 36 36

1 2 1 5 9 2 11 5 9
P(4) = = q(4)= =
36 36 36 36

2
 9 fory  1

 1 fory  2
3

1
q ( x / y1 )   fory  3
6
1
 fory  4
9
0otherwise

(b) The conditional probability function of X is given by

f ( x, y)
P(x/y) = if q(y)>0
q( y )

20 Periyar University – CDOE| Self-Learning Material


4
p (1,1) 1
p  x  1/ y  1   36 
q ( y  1)  12  3
 
 36 

2
p (2,1) 1
p  2 / y  1   36 
q(1)  12  6
 
 36 

5
p (3,1) 5
p  3 /1   36 
q (1)  12  12
 
 36 

1
p(4,1) 1
p  4 /1   36 
q(1)  12  12
 
 36 

1
 3 forx  1

 1 forx  2
6

5
 p( x / y1 )   forx  3
12
1
 forx  4
12
0otherwise

Similarly, the conditional probability function of Y when X = 2 is given by

21 Periyar University – CDOE| Self-Learning Material


1
 9 fory  1

 1 fory  2
3

1
 q( y / x  2 )   fory  3
6
1
 fory  4
9
0otherwise

Let Us Sum Up
Learners, in this section we have seen definition of independent random variable

and examples.

Check Your Progress


1. For independent random variables X and Y which of the following is not
necessarily true?

A. E[X+Y]=E[X]+E[Y]

B. Var(X+Y)=Var(X)+Var(Y)

C. Cov(X,Y)=0

D. E[X.Y]=E[X].E[Y]

2. What If X and Y are independent random variables, what is the correlation

coefficient between X and Y?

A. 1

B. -1

C. 0

D. It depends on the variances of X and Y.

22 Periyar University – CDOE| Self-Learning Material


SECTION 1.6: MATHEMATICAL EXPECTATIONS

1.6.1 EXPECTATION OR MEAN VALUE

An average of a probability distribution is usually called the expectations or the

expected value of the random variable. Before, going to the definition let us consider an

example. Suppose we get, from someone, one rupee per dot that appears when a

perfect die is tossed. If this experiment is repeated 6,000 times (a large number of

items) we would obtain approximately 1,000 1’s, 1000 2’s etc. Therefore, total amount

of money received would be approximately Rs. (1.1000+2.1000+….+6.1000) and the

average that we would expect to win per toss would be

 1000 1000 1000 


Rs. 1.  2.  ....  6. 
 6000 6000 6000 

 1 1 1
= Rs. 1.  2.  ....  6. 
 6 6 6

= Rs.3.5

Which is the summation of the amounts associated with the outcomes multiplied by the

corresponding probabilities. Accordingly, where x designates the outcome of a toss, we

say the average or mean value of x is 3.5. Here we have to note that the expected value

is not an outcome of the experiment.

1.6.2 DEFINITION

Let X be a discrete random variable with possible values x1 x1, x2, ,.... xn ,..... Let,

p( xi )  P( X  xi ) i=1,2,…. Then the expected value of X, denoted by E(X), is defined as

23 Periyar University – CDOE| Self-Learning Material


 
E ( X )   xi p( xi ) if the series E ( X )   xi p( xi ) converges absolutely. If X is a
i 1 i 1


continuous random variable with pdf, f(x) then E ( X )   xf ( x)dx , then

if the integral is

absolutely convergent.

Note 1: Let us note the similarity between the expected value (when x assumes finite

number of values.) and the notion of average of a set of numbers x1, x2, ,....xn .We usually

n
xi
denote the average of above set of numbers X  i   . Suppose a sample of n1
i 1 n

observations produces observations each equal to x1 n2 , observations each equal to


,

x2 and so on.

Then _ n x n x ....n x
x 1 1n 2n 2....n k k
1 2 k

  xi ni
k

i 1 n
k
  xi fi
i 1

ni
Letting fi 
n

If n is sufficiently large,

fi  p( xi )
_ k
x   xi p( xi )
i 1

= E(X)

24 Periyar University – CDOE| Self-Learning Material


Note 2: We should observe that here the analogy between the expected value and the

concept centre of mass’ in mechanics. If a unit mass is distributes along a line at the

k
discrete points x1 , x2 ,....xn ,...... and p( xi ) is the mass at the point xi then  x p( x )
i 1
i i

represents the centre of mass about origin. Similarly if a unit mass is distributed

continuously over a line and if f(x) represents mass density at x, then  xf ( x)dx
i 1
may

again be interpreted as the centre of mass. In this sense E(x) represents a centre of the

probability distribution. Also, E(x) is sometimes called a measure of central tendency

and is same units as X.


Note 3: Expectation need not exist for all the probability distributions.

1.6.3 EXAMPLE

Suppose a random variable X takes the value 0!, 1!, 2! ,….. with probabilities
1
P( X  xi )  e
xi !

e1 e1 e1


Then E ( X )  0!  1!  2!  .....
0! 1! 2!

 e1[1  1  1  .....]

Which is divergent. E(x)= does not exist.

1.6.4 EXAMPLE

1
A Cauchy distribution is defined by the pdf f ( x)  ,   x  
 (1  x 2 ) for

Then E( X )   xf ( x)dx


25 Periyar University – CDOE| Self-Learning Material



1
  x  (1  x ) dx

2

1 
 log(1  x2 )  Which does not exist.
2 

E(x) does not exist for Cauchy distribution.

Let Us Sum Up
Learners, in this section we have seen definition of mathematical
expectations and examples.
Check Your Progress
1. If X and Y are independent random variables, what is the expected

value of their product E[XY]?

A. E[X] E[Y]

B. E[X+Y]

C. E[X]+E[Y]

D. Cov(X,Y)

2. The expected value of a discrete random variable X is known as?

A. The median

B. The mode

C. The mean

D. The variance

SECTION 1.7: FUNCTION OF A RANDOM VARIABLE


If X is a random variable and Y is a function of X then Y is also a random

variable. If the probability function of X is known the probability function of Y can be

26 Periyar University – CDOE| Self-Learning Material


determined and hence E(y) can be evaluated. E(y) can also be evaluated without

knowing the probability distribution of Y. Let us now see these two approaches.

1.7.1 DEFINITION

Let X be a random variable and Y=H(x) If Y is a discrete random variable with

possible values y1 , y2 ,..... , and q( y1 ), q( y2 )..... are their probabilities, the expectation of Y


is defined as E (Y )   yi q ( yi ) provided the series on R.H.S is absolutely convergent. If
i 1

Y is a continuous random variable with pdf q(y) then we define


E ( y)   yq( y)dy

If it exists.

The following theorem gives the method on finding E(y) without the knowledge of the

probability distribution of Y.

1.7.2 THEOREM

Let X be a random variable and Y=H(x)

(i) If X is discrete with probability function p(y) then


E ( y )   H ( xi ) p ( xi )
j 1

(ii) If X is a continuous random variable with pdf : f(x) then



E ( y)   H ( x) f ( x)dx


27 Periyar University – CDOE| Self-Learning Material


1.7.3 EXAMPLE

A random variable X has probability function as follows.


Values of X -1 0 1
Probability 0.2 0.3 0.5
Evaluate (i) (3 X  1) (ii) E ( X 2 )

(i) E (3 X  1)   (3x  1) p( x)

= (-3+1) 0.2 +(0+1)0.3+(3+1)0.5

=1.9

Note: It is easily seen that E(3X+1) =3E(X)+1

(ii ) E ( X 2 )   x 2 p( x)

=1(0.2)+0(0.3)+1(0.5)

=0.7

So far, we have seen the expectation of a random variable and the expectation of a

function of a random variable of one dimension. These concepts can be extended for

higher dimensional random variables. Let us now give the definition in the case of a

two-dimensional random variable. If (X, Y) is a two dimensional random variable,

Z=H(X,Y) is a one dimensional random variable. Consider the example of throwing a

perfect die twice.

Define X = number that appears in the first throw.

Y = number that appears in the second throw.

Then (X, Y) is a two-dimensional random variable. Let us now give the define Z= sum of

the numbers in two throws of the die. Clearly Z is a function of (X, Y) (i.e,) Z=H(X,Y). Let

28 Periyar University – CDOE| Self-Learning Material


us now give the definition for finding the expectation of a two-dimensional random

variable.

1.7.4 DEFI ITION

Let (X, Y) be a two dimensional random variable and Z=H(X, Y) be a real valued

function of (X, Y). Then Z is a one-dimensional random variable and E(Z) is defined as

follows.

(i) If Z is a discrete random variable with possible values z1 , z2 ,...... and


p( zi )  P(Z  zi ) then E ( Z )   zi p ( zi ) if the series on R.H.S is absolutely
i 1

convergent.

(ii) If Z is a continuous random variable with pdf f(z) then E ( z )   zf ( z)dz

if the

integral is absolutely convergent.

We shall now state the theorem (without proof), with the help of which we can

determine E(Z), without knowing the probability distribution of Z.

1.7.5 THEOREM

Let (X, Y) be a two dimensional random variable and Z = H(X, Y) . If (X, Y) is a

discrete random variable with probability function.

p( xi , yi ) then E ( z )   H ( xi , yi ) p( xi , yi ) if it exists.
j i

If (X, Y) is a continuous random variable with joint pdf f(x,y) then,


 
E( z)    H ( x, y) f ( x, y)dxdy if it exists.
 

29 Periyar University – CDOE| Self-Learning Material


n
7 7 7
 , ,.......ntimes   
2 2 2

1.7.6 EXAMPLE

Three urns contain respectively 3 green and 2 white balls, 5 green and 6 white

balls and 2 green and 4 white balls. One ball is drawn from each urn. Find the

expected number of white balls drawn out.

Solution: Let X be a random variable denoting the number of white balls drawn. Then

the possible values of x are 0,1,2,3.

G W
U1 3 2
U2 5 6
U3 2 6

3 5 2 30
P( x  0)  
5 11 6 330

2 5 2 3 6 2 3 5 4
p( x  1)         
5 11 6 5 11 6 5 11 6

20  36  60 116
 
330 330

24  40  72 136
p( x  2)  
330 330

48
p( x  3) 
330

30 116 136 48
E( X )  0   1  2  3
330 330 330 330

116  272  144 532 266


   .
330 330 165

30 Periyar University – CDOE| Self-Learning Material


1.7.7 EXAMPLE

Show that the expected number p failures preceding the first success in a series of

1 P
Bernoulli trials with a constant probability of success p is .
P

Let Us Sum Up
Learners, this section discusses the definition of function of a random variable and
given example.

Check Your Progress


1. If X is a random variable and Y = g(X) where g is a function, what is

the expected value of Y?

A. E[g(X)]

B. G(E[x])

C. E[x]

D. G(E[g(X)])

2. Which If X is a random variable and Y = aX+b, where a and b are constants,

what is the variance of Y?

A. a2 Var(X)

B. Var(X)

C. aVar(X)+b

D. Var(x)+b2

1.8 Unit Summary


The first unit content on definitions of random variable, discrete and continuous
random variable, distribution functions and density function, mathematical expectation
and its properties with simple problems.

31 Periyar University – CDOE| Self-Learning Material


1.9 Glossary
KEYWORDS MEANING

E(X) Expectation of X
Var(X) Variance of X
Cov(X,Y) Covariance of X and Y
E(etx) Characteristic function of X

1.10 Self-Assessment Questions


Short Answers: (5 Marks)

1. Explain random variables and its types.

2. Explain the properties of distribution function.

3. Define marginal probability density function.

4. Write properties of conditional probability distribution.

5. Define function of a random variable.

Long Answers: (8 Marks)

1. Describe the two-dimensional random variable with an example.

2. The joint probability distribution of two random variables X, Y is given by

y/x 1 2 3 4

4 2 5 1
1
36 36 36 36

1 3 1 2
2
36 36 36 36

3 3 1 1
3
36 36 36 36

2 1 1 5
4
36 36 36 36

32 Periyar University – CDOE| Self-Learning Material


Find (a) The marginal probability functions of X and Y

(b) The conditional probability density functions of X when Y = 1 and that Y when X =2.

3. A random variable X has probability function as follows.

Values of X -1 0 1

Probability 0.2 0.3 0.5

Evaluate (i) (3 X  1) (ii) E ( X 2 )

4. Three urns contain respectively 3 green and 2 white balls, 5 green and 6 white balls

and 2 green and 4 white balls. One ball is drawn from each urn. Find the expected

number of white balls drawn out.

5. Explain independence of two random variables with an example.

1.11 EXERCISES
1. If a bag containing 40 blue marbles and 60 red marbles. Choose 10 marbles (without

replacement) at random. Let X be the number of blue marbles and Y be the number

of red marbles. Find the joint PMF of X and Y.

2. In a large consignment of oranges, a random sample of 64 oranges revealed that 14

oranges were bad. Is it reasonable to assume that 20% of the oranges were bad?

3. A die is thrown 9000 times and throw of 3 or 4 observed 3240 times. Show that the die

cannot be regarded as unbiased one.

4. The following table gives the number of aircraft accidents that occurred during the various

days of the week. Test whether the accidents are uniformly distributed over the week.

Days : Mon Tue Wed Thur Fri Sat

No. of Accidents: 14 18 12 11 15 14

33 Periyar University – CDOE| Self-Learning Material


5. Suppose that the number of customers visiting a fast-food restaurant in a given day is

N follows Poisson distribution. Assume that each customer purchases a drink with

probability p, independently from other customers, and independently from the values

of N. Let X be the number of customers that do not purchase drinks: so, X+Y=N.

a. Find the marginal PMFs of X and Y.

b. Find the joint PMF of X and Y.

6. I toss a coin twice. Let X be the number of observed heads. Find the CDF of X.

7. Let X be a discrete random variable with range R2 = {1,2,3,…} .Suppose the PMF of

X is given by

PX(k)=1/2k for k = 1,2,3,…,

Find and plot the CDF of X and FX(x).

8. Let X be a discrete random variable with the following PMF

PX(X)= 0.1 for x=0.2

0.2 for x=0.4

0.2 for x=0.5

0.3 for x=0.8

0.2 for x=1

0 otherwise

Find Rx the range of the random variable X.

34 Periyar University – CDOE| Self-Learning Material


1.12 Answers for Check Your Progress
Modules S. No. Answers
1. C. E(XY) = E(X)E(Y)
Module 1
2. B. Moment generating function
1. D. Cov(aX+B,cY+d) = acCov(X,Y)
Module 2
2. D. 1-pq
1. D. All the above
Module 3
2. D. Both a and b
1. D. E(XY)=E[XE(Y/x)]
Module 4
2. B. Bivariate normal distribution
1. C. COV(X,Y)=0
Module 5
2. C. 0
1. A. E(X)E(Y)
Module 6
2. B. The mean
1. A. E[g(x)]
Module 7
2. A. a2 var(X)

1.13 References and Suggested Readings

1. Gupta S. P. (2001), Statistical Methods, Sultan Chand and Sons, New Delhi.
2. Gupta. S. C. and Kapoor. V. K. Fundamentals of Applied Statistics, Sultan Chand and Sons,
New Delhi
3. Pillai R. S. N. And Bagavathi. V. (2005), Statistics, S. Chand and Company Ltd., New Delhi.
4. Sancheti D. C. And Kapoor. V. K (2005), Statistics (7th Edition), Sultan Chand & Sons, New
Delhi.
5. Arora P. N, Comprehensive Statistical Methods, Sultan Chand & Sons, New Delhi.
6. Murthy M. N (1978), Sampling Theory and Methods, Statistical Publishing Society, Kolkata.
7. Pillai R. S. N. And Bagavathi. V. (1987), Practical Statistics, S. Chand & Company Ltd., New
Delhi.
8. Agarwal B. L, Basic Statistics, Wiley Eastern Ltd., Publishers, New Delhi.
9. Gupta C. B (1978), An Introduction to Statistical Methods, Vikas Publishing House, New
Delhi.
10. Snedecor G.W and Cochran W.G., Statistical Methods, Oxford Press and IBH.

35 Periyar University – CDOE| Self-Learning Material


UNIT 2 - DISCRETE PROBABILITY DISTRIBUTION
Discrete Probability Distribution - Binomial and Poisson Distributions –
Mean and Variance of Distributions – Recurrence formula – Fitting of
Binomial and Poisson Distributions – Simple Problems.

Unit objectives

1. To impart statistical concepts of discrete probability distribution.

2. To introduce the definition of Binomial distribution and its mean and variance.

3. To introduce the concepts of recurrence relation of Binomial distribution.

4. To introduce the definition of Poisson distribution and the concepts of recurrence

relation of Poisson distribution.

SECTION 2.1: DISCRETE PROBABALITY DISTRIBUTION

2.1.1: BINOMIAL DISTRIBUTION

In probability theory and statistics, the binomial distribution with

parameters n and p is the discrete probability distribution of the number of successes in

a sequence of n independent experiments, each asking a yes–no question, and each

with its own Boolean - valued outcome: success (with probability p) or failure (with

probability) A single success/failure experiment is also called a Bernoulli trial or

Bernoulli experiment, and a sequence of outcomes is called a Bernoulli process; for a

single trial, i.e., n = 1, the binomial distribution is a Bernoulli distribution. The binomial

36 Periyar University – CDOE| Self-Learning Material


distribution is the basis for the popular binomial test of statistical significance. The

binomial distribution is frequently used to model the number of successes in a sample of

size n drawn with replacement from a population of size N. If the sampling is carried out

without replacement, the draws are not independent and so the resulting distribution is

a hypergeometric distribution, not a binomial one. However, for N much larger than n,

the binomial distribution remains a good approximation, and is widely used. We are

interested in the probability of getting x successes in n trials, or, in other words, x

successes and (n – x) failures in n attempts. A series of trials which we refer to as

Bernoulli trials if the following assumptions hold:

1. There are only two possible outcomes for each trial (arbitrarily called “success” and

“failure”, without inferring that a success is necessarily desirable).

2. The probability of success is the same for each trial. The outcomes from different

trials are independent.

2.1.2 – DEFINITION OF BINOMIAL DISTRIBUTION

Let X be the random variable that equals the number of successes in n trials. To obtain

probabilities concerning X, we proceed as follows: If p and (1 – p) are the probabilities

of success and failure on any one trial, then the probability of getting x successes and

(n−x) failures, in some specific order, is px(1− p)n−x. Clearly, in this product of p’s and

(1−p)’s there is one factor p for each success, one factor 1−p for each failure. The x

factors p and n−x factors 1−p are all multiplied together by virtue of the generalized

multiplication rule for more than two independent events. Since this probability applies

to any point of the sample space that represents x successes and (n−x) failures (in any

specific order), we have only to count how many points of this kind there are, and then

37 Periyar University – CDOE| Self-Learning Material


multiply px(1−p)n−x by this. The number of ways in which we can select the x trials on

which there is to be a success is C(n, x), the number of combinations of x objects

selected from a set of n objects. Multiplying, we arrive at the following result:

This probability distribution is called the binomial distribution because for x= 0, 1,2, . . . ,

and n, the values of the probabilities are the successive terms of the binomial expansion

of [p+(1− p)]n. For the same reason, the combinatorial quantities C(n, x) are referred to

as binomial coefficients.

2.1.3 EXAMPLE

It has been claimed that in 60% of all solar-heat installations the utility bill is

reduced by at least one-third. Accordingly, what are the probabilities that the utility

bill will be reduced by at least one-third in

(a) four of five installations. (b) at least four of five installations?

Solution: (a) Substituting x = 4, n = 5, and p = 0.60 into the formula for the binomial

distribution, we get

b(4; 5, 0.60) =C(5,4) (0.60)4(1 − 0.60)5−4

= 0.259;

(b) Substituting x = 5, n = 5, and p = 0.60 into the formula for the binomial distribution,

we get

b(5; 5, 0.60) = C(5, 5) (0.60)5(1 − 0.60)5−5

38 Periyar University – CDOE| Self-Learning Material


= 0.078

and the answer is b(4; 5, 0.60) + b(5; 5, 0.60) = 0.259 + 0.078 = 0.337.

2.1.4 EXAMPLE
Let X have the binomial distribution with probability distribution
𝑛
𝑏 (𝑥|𝑛, 𝑝) = ( ) 𝑝 𝑥 (1 − 𝑝)𝑛−𝑥 𝑓𝑜𝑟 𝑥 = 0, 1, . . . . , 𝑛.
𝑥

Show that

(𝒂)𝑁(𝑡) = (1 − 𝑃 + 𝑝𝑒𝑡)𝑛 𝑓𝑜𝑟 𝑎𝑙𝑙 𝑡

(𝒃) 𝐸(𝑋) = 𝑛𝑝 𝑎𝑛𝑑 𝑉𝑎𝑟 (𝑋) = 𝑛𝑝 (1 − 𝑝 )

(a) Bt definition of the moment generating function


𝑛
𝑀( 𝑡) = ∑𝑛𝑥=0 𝑒 𝑡𝑥 ( ) 𝑝 𝑥 (1 − 𝑝)𝑛−𝑥
𝑥
𝑛
= ∑𝑛𝑥=0 ( ) (𝑒 𝑡 𝑝)𝑥 (1 − 𝑝)𝑛−𝑥
𝑥

= (𝑝𝑒 𝑡 + 1 − 𝑃)𝑛 𝑓𝑜𝑟 𝑎𝑙𝑙 𝑡

Where we have used the binomial formula


𝑛
(𝑎 + 𝑏)𝑛 = ∑ (𝑛𝑥)𝑎 𝑥 𝑏 𝑛−𝑥
𝑥=0

(b) Deafferenting M(t) , we find

(𝑀′ (𝑡)) = 𝑛 𝑃 𝑒 𝑡 (𝑝 𝑒 𝑡 + 1 − 𝑃 )

(𝑀′ ′ (𝑡)) = (𝑛 − 1)𝑛𝑝2 𝑒 2𝑡 (𝑝 𝑒 𝑡 + 1 − 𝑝)𝑛−2 + 𝑛 𝑝 𝑒 𝑡 ( 𝑝 𝑒 𝑡 + 1 − 𝑝)𝑛−1

Evaluating these derivatives at t -0, we obtain the moments

E (X) = np

E (X2) = (n – 1) n P2 + np

Also, the variance is

Var (X) = E (X)2 – [E(X)]2 = np (1-P)

39 Periyar University – CDOE| Self-Learning Material


2.1.5 - ADDITIVE PROPERTY

If x is a B(𝑛1 , P) and Y is B (n2,p) and they are independent then their sum X + Y

Also follows B(n1 + n2, p)

Since , 𝑋~ 𝐵(𝑛𝑖, 𝑝), 𝑀𝑥 (𝑡) = (𝑞 + 𝑝 𝑒 𝑡 )𝑛1

𝑌~ 𝐵(𝑛2, 𝑝), 𝑀𝑦 (𝑡) = (𝑞 + 𝑝 𝑒 𝑡 )𝑛2

since X and Y are independent

= (𝑞 + 𝑝 𝑒 𝑡 )𝑛1 × (𝑞 + 𝑝 𝑒 𝑡 )𝑛2

= (𝑞 + 𝑝 𝑒 𝑡 )𝑛1 +𝑛2

= mgf of B (𝑛1 + 𝑛2 , 𝑝)

Therefore,

𝑋 + 𝑌~𝐵 (𝑛1 + 𝑛2 , 𝑝)

2.1.6 - RECURRENCE RELATION FOR BINOMIAL DISTRIBUTION

If 𝑋~ 𝐵(𝑛, 𝑝), 𝑡ℎ𝑒𝑛

𝑑𝜇
𝜇𝑟+1 = 𝑝𝑞 [𝑛𝑟𝜇𝑟−1 + ]
𝑑𝑝

Proof : We have

𝜇𝑟 = 𝐸[ 𝑋 − 𝐸(𝑋)]𝑟

= 𝐸 (𝑋 − 𝑛𝑝)𝑟
𝑛
= ∑ (𝑥 − 𝑛𝑝)𝑟 𝑛 𝐶𝑥 𝑝 𝑥 𝑞 𝑛−𝑥
0

Therefore,

𝑑𝜇 𝑛
= ∑ [𝑛 𝐶𝑥 𝑝 𝑥 𝑞 𝑛−𝑥 𝑟 (𝑥 − 𝑛𝑝)𝑟−1 (−𝑛) +
𝑑𝑝 0

40 Periyar University – CDOE| Self-Learning Material


𝑛 𝐶𝑥 (𝑥 − 𝑛𝑝)𝑟 𝑞 𝑛−𝑥 𝑥 𝑝 𝑥−1 + 𝑛 𝐶𝑥 (𝑥 − 𝑛𝑝)𝑟 𝑝 𝑥 (𝑛 − 𝑥)𝑞 𝑛−𝑥−1 (−1)]
𝑛 𝑛 𝑥 𝑛−𝑥
= −𝑛𝑟 ∑ (𝑥 − 𝑛𝑝)1−𝑟𝑛 𝐶𝑥 𝑝 𝑥 𝑞 𝑛−𝑥 + ∑ (𝑥 − 𝑛𝑝)𝑟𝑛 𝑛 𝐶𝑥 𝑝 𝑥 𝑞 𝑛−𝑥 [ − ]
0 0 𝑝 1−𝑝

= 𝑛𝑟 𝜇𝑟−1 + ∑𝑛0(𝑥 − 𝑛𝑝)𝑟+1 𝑓(𝑥)


𝜇 1
That is = −𝑛𝑟 + 𝜇𝑟+1
𝑑𝑝 𝑝𝑞

Therefore,

𝑑𝜇
𝜇𝑟+1 = 𝑝𝑞 [𝑛𝑟 𝜇𝑟−1 + ]
𝑑𝑝

Using the information,

𝜇0 = 1 𝑎𝑛𝑑 𝜇𝑖 = 0. 𝐴𝑙𝑠𝑜 𝑤𝑒 𝑐𝑎𝑛 𝑑𝑒𝑡𝑒𝑟𝑚𝑖𝑛𝑒 𝑡ℎ𝑒 𝑣𝑎𝑙𝑢𝑒𝑠 𝑜𝑓 𝜇2 , 𝜇3 , 𝜇4 etc. by this relation.

Recurrence relation for Binomial Distribution


𝑛−𝑥𝑝
𝐵(𝑥 + 1; 𝒏, 𝑝) = 𝐵(𝑥; 𝑛, 𝑝)
𝑥 +1𝑞

Proof:

We have,
𝑛
𝐵(𝑥; 𝑛, 𝑝) = 𝐶𝑥𝑝 𝑥 𝑞 𝑛−𝑥

𝑛
𝐵(𝑥 + 1; 𝑛, 𝑝) = 𝐶𝑥+1 𝑝 𝑥+1 𝑞 𝑛−(𝑥+1)
𝑛
𝐵(𝑥 + 1; 𝑛, 𝑝) 𝐶𝑥+1 𝑝 𝑥+1 𝑞 𝑛−𝑥−1
= 𝑛 𝐶 𝑝 𝑥 𝑞 𝑛−𝑥
𝐵(𝑥; 𝑛, 𝑝) 𝑥

𝑛! 𝑥! (𝑛 − 𝑥)! 𝑝
= × ×
(𝑥 + 1)! (𝑛 − 𝑥 − 1)! 𝑛 𝑞

(𝑛 − 𝑥) 𝑝
=
𝑥+1 𝑞

There fore,

(𝑛 − 𝑥) 𝑝
𝐵(𝑥 + 1; 𝑛, 𝑝) = 𝐵(𝑥; 𝑛, 𝑝)
𝑥+1 𝑞

41 Periyar University – CDOE| Self-Learning Material


2.1.7 MEAN AND VARIANCE

Mean and variance: Mean,


E(X) = ∑ 𝑥𝑓(𝑥)
𝑥
𝑛 𝑛
𝑛 𝑥 𝑛−𝑥 ]
𝑛!
∑ 𝑥[ 𝐶𝑥 𝑝 𝑞 = ∑ 𝑥[ 𝑝 𝑥 𝑞 𝑛−𝑥
𝑥! (𝑛 − 𝑥)!
𝑥=0 𝑥=0
𝑛
(𝑛 − 1)!
= 𝑛𝑝 ∑ [ 𝑝 𝑥−1 𝑞 𝑛−𝑥 ])
(𝑥 − 1)! (𝑛 − 𝑥)!
𝑥=0
𝑛𝑝[(𝑝 + 𝑞)𝑛−1 ]
= 𝑛𝑝[(1)𝑛−1 ] = 𝑛𝑝
𝐸(𝑋 2 ) = ∑ 𝑥 2 [ 𝑛 𝐶𝑥 𝑝 𝑥 𝑞 𝑛−𝑥 ]
𝑥=0
𝑛
𝑛!
= ∑ [𝑥(𝑥 − 1) + 𝑥] [ 𝑝 𝑥 𝑞 𝑛−𝑥
𝑥! (𝑛 − 𝑥)!
𝑥=0
𝑛 𝑛
𝑛! 𝑛!
= ∑ [𝑥(𝑥 − 1) [ 𝑝 𝑥 𝑞 𝑛−𝑥 ] + ∑ 𝑥 [ 𝑝 𝑥 𝑞 𝑛−𝑥 ]
𝑥! (𝑛 − 𝑥)! 𝑥! (𝑛 − 𝑥)!
𝑥=0 𝑥=0
𝑛
(𝑛 − 2)!
= 𝑛(𝑛 − 1)𝑝2 ∑ [ !𝑥−2 𝑞 𝑛−𝑥 ] + E(X)
(𝑥 − 2)! (𝑛 − 𝑥)!
𝑥=0
= 𝑛(𝑛 − 1)p2 [(𝑝 + 𝑞) + n)

= 𝑛(𝑛 − 1)p2 + 𝑛𝑝

Therefore the variance,


𝑉(𝑋) = 𝐸(𝑋 2 ) − [𝐸(𝑋)]2
= [𝑛(𝑛 − 1)p2 + 𝑛𝑝] − [𝑛𝑝]2
= 𝑛2 𝑝2 − 𝑛𝑝2 + 𝑛𝑝 − 𝑛2 𝑝2
= 𝑛2 𝑝2 − 𝑛𝑝2 = 𝑛𝑝(1 − 𝑝) = 𝑛2 𝑝2
𝑛𝑝 − 𝑛𝑝2 = 𝑛𝑝(1 − 𝑝) = 𝑛𝑝𝑞
E(X 3 ) = ∑𝑛𝑥=0 𝑥 3 𝑥 3 [ 𝑛 C𝑥 𝑝 𝑥 𝑞 𝑛−𝑥 ]
𝑛!
= ∑𝑛𝑥=0 [𝑥(𝑥 − 1)(𝑥 − 2) + 3𝑥 2 − 2𝑥] [ 𝑝 𝑥 𝑞 𝑛−𝑥
𝑥! (𝑛 − 𝑥)!

42 Periyar University – CDOE| Self-Learning Material


𝑛! 𝑛!
= ∑𝑛𝑥=0 [𝑥(𝑥 − 1)(𝑥 − 2) + [ 𝑝 𝑥 𝑞 𝑛−𝑥 + ∑𝑛𝑥=0 3𝑥 2 𝑝 𝑥 𝑞 𝑛−𝑥
𝑥! (𝑛 − 𝑥)! 𝑥! (𝑛 − 𝑥)!

𝑛!
− ∑𝑛𝑥=0 2𝑥 𝑝 𝑥 𝑞 𝑛−𝑥
𝑥! (𝑛 − 𝑥)!

(𝑛 − 3)!
= n(n − 1)(n − 2)p3 ∑𝑛𝑥=0 [ p𝑥−3 + qn−x ] + 3E(X 2 ) − 2E(X)
(𝑥 − 3)! (𝑛 − 𝑥)!

[𝑛(𝑛 − 1)(𝑛 − 2)p3 [(𝑝 + 𝑞)n − 3] + 3[𝑛(𝑛 − 1)𝑝2 + 𝑛𝑝] − 2𝑛𝑝

= n(n − 1)(n − 2)p3 + 3[𝑛(𝑛 − 1)p2 ] + 𝑛𝑝

Similarly,

E(X 4 ) = 𝑛(𝑛 − 1)(𝑛 − 2)(𝑛 − 3)𝑝4 + 3[𝑛(𝑛 − 1)(𝑛 − 2)p3 ] + 7𝑛(𝑛 − 1)𝑝2 + 𝑛𝑝

Beta and Gamma coefficients:

(𝑞 − 𝑝)2
𝛽1 = 𝜇32 =
𝑛𝑝𝑞

𝑞−𝑝
𝛾1 = √𝛽1 =
√𝑛𝑝𝑞

A Binomial Distribution is positively skewed, symmetric or negatively skewed

according as 𝛾1 >=< 0 which implies 𝑞 >=< 𝑝.

𝜇4 1 − 6𝑝𝑞
𝛽2 = 2 = 3+
𝜇2 𝑛𝑝𝑞

A Binomial Distribution is leptokurtic, mesokurtic or platykurtic according as


1
𝛽2 >=< 3 which implies 𝑝𝑞 >=< 6

The mgf,

𝑀𝑥 (𝑡) = 𝐸(𝑒 tx )

= ∑𝑥 𝑒 𝑡𝑥 P(X = 𝑥)

43 Periyar University – CDOE| Self-Learning Material


= ∑𝑛𝑥=0 𝑒 𝑡𝑥 [ 𝑛 𝐶𝑥 𝑝 𝑥 𝑞 𝑛−𝑥 ]

= ∑𝑛𝑥=0 𝑛 𝐶𝑥 𝑝 𝑥 𝑞 𝑛−𝑥 (pet )𝑥 qn−𝑥

(q + pet )n

2.1.8 FITTING OF BINOMIAL DISTRIBUTION

A set of three similar coins are tossed 100 times with the following results

Number of heads 0 1 2 3
Frequency 36 40 22 2

Fit a binomial distribution and estimate the expected frequencies.

Solution:

(i) Mean= x 
 fx  90  0.9
 f 100
x 0.9
(ii) p    0.3
n 3

(iii) q  1  p  1  0.3  0.7

(iv) P( x)  nCx p x q n x  3Cx 0.3x 0.73 x

(v) P(0)  3C0 0.300.730  0.73  0.343

(vi) P(0)  3C0 0.300.730  0.73  0.343

nx p
(vii) F ( x  1)    F ( x)
x 1 q

3  0 0.3
F (1)  F (0  1)    34.3  44.247
0  1 0.7

3  1 0.3
F (2)  F (1  1)    44.247  19.03
1  1 0.7

44 Periyar University – CDOE| Self-Learning Material


3  2 0.3
F (3)  F (2  1)   19.03  2.727
2  1 0.7

(i) The fitted binomial distribution is,


x ( 3  x )
3 C 0 .3 0 .7
x ,x = 0,1,2,3.
Pi(X=x) =

(ii) The expected frequencies are:

x 0 1 2 3 4
Observed frequency 36 40 22 2 100
(Oi)
Expected frequency 34 44 19 3 100
(Ei)

Let Us Sum Up
Learners, in this section we have seen that definition of Binomial distribution,

additive property, recurrence relation of Binomial distribution, mean and variance and

fitting of Binomial distribution.

Check your Progress


1. The characteristic function of the binomial distribution for the binomial variate
X follows b(n,p) is:
A. (q+peit)
B. (p+qeit)n
C. (p+qeit)n
D. (q+peit)n

2. If for a binomial distribution, b(n,p) , n=4 and also P(X=2)=3P(X=3), the value of
p is:
A. 9/11
B. 1
C. 1/3
D. None of the above.

45 Periyar University – CDOE| Self-Learning Material


SECTION 2.2: POISSON DISTRIBUTION

2.2.1 - DEFINITION OF POISSON DISTRIBUTION

The Poisson distribution often serves as a model for counts which do not have a

natural upper bound. The Poisson distribution, with mean λ (lambda), has probabilities

given by;

2.2.2 - MEAN AND VARIANCE

Let 𝑋 have the Poisson distribution with probability distribution

𝜆𝑥
𝑓(𝑥) = 𝑒 −𝜆 for 𝑥 = 0,1, … , ∞
𝑥!

Show that
𝑡 −1)
(a) 𝑀(𝑡) = 𝑒 𝜆(𝑒 for all 𝑡

(b) 𝐸(𝑋) = 𝜆 and Var(𝑋) = 𝜆

The mean and variance of the Poisson distribution are equal.

Solution:

(a) By definition of the moment generating function


𝑥
𝜆𝑥 (𝜆𝑒 𝑡 )
$𝑀(𝑡) = ∑∞
𝑥=0 𝑒
𝑡𝑥
𝑒 −𝜆 = ∑∞
𝑥=0 𝑒 −𝜆
𝑥! 𝑥! $
−𝜆 𝜆𝑒 𝑡 𝜆(𝑒 𝑡 −1)
=𝑒 𝑒 =𝑒 $𝑓𝑜𝑟$ − ∞ < 𝑡 < ∞

𝑦𝑘
where we have used the series 𝑒 𝑦 = ∑∞
𝑘=0 𝑘!

(b) Differentiating 𝑀(𝑡), we find


𝑡 −1)
𝑀′ (𝑡) = 𝜆𝑒 𝑡 𝑒 𝜆(𝑒
𝑡 −1) 𝑡 −1)
𝑀′′ (𝑡) = 𝜆𝑒 𝑡 𝑒 𝜆(𝑒 + 𝜆2 𝑒 2𝑡 𝑒 𝜆(𝑒

Evaluating these derivatives at 𝑡 = 0, we obtain the moments

46 Periyar University – CDOE| Self-Learning Material


𝐸(𝑋) = 𝜆
𝐸(𝑋 2 ) = 𝜆 + 𝜆2

Also, the variance is

Var(𝑋) = 𝐸(𝑋 2 ) − [𝐸(𝑋)]2 = 𝜆

2.2.3 - ADDITIVE PROPERTY

Additive property of Poisson distribution: Let 𝑋1 and 𝑋2 be two independent Poisson random

variables with parameters 𝜆1 and 𝜆2 respectively. Then 𝑋 = 𝑋1 + 𝑋2 follows Poisson distribution

with parameter 𝜆1 + 𝜆2 .

Proof:

𝑡
X1 ∼ P(𝜆1 ) ⇒ Mx1 (t) = 𝑒 𝜆1(e −1)

𝑡 −1)
X2 ∼ P(𝜆2 ) ⇒ Mx2 (t) = 𝑒 𝜆2(e

Mx(t) = Mx1 + X2 (t) = Mx1 (t) ⋅ Mx2 (t)

since 𝑋1 and 𝑋2 are independent.

Thus,

𝑋 = 𝑋1 + 𝑋2 ∼ 𝑃(𝜆1 + 𝜆2 )

2.2.4 - FITTING OF POISSON DISTRIBUTION

The following mistakes per page were observed in a book

Number of Mistakes (per pages) 0 1 2 3 4


Number of pages 211 90 19 5 0

Fit a Poisson distribution and estimate the expected frequencies.

47 Periyar University – CDOE| Self-Learning Material


Solution:

x f fx
0 211 0
1 90 90
2 19 38
3 5 15
4 0 0
Total 325 143

(i) Mean x 
 fx  145  0.44
 f 325
(ii)   x  0.44

e   x e0.44 (0.44) x
(iii) P( X  x)  
x! x!

e0.44 (0.44)0
(iv) P(0)   e0.44  0.6440
0! (from the poisson table)

(v) F (0)  N  P(0)  325  0.6440  209.43


(vi) F ( x  1)  F ( x)
x 1

0.44
F (1)  F (0  1)   209.43  92.15
0 1
0.44
F (2)  F (1  1)   92.15  20.27
11
0.44
F (3)  F (2  1)   2.972
2 1
0.44
F (4)  F (3  1)   2.97  0.33
3 1

e0.44 0.44 x
(1) Fit Poisson distribution is P(X=x) = , x  0,1, 2,...
x!

48 Periyar University – CDOE| Self-Learning Material


(2) Expected frequencies are given below:

x 0 1 2 3 4 Total
Observed Frequencies (Oi) 211 90 19 5 0 325
Expected Frequencies (Ei) 210 92 20 3 0 325

Let Us Sum Up

Learners, in this section we have seen the definition of Poisson distribution, mean and

variance, additive property and fitting of Poisson distribution.

Check Your Progress

1. A family of parametric distribution in which mean is equal to variance is:

A. Binomial distribution

B. Gamma distribution

C. Normal distribution

D. Poisson distribution

2. What If X and Y are two Poisson variates such X follows P(1) and Y follows P(2)

the probability P(X+Y < 3) is:

A. e-3

B. 3e-3

C. 4e-3

D. 8.5e-3

49 Periyar University – CDOE| Self-Learning Material


2.3 Unit Summary

The second unit content on definitions of discrete probability distribution,

Binomial distribution, Poisson distribution, Mean and variance of Binomial and Poisson

distribution, Additive property and fitting of Binomial and Poisson distribution.

2.4 Glossary

KEYWORDS MEANING

n Total number of events

r(or)x Total number of successful events

p Probability of success on a single trial

1-p Probability of failure

(λ) Average rate of value

2.5 Self-Assessment Questions

Short Answers: (5 Marks)

1. Define Binomial distribution and its important features.

2. A machine produces 10 percent defective item. Ten items are selected at random. Find the

probability of not more than two items being defective.

3. Describe Poisson distribution and its properties.

4. Prove that the additive property of Binomial distribution.

5. Derive the additive property of Poisson distribution.

50 Periyar University – CDOE| Self-Learning Material


Long Answers: (8 Marks)

1. Prove the recurrence relation for central moments of Binomial distribution.

2. Derive mean and variance for Binomial distribution.

3. Let X have the binomial distribution with probability distribution

𝑛
𝑏 (𝑥|𝑛, 𝑝) = ( ) 𝑝 𝑥 (1 − 𝑝)𝑛−𝑥 𝑓𝑜𝑟 𝑥 = 0, 1, . . . . , 𝑛.
𝑥

Show that

(𝒂) 𝑁(𝑡) = (1 − 𝑃 + 𝑝𝑒𝑡)𝑛 𝑓𝑜𝑟 𝑎𝑙𝑙 𝑡

(𝒃) 𝐸(𝑋) = 𝑛𝑝 𝑎𝑛𝑑 𝑉𝑎𝑟 (𝑋) = 𝑛𝑝 (1 − 𝑝 )

4. A set of three similar coins are tossed 100 times with the following results

Number of heads 0 1 2 3
Frequency 36 40 22 2
Fit a binomial distribution and estimate the expected frequencies.

5. The following mistakes per page were observed in a book

Number of Mistakes (per pages) 0 1 2 3 4


Number of pages 211 90 19 5 0

Fit a Poisson distribution and estimate the expected frequencies.

2.6 EXERCISES

1. A coin that is fair in nature is tossed n number of times. The probability of the occurrence of

a head six times is the same as the probability that a head comes 8 times, the find the

value of n.

2. The probability that a person can achieve a target is 3/4. The count of tries is 5. What is the

probability that he will attain the target at least thrice?

3. There are four fused bulbs in a lot of 10 good bulbs. If three bulbs are drawn at random with

replacement, find the probability of distribution of the number of fused bulbs drawn.

51 Periyar University – CDOE| Self-Learning Material


4. The probability of a boy guessing a correct answer is 1/4. How many questions must he

answer so that the probability of guessing the correct answer at least once is greater than

2/3?

5. A set of three similar coins are tossed 100 times with the following results.

Number of heads: 0 1 2 3

Frequency : 36 40 22 2

Fit a binomial distribution and estimate the expected frequencies.

6. The following mistakes per page were observed in a book

Number of mistakes (per page) : 0 1 2 3 4

Number of pages : 211 90 19 5 0

7. As only 3 students came to attend the class today, find the probability for exactly 4

students to attend the classes tomorrow.

8. Telephone calls arrive at an exchange according to the Poisson process at a rate of

2/min. Calculate the probability that exactly two calls will be received during each of

the first time 5 minutes of the hour.

2.7 Answers for Check Your Progress


Modules S. No. Answers
1. D.(q+peit)n
Module 1
2. C. 1/3
1. C. Regulating and supervising the banking sector
Module 2
2. A. 1935

52 Periyar University – CDOE| Self-Learning Material


2.8 References and Suggested Readings

1. Gupta S. P. (2001), Statistical Methods, Sultan Chand & Sons, New Delhi.

2. Gupta. S. C. and Kapoor. V. K. Fundamentals of Applied Statistics, Sultan Chand and

Sons, New Delhi

3. Pillai R. S. N. And Bagavathi. V. (2005), Statistics, S. Chand and Company Ltd., New

Delhi.

4. Pillai R. S. N. And Bagavathi. V. (1987), Practical Statistics, S. Chand & Company Ltd.,

New Delhi.

5. Gupta C. B (1978), An Introduction to Statistical Methods, Vikas Publishing House, New

Delhi.

53 Periyar University – CDOE| Self-Learning Material


UNIT 3 - CONTINUOUS PROBABILITY DISTRIBUTION
UNIT- 3 Continuous Probability Distribution and Curve Fitting- Definition
of Normal distribution – Characteristics of Normal distribution (Simple
Problems) – Curve fitting – Fitting of Straight line and Second-degree
Parabola - Simple Problems

Unit objectives

1. To impart statistical concepts for continuous probability distribution.

2. To introduce the basic definition of Normal distribution and its characteristics.

3. To introduce the concepts of curve fitting of Normal distribution.

4. To introduce the concepts of fitting of straight line and second-degree parabola and

simple problems.

SECTION 3.1 NORMAL DISTRIBUTION

The normal distribution is the most widely known and used of all distributions.

Because the normal distribution approximates many natural phenomena so well, it

has developed into a standard of reference for many probability problems.

3.1.1 – CHARACTERISTICS OF NORMAL DISTRIBUTION

1. It is bell shaped and is symmetrical about its mean.

2. It is asymptotic to the axis, i.e., it extends indefinitely in either direction from the

mean.

3. It is a continuous distribution.
54 Periyar University – CDOE| Self-Learning Material
4. It is a family of curves, i.e., every unique pair of mean and standard deviation

defines a different normal distribution. Thus, the normal distribution is completely

described by two parameters: mean and standard deviation. See the following

figure.

5. Total area under the curve sums to 1, i.e., the area of the distribution on each side

of the mean is 0.5.

6. It is unimodal, i.e., values mound up only in the center of the curve.

7. The probability that a random variable will have a value between any two points is

equal to the area under the curve between those points.

3.1.2 - EXAMPLE

Find P(Z a) for a = 1.65, -1.65, 1.0, -1.0

To solve: for positive values of a, look up and report the value for

F(a) given in For negative values of a, look up the value for

F(-a) (that is F(absolute value of a)) and report 1 - F(-a).

P(Z < 1.65) = F(1.65) = 0.95. P(Z <1.65) =

F(-1.65) = 1 - F(1.65) = 0.05. P(Z <1.0) = F(1.0) = 0.84.

P(Z <1.0) = F(-1.0) = 1 - F(1.0) = 0.16.

55 Periyar University – CDOE| Self-Learning Material


3.1.3 - FITTING OF NORMAL DISTRIBUTION

It has been claimed that in 60% of all solar-heat installations the utility bill is

reduced by at least one-third. Accordingly, what are the probabilities that the utility bill

will be reduced by at least one-third in

(a) four of five installations. (b) at least four of five installations?

Solution: (a) Substituting x = 4, n = 5, and p = 0.60 into the formula for the binomial

distribution, we get

b(4; 5, 0.60) =C(5,4) (0.60)4(1 − 0.60)5−4

= 0.259;

(b) Substituting x = 5, n = 5, and p = 0.60 into the formula for the binomial distribution,

we get

b(5; 5, 0.60) = C(5, 5) (0.60)5(1 − 0.60)5−5

= 0.078

and the answer is b(4; 5, 0.60) + b(5; 5, 0.60) = 0.259 + 0.078 = 0.337.

3.1.4 - CURVE FITTING

Let (xi,yi); I = 1,2,3,…n be given set of n pairs of values, X being independent

variable and Y the dependent variable. The general problem in curve fitting is to find, if

possible, an analytic expression of the form y = f(x), for the functional relationship

suggested relationship suggested by the given data.

56 Periyar University – CDOE| Self-Learning Material


3.1.5 TYPES

1. Fitting of straight line: Y = a+bX

2. Fitting of second-degree parabola: Y = a+bX+cX2

3. Fitting of power curve: y = aXb

4.Fitting of exponential curve: (i) Y = aebX (ii) Y = abx

3.1.6 EXAMPLE

Fitting a straight-line using Least square method for the following data.

x 5 4 3 2 1
y 1 2 3 4 5

Solution:

Method-1 of solution:

Straight line equation is y = a+bx.

The normal equations are

∑y=an+b∑x

The values are calculated using the following table

x y x2 xy
5 1 25 5
4 2 16 8
3 3 9 9
2 4 4 8
1 5 1 5
--- --- --- ---
∑x=15 ∑y=15 ∑x2=55 ∑x⋅y=35

57 Periyar University – CDOE| Self-Learning Material


Substituting these values in the normal equations

5a+15b=15

15a+55b=35

Solving these two equations using elimination method,

5a+15b = 15

5(a+3b) = 5⋅3

a+3b = 3

and 15a+55b = 35

5(3a+11b) = 5⋅7

3a+11b = 7

a+3b = 3→ (1)

3a+11b = 7→(2)

equation (1)×3⇒ 3a+9b = 9

Equation (2)×1⇒3a+11b = 7

Subtracting ⇒-2b = 2

⇒2b = -2

⇒b = -22

⇒b = -11

⇒b = -1

Putting b=-1 in equation (1), we have

a+3(-1)=3

⇒a=3+3

⇒a=6

58 Periyar University – CDOE| Self-Learning Material


∴a=6andb=-1

Now substituting these values in the equation is y=a+bx, we get

y=6-x

Let Us Sum Up

Learners, in this section we have seen that definition of Normal distribution,

characteristics of normal distribution, fitting of normal distribution, curve fitting and its

types.

Check your Progress

1. AN approximate relation between Q.D and S.D of normal distribution is:

A. 5 Q.D=4 S.D

B. 4 Q.D=5 S.D

C. 2 Q.D=3 S.D

D.3 Q.D=2 S.D

2. For a normal curve the Q.D, M.D, and S.D are in the ratio

A. 5:6:7

B. 10:12:15

C. 2:3:4

D. None of the above

59 Periyar University – CDOE| Self-Learning Material


SECTION 3.2: SECOND DEGREE PARABOLA

3.2.1 DEFINITION

The equation is y=a+bx+cx2 and the normal equations are

∑y=an+b∑x+c∑x2

∑xy=a∑x+b∑x2+c∑x3

∑x2y=a∑x2+b∑x3+c∑x4

3.2.2 EXAMPLE

Calculate Fitting second degree parabola - Curve fitting using Least square

method.

X 2 2 3 4 5 6 7
Y -5 -2 5 16 31 50 73

Solution: The equation is y=a+bx+cx2 and the normal equations are;

∑y=an+b∑x+c∑x2 ∑xy=a∑x+b∑x2+c∑x3 ∑x2y=a∑x2+b∑x3+c∑x4

The values are calculated using the following table

x y x2 x3 x4 x⋅y x2y
1 -5 1 1 1 -5 -5
2 -2 4 8 16 -4 -8
3 5 9 27 81 15 45
4 16 16 64 256 64 256
5 31 25 125 625 155 775
6 50 36 216 1296 300 1800
7 73 49 343 2401 511 3577
--- --- --- --- --- --- ---
∑x=28 ∑y=168 ∑x2=140 ∑x3=784 ∑x4=4676 ∑x⋅y=1036 ∑x2y=6440

60 Periyar University – CDOE| Self-Learning Material


Substituting these values in the normal equations

7a+28b+140c=168

28a+140b+784c=1036

140a+784b+4676c=6440

Solving these 3 equations,

Total Equations are 3

7a+28b+140c=168→(1)

28a+140b+784c=1036→(2)

140a+784b+4676c=6440→(3)

Select the equations (1) and (2), and eliminate the variable a.

7a+28b+140c=168 ×4→ 28a + 112b + 560c = 672

28a+140b+784c=1036 ×1→ 28a + 140b + 784c = 1036

- 28b - 224c = -364 →(4)

Select the equations (1) and (3), and eliminate the variable a.

7a+28b+140c=168 ×20→ 140a + 560b + 2800c = 3360

140a+784b+4676c=6440 ×1→ 140a + 784b + 4676c = 6440

61 Periyar University – CDOE| Self-Learning Material


- 224b - 1876c = -3080 →(5)

Select the equations (4) and (5), and eliminate the variable b.

-28b-224c=-364 ×8→ - 224b - 1792c = -2912

-224b-1876c=-3080 ×1→ - 224b - 1876c = -3080

84c = 168 →(6)

Now use back substitution method

From (6)

84c=168

⇒c=16884=2

From (4)

-28b-224c=-364

⇒-28b-224(2)=-364

⇒-28b-448=-364

⇒-28b=-364+448=84

⇒b=84-28=-3

rom (1)

7a+28b+140c=168

⇒7a+28(-3)+140(2)=168

⇒7a+196=168

⇒7a=168-196=-28

62 Periyar University – CDOE| Self-Learning Material


⇒a=-287=-4

Solution using Elimination method.

a=-4,b=-3,c=2

Now substituting this values in the equation is y=a+bx+cx2, we get

y=-4-3x+2x2

Let Us Sum Up

Learners, in this section we have seen the definition of second degree parabola

and simple problems.

Check Your Progress

1. What is the equation of the parabola with vertex (0,0) and focus (0,2)?

A. Y2=8x

B. X2=8y

C. Y2=4x

D. X2=4y

2. What is the vertex of the parabola given by the equation y=2(x-3)2 +4?

A. (3,4)

B. (-3,-4)

C. (3,-4)

D. (-3,4)

63 Periyar University – CDOE| Self-Learning Material


SECTION 3.3 EXPONENTIAL DISTRIBUTION

3.2.1 DEFINITION

The exponential equation is y=aebx

taking logarithm on both sides, we get

log10y=log10(aebx)

log10y=log10a+log10ebx

log10y=log10a+bxlog10e

Y = A+Bx where Y = log10y, A = log10a, B=blog10e

which linear in Y, x

So the corresponding normal equations are

∑Y = nA+B∑x

∑xY=A∑x+B∑x2

3.2.2 EXAMPLE

Fitting a exponential equation (y=aebx) - Curve fitting using Least square method.

X 0 0.5 1 1.5 2 2.5


Y 0.10 0.45 2.15 9.15 40.35 180.75

Solution:

The curve to be fitted is y=aebx

taking logarithm on both sides, we get

log10y=log10a+bxlog10e

Y=A+Bx where Y=log10y,A=log10a,B=blog10e

64 Periyar University – CDOE| Self-Learning Material


which linear in Y,x

So the corresponding normal equations are

∑Y=nA+B∑x

∑xY=A∑x+B∑x2

The values are calculated using the following table

x y Y=log10(y) x2 x⋅Y
0 0.1 -1 0 0
0.5 0.45 -0.3468 0.25 -0.1734
1 2.15 0.3324 1 0.3324
1.5 9.15 0.9614 2.25 1.4421
2 40.35 1.6058 4 3.2117
2.5 180.75 2.2571 6.25 5.6427
--- --- --- --- ---
∑x=7.5 ∑y=232.95 ∑Y=3.81 ∑x2=13.75 ∑x⋅Y=10.4556

Substituting these values in the normal equations

6A+7.5B=3.81

7.5A+13.75B=10.4556

Solving these two equations using Elimination method,

we obtain A=-0.9916,B=1.3013

∴a=antilog10(A)=antilog10(-0.9916)=0.102

and b=Blog10(e)=1.30130.4343=2.9963

Now substituting this values in the equation is y = aebx, we get

y=0.102e2.9963x

65 Periyar University – CDOE| Self-Learning Material


Let Us Sum Up

Learners, in this section we have seen the definition of exponential distribution

and simple problems.

Check Your Progress

1. The exponential distribution is often used to model:

A. The number of successes in a fixed number of trials.

B. The time between events in a Poisson process

C. The average value of a sample

D. The proportion of defects in a batch

2. Which of the following properties is true for an exponential distribution?

A. The exponential distribution is symmetric.

B. The exponential distribution is memoryless.

C. The exponential distribution has a finite variance

D. The exponential distribution is bounded on the right.

3.4 Unit Summary

The third unit content on definitions of Normal distribution, characteristics of

normal distribution, curve fitting, fitting of straight line and second degree parabola with

simple problems.

66 Periyar University – CDOE| Self-Learning Material


3.5 Glossary

KEYWORDS MEANING
N Signifies that the distribution is normal.
 Mean
2 Standard deviation
Y = a+bx Straight line
Exp(m) Exponential distribution

3.6 Self-Assessment Questions

Short Answers: (5 Marks)

1. Define normal distribution.

2. Find P(Z a) for a = 1.65, -1.65, 1.0, -1.0

3. It has been claimed that in 60% of all solar-heat installations the utility bill is

reduced by at least one-third. Accordingly, what are the probabilities that the

utility bill will be reduced by at least one-third in

(a) four of five installations. (b) at least four of five installations?

4. Define exponential distribution.

Long Answers: (8 Marks)

1. Explain the characteristics of normal distribution.

2. Fit a straight line using Least square method for the following data.

X 5 4 3 2 1
Y 1 2 3 4 5

67 Periyar University – CDOE| Self-Learning Material


3. Fit a second-degree parabola using Least square method.

X 1 2 3 4 5 6 7
Y -5 -2 5 16 31 50 73

4. Fit an exponential equation (y = aebx) using Least square method.

X 0 0.5 1 1.5 2 2.5


Y 0.10 0.45 2.15 9.15 40.35 180.75

3.7 EXERCISES

1. For some computers, the time period between charges of the battery is normally

distributed with a mean of 50 hours and a standard deviation of 15 hours. Rohan

has one of these computers and needs to know the probability that the time period

will be between 50 and 70 hours.

2. The speeds of cars are measured using a radar unit, on a motor way. The speeds

are normally distributed with a mean of 90 km/hr and a standard deviation of 10

km/hr . what is the probability that a car selected at chance is moving at more than

100 km/hr?

3. The length of human pregnancies from conception of birth approximates a normal

distribution with a mean of 266 days and a standard deviation of 16 days. What

proportion of all pregnancies will last between 240 and 270 days (roughly between 8

and 9 months)?

4. Calculate fitting of second-degree parabola-Curve fitting using least square method.

X 1 2 3 4 5 6 7
Y -5 -2 5 16 31 50 73

68 Periyar University – CDOE| Self-Learning Material


5. Calculate fitting a second-degree parabola-curve fitting using least square method.

X 1996 1997 1998 1999 2000


Y 40 50 62 58 50
6. Assume that, you usually get 2 phone calls per hour. Calculate the probability, that

a phone call will come within the next hour.

7. The mileage which car owners get with a certain kind of radial tire is a random

variable having an exponential distribution with mean 40,000 km. Find the

probabilities that one of these tires will last (i) at least 20,000 km and (ii) at most

30,000 km.

8. The length of time a person speaks over phone follows exponential distribution with

mean 6. What is the probability that the person will talk for (i) more than 8 minutes

(ii) between 4 and 8 minutes?

3.8 Answers for Check Your Progress

Modules S.No. Answers

1. D.3Q.D-2S.D
Module 1
2. A. 10:12:15

1. D.x2=4y
Module 2
2. A.(3,4)

1. B. The time between events in a Poisson process


Module 3
2. B. The exponential distribution is memoryless

3.9 References and Suggested Readings

1. Gupta S. P. (2001), Statistical Methods, Sultan Chand & Sons, New Delhi.

2. Gupta. S. C. and Kapoor. V. K. Fundamentals of Applied Statistics, Sultan Chand and Sons,

New Delhi

69 Periyar University – CDOE| Self-Learning Material


UNIT 4 - TEST FOR SIGNIFICANCE (LARGE SAMPLE TESTS)
Test of Significance (Large Samples Tests) Concept of Statistical
Hypothesis – Simple and Composite Hypothesis – Null and Alternative
Hypothesis – Critical region – Type I and Type II Errors – Sampling
distribution and Standard Error – Test of Significance: Large Sample
Tests for Proportion, Difference of Proportions, Mean and Difference of
Means - Simple Problems.

Unit objectives

1. To impart concept of Statistical Hypothesis and its types.

2. To introduce the basic concepts errors and its types.

3. To introduce the concepts of sampling distribution and standard error.

4. To introduce the concepts of large sample tests for proportion and difference of

proportions.

SECTION 4.1: STATISTICAL HYPOTHESIS

4.1.1 POPULATION

A group of objects under study is called population or universe.

e.g. 1) The number of books in national library.

2) The no. of students studying in University of Madras.

70 Periyar University – CDOE| Self-Learning Material


4.1.2 - SAMPLE

A part selected from the population is called a sample.

e.g. 1) The no. of statistics books in national library.

2) The no. of statistics students studying in University of Madras

4.1.3 - PARAMETER

The statistical constants about population are called parameters. e.g. Population

mean, population variance etc.

4.1.4 - STATISTIC

The statistical constants about sample are called statistic. e.g. Sample mean, sample

variance etc.

Let Us Sum Up

Learners, in this section we have seen that definition of population, parameter,

statistic and sample.

Check your Progress

1. A population consists of all the items which are physically present is called:

A. Hypothetical population

B. Real population

C. Infinite population

D. None of the above

2. A sample consists of:

A. All units of the population

B. 50 per cent unit of the population

71 Periyar University – CDOE| Self-Learning Material


C. 5 per cents units of the population

D. Any fraction of the population

SECTION 4.2: TESING OF HYPOTHESIS


A statement about the population parameter is called hypothesis.

4.2.1 TYPES

Null Hypothesis: A hypothesis which is tested for possible rejection under the

assumption that it is true is called Null hypothesis and it is denoted by H 0.

Alternative Hypothesis: Any hypothesis which contradicts the null hypothesis H0 is

called alternative hypothesis and it is denoted by H1.

4.2.2 CRITICAL REGION (OR) REJECTION REGION

A region, corresponding to a static in the sample space which amounts to rejection of

the null hypothesis H0 is called as critical region. The region complement to critical

region is called acceptance region.

4.2.3 ERRORS

Type I error: Rejecting the hypothesis H0 when it is true is called Type I error.

Type II error: Accepting the Hypothesis H0 when it is wrong is called type II error.

4.2.4 LEVEL OF SIGNIFICANCE

Maximum probability of making type I error is called level of significance and it is

denoted by α.

4.2.5 DEGREES OF FREEDOM

The number which we can choose freely is called degrees of freedom.

72 Periyar University – CDOE| Self-Learning Material


4.2.6 SMALL SAMPLE TEST

When the sample size is less than 30, then the sample is said to be small sample.

Otherwise, it is said to be large sample.

4.2.7 TEST PROCEDURE

Step 1: State the Null hypothesis H0.

Step 2: Decide the Alternative hypothesis H1.

Step 3: Choose the level of significance and state the degrees of freedom.

Step 4: Write the critical value.

Step 5: Find the test static.

Step 6: Decide whether the hypothesis is to be rejected or accepted.

Let Us Sum Up

Learners, in this section we have seen the definition of hypothesis, types, critical

region, errors, level of significance, degrees of freedom, small sample test and its test

procedure.

Check Your Progress

1.The errors in a survey other than sampling errors are called:

A. Formula errors

B. Planning error

C. Non sampling error

D. None of the above

2. A hypothesis under test is:

A. Simple hypothesis

73 Periyar University – CDOE| Self-Learning Material


B. Alternative hypothesis

C. Null hypothesis

D. None of the above

SECTION 4.3: TEST FOR POPULATION MEAN


4.3.1 TEST PROCEDURE

Let x1, x2, …xn be a random sample from a population with mean µ, then we want

to test whether there is any significant different between population mean and sample

mean or not.

Null hypothesis H0: µ=µ0

Alternative hypothesis H1 : µ ≠ µ0(Two tailed test)

Level of significant = 0.05

Reject H0 if |t|> t0.025


x
The test statistic is given by t =
s
n 1

follows t distribution with (n-1) degrees freedom.

4.3.2 PROBLEM

A soap manufacturing company was distributing a particular brand of soap

through a large no. of retail shops. Before a heavy add campaign the mean sales per

week per shop was 140 dozen. After the campaign, a sample of 26 shops was taken

and the mean sales was found to be 147 dozen with a standard deviation of 16 dozen,

can you say the add effective?

74 Periyar University – CDOE| Self-Learning Material


Solution: we are given


n = 26, x = 147, s = 16

H0: μ= 140[The add is not effective]

H1: μ>140[The add is effective] (Right tailed test)

α = 0.05, d.f = n-1 = 25

Table value at 5% level for 25 d.f = 1.708

Reject H0 if t>1.708

The test statistic is given by


x
t =
s
n 1

t = 2.1875 (On simplification)

Conclusion : Since the calculated value is greater than tabulated value, so we may

reject the hypothesis H0. Therefore, the add is effective.

4.3.3 TEST FOR DIFFERENCE OF POPULATION MEAN

Suppose we want to test if two independent samples x1, x2, …xn1 and y1, y2,… yn2 of

sizes n1 and n2 have been drawn from two normal populations with population mean

μ1and μ2 respectively. Under the Null hypothesis H0: μ1= μ2(There is no significant

difference between two population mean.

Alternative Hypothesis H1: μ1 ≠ μ2 (There is a significant difference between the

two population mean.

The test statistic is given by

75 Periyar University – CDOE| Self-Learning Material




x y n1s12  n2 s22
t = Where S =
1 1 n1  n2  2
S (  )
n1 n2

Follows t distribution with (n1+n2-2) d.f

4.3.4 - PROBLEM

Samples of two types of electric bulbs were tested for length of life and following

data were obtained.

Type - I Type - II
Sample No. 8 7
Sample Mean 1234hours 1046 hours
Sample S.D. 36hours 40 hours

Is the difference between in the means sufficient to warrant that type I is superior to

type II regarding the length of life.

Sol:

Null Hypothesis H0: μ1= μ2

Alternative Hypothesis H1: μ1>μ2(Right tailed test)


We are given n1 = 8, x1 = 1234hrs, s1 = 36hrs


n2 = 7. x2 = 1036hrs, s2 = 40hrs

d.f = n1+ n2-2 = 13, Level of significant = 0.05

Table value at 5% level for 13 d.f = 1.771

We may reject H0 if t >.771

76 Periyar University – CDOE| Self-Learning Material


The test statistic is given by



x y
t =
1 1
S (  )
n1 n2

n1s12  n2 s22
S=
n1  n2  2

= 40.73

t = 9.37 (on simplification)

since the calculated value of t is much greater than table value, hence H 0 rejected. So,

we conclude that type I is superior to type II.

Let Us Sum Up

Learners, in this section we have seen the test for population mean ,test

procedure ,test for difference of population mean with simple problems.

Check Your Progress

1. When performing a one sample-t test for the population mean, which

distribution do you use if the population standard deviation is unknown?

A. Normal distribution

B. t-distribution

C. Chi-square distribution

D. F-distribution

2. Which test is appropriate when the population standard deviation is known and

the sample size is large?

A. One sample z-test

77 Periyar University – CDOE| Self-Learning Material


B. One sample t-test

C. Paired t-test

D. Chi-square test

SECTION 4.4: PAIRED t-TEST


4.4.1 TEST PROCEDURE
Let x1, x2, …xn be the sales of a product in n independent stores for a certain

period before add and Y1, y2,… yn are the corresponding sales of the same product for

the same period after the add. Now we want to test the significant difference between

sales before add and after add,

We apply paired t-test.

Let di = xi-yi; I = 1,2,3,…,n

Let the Null hypothesis H0: μ1= μ2 (There is no significant different between the sales

before and after the add)

Alternative Hypothesis H1: μ1 < μ2

The test statistic is given by


d
t= follows t distribution with (n-1) df.
s
n

4.4.2 PROBLEM

An I.Q test was conducted to 5 persons before and after they were trained, the results

are given below.

78 Periyar University – CDOE| Self-Learning Material


candidates I II III IV V

Before training
I.Q 110 120 123 132 125
After training
120 118 125 136 121

Test whether there is any change in IQ after the training programme at 5% level.

Solution:

The given data is dependent, so let us apply paired t-test.

Null Hypothesis H0: μ1= μ2(There is no any change in IQ)

Alternative Hypothesis H1: μ1 < μ2 (There is a change in IQ)

Level of significance = 5% . Degrees of freedom = n-1 = 4. Table value at 5% level for 4

df = 3.747. We may reject H0 if t< -3.747. Assuming H0 is true, the test statistic is given

by


d
t=
s
n

x y d =x-y d2
110 120 -10 100

120 118 2 4
123 125 -2 4
132 136 -4 16
125 121 4 16
 d =-10  d 2
=140

79 Periyar University – CDOE| Self-Learning Material



d
d
n

= -2

S2 = 5.47

t = 0.82

Conclusion: Since the calculated value is less than the table value. So, we accept the

hypothesis H0. We conclude that there is a change in IQ after the training.

Let Us Sum Up

Learners, in this section we have seen the test procedure for Paired t-test with

simple problems.

Check Your Progress

1. The degrees of freedom for statistic-t for paired t-test based on n pairs of

observations is:

A. 2(n-1)

B. n-1

C. 2n-1

D. None of the above

2. Paired t-test is applicable when observations in the two samples are:

A. paired

B. correlated

C. Equal in number

D. All the above

80 Periyar University – CDOE| Self-Learning Material


SECTION 4.5: CHI-SQUARE TEST OF INDEPENDENT
ATTRIBUTES
4.5.1 TEST PROCEDURE

Consider the cross-tabulation of some characteristic across two categorical

variables; the resulting table is called a two-way frequency table or a contingency table.

One characteristic or attribute is shown along the rows and the other is shown along the

column. Each cell of the table gives the cont or the number of cases corresponding to

that cell. We wish to test whether the two attributes are independent or not.

The null hypothesis is as follows

H0: The two attributes are independent.

H1: The two attributes are not independent.

Let Oij and Eij denote the observed and expected frequencies respectively in the

ith row and the jth column. When the null hypothesis is true, the expected frequencies

are calculated according to the formula.

Expected: Frequency = (Row total X Column total)/Grand total

After finding the expected frequencies of all the cells, we calculate the chi-square value

as follows.

(Oi  Ei )2
 
2
follows Chi-Square distribution with (r-1)x(c-1) df.
Ei

The degrees of freedom is (r-1)x(c-1), where r is the number of rows ad c is the number

of columns. The critical value can be seen from the statistical tables corresponding to

these degrees of freedom at the chosen level of significance. If the chi-square value

exceeds the critical value, we reject the null hypothesis, the conclusion is that the two

characteristics are not independent and they are dependent with each other.

81 Periyar University – CDOE| Self-Learning Material


4.5.2 EXAMPLE

From the following table, test whether the son’s eye colour is dependent with father’s

eye colour;

Eye colour of father


Not Light Light
Eye
Colour Not light 230 148
Of
father
Light 151 471
Solution:

Null Hypothesis H0: The eye colour of father and eye colour of son are independent.

Alternative HyothesisH1: The eye colour of father and eye colour of son are not

independent.

Expected frequency = (Row total X Column total)/Grand total

Using above formula,

Exp. Freq of 230 = 144 Exp. Freq of 148 =234

Exp. Freq of 471 = 385 Exp. Freq of 151 =237

O E (O-E) (O-E)2 (O-E)2/E


230 144 86 7396 51.36
151 237 -86 7396 31.21
148 234 86 7396 31.61
471 385 86 7396 19.21

χ2= 0.095

df = (r-1)(c-1) = (2-1)(2-1) = 1.

Level of significance = 0.05

82 Periyar University – CDOE| Self-Learning Material


The table value of chi-square at 0.05 level for 1 D.F = 3.84

CONCLUSION: Since the calculated value of Chi-square is more than the table value,

we can accept the null hypothesis H0; Hence we conclude that eye-colour of son is

dependent of eye colour of father.

Let Us Sum Up

Learner, in this we have seen that definition of Chi-squate test of independent

attributes with simple problems.

Check Your Progress

1. Degrees of freedom for statistic chi-square in case of contingency table of

order(2X2) is

A. 3

B. 4

C. 2

D. 1

2. The hypothesis that the population variance has a specified value can be

tested by:

A. F-test

B. Z-test

C. Chi-square test

D. None of the above

83 Periyar University – CDOE| Self-Learning Material


4.6 Unit Summary
The fourth unit content on definition of statistical hypothesis, types, critical region,

errors and types, sampling distribution and standard error, test of significance for large

samples, difference of proportions, mean and difference of means with simple

problems.

4.7 Glossary
KEYWORDS MEANING
Proportion of Population parameter.
P
p Proportion of Sample statistic.

S2 Variance of sample statistic.


2 Variance of population parameter.

H0 Null hypothesis.
H1 Alternative hypothesis.

4.8 Self-Assessment Questions


Short Answers: (5 Marks)

1. Define (i) Hypothesis and its types (ii) errors and its types

2. Explain the test procedure for small sample test.

3. Explain the test for difference of population mean.

4. Explain the test procedure for paired t-test.

5. Explain chi-square test of independent attributes.

84 Periyar University – CDOE| Self-Learning Material


Long Answers: (8 Marks)

1. Explain the test procedure for population mean.

2. A soap manufacturing company was distributing a particular brand of soap

through a large no. of retail shops. Before a heavy add campaign the mean sales

per week per shop was 140 dozen. After the campaign, a sample of 26 shops was

taken and the mean sales was found to be 147 dozen with a standard deviation of

16 dozens, can you say the add effective?

3. Samples of two types of electric bulbs were tested for length of life and following

data were obtained.

Type - I Type - II
Sample no. 8 7
Sample Mean 1234hours 1046 hours
Sample SD 36hours 40 hours

Is the difference between in the means sufficient to warrant that type I is

superior to type II regarding the length of life

4. An I.Q test was conducted to 5 persons before and after they were trained, the

results are given below.

Candidates I II III IV V

Before training 110 120 123 132 125


I.Q
After training 120 118 125 136 121

Test whether there is any change in IQ after the training programme at 5% level.

85 Periyar University – CDOE| Self-Learning Material


5. From the following table, Test whether the son’s eye colour is dependent with father’s

eye colour

Eye colour of father


Not Light Light
Eye
Colour Not light 230 148
Of
father
Light 151 471

4.9 EXERCISES

1. A survey of 320 families with 5 children each reveals the following distribution:

No. of Boys: 5 4 3 2 1 0

No. of Girls: 0 1 2 3 4 5

No. of families: 14 56 110 88 40 12

Is this result being consistent with the hypothesis that male and female births were

equally probable?

2. Fit a Poisson distribution to the following data which gives the number of yeast cells

per square for 400 squares.

No. of cells

per square: 0 1 2 3 4 5 6 7 8 9 10

No. of

squares 103 143 98 42 8 4 2 0 0 0 0

Also test the goodness of fit.

3. 15000 random numbers were taken from some logarithmic table and the following

frequencies of each digit were obtained.

86 Periyar University – CDOE| Self-Learning Material


Digit: 0 1 2 3 4 5 6 7 8 9

Freq: 1493 1441 1461 1552 1494 1454 1613 1491 1482 1519

Using chi-square test, test the hypothesis that each digit has an equal chance

of being chosen.

4. The theory predicts the proportion of beans in the groups A, B, C and D should be

9:3:3:1. In an experiment among the beans, the numbers in the four groups were

882, 313, 287, and 118. Does the experimental report support the theory?

5. Samples of two types of electric bulbs were tested for length of life and the following

data were obtained

Mean life S.D


No of samples.
Type I 8 1134 35
Type II 7 1024 40

Test the significant difference between the average life of the bulbs.

6. A group of five patients treated with medicine A weigh 42, 39,48,60 and 41kgm. A

second group of 7 patients from the same hospital treated with medicine B weigh 38,

42,56,64,68,69 and 62kgms. Do you agree with the claim the medicine ‘B’

increases the weight significantly?

7. The mean life of a sample of 10 electric bulbs was found to be 1456 hours with a

S.D of 423 hours. A second of 17 bulbs chosen from a different batch showed a

mean life of 1280 hours with S.D 398 hours. Is there significant difference between

the means of the two batches?

87 Periyar University – CDOE| Self-Learning Material


8. The average number of articles produced by two machines per day are 200and 250

with S.D 20 and 25 respectively on the basis of records of 25 days production. Can you

regard both the machines equally efficient at 1% level of significant?

4.10 Answers for Check Your Progress


Modules S.No. Answers
1. B. real population
Module 1
2. D. any fraction of the population
1. C. non sampling error
Module 2
2. B. Null hypothesis
1. B.t-distribution
Module 3
2. A. one-sample test
1. B. n-1
Module 4
2. D. all the above
1. D.1
Module 5
2. C. Chi-square test

4.11 References and Suggested Readings


1. Gupta S. P. (2001), Statistical Methods, Sultan Chand & Sons, New Delhi.

2. Gupta. S. C. and Kapoor. V. K. Fundamentals of Applied Statistics, Sultan Chand and

Sons, New Delhi

3. Pillai R. S. N. And Bagavathi. V. (2005), Statistics, S. Chand and Company Ltd., New

Delhi.

4. Sancheti D. C. And Kapoor. V. K (2005), Statistics (7th Edition), Sultan Chand & Sons,

New Delhi.

5. Arora P. N, Comprehensive Statistical Methods, Sultan Chand & Sons, New Delhi.

6. Murthy M. N (1978), Sampling Theory and Methods, Statistical Publishing Society,

Kolkata.

88 Periyar University – CDOE| Self-Learning Material


UNIT 5 - SMALL SAMPLE TESTS
Test of Significance (Small Samples Tests) Small sample tests with
regard to Mean, Difference between Means and Paired t- test, F-test -
Definition of Chi-square test – Assumptions – Characteristics – Chi-
square tests for Goodness of fit and Independence of attributes – Simple
Problems.

Unit objectives

1. To impart statistical concepts with small sample tests with regard to mean.

2. To introduce the basic concepts of F-test.

3. To introduce the definition of Chi-square, test its assumptions and its characteristics.

4. To introduce the concepts of Chi-square test for goodness of fit and independence of

attributes with simple problem.

SECTION 5.1: TESTING OF SIGNIFICANCE SMALL SAMPLES

5.1.1 TEST PROCEDURE

Step 1: State the Null hypothesis H0.

Step 2: Decide the Alternative hypothesis H1.

Step 3: Choose the level of significance and state the degrees of freedom.

Step 4: Write the critical value.

Step 5: Find the test static.

Step 6: Decide whether the hypothesis is to be rejected or accepted.

89 Periyar University – CDOE| Self-Learning Material


5.1.2 - TEST FOR POPULATION MEAN

Test procedure: Let x1, x2, …xn be a random sample from a population with mean

µ, then we want to test whether there is any significant different between population

mean and sample mean or not.

Null hypothesis H0: µ=µ0

Alternative hypothesis H1 : µ ≠ µ0(Two tailed test)

Level of significant = 0.05

Reject H0 if |t|> t0.025


x 
The test statistic is given by t ~ t( n 1)
s
n 1

follows t distribution with (n-1) degrees freedom.

5.1.3 PROBLEM

A soap manufacturing company was distributing a particular brand of soap through a

large no. of retail shops. Before a heavy add campaign the mean sales per week per

shop was 140 dozens. After the campaign, a sample of 26 shops was taken and the

mean sales was found to be 147 dozens with a standard deviation of 16 dozens, can

you say the add effective?

Solution: we are given


n = 26, x = 147, s = 16

H0: μ= 140[The add is not effective]

H1: μ>140[The add is effective](Right tailed test)

α = 0.05, d.f = n-1 = 25

90 Periyar University – CDOE| Self-Learning Material


Table value at 5% level for 25 d.f = 1.708

Reject H0 if t>1.708

The test statistic is given by


x 
t = ~ t( n 1)
s
n 1

t = 2.1875 (On simplification)

Conclusion : Since the calculated value is greater than tabulated value, so we may

reject the hypothesis H0. Therefore, the add is effective.

5.1.4 TEST FOR DIFFERENCE OF POPULATION MEAN

Suppose we want to test if two independent samples x1, x2, …xn1 and y1, y2,… yn2

of sizes n1 and n2 have been drawn from two normal populations with population mean

μ1and μ2 respectively. Under the Null hypothesis H0: μ1= μ2(There is no significant

difference between two population mean.

Alternative Hypothesis H1: μ1 ≠ μ2 (There is a significant difference between the two

population mean.

The test statistic is given by

__ __
x y
t ~ t( n1  n2 1)
1 1
S 
n1 n2

n1s12  n2 s22
Where S =
n1  n2  2

91 Periyar University – CDOE| Self-Learning Material


5.1.5 PROBLEM

Samples of two types of electric bulbs were tested for length of life and following data

were obtained.

Type - I Type - II
Sample no. 8 7
Sample Mean 1234hours 1046 hours
Sample s.d 36hours 40 hours

Is the difference between in the means sufficient to warrant that type I is superior to

type II regarding the length of life.

Sol:

Null Hypothesis H0: μ1= μ2

Alternative Hypothesis H1: μ1>μ2(Right tailed test)



We are given n1 = 8, x1 = 1234hrs, s1 = 36hrs


.n2 = 7. x2 = 1036hrs, s2 = 40hrs

D.f = n1+ n2-2 = 13, Level of significant = 0.05

Table value at 5% level for 13 d.f = 1.771

We may reject H0 if t >.771

The test statistic is given by

__ __
x y
t ~ t( n1  n2 1)
1 1
S 
n1 n2

92 Periyar University – CDOE| Self-Learning Material


n1s12  n2 s22
S=
n1  n2  2

= 40.73

t = 9.37 (on simplification)

since the calculated value of t is much greater than table value, hence H0 rejected. So,

we conclude that type I is superior to type II

Let Us Sum Up

Learners, in this section we have seen that Testing of significance for small

sample tests, test procedure, test for population mean, test for difference of population

mean.

Check your Progress

1. Which of the following tests is typically used for hypothesis testing with large

samples?

A. T-test

B. Chi-square test

C. Z-test

D. Wilcoxon test

2. Which test is appropriate for hypothesis testing with small sample sizes?

A. Z-test

B. Chi-square test

C. T-test

D. F-test

93 Periyar University – CDOE| Self-Learning Material


SECTION 5.2: PAIRED t-TEST

5.2.1 TEST PROCEDURE

Let x1, x2, …xn be the sales of a product in n independent stores for a certain

period before add and y1, y2,… yn are the corresponding sales of the same product for

the same period after the add. Now we want to test the significant difference between

sales before add and after add,

We apply paired t-test.

Let di = xi-yi; I = 1,2,3,n

Let the Null hypothesis H0: μ1= μ2 (There is no significant different between the

sales before and after the add)

Alternative Hypothesis H1: μ1 < μ2

The test statistic is given by


d
t= follows t distribution with (n-1) d.f
s
n

5.2.2 PROBLEM

An I.Q test was conducted to 5 persons before and after they were trained, the results

are given below.

94 Periyar University – CDOE| Self-Learning Material


candidates I II III IV V

Before training
I.Q 110 120 123 132 125
After training
120 118 125 136 121

Test whether there is any change in IQ after the training programme at 5% level.

Solution:

The given data is dependent, so let us apply paired t-test.

Null Hypothesis H0: μ1= μ2(There is no any change in IQ)

Alternative Hypothesis H1: μ1 < μ2 (There is a change in IQ)

Level of significance = 5%

Degrees of freedom = n-1 = 4

Table value at 5% level for 4 d.f = 3.747

We may reject H0 if t< -3.747

Assuming H0 is true, the test statistic is given by


d
t =
s
n

95 Periyar University – CDOE| Self-Learning Material


x y d =x-y d2
110 120 -10 100
120 118 2 4
123 125 -2 4
132 136 -4 16
125 121 4 16

 d =-10  d 2
=140


d
d
n

= -2

S2 = 5.47

t = 0.82

Conclusion: Since the calculated value is less than the table value. So we accept the

hypothesis H0. We conclude that there is a change in IQ after the training.

5.2.3 TEST FOR SPECIFIED POPULATION MEAN

Test procedure: We have proved that if xi,(i=1,2,3,…n) is a random sample of size n

from a normal population with mean µ and variance σ 2 then the sample mean is

2  2
distributed normally with mean µ and variance . i.e., x ~N(µ, .)
n n

96 Periyar University – CDOE| Self-Learning Material


Under the null hypothesis H0: µ=µ0(The sample has been drawn from the population

with mean µ and variance σ2


x
The test statistic is given by Z = ~N(0,1)

n

Then compare the calculated value with the tabulated value at given level of

significance, reject H0 if the calculated value is greater than the tabulated value.

Otherwise accept it.

5.2.4 PROBLEM

A sample of 900 members has a mean 3.4 Cms with S.D 2.61 cms. IS the sample

from a population of mean 3.25.

Null Hypothesis H0 : The sample has been drawn from the population with mean 3.25

Cm

The test statistic is given by


x
Z = ~N(0,1), since n is large.

n


Here we are given x = 3.4 cm, n = 900cms, µ = 3.25. σ = 2.61

Substituting the values in the above equation, we get



3.40  3.25
Z =
2.61
900

= 1.73

The table value for 5% level of significance = 1.96.

97 Periyar University – CDOE| Self-Learning Material


Since the calculated value is less than the tabulated value.

Hence, we may accept the Hypothesis H0.

Hence, we conclude that the sample is come from the population with mean 3.25.

5.2.5 TEST FOR DIFFERENCE OF POPULATION MEAN


Let x1 be the mean of a sample of size n1 from a population with mean μ1 and


variance  21 and x 2 be the mean of an independent sample of size n2 from an another

population with mean μ2 and variance  2 2 then sample sizes are large,

  12   22
x1 ~ N(μ1, .) x 2 ~N(μ2, )
n1 n2

Under the Null hypothesis H0: μ1= μ2(There is no significant difference between two

population mean.)

Alternative hypothesis H1: μ1≠ μ2(There is a significant difference between two

population mean)

Assuming that the hypothesis H0 is true, the test statistic is given by


 
( x1  x2 )
z ~N(0,1)
 12  22

n1 n2

Then compare the calculated value with the tabulated value at given level of

significance, reject H0 if the calculated value is greater than the tabulated value.

Otherwise accept it.

Remark 1:
98 Periyar University – CDOE| Self-Learning Material
If both the sample drawn from the populations with common S.D σ, then  12   22  

 
( x1  x2 )
then we get, z  ~N(0,1)
1 1
 (  )
n1 n2

Remark 2:

If the population variances are not known, then  12  S12 ,  22  S 22

5.2.6 - PROBLEM

The means of two single large samples of 1000and 2000 members are 67.5 and 68

inches respectively. Can the samples regarded as drawn from the same population of

standard deviation 2.5?

 
Solution: We are given : .n1 = 1000, n2 = 2000, x1 = 67.5, x 2 =68

Null Hypothesis H0 : μ1= μ2 (There is no significant difference between the two

population means.)

Alternative Hypothesis H1: μ1≠ μ2 (There is a significant difference between the two

population means.)

Under the null hypothesis, the test statistic is given by


 
( x1  x2 )
z  ~N(0,1)
 1 1 
   
 n1 n2 


(67.5  68)
z = -5.1
1 1
2.5 (  )
1000 2000

99 Periyar University – CDOE| Self-Learning Material


The table value at 5% level of significance = 1.96

Since Z  2.96. so, we reject the null hypothesis H0.

Let Us Sum Up

Learners, in this section we have seen the Paired t-test, test procedure, test for

specified population mean, test for difference of population mean.

Check Your Progress

1. In a t-test for small samples, what assumption is crucial regarding the

population distribution?

A. The population distribution must be normal.

B. The sample size must be greater than 30

C. The variance of the two samples must be equal.

D. The population standard deviation must be known.

2.When comparing the means of two independent samples with equal variances,

which test is appropriate?

A. Paired t-test

B. Two sample t-test

C. Z-test

D.ANOVA

SECTION 5.3: TESTING OF HYPOTHESIS ABOUT A


POPULATION PROPORTION

100 Periyar University – CDOE| Self-Learning Material


Often in some situations, the parameter of interest is the proportion of element in

the population having a particular characteristic of interest. For example, the researcher

may be concerned with the proportion of defective items turned out by the production

process or a market researcher may be interested in studying the proportion of

consumer’s expression a preference for one packaging over another. In such situations,

the appropriate sampling distribution is the distribution of the proportion.

5.3.1 TEST FOR SINGLE PROPORTION

Test Procedure: If p represents the proportion of success, then using the binomial

distribution, the mean and the standard deviation of the proportion of success are given

pq
by p and respectively. For testing the null hypothesis H0: p = p0 against the
n

alternative hypothesis H1: p≠p0(Two tailed) ,

The test statistic is given by



p  p0
Z follows N(0,1)
p0 q0
n


Where p is the sample proportion of success.

Problem: a fruits marketing federation claims that nor more than 4% of the apples

supplied is damaged, a random sample of 600 apples contained 36 damaged apples.

Using at 5% level of significance, test the claim of the federation.

Solution:

Null Hypothesis: H0: p = p0 (=.04) Vs Alternative hypothesis H1: p>p0(Right Tailed)


The sample proportion, p = 36/600 = .06. The test statistic is given by

101 Periyar University – CDOE| Self-Learning Material



p  p0
Z
p0 q0
n

Substituting the values in the above equation, we get


0.06  0.04
Z = 2.5
(.04)(.96)
600

The table value at 5% level of significance = 1.645.

Since the calculated value is greater than the table value. Hence we may reject the

Hypothesis H0.

Thus, the federation’s claims of 4% damages cannot be accepted.

5.3.2 TEST FOR THE DIFFERENCE BETWEEN TWO POPULATION

PROPORTIONS

In the previous section, we discussed hypothesis tests related to a single proportion. In

this section, we will discuss tests associated with the difference between two population

proportions.. We have two populations and two samples are drawn from each of them.

We are interested in testing the significance in difference between the two population

proportions, or equivalently whether the proportions (of success) of the two populations

are equal or not. Representing p1 and p2 as the proportion of success of the first and

second populations, respectively, the null and alternative hypotheses are

H0: p1=p2 Vs H1 p1 ≠ p2

The test statistic is given by

102 Periyar University – CDOE| Self-Learning Material


 
p1  p2
Z ~ N (0,1)
1 1 
PQ   
 1
n n2 

In general, we do not have any information as to the proportion of A in the population

from which the samples have been taken. Under H0, an unbiased estimate of the

population proportion P based on both the samples is given by

X1  X 2
P and Q = 1-P
n1  n 2

Then compare the calculated value of Z with table value at given level of significance,

reject H0 if the calculated value is greater than tabulated value. Otherwise accept H 0.

Let Us Sum Up

Learners, in this section we have seen that testing of hypothesis about a

population proportion, test for single proportion, test for the difference between two

population proportions.

Check Your Progress

1. You are performing a hypothesis test for a proportion and obtain a z-score of -

1.75. if the significance level is 0.10 for a two-tailed test, what is the p-value?

A. 0.0818

B. 0.0898

C. 0.10

D. 0.05

2. A hypothesis test for a population proportion result in a p-value of 0.03, and the

significance level is 0.05. What is the decision regarding the null hypothesis?

A. Reject the null hypothesis

103 Periyar University – CDOE| Self-Learning Material


B. Accept the null hypothesis

C. Fail to reject the null hypothesis

D. The decision cannot be made with the given information.

SECTION 5.4: CHI-SQUARE TEST OF INDEPENDENT OF

ATTRIBUTES

Consider the cross-tabulation of some characteristic across two categorical variables;

the resulting table is called a two-way frequency table or a contingency table. One

characteristic or attribute is shown along the rows and the other is shown along the

column. Each cell of the table gives the cont or the number of cases corresponding to

that cell. We wish to test whether the two attributes are independent or not.

The null hypothesis is as follows

H0: The two attributes are independent.

H1: The two attributes are not independent.

Let Oij and Eij denote the observed and expected frequencies respectively in the

ith row and the jth column. When the null hypothesis is true, the expected frequencies

are calculated according to the formula.

Row total  Column total


Expected frequency 
Grand total

After finding the expected frequencies of all the cells, we calculate the chi-square

value as follows.

(Oi  Ei )2
2   ~ (2r 1)(c 1)
Ei

104 Periyar University – CDOE| Self-Learning Material


The degrees of freedom is (r-1)x(c-1), where r is the number of rows ad c is the number

of columns. The critical value can be seen from the statistical tables corresponding to

these degrees of freedom at the chosen level of significance. If the chi-square value

exceeds the critical value, we reject the null hypothesis, the conclusion is that the two

characteristics are not independent and they are dependent with each other.

5.4.1 EXAMPLE

From the following table, Test whether the son’s eye colour is dependent with father’s

eye colour

Eye colour of father

Not Light Light

Eye Colour 230 148


Not light
Of son

Light 151 471

Solution:

Null Hypothesis H0: The eye colour of father and eye colour of son are independent.

Alternative HyothesisH1: The eye colour of father and eye colour of son are not

independent.

Row total  Column total


Expected frequency 
Grand total

Using above formula,

Exp. Freq of 230 = 144 Exp. Freq of 148 =234

105 Periyar University – CDOE| Self-Learning Material


Exp. Freq of 471 = 385 Exp. Freq of 151 =237

O E (O-E) (O-E)2 (O-E)2/E

230 144 86 7396 51.36

151 237 -86 7396 31.21

148 234 86 7396 31.61

471 385 86 7396 19.21

χ2= 0.095

D.F = (r-1)(c-1) = (2-1)(2-1) = 1.

Level of significance = 0.05

The table value of chi-square at 0.05 level for 1 D.F = 3.84

CONCLUSION

Since the calculated value of Chi-square is more than the table value, we can accept

the null hypothesis H0;

Hence we conclude that eye-colour of son is dependent of eye colour of father

5.4.2 CHI-SQUARE TEST OF GOODNESS OF FIT

Chi-square test of goodness of fit is given by Karl Pearson which is used to test

the significance of the discrepancy between theory and experiment. It enables us to find

if the deviation of the experiment from theory is just by chance or is it really due to the

inadequacy of the theory to fit the observed data.

106 Periyar University – CDOE| Self-Learning Material


Test Procedure:

If Oi, i = 1,2,3,…n is a set of observed(experimental) frequencies and E i is the

corresponding set of expected(Theoretical ) frequencies, then Karl Pearson’s Chi-

square test statistic is given by

n (oi  Ei ) 2 
  
2
 follows Chi-square distribution with (n-1) d.f.
i 1  Ei 

5.4.3 PROBLEM

Chi-square test of goodness of fit is given by Karl Pearson which is used to test

the significance of the discrepancy between theory and experiment. It enables us to find

if the deviation of the experiment from theory is just by chance or is it really due to the

inadequacy of the theory to fit the observed data.

Test Procedure:

If Oi, i = 1,2,3,…n is a set of observed(experimental) frequencies and E i is the

corresponding set of expected(Theoretical ) frequencies, then Karl Pearson’s Chi-

square test statistic is given by

n 
(o  Ei ) 
2

2   i  follows Chi-square distribution with (n-1) d.f.


i 1  Ei 

5.4.4 TEST FOR EQUALITY OF POPULATION VARIANCES

Test Procedure:

Let two independent random samples of sizes n 1 and n2 are drawn from two normal

populations. We want to test whether these two population variances are equal or not.

Null HypothesisH0:  12   22

Alternative Hypothesis H1:  12   22

107 Periyar University – CDOE| Self-Learning Material


The estimates of  12 and  22 are given by

n1s12 n s2
S12 = , S22 = 2 2
n1  1 n2  1

the test statistic is given by

S12
F = 2
(If S12 > S22 ) or
S2

S 22
= 2
(If S22 > S12 ) follows F-distribution with ( n1  1 , n2  1 ) d.f or ( n2  1 n1  1 ) d.f.
S1

Reject H0 if the calculated value is greater than tabulated value. Other wise accept it.

5.4.5 - PROBLEM

In a sample of 8 observations, the sum of the squared deviation of items for the

mean was 94.5. in another sample of 10 observations, the value was found to be 101.7.

test whether the difference in variances is significant at 5% level.

Solution:

Null HypothesisH0:  12   22

Alternative Hypothesis H1:  12   22

 

We have S12 =
 ( X i  X )2 ; S22 =
 (Yi  Y )2
n1  1 n2  1

 
We are given n1 = 8, (X i  X ) 2 = 94.5; n2 = 10.  (Y  Y )
i
2
= 101.7

94.5 101.7
S12 = = 13.5; S22 = = 11.3
7 9

Since S12 > S22 ,

108 Periyar University – CDOE| Self-Learning Material


13.5
F =
11.3

= 1.1947

D.F = ( n1  1, n2  1 )

= (7, 9)

Table value at 5% level for (7, 9) d.f = 3.29

The calculated value is less than table value, so we accept the hypothesis H 0. The two

population variances are equal.

Let Us Sum Up

Learner, in this we have seen that Chi-square test of independent of attributes,

Chi-square test of goodness of fit, test for equality of population variances.

Check Your Progress

1. In a chi-square test of independence what is the null hypothesis?

A. The variables are related.

B. The variables are independent

C. The variables are normally distributed.

D. The sample means are equal.

2. If you have a contingency table with 3 rows and 4 columns, how many degrees

of freedom should you use for the chi-square test of independence?

A. 6

B. 8

C. 12

D.9

109 Periyar University – CDOE| Self-Learning Material


5.5 Unit Summary

The first unit content on financial services covers Small sample tests with regard

to mean, Difference between mean and paired t-test, F-test, Definition of chi-square

test, assumptions, characteristics, chi-square test for goodness of fit and independence

of attributes with simple problems.

5.6 Glossary

KEYWORDS MEANING
t-test Small sample
z-test Large sample
X Sample mean

 Estimate of the population value from sample.

D The difference between two paired samples

d Sum of the differences

5.7Self-Assessment Questions

Short Answers: (5 Marks)

1. Explain the test procedure for testing of significance of large samples.

2. Write the test for specified population mean.

3. Write the test for single proportion.

4. Explain the chi-square test of goodness of fit.

110 Periyar University – CDOE| Self-Learning Material


Long Answers: (8 Marks)

1. Samples of two types of electric bulbs were tested for length of life and following

data were obtained.

Type - I Type - II
Sample no. 8 7
Sample Mean 1234hours 1046 hours
Sample s.d 36hours 40 hours
Is the difference between in the means sufficient to warrant that type I is superior to

type II regarding the length of life.

2. An I.Q test was conducted to 5 persons before and after they were trained, the

results are given below.

candidates I II III IV V

Before training
I.Q 110 120 123 132 125
After training
120 118 125 136 121

Test whether there is any change in IQ after the training programme at 5% level.

3. A sample of 900 members has a mean 3.4 Cms with S.D 2.61 cms. Is the sample

from a population of mean 3.25.

4.The means of two single large samples of 1000and 2000 members are 67.5 and 68

inches respectively. Can the samples have regarded as drawn from the same

population of standard deviation 2.5?

111 Periyar University – CDOE| Self-Learning Material


5. A fruits marketing federation claims that nor more than 4% of the apples supplied

is damaged, a random sample of 600 apples contained 36 damaged apples. Using

at 5% level of significance, test the claim of the federation.

5.8 EXERCISES

1. Given a sample mean of 83, a sample standard deviation of 12.5 and a sample size

of 22, test the hypothesis that the value of the population mean is 70 against the

alternative that is more than 70. Use the 0.025 significance level.

2. A certain injection administered to each of 12 patients resulted in the following

increases of blood pressure:

5,2,8,-1,3,0,6,-2,1,5,0,4

Can it be concluded that the injection will be general, accompanied by an increase in

B.P?

3. The mean lifetime of a sample of 25 bulbs is found as 1550 hours with a S.D of 120

hours. The company manufacturing the bulbs claims that the average life of their bulbs

is 1600 hours. Is the claim acceptable at 5% level of significance?

4. Conduct an F-test on the following samples:

Sample 1: having variance=109.63,sample size=41.

Sample 2: having variance =65.99,sample size=21.

5. A sample of coin flips is collected from three different coins. The results are below.

Use one hypothesis test to test the claim that all three coins have the same probability

of landing heads. Use the critical value method with significance level 0.10.

112 Periyar University – CDOE| Self-Learning Material


Coin A Coin B Coin C
Heads 88 93 110
Tails 112 107 90

6. To test the claim that snack choices are related to the gender of the customer, a

survey at a ball park shows this selection of snacks purchased. Write the null

hypothesis and check the assumptions. Do not do the rest of the hypothesis test.

Burger Peanuts Popcorn


Male 6 12 9
Female 5 5 8

7. In an experiment in breeding mice, a geneticist has obtained 120 brown mice with

pink eyes,48 brown mice with brown eyes,36 white mice with pink eyes and 13 white

mice with brown eyes. Theory predicts that these types of mice should be obtained in

the ratios 9:3:3:1. Test the compatibility of the data with theory, using a 5% critical

value.

8. A small component in an electronic device has two small holes where another tiny

part is fitted. In the manufacturing process the average distance between the two holes

must be tightly controlled at 0.02 mm, else many units would be defective and wasted.

Many times throughout the day quality control engineers take a small sample of the

components from the production line, measure the distance between the two holes,

and make adjustments if needed. Suppose at one time four units ate taken and the

distances are measured as : 0.021,0.019,0.023 and 0.020.

113 Periyar University – CDOE| Self-Learning Material


5.9 Answers for Check Your Progress

Modules S.No. Answers

Module 1 1. C. Z-test

2. B. GDP Growth

Module 2 1. A. The population standard deviation must be known.

2. B. Two-sample t-test.

1. A. 0.0818
Module 3
2. A. Reject the null hypothesis.

1. B. The variables are independent.


Module 4
2. A.6

5.10 References and Suggested Readings

1. Gupta S. P. (2001), Statistical Methods, Sultan Chand & Sons, New Delhi.

2. Gupta. S. C. and Kapoor. V. K. Fundamentals of Applied Statistics, Sultan Chand and

Sons, New Delhi

3. Pillai R. S. N. And Bagavathi. V. (2005), Statistics, S. Chand and Company Ltd., New

Delhi.

4. Sancheti D. C. And Kapoor. V. K (2005), Statistics (7th Edition), Sultan Chand & Sons,

New Delhi.

5. Arora P. N, Comprehensive Statistical Methods, Sultan Chand & Sons, New Delhi.

6. Pillai R. S. N. And Bagavathi. V. (1987), Practical Statistics, S. Chand & Company Ltd.,

New Delhi.

7. Agarwal B. L, Basic Statistics, Wiley Eastern Ltd., Publishers, New Delhi.

8. Gupta C. B (1978), An Introduction to Statistical Methods, Vikas Publishing House, New

Delhi.

114 Periyar University – CDOE| Self-Learning Material

You might also like