Toh Solved
Toh Solved
REGULATION : 2017
Statistical Hypothesis:
In making statistical decision, we make assumption, which may be true or false are called
Statistical Hypothesis.
Null Hypothesis( H 0 ):
For applying the test of significance, we first setup a hypothesis which is a statement about the
population parameter. This statement is usually a hypothesis of no true difference between
sample statistics and population parameter under consideration and so it is called null hypothesis
and is denoted by H 0 .
Alternative Hypothesis ( H1 ):
Suppose the null hypothesis is false, then something else must be true. This is called an
alternative hypothesis and is denoted by H1 .
Eg. If H 0 is population mean =300, then H1 is 300 (ie. 300 or 300) or
H1 is 300 or H1 is 300 . So any of these may be taken as alternative hypothesis.
Error in sampling:
After applying a test of significance a decision is to be taken to accept or reject the null
hypothesis H 0 .
Type I error: The rejection of the null hypothesis H 0 when it is true is called type I error.
Type II error: The acceptance of the null hypothesis H 0 when it is false is called type II error.
1
Level of significance:
The probability of type I error is called level of significance of the test and it is denoted by α.
We usually take either α=5% or α=1%.
One tailed and Two tailed test:
If 0 is a population parameter and is the corresponding sample statistics and if we setup
the null hypothesis H 0 : 0 , then the alternative hypothesis which is complementary to H 0 can
be anyone of the following:
(i) H1 : 0 ( 0 or 0 ) (ii) H1 : 0 (iii) H1 : 0
Alternative hypotheis, whereas H1 given in (ii) is called a left-tailed test. And (iii) is called a
right tailed test.
Level of significance:
The probability of Type I error is called the level of significance of the test and is denoted by .
Critical region:
For a test statistic, the area under the probability curve, which is normal is divided into two
region namely the region of acceptance of H 0 and the region of rejection of H 0 . The region in
which H 0 is rejected is called critical region. The region in which H 0 is accepted is called
acceptance region.
Procedure of Testing of Hypothesis:
(i) State the null hypothesis H 0
(ii) Decide the alternative hypothesis H1 (ie, one tailed or two tailed)
(iii) Choose the level of significance α (α=5% or α=1%).
(iv) Determine a suitable test statistic.
t E (t )
Test statistic
S .E of (t )
(v) Compute the computed value of z with the table value of z and decide the acceptane or the
rejection of H 0 .
For a single tail test(right tail or left tail) we compare the computed value of z with 1.645(at
5% level) and 2.33(at 1% level) and accept or reject H 0 accordingly.
2
Test of significance of small sample:
When the size of the sample (n) is less than 30, then that sample is called a small sample.
The following are some important tests for small sample,
(I) students t test
(II) F-test
(III) 2 -test
I Student t test
(i). Test of significance of the difference between sample mean and population mean
(ii). Test of significance of the difference between means of two small samples
(i) Test of significance of the difference between sample mean and population mean:
x
The studemts ‘t’ is defined by the statistic t where x =sample mean, =population
S
n
mean, S=standard deviation of sample,
n= sample size.
Note:
x
If standard deviation of sample is not given directly then, the static is given by t , where
S
n
x x
n n
xi
2
i
x i 1
,S2 i 1
n n 1
Confident Interval:
s
The confident interval for the population mean for small sample is x t
n
s s
x t , x t
n n
Working Rule:
(i) Let H 0 : x (there is no significant difference between sample mean and population
mean)
H1 : x (there is no significant difference between sample mean and population
mean)(Two tailed test)
x
Find t .
S
n 1
Let t be the table value of t with v=n-1 degrees of freedom at % level of significance.
Conclusion:
If t t , H 0 is accepted at % level of significance.
If t t , H 0 is rejected at % level of significance.
3
Problem:
1. The mean lifetime of a sample of 25 bulbs is found as 1550h, with standard deviation of
120h. The company manufacturing the bulbs claims that the average life of their bulbs is
1600h. Is the claim acceptable at 5% level of significance?
Solution:
Given sample size n=25, mean x =1550, S.D.(S)=120, population mean =1600
Let H 0 : 1600 ( the claim is acceptable)
H1 : 1600 ( x) (two tailed test)
x 1550 1600
Under H 0 , the test statistic is t 2.0833
S 120
n 25
t 2.0833
From the table, for v=24, t0.05 =2.064. Since t t0.05
H 0 is rejected
Conclusion: The claim is not acceptable.
2. Test made on the breaking strength of 10 pieces of a metal gave the following results:
578,572,570,568,572,570,570,572,596, and 584kg. Test if the mean breaking strength of the
wire can be assumed as 577kg.
Solution:
let us first compute sample mean x and sample standard deviation S and then test if x differs
significantly from the population mean =577.
xx x x
2
x
x i
5752
x i 1
575.2,
n 10
4
x
n 2
i x
681.6
S2 i 1
75.733
n 1 9
Let H 0 : x ,
H1 : x
x 572.2 577
Under H 0 , the test statistic is t 1.74
S 75.733
n 10
t 1.74
Tabulated value of t for v=9 degrees of freedom t0.05 =2.262
Since t t0.05 . H 0 is accepted
Conclusion:
The mean breaking strength of the wire can be assumed as 577kg at 5% level of significance.
3. A random sample of 10 boys had the following I.Q’s: 70, 120, 110, 101, 88, 83, 95,
98,107,100. Do these data support the assumption of a population mean I.Q of 100 ? Find a
reasonable range in which most of the mean I.Q. values of samples of 10 boys lie.
Solution:
Given 100, n 10
Null Hypothesis:
H 0 : 100 i.e., The data are consist with the assumption of men IQ of 100 in the population
Alternate Hypothesis:
H1 : 100 i.e., The data are consist with the assumption of men IQ of 100 in the population
x
x 70 120 110 101 88 83 95 98 107 100 972 97.2
n 10 10
1 (70 97.2) (120 97.2) (110 97.2) (101 97.2) (88 97.2)
2 2 2 2 2
S2
10 1 (83 97.2)2 (95 97.2)2 (98 97.2)2 (107 97.2)2 (100 97.2)2
1
S 2 (1833.6) 203.73 S 14.2734
9
5
97.2 100 2.8
t 0.6203
14.2734 4.5136
10
Table value : t ,n1 t 5% ,101 t 0.05,9 2.262 (Two –tailed test)
Conclusion :
Here t t
i.e., The table value >calculated value,
we accept the null hypothesis and conclude that the data are consistent with the assumption of
mean I.Q of 100 in the population.
To find the confidence limit:
S 14.2734
x t 97.2 2.262 97.2 (2.262)(4.514) (86.99,107.41)
n 10
A reasonable range in which most of the mean I.Q. values of samples of 10 boys lies
(86.99,107.41)
4. A random sample of 16 values from a normal population showed a mean of 41.5 inches and
the sum of squares of deviations from this mean equal to 135 square inches. Show that the
assumption of a mean of 43.5 inches for the population is not reasonable. Obtain 95 percent
and 99 percent confidence limits for the same.
Solution:
Given x 41.5, 43.5, n 16
Sum of squares of deviations from mean= x x
2
135
The parameter of interest is .
Null Hypothesis H0: =43.5 i.e., the assumption of a mean of 43.5 inches for the population is
reasonable.
Alternative Hypothesis H1: 43.5 i.e., the assumption of a mean of 43.5 inches for the
population is not reasonable.
Level of significance: (i) 5% = 0.05, degrees of freedom = 16–1=15
(ii) 1% =0.01, degrees of freedom = 16–1=15
x
Test Statistic : t
S
n
1 1
where S 2
n 1
( x x )2
16 1
135 9 S 9
41.5 43.5 8
t 2.667 t 2.667
3 3
16
Conclusion:
(i) Since t =2.667 > 2.131 so we reject H0 at 5% level of significance.
So we conclude that the assumption of mean of 43.5 inches for the population is not
reasonable.
(ii) Since t =2.667 < 2.947 so we accept H0 at 1% level of significance.
So we conclude that the assumption is reasonable.
6
95% confidence limits:
S 3
x t 41.5 2.947 4 41.5 1.5983 (39.9, 43.09)
n
39.902 43.098
99% confidence limits:
S 3
x t 41.5 2.947 4 41.5 2.2101 (39.29, 43.71)
n
39.29 43.71
5. Ten oil tins are taken at random from an automatic filling machine. The mean weight of the
tins is 15.8 kg and standard deviation is 0.5 kg. Does the sample mean differ significantly
from the intended weight of 16 kg?
Solution:
Given x 15.8, 16, s 0.50, n 10
Null Hypothesis H0: 16 the sample mean weight is not different from the intended weight.
Alternative Hypothesis H1: 16 i.e., the sample mean weight is not different from the
intended weight.
Level of significance: 5% = 0.05, degrees of freedom = 10-1=9
x
Test Statistic : t
s
n
15.8 16 0.2
t 1.27 t 1.27
0.50 0.1581
10
Critical value : The critical value of t at 5% level of significance with degrees of freedom 9 is
2.26
Conclusion:
Here calculated value < table value.
so we accept H0 at 5% level of significance.
Hence the sample mean weight is not different from the intended weight.
(ii) Test of significance of the difference between means of two small samples:
To test the significance of the difference between the means x1 and x2 of sample of size n1
and n 2 .
x1 x2
Under H 0 , the test statistic is t ,
1 1
S
n1 n2
x1 x1 x2 x2
2 2
n s 2 n2 s22
where S 1 1 or S 2 (if s1 , s2 is not given directly )
n1 n2 2 n1 n2 2
Degrees of freedom(df) v = n1 + n 2 -2
Note:
If n1 = n 2 =n and if the pairs of values x1 and x2 are associated in some way (or correlated).
7
d d
2
8
2. Two independent sample of size 8 and 7 contained the following value:
Sample I 19 17 15 21 16 18 16 14
Sample 15 14 15 19 15 18 16
II
Is the difference between the sample means significant?
Solution:
x1 x1 x x x2 x2 x
2 2
x1 x2
1 1 2 x2
19 2 4 15 -1 1
17 0 0 14 -2 4
15 -2 4 15 -1 1
21 4 16 19 3 9
16 -1 1 15 -1 1
18 1 1 18 2 4
16 -1 1 16 0 0
14 -3 9
136 0 36 112 0 20
x1
x 1
136
17, x2
x2 112 16
n1 8 n2 7
x x x
2 2
1 1 2 x2 36 20
S2 4.3076 S 2.0754
n1 n2 2 872
Let H 0 : x1 x2 ,
H1 : x1 x2 (Two tailed test)
x1 x2 17 16
Under H 0 , the test statistic is t 0.9309
1 1 1 1
S 2.0754
n1 n2 8 7
t 0.9309
Degrees of freedom v = v = n1 + n 2 -2=13
From the ‘t’ table, v = 13 degrees freedom at 5% level of significance is t0.05 =2.16
Since t t0.05 H 0 is accepted
Conclusion:
The two sample mean do not differ significantly at 5% level of significance.
9
3. The following data represent the biological values of protein from cow’s milk and buffalo’s
milk:
Cow’s milk 1.82 2.02 1.88 1.61 1.81 1.54
Buffalo’s milk 2.00 1.83 1.86 2.03 2.19 1.88
Examine whether the average values of protein in the two samples significantly differ at
5% level.
Solution:
Given n1 n2 6
H 0 : 1 2 There is no significant difference between the means of the two samples.
H1 : 1 2 There is a significant difference between the means of the two samples.
xy
Test Statistic: t
1 1
S
n1 n2
x y xx ( x x )2 y y
( y y )2
x 1.78 y 1.965
1.82 2 0.04
0.0016 0.035 0.00123
2.02 1.83 0.24
0.0576 -0.135 0.01823
1.88 1.86 0.1 0.01 -0.105 0.01102
1.61 2.03 -0.17
0.0289 0.065 0.00425
1.81 2.19 0.03
0.0009 0.225 0.0506
1.54 1.88 -0.24
0.0576 -0.085 0.00723
Total
10.68 11.79 0.1566 0.09256
x
x 10.68 1.78 ; y y 11.79 1.965
n1 6 n2 6
1
S2 0.1566 0.09256 (0.1)(0.2492) 0.0249 S 0.1578
662
1.78 1.956 0.176 0.176
t 1.9319
1 1 (0.1578)(0.5774) 0.0911
(0.1578)
6 6
Critical value:The critical value of t at 5% level of significance with degrees of freedom 10 is
2.228
Here calculated value < table value, we accept H 0
(i.e,) The difference between the mean protein values of the two varieties of milk is not
significant at 5% level.
10
4. The following data relate to the marks obtaind by 11 students in 2 test, one held at the
beginning of a year and the other at the end of the year intensive coaching.
Test 1 19 23 16 24 17 18 20 18 21 19 20
Test 2 17 24 20 24 20 22 20 20 18 22 19
Do the data indicate that the students have benefited by coaching?
Solution:
The given data relate to the marks obtained in 2 tests by the same set of students. Hence the
marks in the 2 set can be regarded as correlated.
We use t-test for paired values.
Let H 0 : x1 x2 ,
H1 : x1 x2 (one tailed test)
d d
2 2
x1 x2 d = x1 - x2 d 2 x1 x2 d- d
19 17 2 4 3 9
23 24 -1 1 0 0
16 20 -4 16 -3 9
24 24 0 0 1 1
17 20 -3 9 -2 4
18 22 -4 16 -3 9
20 20 0 0 1 1
18 20 -2 4 -1 1
21 18 3 9 4 16
19 22 -3 9 -2 4
20 19 1 1 2 4
-11 58
d d
2
d
d 11 1 S 2
58
5.272
n 11 n 11
d 1
the test statistic is t 1.377 t 1.377
S 5.272
n 1 10
from the table, v = n-1 = 10 (d.f.), t0.05 =1.812
Since t t0.05 H 0 is accepted
Conclusion:
The students have not benefitted by coaching.
11
5. Ten Persons were appointed in the officer cadre in an office. Their performance was noted
by giving a test and the marks were recorded out of 100.
Employee A B C D E F G H I J
Before training 80 76 92 60 70 56 74 56 70 56
After training 84 70 96 80 70 52 84 72 72 50
By applying the t-test, can it be concluded that the employees have been benefited by the
training?
Solution:
Null Hypothesis H0: 1 2 i.e., the employees have not been benefited by the training.
Alternative Hypothesis H1: 1 2 i.e., the employees have been benefited by the training.
Level of significance: 5% = 0.05 (one tailed test)
d
Test Statistic : t
S
n
where S 2
1
(d d ) 2 & d
d
n 1 n
Employees Before After d (d d ) 2
A 80 84 -4 0
B 76 70 6 100
C 92 96 -4 0
D 60 80 -20 256
E 70 70 0 16
F 56 52 4 64
G 74 84 -10 36
H 56 72 -16 144
I 70 72 -2 4
J 50 50 6 100
Total 44 44.4
d
d 40 4
n 10
1 1
S
2
n 1
(d d )2 (720) 80
9
d 4
t 1.414 t 1.414
S 8.94 / 10
n
Critical value : The critical value of tat 5% level of significance with degrees of freedom 9 is
1.83
Conclusion:
Here calculated value < table value.
so we accept H0
Hence the employees have not been benefited by the training.
12
6. The weight gains in pounds under two systems of feeding of calves of 10 pairs of identical
twins is given below.
Twin pair 1 2 3 4 5 6 7 8 9 10
Weight gains under 43 39 39 42 46 43 38 44 51 43
System A
Sytem B 37 35 34 41 39 37 37 40 48 36
Discuss whether the difference between the two systems of feeding is significant.
Solution:
Null Hypothesis H0: 1 2 i.e., there is no significance difference between the two system of
feedings
Alternative Hypothesis H1: 1 2 i.e., there is significance difference between the two systems
of feedings.
Level of significance: 5% = 0.05 ( Two tailed test)
d
Test Statistic : t
S
n
where S 2
1
(d d ) 2 & d
d
n 1 n
System System
Twin
Pair
A B d x y (d d ) 2
x y
1 43 37 6 2.56
2 39 35 4 0.16
3 39 34 5 0.36
4 42 41 1 11.56
5 46 39 7 6.76
6 43 37 6 2.56
7 38 37 1 11.56
8 44 40 4 0.16
9 51 48 3 1.96
10 43 36 7 6.76
Total 44 44.4
d
d 44 4.4
n 10
1 1
S
2
n 1
(d d )2 (44.4) 4.93
9
S 2.08
13
d 4.4
t 6.68
S 2.08 / 10
n
Critical value : The critical value of tat 5% level of significance with degrees of freedom 9 is
2.62
Conclusion:
Here calculated value < table value.
so we accept H0
Hence there is no significance difference between the two systems of feedings.
II F-test
(i) To test whether if there is any significant difference between two estimates of population
variance
(ii) To test if the two sample have come from the same population.
We use F-test:
S2
The test statistic is given by F 12 , if S12 S22
S2
n1s12 n s2
Where S12 [ n1 is the first sample size] and S22 2 2 [ n2 is the second sample size]
n1 1 n2 1
The degrees of freedom ( v1 , v2 )=( n1 1 n2 1 )
Note :
S2
1. If S12 S22 then F 22 (always F > 1)
S1
2. To test whether two independent samples have been drawn from the same normal population,
we have to test
i) Equality of population means using t-test or z-test, according to sample size.
ii) Equality of population variances using F-test
Problem:
1. A sample of size 13 gave an estimated population variance of 3.0, while another sample of
size 15 gave an estimate of 2.5. Could both sample be from population with the same
variance?
Solution:
Given n1 =13, n 2 =15, S12 =3.0, S22 2.5
Let H 0 : S12 S22 (the two samples have been drawn from populations with same variance}
H1 : S12 S22
S12 3
The test statistics is F 2
1.2
S2 2.5
From the table, with degrees of freedom v = ( n1 1 n2 1 ) = (12, 14)
F0.05 2.53 Since F F0.05 H 0 is accepted
Conclusion:
The two sample could have come from two normal population with the same variance.
14
2. Two sample of size 9 and 8 give the sums of squares of deviations from their respective
means equal to 160 and 91 respectively. Could both samples be from populations with the
same variance?
Solution:
x x y y
2 2
Given n1 =9, n 2 =8, 160 , 91
x x y y
2 2
160 91
S12 20 , S22 13
n1 1 8 n2 1 7
Let H 0 : 12 22 (the two normal populations have the same variance}
H1 : 12 22
S12 20
The test statistics is F 1.538
S22 13
From the table, with degrees of freedom v = ( n1 1 n2 1 ) = (8,7)
F0.05 3.73 Since F F0.05 H 0 is accepted
Conclusion:
The two sample could have come from two populations with the same variance.
3. Two random samples gave the following data:
Sample Size Mean Variance
I 8 9.6 1.2
II 11 16.5 2.5
Cane we conclude that the two samples have been drawn from the same normal
population?
Solution:
The two samples have been drawn from the same normal population we have to check
(i) the variance of the population do not differ significantly by F-test.
(ii) the sample means do not differ significantly by t-test.
(i) F-test:
Given n1 =8, n 2 =11, s12 =1.2, s22 2.5 , x1 =9.6, x2 =16.5
n1s12 8(1.2) n s 2 11(2.5)
S12 1.37 S22 2 2 2.75
n1 1 7 n2 1 10
Let H 0 : 12 22
H1 : 12 22
S22
The test statistics is F 2
(sin ce S12 S22 )
S1
2.75
2.007
1.37
From the table, F0.05 n2 1, n1 1 F0.05 (10,7) 3.63
Since F F0.05 H 0 is accepted
(ii) t-test:(Equality of means)
Let H 0 : 1 2
H1 : 1 2
15
x1 x2
Under H 0 , the test statistic is t ,
1 1
S
n1 n2
n1s12 n2 s22 8(1.2) 11(2.5)
where S 1.4772
n1 n2 2 8 11 2
9.6 16.5
t 10.0525 t 10.0525
1 1
1.4772
8 11
From the table ,with degrees of freedom n1 + n 2 -2=17, t0.05 =2.110
sin ce t t0.05 H 0 is rejected ie. 1 2
Conclusion:
The two samples could not have been drawn from the same normal population.
4. Two independent samples of 5 and 6 items respectively had the following values of the
following values of the variable:
Sameple1: 21 24 25 26 27
Sameple2: 22 27 28 30 31 36
Can you say that the two samples came from the same population?
Solution:
Let H 0 : 12 22 and 1 2 ( the two samples have been drawn from the same population)
H1 : 12 22 and 1 2
(i) F-test : (Equality of variance)
x1 x1 x x x2 x2 x x
2 2
x1 x2
1 1 2 2
21 -3.6 12.96 22 -7 49
24 -0.6 0.36 27 -2 4
25 0.4 0.16 28 -1 1
26 1.4 1.96 30 1 1
27 2.4 5.76 31 2 4
36 7 49
123 21.2 174 108
x1
x 1
123
24.6, x2
x2 174 29
n1 5 n2 6
x x x
2 2
21.2 2 x2 108
s12 5.3 , s22 21.6
n1 1 4 n2 1 5
17
15 14
t 0.742
1 1
3.146
10 12
Critical value: The critical value of t at 5% level of significance with degrees of freedom
n1 n2 2 10 12 2 20 is 2.086
Conclusion: calculated value < table value
H 0 is Accepted.
ii) F-test to test equality of populations variances:
Null Hypothesis H0: 12 22 The population Variances are equal
Alternative Hypothesis H1: 1 2 The population Variances are not equal
2 2
Level of significance: 5%
Test Statistics:
S2
F 12
S2
1 1
Where S1
2
n1 1
( x x )2
10 1
(90) 10
1 1
S12
n1 1
( y y )2
12 1
(108) 9.818
S12 10
Here S1 S2 F
2 2
2
1.02
S2 9.818
Critical value:The critical value of F at 5% level of significance with degrees of freedom
(n1 1, n2 1) (9,11) is 2.90
Here calculated value < table value, we accept H0
Conclusion: Both null hypothesis 1 2 and 1 2 are accepted.
2 2
Hence we may conclude the two samples are drawn from same normal population.
III 2 -test:
(i). 2 -Test for a specified population variance
(ii). 2 -test is used to test whether differences between observed and expected frequencies are
significant (goodness of fit).
(iii). 2 -test is used to test the independence of attributes.
2 -Test for a specified population variance:
ns 2
The test statistics 2
2
Which follows 2 - distribution with (n – 1) degrees of freedom
Problem:
1. The lapping process is used to grind certain silicon wafers to the proper thickness is
acceptable only , the population S.D. of the thickness of dice cut from the wafers, is at
most 0.5mil. Use the 0.05 level of significance to test the null hypothesis =0.5 against the
alternative hypothesis >0.5, if the thickness of 15 dice cut from such wafers have S.D of
0.64mil.
18
Solution:
Given n 15 , s=0.64, =0.5
H 0 : 0.5 , H1 : 0.5
ns 2 15 (0.64)2
Under H 0 , The test statistics 2 24.576
2 (0.5)2
From 2 table, with degrees of freedom = 14, 0.05
2
23.625
2 0.05
2
H 0 is rejected. Hence 0.5
2 -test is used to test whether differences between observed and expected frequencies are
significant (goodness of fit):
Oi Ei 2
The test statistics
2
i Oi
Where Oi is observed frequency, and Ei is the expected frequency.
If the data given in a series of n number, then degree of freedom = n - 1 .
Note: In case of binomial distribution d.f = n – 1, poisson distribution d.f = n – 2, normal
distribution d.f = n – 3.
Problem:
1. The following data give the number of aircraft accident that occurred during the various
days of a week:
Days : Mon Tue Wed Thu Fri Sat
No of 15 19 13 12 16 15
accidents:
Test the whether the accident are uniformly distributed over the week.
Solution:
90
The expected number of accident on any day 15
6
Let H 0 : Accidents occur uniformly over the week
H1 : Accidents not occur uniformly over the week
Days Observed Expected Oi Ei Oi Ei
2
Freqency Frequency
Ei
( Oi ) ( Ei )
Mon 15 15 0 0
Tue 19 15 4 1.066
Wed 13 15 -2 0.266
Thu 12 15 -3 0.6
Fri 16 15 1 0.066
Sat 15 15 0 0
90 1.998
O Ei
2
Now, 2 i 1.998
i Oi
Here 6 observations are given, degrees of freedom = n – 1= 6 – 1 = 5
19
2 0.05
2
H 0 is accepted.
Conclusion: Accidents occur uniformly over the week
2. A survey of 320 families with 5 children each revealed the following distribution:
No. of 5 4 3 2 1 0
Boys:
No. of 0 1 2 3 4 5
Girls:
No. of 14 56 110 88 40 12
families:
Is the result consistent with the hypothesis that male and female births are equally
probable?
Solution:
Let H 0 : Male and female births are equally probable
H1 : Male and female births are not equally probable
1 1
Probability of male birth p , Probability of female birth q
2 2
x 5 x
The probability of x male births in a family of 5 is p( x) 5Cx p q , x 0,1, 2...5
Expected number of families with x male births 320 5Cx p x q5 x , x 0,1, 2...5
x 5 x
1 1
320 5Cx
2 2
5
1
320 5Cx 10 5Cx
2
The 2 is calculated using the following table:
No. of Observed Expected Oi Ei Oi Ei
2
x: 0 1 2 3 4 5 6
f(x): 275 72 30 7 5 2 1
Solution:
20
Mean of the given distribution x
fx i i
189
0.482
f i 392
To fit a poisson distribution to the given data:
We take the parameter of the poisson distribution equal to the mean of the given distribution.
x 0.482
e x
The poisson distribution is given by P X x ; x 0,1, 2...
x!
0.482
0.482 x
e x
and the expected frequencies are obtained by f ( x) fi
e
392
x! x!
0.482
0.482 242.1, f (1) 392 e 0.482 116.69
0.482
0 1
e
we get f (0) 392
0! 1!
f (3) 4.518, f (4) 0.544, f (5) 0.052 0.1, f (6) 0.004 0
x: 0 1 2 3 4 5 6 Total
Expected 242.1 116.69 28.12 4.518 0.544 0.052 0.004 392
Frequency:
Freqency Frequency
Ei
( Oi ) Ei
0 275 242.1 4.471
1 72 116.7 17.122
2 30 28.1 0.128
3 7 4.5
4 5 15 0.5 5.1 19.218
5 2 0.1
6 1 0
Total 392 392 40.939
40.939
2
21
Attribute B
B1 B2 B3 Total
A1 a11 a12 a13 R1
A2 a21 a22 a23 R2
Attribute A
Total C1 C2 C3 N
Now, under the null hypothesis H 0 : The attributes A and B are independent and we calculate
the expected frequency Eij for varies cells using the following formula.
Ri C j
Eij , i 1, 2,...r , j 1, 2,...s
N
R1 C1 R1 C2 R1 C3 R1
E a11 E a12 E a13
N N N
R C R C R C R2
E a21 2 1 E a22 2 2 E a23 2 3
N N N
R3 C1 R3 C2 R3 C3 R3
E a31 E a32 E a33
N N N
C1 C2 C3 N
O Eij
2
r s
and we compute 2
ij
i 1 j 1 Eij
Which follows 2 distribution with n = (r-1) (s-1) degrees of freedom at 5% or 1% level of
significance.
1. Calculate the expected frequencies for the following data presuming two attributes viz.,
conditions of home and condition of child as independent.
Condition of home
Clean Dirty
Condition of Child Clean 70 50
Fair 80 20
Dirty 35 45
Use Chi-Square test at 5% level of significance to state whether the two attributes are
independent.
Solution:
Null hypothesis H 0 : Conditions of home and conditions of child are independent.
Alternate hypothesis H 1 : Conditions of home and conditions of child are not independent.
Level of significance: 0.05
22
r s (Oij Eij ) 2
The test statistics: 2
i 1 i 1 Eij
Analysis:
Condition of home Total
Clean Dirty
Condition of Child Clean 70 50 120
Fair 80 20 100
Dirty 35 45 80
Total 185 115 300
Corresponding row total×Column total
Expected Frequency
Grand Total
120×185 100×185
Expected Frequency for 70 74 , Expected Frequency for 80 61.67 ,
300 300
80×185 120×115
Expected Frequency for 35 49.33 , Expected Frequency for 50 46 ,
300 300
100×115 80×115
Expected Frequency for 20 38.33 , Expected Frequency for 45 30.67
300 300
Oij E ij Oij - E ij (Oij Eij ) 2 (Oij Eij ) 2
Eij
70 74 -4 16 16
0.216
74
50 46 4 16 0.348
80 61.67 18.33 335.99 5.448
20 38.33 -18.33 335.99 8.766
35 49.33 -14.33 205.35 4.163
45 30.67 14.33 205.35 6.695
Total 25.636
2 25.636
23
2. The following contingency table presents the reactions of legislators to a tax plan according
to party affiliation. Test whether party affiliation influences the reaction to the tax plan at
0.01 level of signification.
Reaction
Party In favour Neutral Opposed Total
Party A 120 20 20 160
Party B 50 30 60 140
Party C 50 10 40 100
Total 220 60 120 400
Solution:
Null hypothesis H 0 : Party affiliation and tax plan are independent.
Alternate hypothesis H 1 : Party affiliation and tax plan are not independent.
Level of significance: 0.05
r s (Oij Eij ) 2
The test statistic:
2
i 1 i 1 Eij
Analysis:
Reaction
Party Infavour Neutral Opposed Total
Party A 120 20 20 160
Party B 50 30 60 140
Party C 50 10 40 100
Total 220 60 120 400
24
Oij E ij Oij - E ij (Oij Eij ) 2 (Oij Eij ) 2
Eij
3. From a poll of 800 television viewers, the following data have been accumulated as to, their
levels of education and their preference of television stations. We are interested in
determining if the selection of a TV station is independent of the level of education
Educational Level
Public High School Bachelor Graduate Total
Broadcasting 50 150 80 280
Commercial Stations 150 250 120 520
Total 200 400 200 800
(i) State the null and alternative hypotheses.
(ii) Show the contingency table of the expected frequencies. (iii) Compute the test statistic.
(iv) The null hypothesis is to be tested at 95% confidence. Determine the critical value for
this test.
Solution:
(i)Null Hypothesis: Selection of TV station is independent of level of education
Alternative Hypothesis: Selection of TV station is not independent of level of education
(ii) Level of significance: 0.05
25
Educational Level
Public High School Bachelor Graduate Total
Broadcasting 50 150 80 280
Commercial Stations 150 250 120 520
Total 200 400 200 800
280×200 280×400
Expected Frequency for 50 70 , Expected Frequency for 150 140
800 800
280×200 520×200
Expected Frequency for 80 70 , Expected Frequency for 150 130
800 800
520×400 520×200
Expected Frequency for 250 260 , Expected Frequency for 120 130
800 800
r s (Oij Eij ) 2
The test statistic:
2
i 1 i 1 Eij
Analysis:
Oij E ij Oij - E ij (Oij Eij ) 2 (Oij Eij ) 2
Eij
26
Large sample:
If the size of the sample n>30, then that samplw is said to be large sample. There are four
important test to test the significance of large samples.
Note:
(i). The sampling distribution of a static is approximately normal, irrespective of whether the
distribution of the population is normal or not.
(ii). The sample statistics are sufficiently close to the corresponding population parameters and
hence may be used to calculate the standard errors of the sampling distribution.
(iii). Critical values for some standard LOS’s (For Large Samples)
1% (0.01) 2% (0.02) 5% (0.05) 10% (0.1)
Nature of test
(99%) (98%) (95%) (90%)
Note:
x
If standard deviation of population is not known then the static is z ,
S
n
where S = standard deviation of sample.
Confident Interval:
The confident interval for when is known and sampling is done from a normal population or
with a large sample is x z
n
27
x z , x z
n n
s
If s is known ( is not known): x z
n
1. A sample of 100 students is taken from a large population, the mean height in the sample is
160cm. Can it be reasonable regarded that in the population the mean height is 165cm, and
s.d. is 10cm. and find confident limit. Use an level of significance at 1%
Solution:
Given n = 100, x =160cm, =165cm, =10cm
Let H 0 : 165
H1 : 165 (two tailed test)
x 160 165
Under H 0 , the test statistic is z 5
10
n 100
z 5
From the table, z0.01 =2.58. Since z z0.01 H 0 is rejected. hence 165 .
Confident Interval:
10 10
x z , x z 160 2.58 ,160 2.58 (157.42,162.58)
n n 100 100
2. The mean breaking strength of the cables supplied by a manufacture is 1800 with a S.D of
100. By a new techniques in the manufacturing process, it it claimed that the breaking
strength of the cable has increased. In order to test this claim, a sample of 50 cables is tested
and it is found that the mean breaking strength is 1850. Can we support the claim at 1%
level of significance?
Solu:
Given n = 50, x =1850, =1800, =100
Let H 0 : x
H1 : x (one tailed test)
x 1850 1800
Under H 0 , the test statistic is z 3.535
100
n 50
z 3.535
From the table, z0.01 =2.33. Since z z0.01 H 0 is rejected. hence x .
3. A sample of 900 members has a mean of 3.4 cms and s.d is 2.61 cms. Is the sample from a
large population of mean 3.25cm and s.d is 2.61 cms. If the population is normal and its
mean is unknown find the 95% confidence limits of true mean.
Solution:
Given n 900 , 3.25 , x 3.4cm , 2.61, s 2.61
Null Hypothesis H0 : Assume that there is no significant difference between sample mean and
population mean. (i.e) 3.25
Alternative Hypothesis H1 : Assume that there is a significant difference between sample mean
and population mean. (i.e) 3.25
28
Level of significance : 5%
Test Statistic :
x 3.4 3.25
z 1.724
s 2.61
n 900
Critical value: The critical value of z for two tailed test at 5% level of significance is 1.96
Conclusion:
i.e., z 1.724 1.96 calculated value < tabulated value
Therefore We accept the null hypothesis H0.
i.e., The sample has been drawn from the population with mean 3.25
Solution:
Given n 121, x 6.08, 6, S 0.44
Null Hypothesis H0: 6 i.e., Assume that the lathe is in perfect adjustment
Alternative Hypothesis H1: 6 i.e., Assume that the lathe is not in perfect adjustment.
Level of Significance : 0.05
ii) Test Statistic :
x 6.08 6 0.08
z 2
S 0.44 0.04
n 121
Table value: Table value at 5% level of significance is 1.96
iii) Conclusion:
Here calculated value > tabulated value
Hence we reject 𝐻0 .
5. The mean life time of a sample of 100 light tubes produced by a company is found to be
1580 hours with standard deviation of 90 hours. Test the hypothesis that the mean lifetime
of the tubes produced by the company is 1600 hours.
Solution:
Given n 100, x 1580, 1600, S 90
Null Hypothesis H0: 1600 i.e., There is no significance difference between the sample mean
29
and population mean
Alternative Hypothesis H1: 1600 i.e., There is a significance difference between the
sample mean and population mean
Level of Significance : 5% 0.05
Test Statistic :
x 1580 1600 20
z 2.22
S 90 9
n 100
z 2.22
Table value: Table value at 5% level of significance is 1.96 (two tailed test)
Conclusion:
Here calculated value > tabulated value
Hence we reject 𝐻0 .
Hence the mean life time of the tubes produced by the company may not be 1600 hrs.
30
x1 x2 20 15
Test statistic: z 18.6
1 1 1 1
4
n2 n1 500 400
Critical value: The critical value of t at 1% level of significance is 2.58
Conclusion: calculated value > table value
H 0 is rejected
The samples could not have been drawn from same population.
2. Test significance of the difference between the means of the samples, drawn from two
normal populations with the same SD using the following data:
Size Mean Standard Deviation
Sample I 100 61 4
Sample II 200 63 6
Solution:
Given x1 60, x2 63, s1 4, s2 6, n1 100, n2 200
Null hypothesisH 0 : 1 2 there is no significance difference between the means of the samples.
Alternate Hypothesis H1 : 1 2 there is a significance difference between the means of the
samples.
Level of Significance : 5% 0.05 (two tailed test )
x1 x 2 61 63
Test statistic: z 3.02 z 3.02
2 2 2 2
s1 s 4 6
2
n2 n1 200 100
Critical value: The critical value of t at 5% level of significance is 1.96
Conclusion: calculated value > table value
H 0 is rejected .Therefore the two normal populations, from which the samples are drawn, may
not have the same mean though they may have the same S.D.
3. A sample of heights of 6400 Englishmen has a mean of 170cm and a S.D of 6.4cm, while a
simple sample of heights of 1600 Americans has a mean of 172cm and a S.D of 6.3cm. D the
data indicate that Americans are on the average, taller than Englishmen?
Solution:
Given x1 170, x2 172, s1 6.4, s2 6.3, n1 6400, n2 1600
Null hypothesis H 0 : 1 2 there is no significance difference between the heights of Americans
and Englishmen.
Alternate Hypothesis H1 : 1 2 Americans are on the average, taller than Englishmen
Level of Significance : 5% 0.05 (one tailed test )
x1 x 2 170 172
Test statistic: z 11.32 z 11.32
2 2 2 2
s1 s 6.4 6.3
2
n2 n1 6400 1600
Critical value: The critical value of t at 5% level of significance is 1.645
31
Conclusion: calculated value > table value
H 0 is rejected. We conclude that the data indicate that Americans are on the average, taller than
Englishmen.
4. The aveage marks scored by 32 boys is 72 with a S.D of 8, while that for 36 girls is 70 with a
S.D of 6. Test at 1%level of significance whether the boys perform beter than girls.
Solution:
Given x1 72, x2 70, s1 8, s2 6, n1 32, n2 36
H 0 : 1 2 (Both perfom are equal)
H 0 : 1 2 (Boys are better than girls) (one tailed test)
x1 x2 72 70
The test statistic: z 1.15
2 2
s s2 82 62
1
n2 n1 32 36
Critical value: The critical value of t at 1% level of significance is 2.33
Confident Interval:
PQ
The confident interval for population proportion for large sample is p z
n
1. In a big city 325 men out of 600 men were found to be smokers. Does this information
support the conclusion that the majority of men in this city are smokers?
Solution:
Given n=600 , Number of smokers=325
p = sample proportion of smokers p =325/600=0.5417
P= Population proportion of smokers in the city = 1/2 =0.5Q=0.5
Null Hypothesis H0: The number of smokers and non-smokers are equal in the city.
Alternative Hypothesis H1: P > 0.5 (Right Tailed)
Test Statistic:
p P 0.5417 0.5
z 2.04
PQ 0.5*0.5
n 600
32
Critical value:
Tabulated value of z at 5% level of significance for right tail test is 1.645.
Conclusion:
Since Calculated value of z > tabulated value of z.
We reject the null hypothesis. The majority of men in the city are smokers.
2. 40 people were attacked by a disease and only 36 survived. Will you reject the hypothesis
that the survival rate, if attacked by this disease, is 85% at 5% level of significance?
Solution:
Given
36
The Sample proportion, p 0.90
40
Population proportion P 0.85 Q 1 P 1 0.85 0.15
Null Hypothesis H0: P 0.85 i.e., There is no significance difference in survival rate
Alternative Hypothesis H1: P 0.85
i.e., There is a significance difference in survival rate.
Level of Significance : 0.05
Test Statistic :
pP 0.90 0.85
z 0.886
PQ 0.85 0.15
n 40
Table value: Tabulated value of z at 5% level of significance is 1.96
The confident interval for difference between two population proportion for large sample is
1 1
p1 p2 z PQ
n1 n2
1. Before an increase in excise duty on tea, 800 people out of a sample of 1000 were consumers
of tea. After the increase in duty, 800 people were consumers of tea in a sample of 1200
persons. Find whether there is significant decrease in the consumption of tea after the
increase in duty. Also find confident limit.
Solution:
Given n1 1000, n2 1200
800
p1 proportion of tea drinkers before increase inexcise duty 0.8
1000
34
800
p2 proportion of tea drinkers before increase inexcise duty 0.6667
1200
Null hypothesis: H 0 : P1 P2 there is no significance difference in the consumption of tea before
after increase in excise duty
Alternate hypothesis: H1 : P1 P2 there is a significance difference in the consumption of tea
before after increase in excise duty
Level of significance: 5% =0.05
p1 p2
Test Statistic: z
1 1
PQ
n1 n2
Where
n1 p1 n2 p2 (0.8)(1000) (0.67)(1200)
P 0.7273 Q 1 P 1 0.7273 0.2727
n1 n2 1000 1200
0.8 0.6667 0.1333
z 6.99
1 1 0.01907
(0.7273)(0.2727)
1000 1200
Critical value: the critical value of z at 5% level of significance is 1.645
Conclusion:
Here calculated value > table value
We reject H 0
Hence there is no significance difference in the consumption of tea before after increase in excise
duty.
Confident Interval:
The confident interval for difference between two population proportion for large sample is
1 1 1
p1 p2 z PQ 0.8 0.667 1.645 0.7273 0.2727
1
n1 n2 1000 1200
(0.1016,0.1644)
2. Random samples of 400 men and 600 women asked whether they would like to have a
flyover near their residence.200 men and 325 women were in favor of the proposal. Test the
hypothesis that proportions of men and women in favor of the proposal are same against
that they are not, at 5% level.
Solution:
Given n1 400, n2 600
200
p1 proportion of men 0.5
400
325
p2 proportion of women 0.541
600
Null hypothesis: H 0 : P1 P2 Assume that there is no significant difference between the
option of men and women as far as proposal of flyover is concerned.
Alternate hypothesis: H1 : P1 P2 Assume that there is significant difference between the
option of men and women as far as proposal of flyover is concerned
35
Level of significance: 5% =0.05 (two tailed)
p1 p2
Test Statistic: z
1 1
PQ
n1 n2
n1 p1 n2 p2 (400)(0.5) (600)(0.541)
Where P 0.525 Q 1 P 1 0.525 0.475
n1 n2 400 600
0.5 0.541 0.041
z 1.34 z 1.34
1 1 0.032
(0.525)(0.475)
400 600
Critical value: the critical value of z at 5% level of significance is 1.96
Conclusion:
Here calculated value < table value
We accept H 0 at 5% level of significance.
Hence There is no difference between the option of men and women as far as proposal
of flyover are concerned.
3. A machine puts out 16 imperfect articles in a sample of 500. After the machine is
overhauled, it puts out 3 imperfect articles in a batch of 100. Has the machine improved?
Solution:
Hypothesis:
H 0 : P1 P2
H 1 : P1 P2
Level of Significance : 0.05
p1 p 2
Test Statistic : Z
1 1
PQ
n1 n2
Analysis:
The Sample proportion,
16 3 n p n2 p 2
p1 0.032 , p 2 0.03 , P 1 1 0.032 & Q 1 P 0.968
500 100 n1 n2
p1 p 2 0.032 0.03
Z 0.1037
1 1 1 1
PQ 0.032 0.968
n1 n2 500 100
Table value : Z 1.645
Conclusion:
Calculated value < table value
Hence we accept the null hypothesis and conclude that the machine has not improved after
overhauling.
36