0% found this document useful (0 votes)
9 views

Unit 5. Test of Significance

The document discusses statistical concepts related to hypothesis testing including population, parameter, sample, statistic, sampling, statistical inference, tests of significance, hypothesis, null hypothesis, alternative hypothesis, hypothesis testing, errors in sampling, level of significance, critical region, one tailed and two tailed tests, critical values, and procedure for testing hypothesis. It also provides examples of testing hypothesis for single mean and discusses large sample tests.

Uploaded by

khandaitprajwal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

Unit 5. Test of Significance

The document discusses statistical concepts related to hypothesis testing including population, parameter, sample, statistic, sampling, statistical inference, tests of significance, hypothesis, null hypothesis, alternative hypothesis, hypothesis testing, errors in sampling, level of significance, critical region, one tailed and two tailed tests, critical values, and procedure for testing hypothesis. It also provides examples of testing hypothesis for single mean and discusses large sample tests.

Uploaded by

khandaitprajwal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 56

KOLHAPUR INSTITUTE OF TECHNOLOGY’S,

COLLEGE OF ENGINEERING (AUTONOMOUS), KOLHAPUR

DEPARTMENT OF BIOTECHNOLOGY ENGINEERING


Second Year B.Tech. (SEM - III)
APPLIED MATHEMATICS (UBIO0301)

Unit No.5: Test of Significance

 Population: The group of individuals under study is called the population. For
example, if we want to have an idea of the average per capita (monthly) income of the
people in Maharashtra, we will have to enumerate all the earning individuals in the
Maharashtra.
 Parameter: A parameter is any numerical quantity that characterizes a given
population or some aspects of it. This means the parameter tells us something about
the whole population.
 Samples: A finite subset of statistical individuals in a population is called a sample
and the number of individuals in a sample in a sample is called the sample size (n).
For example, we select 100 people in Kolhapur district for average per capita income
in Maharashtra state.
 Statistic: A statistic is an estimate of a population parameter based on sample.
 Sampling: The process of drawing random samples from population.
For example, in a shop we assess the quality of wheat by taking a handful of it from
the bag and then decide to purchase it or not. A Meal maker normally tests the cooked
products to find if they are properly cooked and contain the proper quantity of salt.
 Statistical Inference: It refers to the process of selecting and using a sample to draw
inference about population from which sample is drawn is called statistical inference.
For example, we draw 10 diameters of screws from a large lot of screws. Sampling is
done in order to see whether a model of the population is accurate enough for practical
purposes.
 Tests of Significance: The methods of statistical inference used to accept or reject
claims based on sample data are known as tests of significance.
 Hypothesis: A statement or a claim about a property of a population is called a
Hypothesis. There are two types of Hypothesis Null Hypothesis and Alternative
Hypothesis.
 Null Hypothesis: A definite Statement about the value of a population parameter.
Such a hypothesis, which is usually a hypothesis of no difference, is called null
hypothesis. Generally it is represented by H0. For e.g.H0:μ = 3
 Alternative Hypothesis: Any hypothesis which is complementary to the null
hypothesis is called alternative hypothesis. Statement about the value of a population
parameter that must be true if the null hypothesis is false. Generally it is represented
by H1. For e.g. H1: μ > 3 or μ < 3 or μ ≠ 3
 Hypothesis Testing: Hypothesis Testing is to test the claim or statement of
population.
For example, a statement is made that “the average starting salary for Engineering
Graduate student from KITCoEK, is Rs.35000 per month”.
 Errors in Sampling: The main objective is to draw valid inferences about the
population parameters on the basis of the sample results. As such we are liable to
commit the following two types of errors.

Mr. Amar Tikole (9272319199), KIT’s College of Engineering (Autonomous), Kolhapur Page 2
 Type I error: Reject the null hypothesis when it is true. Probability of type I
error is denoted by α.
 Type II error: Accept the null hypothesis when it is wrong. Probability of
type II error is denoted by β.
 Level of significance: The Probability of type I error (α) is called the level of
significance. The level of significance is always fixed in advance before collecting the
sample information. Generally it is consider 1 % and 5%.
 Critical Region: The region (corresponding to a statistic) of rejection of null
hypothesis is called critical region or rejection region. Test statistic falls in some
interval which we reject the null hypothesis. This interval is called rejection region.
The test statistic falls in some interval which we accept the null hypothesis. This
interval is called acceptance region.

One tailed and two tailed tests:


 One tailed: A one tailed test is a statistical test in which the critical area of a
distribution is one sided so that it is either greater than or less than a certain
parameter value.
For e.g.H0: μ = 3 vs. H1: μ > 3 (Right tailed test)

Mr. Amar Tikole (9272319199), KIT’s College of Engineering (Autonomous), Kolhapur Page 3
For e.g.H0: μ = 3 vs. H1: μ < 3 (Left tailed test)

 Two tailed: A two tailed test is a statistical test in which the critical area of a
distribution lying in both tails of the probability curve of test statistic.
For e.g.H0: μ = 3 vs. H1: μ ≠ 3 (Two tailed test)

 Critical Values: The value of test statistic which separates the critical (or rejection)
region and the acceptance region is called the critical value. It depends up on,
(1)The level of significance α.
(2)The Alternative hypothesis, whether it is one tailed or two tailed.
 Procedure (Steps) for Testing of Hypothesis:
Describe in words the population characteristic about which hypothesis are to be
tested.
1. Null hypothesis: Set up the null hypothesis H0.
2. Alternative hypothesis: Set up the alternative hypothesis H1, this will enable us to
decide whether we have to use right, left, or two-tailed test.
3. Level of Significance: Choose the appropriate α in advance.
4. Critical Value: Determine the critical value associated with the Alternative
hypothesis and level of significance
5. Test Statistic: Compute the test statistic, (i.e. Z- test or t- test or Chi-square test)
6. Test Criteria: Decide whether to reject the null hypothesis, H0, or fail to reject the
null hypothesis. It depends on the critical value and the test statistic of the test.
7. Conclusion: State your result in the context of the specific problem.

Mr. Amar Tikole (9272319199), KIT’s College of Engineering (Autonomous), Kolhapur Page 4
Large Sample Test
Introduction:
The sample size (n) is greater than 30 (n ≥ 30) it is known as large sample. For large
samples the sampling distributions of statistic are normal (Z-test). A test of statistic for large
sample is known as large sample test.
Test 1: Test of Significance for Single Mean:
To test the null hypothesis H0: μ = μ0 that the sample has been drawn from a
population with mean μ0 and variance σ2 i.e. there is no significant difference between the
sample mean x and population mean μ0 , the test statistic (for large samples), is:
x
Z ~ N (0,1)
/ n
Note:
1. For large samples, if the population standard deviation (S.D) σ is unknown, we use its
estimate provided by the sample standard deviation’s’. σ2 = s2 so, the test statistic is:
x
Z
s/ n
2. Confidence Limits for  :

  
95% confidence limits for μ are x  1.96 .
 n
  
99% confidence limits for μ are x  2.58 .
 n
3. Table of Critical Values for Large Samples:
Critical Value Level of Significance (α)
Zα 1% (0.01) 5% (0.05)
Two-tailed test |Zα| = 2.58 |Zα| = 1.96
Right-tailed test Zα = 2.33 Zα = 1.645
Left-tailed test Zα = - 2.33 Zα = - 1.645

4. If test statistics Z is less than Critical value then accept the null hypothesis otherwise do
no accept (Reject the null hypothesis)

Mr. Amar Tikole (9272319199), KIT’s College of Engineering (Autonomous), Kolhapur Page 5
Solved Examples

Example 1: A sample of 625 members has a mean of 3.5 cm. Can it be reasonably
regarded as a truly random sample from a large population with mean 3.2 and
variance 2.25 at 5% level of significance?
Solution: Here n  625, x  3.5,   3.2,  2  2.25
(i) Null Hypothesis: H0: μ = 3.2 i.e. sample is selected from a large population with
mean = 3.2.
(ii) Alternative Hypothesis: H1: μ ≠ 3.2 (Two-tailed)
(iii) Level of Significance: α = 0.05
(iv)Critical value: For two-tailed test, the value of Zα at 5% level of significance from
the table = 1.96 i.e. | Zα | =1.96
(v) Test Statistic: Under H0, the test statistic for given sample is
x 3 . 5  3 .2
Z Z 5 Z 5
/ n 1.5 / 625
(vi) Test Criteria: Since test statistic |Z| > critical value | Zα |, the null hypothesis is
rejected.
(vii) Conclusion: It cannot be regarded as a random sample from a large
population with mean = 3.2.

Example 2: A sample of 40 sugarcane selected at random in a farm was checked for the
length and it was found that the mean and the variance of the length are 58.5 and 4.2
inches respectively, is it reasonable to say that the average length of the sugarcane is 60
inches. Use 1% level of significance.

Solution: Here n  40, x  58.5, sample standard deviation s  4.2 &   60.
(i) Null Hypothesis: H0: μ = 60 i.e. the average length of the sugarcane is 60 inches.
(ii) Alternative Hypothesis: H1: μ ≠ 60 (Two-tailed)
(iii) Level of Significance: α = 0.01
(iv) Critical value: For two-tailed test, the value of Zα at 1% level of significance
from the table = 2.58 i.e. | Zα | =2.58
(v) Test Statistic: Since the population S.D. is unknown but sample S.D. s is known
and the sample is large

Mr. Amar Tikole (9272319199), KIT’s College of Engineering (Autonomous), Kolhapur Page 6
x 58.5  60
z z  4.6291 z  4.6291
s/ n 2.0494 / 40
(vi)Test Criteria: Since test statistic |Z| > critical value | Zα |, the null hypothesis is
rejected.
(vii) Conclusion: It is not reasonable to say that the average length of the sugarcane
is 60 inches.

Example 3: 64 observations are selected at random from a normal distribution whose


variance is 25.Their mean is calculated and found to be 11.1. Test the hypothesis that
the true value of the population mean is 10. Use 5% level of significance.
Solution: Here n  64, x  11.1, s  5 &   10.
(i) Null Hypothesis: H0: μ = 10 (i.e. the true value of the population mean is 10.)
(ii) Alternative Hypothesis: H1: μ ≠ 10 (Two-tailed)
(iii) Level of Significance: α = 0.05
(iv) Critical value: For two-tailed test, the value of Zα at 5% level of significance
from the table = 1.96 i.e. | Zα | =1.96
(v) Test Statistic: Since the population S.D. is unknown but sample S.D. s is known
and the sample is large
x 11.1  10
z z z  1.76
s/ n 5 / 64
(vi) Test Criteria: Since test statistic |Z| < critical value | Zα |, the null hypothesis is
accepted.
(vii) Conclusion: The true value of the population mean is 10.

Example 4: A tire company claims that lives of tires have mean 42000 km with S.D of
4000 km. A change in production process is believed to give better product. A test
sample of 81 new tires has mean 42500 km. Test at 5% l. o. s that new product is better
than current one.
Solution: Here n  81, x  42500,   42000 &   4000.
(i) Null Hypothesis: H0: μ = 42000 (i.e. the sample mean and the population mean do
not differ significantly.)
(ii) Alternative Hypothesis: H1: μ > 42000 (Right-tailed alternative)
(iii) Level of Significance: α = 0.05
Mr. Amar Tikole (9272319199), KIT’s College of Engineering (Autonomous), Kolhapur Page 7
(iv)Critical value: For two-tailed test, the value of Zα at 5% level of significance from
the table = 1.645 i.e. | Zα | =1.645
x 42500  42000
(v) Test Statistic: z  z  1.125 z  1.125
/ n 4000 / 81
(vi)Test Criteria: Since test statistic |Z| < critical value | Zα |, the null hypothesis is
accepted.
(vii) Conclusion: The new product is not better than current one.

Example 5: A sample of 900 members has a mean 3.4 cm. and S.D = 2.61. Is the sample
from a large population of mean 3.25 and S.D.2.61 cm? (Use α = 0.05).If the population
is normal and its mean is unknown, find the 95% confidence limits for true mean μ.
Solution: Here n  900, x  3.4,   3.25,   s  2.61

(i) Null Hypothesis: H0: μ = 3.25(i.e. sample is selected from a large population with
mean = 3.25.)
(ii) Alternative Hypothesis: H1: μ ≠ 3.25 (Two-tailed)
(iii) Level of Significance: α = 0.05
(iv)Critical value: For two-tailed test, the value of Zα at 5% level of significance from
the table = 1.96 i.e. | Zα | =1.96
(v) Test Statistic: The test statistic for given sample is
x 3.4  3.25
Z Z  1.73 Z  1.73
/ n 2.61 / 900
(vi) Test Criteria: Since test statistic |Z| < critical value | Zα |, the null hypothesis is
accepted.
(vii) Conclusion: It can be regarded as a random sample from a large population with
mean = 3.25.
If μ is unknown, then 95% confidence limits for μ are as follows:
 2.61
x  1.96  3.4  1.96  3.4  0.1705  (3.2295 & 3.5705)
n 900
 3.2295    3.5705
Example 6: The mean muscular endurance score of a random sample of 60 subjects
was found to 145 with a variance 1600. Construct a 95% confidence interval for the
mean. Assume the sample size to be large enough for normal approximation.

Mr. Amar Tikole (9272319199), KIT’s College of Engineering (Autonomous), Kolhapur Page 8
Solution : Here n  60, x  145, s 2  1600, s    40 .
95% confidence limits for  are
 40
x  1.96  145  1.96  145  10.12  134.88 & 155.12
n 60
95% confidence interval for  is (134.88, 155.12).

Examples for Practice

Example 1: Describe test of significance for single mean with suitable examples. Define its
confidence interval.
Example 2: The average marks in mathematics of a sample of 100 students were 51 with a
S.D of 6 marks. Could this have been a random sample from a population with average
marks 50?
Example 3: The following results are obtained from a sample of 100 boxes of biscuits. Mean
weight content = 490 gm S.D. of the weight = 9 gm could the sample come from a
population having a mean of 500 gm.
Example 4: The mean and the standard deviation of the eye sight of 50 different persons
were found to be 22 feet and 1.5 feet respectively. From these sample results can we say that
the hypothesis.”Average eye sight of the person in 20 feet” is true?
Example 5: From the survey of 105 people it was found that the mean and the standard
deviation of night sleep time are 6.75hrs and 0.35 hrs respectively. Using Z-test is it
reasonable to say that "Avg. Sleep time is more than 6 hrs."?
Example 6: A machine is set to produce metal plates of thickness 1.5cm with SD 0.2 cm. A
sample of 100 plates produced by the machine gave an average thickness of 1.52cm. is the
machine fulfilling the purpose?
Example 7: A sample of 50 items gives the mean 6.2 and S.D 10.24.Can it is regarded as
drawn from a normal population with mean 5.4 at 5%level of significance.
Example 8: Can it be concluded that the average life span of an Indian is more than 70
years, if a random sample of 100 Indians has an average life span of 71.8 years with standard
deviation of 7.8 years.
Example 9: A sample of 50 pieces of certain type of string was tested. The mean breaking
strength turned out to be 14.5 ponds. Test whether the sample is from batch of a string
having a mean breaking strength of 15.6pounds and standard deviation of 2.2 pounds.
Mr. Amar Tikole (9272319199), KIT’s College of Engineering (Autonomous), Kolhapur Page 9
Example 10: A random sample of size 36 has 53 as mean and sum of squares of deviation
from mean is 150.Can this sample be regarded as drawn from the population having 54 as
mean.
Example 11: The mean height of random sample of 100 individuals from a population is
160c.m.The S.D of the sample is 10c.m.Would it be reasonable to suppose that the mean
height of the population is 165c.m?
Example 12: The mean weight obtained from a random sample of size 100 is 64 gm. The
S.D of the weight distribution of the population is 3 gm. Test the statement that the mean
weight of the population is 67 gm, at 5% l.o.s.

Mr. Amar Tikole (9272319199), KIT’s College of Engineering (Autonomous), Kolhapur Page 10
Test 2: Test of Significance for Difference of Means:
Let x1 be the mean of a sample of size n1from a population with mean μ1and
variance σ12 and x2 be the mean of a sample of size n2 from a population with mean μ2 and
variance σ2.2
To test the null hypothesis H0: μ1 = μ2 i.e. there is no significance difference between
the two population means, the test statistic (for large samples), is:
x1  x2
Z ~ N (0,1)
2 2
1 2

n1 n2
Note: 1. If σ12 = σ22 = σ2, i.e. if the samples have been drawn from the populations with same
population standard deviation (S.D) σ, then under the null hypothesis test statistic is:
x1  x2
Z ~ N (0,1)
1 1
 
n1 n2

n1s12  n2 s 2 2
2. If σ1 = σ2 = σ , and σ is unknown then use its estimate  
2 2 2
then under the
n1  n2
null hypothesis test statistic is:
x1  x2
Z ~ N (0,1)
1 1
 
n1 n2
3. If σ12 ≠ σ22 and σ1 and σ2 are unknown then for large sample use its estimates σ12 = s12
and σ22 = s22, then under the null hypothesis test statistic is:
x1  x2
Z ~ N (0,1)
2 2
s1 s
 2
n1 n2

4. Confidence Limits for difference between means (1  2 ) :

  2  2 
95% confidence limits for μ1 - μ2 are ( x1  x2 )  1.96 1  2 .
 n1 n2 
 
  2  2 
99% confidence limits for μ1 - μ2 are ( x1  x2 )  2.58 1  2 .
 n1 n2 
 

Mr. Amar Tikole (9272319199), KIT’s College of Engineering (Autonomous), Kolhapur Page 11
5. Table of Critical Values for Large Samples:
Critical Value Level of Significance (α)
Zα 1% (0.01) 5% (0.05)
Two-tailed test |Zα| = 2.58 |Zα| = 1.96
Right-tailed test Zα = 2.33 Zα = 1.645
Left-tailed test Zα = - 2.33 Zα = - 1.645

6. If test statistics Z is less than Critical value then accept the null hypothesis otherwise
reject the null hypothesis.

Solved Examples

Example 1: The means of simple samples of sizes 400 and 400 are 250 and 220
respectively. Can the samples be drawn from the two populations having S.D. 40 and 55
respectively having the same population means at 5% level of significance? Also find 95%
confidence limit for the differences between means.
Solution: Here n1 400, n 2  400, x1  250, x2  220, 1  40 &  2  55.
(i) Null Hypothesis: H0: μ1 = μ2 (i.e. samples are drawn from populations with same
mean)
(ii) Alternative Hypothesis: H1: μ1 ≠ μ2 (Two-tailed)
(iii) Level of Significance: α = 0.05
(iv)Critical value: For two-tailed test, the value of Zα at 5% level of significance from
the table = 1.96 i.e. | Zα | =1.96
(v) Test Statistic: Under H0, the test statistic for given sample is
x1  x2 250  220
Z ~ N (0,1) Z Z  8.8227
2 2 2 2
1 2 40 55
 
n1 n2 400 400

(vi) Test Criteria: Since test statistic |Z| > critical value | Zα |, the null hypothesis is
rejected.
(vii) Conclusion: The samples are not drawn from populations with same mean.

Mr. Amar Tikole (9272319199), KIT’s College of Engineering (Autonomous), Kolhapur Page 12
Confidence interval for difference between means:
95% confidence limits for μ1 - μ2 is,
  2  2   40 2 55 2 
( x1  x2 )  1.96 1  2   (250  220)  1.96  
 n1 n2   400 400 
   
 23.3352 & 36.6647
 23.3352  | 1   2 |  36.6647
Example 2: The means of sample sizes 400 and 1600 are 70 and 65 cm. respectively. Can
the samples be regarded as drawn from the same population of variance 4 at 1% level of
significance?
Solution: Here n1 400, n 2  1600, x1  70, x2  65,   2.
(i) Null Hypothesis: H0: μ1 = μ2 (i.e. samples are drawn from the same populations)
(ii) Alternative Hypothesis: H1: μ1 ≠ μ2 (Two-tailed)
(iii) Level of Significance: α = 0.01
(iv)Critical value: For two-tailed test, the value of Zα at 5% level of significance from
the table = 2.58 i.e. | Zα | =2.58
(v) Test Statistic: Under H0, the test statistic for given sample is
x1  x2 70  65
Z   44.7227
1 1 1 1
  2 
n1 n2 400 1600
(vi) Test Criteria: Since test statistic |Z| > critical value | Zα |, the null hypothesis is
rejected.
(vii) Conclusion: The samples are not drawn from the same populations.

Example 3: The average of marks scored by 32 boys is 72 with standard deviation 8 while
that of 36 girls is 70 with standard deviation 6. Test at 1% level of significance whether the
boys perform better than the girls.
Solution: Since the population S.D. is unknown but sample S.D. s is known and the sample
is large  1  s1 8 &  2  s 2  6. Here n1 32, n 2  36, x1  72, x2  70.
(i) Null Hypothesis: H0: μ1 = μ2 (i.e. there is no significance difference between
performance of boys and girls.)
(ii) Alternative Hypothesis: H1: μ1 > μ2 (right – tailed test)
(iii) Level of Significance: α = 0.01
Mr. Amar Tikole (9272319199), KIT’s College of Engineering (Autonomous), Kolhapur Page 13
(iv)Critical value: For two-tailed test, the value of Zα at 5% level of significance from
the table = 2.33 i.e. | Zα | =2.33
(v) Test Statistic: Under H0, the test statistic for given sample is
x1  x2 72  70
Z   1.1547
2 2
s1 s 82 62
 2 
n1 n2 32 36

(vi) Test Criteria: Since test statistic |Z| < critical value | Zα |, the null hypothesis is
accepted.
(vii) Conclusion: There is no significance difference between performance of boys
and girls. i. e. the boys do not perform better than the girls.

Example 4: The mean height of 50 male students who showed above average participation
in college athletics was 68.2 inches with a standard deviation of 2.5 inches; while who
showed no interest in such participation had a mean height of 67.5 inches with a standard
deviation of 2.8 inches.
(i) Test the hypothesis that male students who participate in college athletics are taller than
other male students.
(ii) By how much should the sample size of each of the two groups be increased in order that
the observed difference of 0.7 inches in the mean heights be significant at the 5% level of
significance.
Solution: Here n1 50, x1  68.2, s1  2.5, n 2  50, x2  67.5, s1  2.8.
(i) Null Hypothesis: H0: μ1 = μ2 (i.e. male students who participate in college
athletics are not taller than other male students.)
(ii) Alternative Hypothesis: H1: μ1 > μ2 (right – tailed test)
(iii) Level of Significance: α = 0.05
(iv)Critical value: For two-tailed test, the value of Zα at 5% level of significance from
the table = 1.645 i.e. | Zα | =1.645
(v) Test Statistic: Under H0, the test statistic for given sample is
x1  x2 68.2  67.5
Z   1.3186
2 2 2 2
s1 s 2.5 2.8
 2 
n1 n2 50 50

Mr. Amar Tikole (9272319199), KIT’s College of Engineering (Autonomous), Kolhapur Page 14
(vi) Test Criteria: Since test statistic |Z| < critical value | Zα |, the null hypothesis is
accepted.
(vii) Conclusion: There for the male students who participate in college athletics are
not taller than other male students.
To find n:
The difference between the mean heights of two groups, each of size n will be
significant at5% level of significance if z  1.645.

68.2  67.5 0.7


  1.645 or  1.645
2.5 2 2.8 2 14.09
 n
n n
2
 1.645  3.754 
n   77.83  78
 0.7 
Hence, the sample should be increased by at least 78 – 50 = 28.

Mr. Amar Tikole (9272319199), KIT’s College of Engineering (Autonomous), Kolhapur Page 15
Examples for Practice

Example 1: Two samples of large populations gave the following results.


Mean S.D Sample Size
I Sample 250 40 400

II Sample 220 55 400


Test the null hypothesis H0 : µ1 = µ2 against H1 : µ1 ≠ µ2 at 1%level of significance.
Example 2: The means of two random samples of size 300 and 400 drawn from two
populations having standard deviation 9 and 10 resp. are 50 and 52. Test the equality of the
means of the two populations.
Example 3: The means of two random samples of size 1000 and 2000 respectively are 67.5
and 68.Can the samples be regarded as drawn from the same population of standard
deviation 2.5 inches?
Example 4: Following data is related to the weights of apples collected from two
different farms:

Farm-I Farm-II
No. of apples weighed 75 120
Mean weight 60gm 55gm
Standard deviation 2.5gm 4gm

Using above data, can we say that the average weight of apples of farm-I is more than the
average weight of farm-II.
Example 5: Intelligence test of two groups of boys and girls gives the following results
Girls: Mean = 84, S.D. = 10, N =121,
Boys: Mean =81, S.D. =12, N=81
Is the difference in mean scores significant?
Example 6: The mean life of a sample of 100 electric bulbs was found to be 1456 hours
with S.D 400 hours. A second sample of 225 bulbs chosen from a different batch showed a
mean life of 1400 hours. Assuming that the two populations have same S.D.Is there any
significant difference between the mean of two batches? Also find 95% confidence limit for
difference between means.

Mr. Amar Tikole (9272319199), KIT’s College of Engineering (Autonomous), Kolhapur Page 16
Example 7: Test the significance of the difference between the means of two normal
populations with the same standard deviation from following results.
Mean S.D Sample Size
I Sample 64 6 100

II Sample 67 8 200
Test the null hypothesis H0 : µ1=µ2 against H1 : µ1≠µ2 at 5%level of significance.
Example 8: Two samples drawn from two different populations gave the following results.
Mean S.D Sample Size
I Sample 340 25 125

II Sample 380 30 150


Test the hypothesis at 5%l.o.s.that the difference of the means of the two populations is 45.
Example 9: The number of accident per day was studied for 144 days in town A and for 100
days in town B, and the following information was obtained.
Mean S.D Mean S.D
Town A 4.5 1.2 Town B 5.4 1.5

Is the difference between the mean accidents of the two towns statistically significant?
Example 10: Intelligence tests of two groups of boys and girls obtained from two normal
populations having the same standard deviations gave the following results
Mean S.D Sample Size
Girls 84 10 121

Boys 81 12 81
Are the differences between the means significant?
_
Example 11: For sample I, n1  1000,  X  49000,  ( x  x ) 2  784000

_
For sample II, n 2  1500,  y  70500,  ( y  y ) 2  2400000
Discuss the significance of the difference of the sample means.

Mr. Amar Tikole (9272319199), KIT’s College of Engineering (Autonomous), Kolhapur Page 17
Small Sample Test
Introduction:
The entire large sample theory was based on the application of normal test .However;
if the sample size (n) is small (n < 30), the distribution of the various statistics are far from
normality and such normal test cannot be applied if sample is small.
Student’s t - Distribution:
Consider a small sample of size n drawn from a normal population with mean μ and variance
σ2.Then student’s t is defined as,
x
t ~ t n -1 d.f.
s/ n
If we calculate t for each sample, we obtain the sampling distribution for t. This distribution
is known as student’s t - distribution.

Mr. Amar Tikole (9272319199), KIT’s College of Engineering (Autonomous), Kolhapur Page 18
Test 3: Test of Significance for Single Mean in Small sample test:
If a random sample xi (i =1, 2,…, n ) of size n has been drawn from normal population.
To test the null hypothesis H0: μ = μ0 that the sample has been drawn from a population with
mean μ0 and variance σ2 i.e. there is no significant difference between the sample mean x
and population mean μ0 , the test statistic (for small samples), is:
x
t ~ t n -1 d.f.
s / n 1
n n
 xi  ( xi  x ) 2
 xi
2
i 1 i 1
Where, sample mean  x  and s  2
  (x ) 2
n n n
n
 ( xi  x ) 2
 xi
2
i 1
hence, s    (x ) 2
n n
follows Student’s t – distribution with n – 1 degrees of freedom (d.f).
We now compare the calculated value of t with the tabulated value at certain level of
significance. If calculated | t | < tabulated t, null hypothesis is accepted otherwise rejected.

Note: Confidence Limits for μ:


 s 
95% confidence limits for μ are x  t 0.05  .
 n 1 
 s 
99% confidence limits for μ are x  t 0.01  .
 n 1 
Assumption for Student’s t – test:
1. The parent population from which the sample drawn is normal.
2. The sample observations are independent.
3. The population standard deviation (σ) is unknown.

Mr. Amar Tikole (9272319199), KIT’s College of Engineering (Autonomous), Kolhapur Page 19
Solved Examples
Example 1: A machinist is making engine parts with axle diameters of 0.700 inch. A
random sample of 10 parts shows a mean diameter of 0.742 inch with a standard deviation of
0.040 inch. Compute the statistic you would use to test whether the work is meeting the
specifications.
Solution: Here   0.7, x  0.742, s  0.04 & n  10.
(i) Null Hypothesis, H0: μ = 0.7(i.e. the product is confirming the specifications.)
(ii) Alternative Hypothesis, H1: μ ≠ 0.7 (Two-tailed)
(iii) Level of Significance: α = 0.05
(iv) Critical value: For two-tailed test, at 5% level of significance the value of t α , n-1d.f
i.e. t 0.05 , 9 d.f from the table = 2.262.
(v) Test Statistic: Under H 0 , the test statistic is
x
t ~ t ( n 1)
s n 1
(0.742  0.7)
t t  3.15
0.04 9
(vi) Test Criteria: Since calculated | t | > tabulated t, hence the null hypothesis is
rejected.
(vii) Conclusion: we conclude that the product is not meeting the specifications.

Example 2: The mean weekly sales of soap bars in departmental stores were 146.3 bars per
store. After an advertising campaign the mean weekly sales in 22 stores for a typical week
increased to 153.7 and showed a s.d. of 17.2. Was the advertising campaign successful?
Solution: Given n  22, x  153.7, s  17.2.
(i) Null Hypothesis, H0: μ = 146.3(i.e. the advertising campaign is not successful.)
(ii)Alternative Hypothesis, H1: μ > 146.3 (Right-tailed)
(iii) Level of Significance: α = 0.05
(iv)Critical value: For right-tailed test, at 5% level of significance the value of
t α , n-1d.f i.e. t 0.05 , 21 d.f from the table = 1.721
(v) Test Statistic: Under H 0 , the test statistic is
x 153.7  146 .3
t ~ t ( n 1) t t  1.9715
s n 1 17.2 21

Mr. Amar Tikole (9272319199), KIT’s College of Engineering (Autonomous), Kolhapur Page 20
(vi)Test Criteria: Since calculated | t | > tabulated t, hence the null hypothesis is
rejected.
(vii) Conclusion: We conclude that the advertising campaign is definitely successful
in promoting sales.

Example 3: A random sample of 10 boys had the following I.Q.’s: 70, 120, 110, 101, 88, 83,
95, 98, 107, 100. Do these data support the assumption of a population mean I.Q. of 100?
Find a reasonable range in which most of the mean I.Q. values of samples of 10 boys lie.
Solution: Let X = I.Q of the boy

 x  972  x
2
From given dada, n =10, μ = 100  96312

 xi
2
972 96312
x  97.2 s   (x ) 2   (97.2) 2  183.36  13.5410
10 n 10
(i) Null Hypothesis, H0: μ = 100(i.e. the data are consistent with the assumption of a
population mean I.Q. of 100)
(ii) Alternative Hypothesis, H1: μ ≠ 100 (Two-tailed)
(iii) Level of Significance: α = 0.05
(iv) Critical value: For two-tailed test, at 5% level of significance the value of t α , n-1d.f
i.e. t 0.05 , 9 d.f from the table = 2.262
(v) Test Statistic: Under H 0 , the test statistic is,
x 97.2  100
t ~ t ( n 1) t t  0.6203 t  0.6203
s n 1 13.5410 9
(vi)Test Criteria: Since calculated | t | < tabulated tα, hence the null hypothesis is
accepted.
(vii) Conclusion: We conclude that the data are consistent with the assumption of
mean I.Q. of 100 in the population.
Confidence interval for μ:
The 95% confidence limits within which the mean I.Q. values of samples of 10 boys
will lie are given by:
 s   13.5410 
x  t 0.05  .  97.2  2.262   86 .9901 & 107 .4099
 n 1   9 
Hence the required 95% confidence interval is [86.9901< μ < 107.4099].

Mr. Amar Tikole (9272319199), KIT’s College of Engineering (Autonomous), Kolhapur Page 21
Example 4: A random sample of 16 values from a normal population showed a mean of 41.5
inches and the sum of squares of deviations from this mean equal to 135 square inches. Show
that the assumption of a mean of 43.5 inches for the population is not reasonable. Obtain
95% and 99% confidential limits for the same.

Solution: Given n  16, x  41.5 inches &  ( x  x ) 2  135 sq. inches


n
 ( xi  x ) 2 135
i 1
hence, s    2.9047
n 16
(i) Null Hypothesis, H0: μ = 43.5(i.e. the data are consistent with the assumption that
mean of the population is 43.5 inches.)
(ii) Alternative Hypothesis, H1: μ ≠ 43.5 (Two-tailed)
(iii) Level of Significance: α = 0.05
(iv) Critical value: For two-tailed test, at 5% level of significance the value of t α , n-1d.f
i.e. t 0.05 , 15 d.f from the table = 2.131
(v) Test Statistic: Under H 0 , the test statistic is
x 41.5  43.5
t ~ t ( n 1) t t  2.6667 t  2.6667
s n 1 2.9047 15
(vi)Test Criteria: Since calculated | t | > tabulated tα, hence the null hypothesis is
rejected.
(vii) Conclusion: We conclude that the assumption of a mean of 43.5 inches for the
population is not reasonable.
Confidence interval for μ:
The 95% confidence limit for μ is given by:
 s   2.9047 
x  t 0.05  .  41.5  2.131   41.5  1.5982
 n 1   15 
Hence the required 95% confidence interval is [39.9018 < μ < 43.0982].
The 99% confidence limit for μ is given by:
 s   2.9047 
x  t 0.01  .  41.5  2.947    41.5  2.2102
 n 1   15 
Hence the required 95% confidence interval is [39.2898 < μ < 43.7102].

Mr. Amar Tikole (9272319199), KIT’s College of Engineering (Autonomous), Kolhapur Page 22
Examples for Practice

Example 1: Define t distribution. State assumption used for testing a specified mean.
Example 2: A random sample of 10observations of animal growth under standard condition
give 96.7, 84.8, 101.8, 78.3, 110.6, 93.4, 87.8, 91.3, 98.2, 88.7cm test the hypothesis that
they came from distribution with mean100.
Example 3: Tests made on the breaking strength of 10 pieces of a metal were gave the
results 578, 572, 570, 568, 572, 570, 570, 572, 596, and 584 kg. Test if the mean breaking
strength of the wire can be assumed as 577 kg .
Example 4: On an average the gain in the weight due to certain drug is 2.5 kg, if the gain in
the weights of five different subjects after applying this drug are 3, 2.8, 3.6, 3.2,3 using t-test
can we say that gain in weight is more than expected?
Example 5: 10 subjects tested for blood clotting time have shown the mean and the standard
deviation as 95 and 5 sec respectively. Using t-test, test the hypothesis that “The average
blood clotting time is 100 sec”
Example 6: The Company claims that a plant grown by its seed will give average 2kg of
production. The sample of such 10 plant has mean and standard deviation of the production
as 17 kg and 0.25kg respectively. From the sample results can we say that the claim of the
company is not true?
Example 7: A fertilizer mixing machine set to give 12 kg of nitrate for quintal bag of
fertilizer .Ten 100 kg bags are examined .The percentage of nitrate per bag are as follows:
11,14,13,12,13,12,13,14,11,12. Is there any reason to believe that the machine is defective?
Example 8: The nine items of a sample having the following values: 45, 47, 50, 52, 48, 47,
49, 53, and 51. Does the mean of these differ significantly from the assumed mean of 47.5?
Example 9: The following values give the lengths of 12 samples of Egyptian cotton taken
from a consignment: 48, 46, 49, 46, 52, 45, 43, 47, 47, 46, 45, and 50. Test if the mean
length of the consignment can be can be taken as 46.
Example 10: A company supplies tooth paste in a packing of 100 gm. A sample of 10
packing gave the following weights in gms.100.5, 100.3, 100.1, 99.8, 99.7, 100.3, 100.4,
99.2, 99.3, and 99.7.Does the sample support the claim of the company that the packing
weight 100 gms.
Example 11: A sample of 20 items has mean 42 units and S.D 5 units. Test the hypothesis
that it is a random sample from a normal population with a mean 45.
Mr. Amar Tikole (9272319199), KIT’s College of Engineering (Autonomous), Kolhapur Page 23
Example 12: The lifetime of electric bulbs for a random sample of 10from a large
consignment gave the following data:
Item 1 2 3 4 5 6 7 8 9 10
Life in‘000hrs’ 4.2 4.6 3.9 4.1 5.2 3.8 3.9 4.3 4.4 5.6
Can we accept the hypothesis that the average life of bulb is 4000hrs?

Mr. Amar Tikole (9272319199), KIT’s College of Engineering (Autonomous), Kolhapur Page 24
Test 4: Test of Significance for Difference of Means in small sample test:
Suppose we want to test if two independent samples Xi (i = 1, 2,…n1) and
Yj (i = 1,2,…n2) of sizes n1 and n2 have been drawn from two normal populations with
means μ1 and μ2 respectively.
Under the null hypothesis H0: μx = μy that the sample has been drawn from the normal
populations with means μ1 and μ2 and under the assumption that the population variance
are equal, i.e. σ12 = σ22 = σ2, the test statistic (for small samples), is:
xy
t ~ t n 1  n 2 - 2 d.f.
1 1
S 
n1 n2

n1 n2
 xi yj
i 1 j 1
Where, sample mean  x  and y 
n1 n2
n1 n2
 ( xi  x )  ( yj  y ) 2 x   y 
2
 2 2
2 i 1 j 1 i  n1 ( x ) 2  j  n2 ( y ) 2
S  
n1  n2  2 n1  n2  2

2 n1S12  n2 S 2 2
S 
n1  n2  2

follows Student’s t – distribution with n1 + n2 – 2 degrees of freedom (d.f).


We now compare the calculated value of t with the tabulated value at certain level of
significance. If calculated | t | < tabulated t, null hypothesis is accepted otherwise rejected.

Assumption for Student’s t – test:


1. The parent population from which the samples are drawn is normal.
2. The two samples are random and independent of each other.
3. The population variances are equal and unknowns i.e. σ12 = σ22 = σ2.Where σ2 is unknown.

Mr. Amar Tikole (9272319199), KIT’s College of Engineering (Autonomous), Kolhapur Page 25
Solved Examples
Example 1: For random sample of 10 pigs fed on diet A, the increases in weight in pounds
in a certain period were 10, 6, 16, 17, 13, 12, 8, 14, 15, 9 lbs. For another random sample of
12 pigs fed on diet B, increases in weight in the same period were 7, 13, 22, 15, 12, 14, 18, 8,
21, 23, 10, 17 lbs. Test whether diets A and B differ significantly as regards the effect on
increase in weight is concerned.
Solution: Here, X = Diet A and Y = Diet B

n1  10, n 2  12,  x  120, x 2  1560,  y  180, y 2  3014


n1 n2
 xi yj
120 j 1 180
Where, x  i 1   12 and y    15
n1 10 n2 12

S 2

x i
2
  y
 n1 ( x ) 2  j
2
 n2 ( y ) 2 
n1  n2  2

S 2

1560  10(12)   3014  12(15)   21.7
2 2
S  21.7  4.6583
10  12  2
(i) Null Hypothesis, H0: μx = μy (i.e. there is no significant difference between the
mean increase in weight due to diets A and B.)
(ii) Alternative Hypothesis, H1: μx ≠ μy (Two-tailed)
(iii) Level of Significance: α = 0.05
(iv) Critical value: For two-tailed test, the critical value of t at 5% level of
significance for 20 degrees of freedom from the table = 2.086. i. e. t α , n1+n2-2d.f

= t 0.05 , 20 d.f = 2.086.


(v) Test Statistic: Under H 0 , the test statistic is,

xy 12  15
t ~ t n1  n 2 - 2 d.f. t  1.504
1 1 1 1
S  4.6583   
n1 n2  10 12 
t  1.504
(vi)Test Criteria: Since calculated | t | < tabulated tα, hence the null hypothesis is
accepted.
(vii) Conclusion: We conclude that the data are consistent with the assumption there
is no significant difference between the mean increase in weight due to diets A and B.

Mr. Amar Tikole (9272319199), KIT’s College of Engineering (Autonomous), Kolhapur Page 26
Example 2: The heights of 6 randomly chosen sailors in inches are 63, 65, 68, 69, 71, and
72. Those of 10 randomly chosen soldiers are 61, 62, 65, 66, 69, 69, 70, 71, 72 and 73. Test
whether the sailors are on the average taller than soldiers.
Solution: Here, X = Height of sailors and Y = Height of soldiers.

n1  6, n 2  10,  x  408,  x 2  27804,  y  678,  y 2  46122


n1 n2
 xi yj
408 j 1 678
Where, x  i 1   68 and y    67.8
n1 6 n2 10

S 2

x i
2
  y
 n1 ( x ) 2  j
2
 n2 ( y ) 2 
n1  n2  2

S2 
27804  6(68)   46122  10(67.5)   15.2571
2 2

6  10  2
S  15.2571  3.9060
(i) Null Hypothesis, H0: μx = μy (i.e. there is no significant difference between the
mean increase in weight due to diets A and B.)
(ii) Alternative Hypothesis, H1: μx > μy (Right-tailed)
(iii) Level of Significance: α = 0.05
(iv) Critical value: For right -tailed test, the critical value of t at 5% level of
significance for 14 degrees of freedom from the table = 1.761. i. e. t α , n1+n2-2d.f

= t 0.05 , 14 d.f = 1.761.


(v) Test Statistic: Under H 0 , the test statistic is,

xy 68  67.8
t ~ t n1  n 2 - 2 d.f. t  0.099
1 1 1 1 
S  3.9060   
n1 n2  6 10 
t  0.099
(vi)Test Criteria: Since calculated | t | < tabulated tα, hence the null hypothesis is
accepted.
(vii) Conclusion: We conclude that the data are consistent with the assumption that
the sailors are not on the average taller than soldiers.

Mr. Amar Tikole (9272319199), KIT’s College of Engineering (Autonomous), Kolhapur Page 27
Example 3: Samples of two types of electric light bulbs were tested for length of life and the
following results were obtained
Size sample mean sample S.D.
Type 1 8 1234 hrs 36 hrs
Type 2 7 1036 hrs 40 hrs
Is the difference in the means significant to generalize that type 1 is superior to type 2
regarding length of life.
Solution: Here n1  8, x1  1234, s1  36, n2  7, x2  1036, s2  40
1
S2  ( n1s12  n2 s2 2 )
n1  n2  2
1
S 2  [8(36) 2  7( 40) 2 ]  1659.0769
872
S  1659.0769  40.7317
(i) Null Hypothesis, H0: μx = μy (i.e. two types of electric light bulbs are identical.)
(ii) Alternative Hypothesis, H1: μx > μy (Right-tailed)
(iii) Level of Significance: α = 0.05
(iv) Critical value: For right -tailed test, the critical value of t at 5% level of
significance for 13 degrees of freedom from the table = 1.761. i. e. t α , n1+n2-2d.f

= t 0.05 , 13 d.f = 1.771.


(v) Test Statistic: Under H 0 , the test statistic is,
xy
t ~ t n 1  n 2 - 2 d.f.
1 1
S 
n1 n2

1234  1036
t  9.3924
1 1
40.7317   
8 7
t  9.3924
(vi)Test Criteria: Since calculated | t | > tabulated tα, hence the null hypothesis is
rejected.
(vii) Conclusion: We conclude that type 1 is superior to type 2 regarding length of
life of bulb.

Mr. Amar Tikole (9272319199), KIT’s College of Engineering (Autonomous), Kolhapur Page 28
Examples for Practice

Example 1: Two types of drugs A and B indigenous on 5 and 7 patients for reducing their
weight. Drug A was imported and drug B indigenous. The decrease in weight after using the
drugs for six months was as follows:
Drug A 10 12 13 11 14
Drug B 8 9 12 14 15 10 9
Is there is a significant difference in the efficacy of the two drugs? If not which drug should
you buy?
Example 2: Two samples of sodium vapour bulbs were tested for length of life and the
following results were obtained
Size Sample Mean Sample S.D.
Type 1 8 1026 hrs 32 hrs
Type 2 7 1065 hrs 39 hrs
Is the different in the means significant to generalize that type 1 is superior to type 2
regarding length of life.
Example 3: The height of 5 randomly chosen sailors in inches is 63, 65, 68, 71, and 72.
Those of 9 randomly chosen soldiers are 61, 62, 65, 66, 69, 70, 71, 72 and 73. Test whether
the sailors are on the average taller than soldiers.
Example 4: Following are the results of growth in weight due to two different diets.
Diet A Diet B
No. of Subject 10 15
Mean weight 2.2 kg 3.5 kg
Standard deviation 0.75 kg 1.20 kg
Using t-test is it reasonable to accept the hypothesis "Average weight is same for the both
diets”?
Example 5: A group of 10 boys fed on diet A and another group of 8 boys fed on a different
diet B recorded the following increase in weight (kgs).
Diet A: 5 6 8 1 12 4 3 9 6 10
Diet B: 2 3 6 8 10 1 2 8
Does it show the superiority of diet A over the diet B?

Mr. Amar Tikole (9272319199), KIT’s College of Engineering (Autonomous), Kolhapur Page 29
Example 6: The following are the number of series with a sample of 6 sales people of gas
lighter in city A and a sample of 8 sales people of gas lighter in another city B; made over a
certain fixed period of time.

City A 63 48 54 44 59 52

City B 41 52 38 50 66 54 44 61

Assuming that the population sampled can be approximated closely, with normal
distribution, having the same variance, tests HO: µ1=µ2 against H1:µ1≠µ2 at the 5% level of
significance.
Example 7: 7 plants of wheat grown in pots and given standard fertilizer treatment yield
respectively 8.4, 4.5, 3.8, 6.1, 4.7, 11.2, 9.6 gms dry weight of seed. A further 8 plants from
the same source are grown in similar conditions but with different fertilizers and yield
respectively.1.6, 7.5, 10.4, 8.4, 13.0, 7.0, 9.6, 13.2 gms. Test whether the two fertilizers
treatments have different effect on seed production.

Mr. Amar Tikole (9272319199), KIT’s College of Engineering (Autonomous), Kolhapur Page 30
Test 5: Paired t – test for Difference of Means: :( Samples are not Independent)
Let us now consider the case when i) The sample sizes are equal, i.e. n1 = n2 = n
ii) The two samples are not independent but the sample observations are paired together.
Let (Xi ,Yj) (i = 1,2,…n) be pairs of observations recorded on n individuals at two different
times. Assuming normality, we want to test, the null hypothesis H0: μD = 0 that the sample
means differ significantly or not.
Where, μD refers to the difference in average. Let the differences of paired data be
denoted by, di = (xi - yi ) (i = 1,2,…n), the test statistic is:
d
t ~ t n - 1 d.f.
S
n
n n

Where, sample mean  d 


 di
i 1
,s 2 i 1
 (d i  d ) 2

 d i
2
 n (d ) 2 
n n 1 n 1
follows Student’s t – distribution with n – 1 degrees of freedom (d.f).
We now compare the calculated value of t with the tabulated value at certain level of
significance. If calculated | t | < tabulated t, null hypothesis is accepted otherwise rejected.

Solved Examples

Example 1: A certain stimulus administered to each of the 12 patients resulted in the


following increase of blood pressure, 5, 2, 8, -1, 3, 0, -2, 1, 5, 0, 4 and 6 Can it be concluded
that the stimulus will, in general, accompanied by an increase in blood pressure?
Solution: Here we are given the increments in blood pressure, i.e. di = (xi - yi ) (i = 1,2,…12)
n  12,  d  31,  d 2  185

d 
d 31
  2.5833 S 2

 d i

2

 n (d ) 2 185  12(2.5833) 2
 9.5380
n 12 n 1 11
S  9.5380  3.0883

(i) Null Hypothesis, H0: μD = 0 (i.e. there is no significant difference in the blood
pressure readings of the patients before and after the drug.)
(ii) Alternative Hypothesis, H1: μD > 0 (Right-tailed) (i.e.the stimulus results in an
increase in blood pressure.)
(iii) Level of Significance: α = 0.05

Mr. Amar Tikole (9272319199), KIT’s College of Engineering (Autonomous), Kolhapur Page 31
(iv) Critical value: For single-tail test, the critical value of t at 5% level of
significance for 11 degrees of freedom from the table = 1.796.
i. e t α , n-1d.f = t 0.05 , 11 d.f =1.796
(v) Test Statistic: Under H 0 , the test statistic is,

d
t ~ t n - 1 d.f.
S
n
2.5833
t  2.8976
3.0883
12
t  2.8976
(vi)Test Criteria: Since calculated | t | > tabulated tα, hence the null hypothesis is
rejected.
(vii) Conclusion: We conclude that the stimulus will, in general, accompanied by an
increase in blood pressure.
Example 2: 10 soldiers visit a rifle range for two consecutive weeks. For the first week their
scores are: 67, 24, 57, 55, 63, 54, 56, 68, 33, and 43 and during the second week they score
in the same order 70, 38, 58, 58, 56, 67, 68, 72, 42, and 38. Examine if there is any
significant difference in their performance.
Solution: If the rifle range score of the first and second week be represented by the
variables X and Y respectively.
Here difference in their performance, i.e. di = (xi - yi ) (i = 1,2,…10)

n  10,  d  47  d 2  699

d
d 
 47
 4.7 S 2

 d i

2

 n (d ) 2 699  10(4.7) 2
 53.1222
n 10 n 1 9
S  53.1222  7.2884

(i) Null Hypothesis, H0: μD = 0 (i.e. there is no significant difference in their


performance.)
(ii) Alternative Hypothesis, H1: μD > 0 (Right-tailed)
(iii) Level of Significance: α = 0.05
(iv) Critical value: For single-tail test, the critical value of t at 5% level of
significance for 9 degrees of freedom from the table = 1.833.

Mr. Amar Tikole (9272319199), KIT’s College of Engineering (Autonomous), Kolhapur Page 32
i. e t α , n-1d.f = t 0.05 , 9 d.f =1.833
(v) Test Statistic: Under H 0 , the test statistic is,

d
t ~ t n - 1 d.f.
S
n
 4 .7
t  2.0392
7.2884
10
t  2.0392
(vi)Test Criteria: Since calculated | t | > tabulated tα, hence the null hypothesis is
rejected.
(vii) Conclusion: We conclude that their performance in second week is better than
first week.

Example 3: 11 school boys were given a test in statistics. They were given a month’s extra
coaching and a second test was held at the end of it. Do the marks give the evidence that the
students have benefited by extra coaching?
Boys 1 2 3 4 5 6 7 8 9 10 11
Marks in 1st test 23 20 19 21 18 20 18 17 23 16 19
Marks in 2nd test 24 19 22 18 20 22 20 20 23 20 18

Solution: Here the marks in 1st test and 2nd test be represented by the variables X and Y
respectively, then difference in their marks, i.e. di = (xi - yi ) (i = 1,2,…11)

n  11,  d  12  d 2  58

d 
d 
 12
 1.0909 S 2

 d  i
2
 n (d ) 2 58  11(1.0909) 2 4.4909
n 11 n 1 10
S  4.4909  2.1191

(i) Null Hypothesis, H0: μD = 0 (i.e. there is no significant difference in their marks
after extra coaching.)
(ii) Alternative Hypothesis, H1: μD > 0 (Right-tailed)
(iii) Level of Significance: α = 0.05
(iv) Critical value: For single-tail test, the critical value of t at 5% level of
significance for 10 degrees of freedom from the table = 1.812.

Mr. Amar Tikole (9272319199), KIT’s College of Engineering (Autonomous), Kolhapur Page 33
i. e t α , n-1d.f = t 0.05 , 9 d.f =1.812
(v) Test Statistic: Under H 0 , the test statistic is,

d
t ~ t n - 1 d.f.
S
n
 1.0909
t  1.7073
2.1191
11
t  1.7073
(vi)Test Criteria: Since calculated | t | < tabulated tα, hence the null hypothesis is
accepted.
(vii) Conclusion: We conclude that there is no significant difference in their marks
after extra coaching.

Mr. Amar Tikole (9272319199), KIT’s College of Engineering (Autonomous), Kolhapur Page 34
Examples for Practice

Example 1: The following data represent the marks obtained by 12 students in 2 tests, one
held before coaching and the other after coaching.
Test 1: 55 60 65 75 49 25 18 30 35 51 61 72
Test 2: 63 70 70 81 54 29 21 38 32 50 70 80
Do the data indicate that the coaching was effective in improving the performance of the
students?
Example 2: Memory capacity of 9 students was tested before and after a course of
meditation for a month .State whether the course was effective or not from the data below (in
same units):
before 10 15 9 3 7 12 16 17 4
after 12 17 8 5 6 11 18 20 3

Example 3: Eleven students were given a test in statistics. They were given a month’s
further tuition and a second test of equal difficulty was held at the end of it. The marks give
evidence that the students have benefitted by extra coaching?
boys 1 2 3 4 5 6 7 8 9 10 11
Marks 1 test 23 20 19 21 18 20 18 17 23 16 19
Marks 2 test 24 19 22 18 20 22 20 20 23 20 17

Example 4: A drug was administrated to 5 persons and the systolic blood pressure before
and after was measured. The result are given below
Candidates 1 2 3 4 5
B.P. before 140 130 132 150 140
B.P. After 132 126 133 144 133
Test whether the drug is effective in lowering the systolic blood pressure.

Example 5: A certain injection administered to 12 patients resulted in the following changes


of blood pressure:5,2,8,-1,3,0,6,-2,1,5,0,4.
Can it be concluded that the injection will be in general accompanied by an increase in blood
pressure?
Mr. Amar Tikole (9272319199), KIT’s College of Engineering (Autonomous), Kolhapur Page 35
Example 6: The sales data of an item in six shops before and after a special promotional
campaign are as under.
Shops A B C D E F
Before campaign 53 28 31 48 50 42
After campaign 58 29 30 55 56 45

Can the campaign be judged to be a success at 5% level of significance?

Mr. Amar Tikole (9272319199), KIT’s College of Engineering (Autonomous), Kolhapur Page 36
Chi
Chi-square distribution

Introduction:
The square of a standard normal variate is known as Chi-
Chi square variate with 1 degrees of
freedom (d. f).
2
2 x x
2
Thus, if X ~ N (  , ), then Z  ~ N (0,1) and Z    is a chi-square
   
variate with 1 d. f.
In general if Xi (i = 1, 2, 3…., n) are n independent normal variates with mean μi and
2
n  X  i 
variances σi , (i = 1, 2, 3…., n), then     i
2 2
 , is a chi-square
square variate with n d. f.
i 1  i 
Applications of Chi - Square distribution:
1. To test if the hypothetical value of the population variance i.e. σ2 = σ02
2. To test the goodness of fit.
3. To test the independence of the attributes.
Test 6: Chi – Square test for population Variance:
Suppose we want to test, if a random sample xi ,(i = 1,2,…n) this has been drawn from a
normal population with a specified variance σ2 = σ02 i.e. H0: σ2 = σ02 using test statistic,

  2( xi  x ) 2 (n  1) s 2
 ~  2 n 1 d . f
2 2
0 0
n n
 xi  ( xi  x ) 2
Where, sample mean  x  i 1
and s  2 i 1

 x2  n ( x)2
n n 1 n 1
Test criteria:

1. If H0: σ2 = σ02 vs. H1: σ2 ≠ σ02 then accept null hypothesis for  2   2  , n 1 d . f
2

2. If H0: σ2 = σ02 vs. H1: σ2 > σ02 then accept null hypothesis for  2   2  , n 1 d . f

3. If H0: σ2 = σ02 vs. H1: σ2 < σ02 then accept null hypothesis for  2   2 1   , n  1 d . f

4. (1- α) % Confidence limits for σ2:

(n  1) S 2 (n  1) S 2
 2 
 2  , n 1 d . f 2  , n 1 d . f
1
2 2

Note: If sample standard deviation is given then use test statistic,

2 ns 2
  2
~  2 n 1 d . f
0

Mr. Amar Tikole (9272319199), KIT’s College of Engineering (Autonomous), Kolhapur Page 38
Solved Examples
Example 1: It is believed that the precision (as measured by the variance) of an instrument is
no more than 0.16. Write down the null and alternative hypothesis for testing this belief.
Carry out the test at 1% level given 11 measurement of the same subject on the instrument,
2.5, 2.3, 2.4, 2.3, 2.5, 2.7, 2.5, 2.6, 2.6, 2.7, and 2.5.

Solution: Here we are given the measurement, n  11,  x  27.6,  d 2  69.44

x
 x  27.6  2.5090 S 2

 x2  n (x)2

69.44  11(2.5090) 2
 0.0189
n 11 n 1 10

(i) Null Hypothesis, H0: σ2 = 0.16 (i.e. Samples are drawn from population variance
0.16.)
(ii) Alternative Hypothesis, H1: σ2 > 0.16 (Right-tailed)
(iii) Level of Significance: α = 0.01

(iv) Critical value: For single-tail test, the critical value of  2 at 1% level of
significance for 10 degrees of freedom from the table = 23.209
 2  , n 1 d . f   2 0.01 , 10 d . f  23.209
(v) Test Statistic: Under H 0 , the test statistic is,

2 (n  1) s 2
  2
~  2 n 1 d . f
0
10  0.0189
2   1.1818
0.16
(vi)Test Criteria: Since calculated  2 < tabulated  2 hence the null hypothesis is
accepted.
(vii) Conclusion: We conclude that the Samples are drawn from population variance
0.16.

Example 2: 15 beans plants are grown in a pot in green house and their heights are measured
after a standard period of growth. In previous experiments, the variance of height has been
27.45 cm2, for those 15 plants S.D. of height is 5.81cm 1) test null hypothesis that variance
has not changed 2) find a 95% confidence interval for σ2.

Solution: Here we are given the measurement, n  15,  2  27.45, S  5.81

Mr. Amar Tikole (9272319199), KIT’s College of Engineering (Autonomous), Kolhapur Page 39
(i) Null Hypothesis, H0: σ2 = 27.45 (i.e. Samples are drawn from population variance
27.45.)
(ii) Alternative Hypothesis, H1: σ2 ≠ 27.45 (Two-tailed)
(iii) Level of Significance: α = 0.05

(iv) Critical value: For two-tail test, the critical value of  2 at 5% level of
significance for 14 degrees of freedom from the table = 26.119.
 2  / 2 , n 1 d . f   2 0.025 , 14 d . f  26.119
(v) Test Statistic: Under H 0 , the test statistic is,

ns 2
2  2
~  2 n 1 d . f
0

215  5.812
   18.4459
27.45
(vi)Test Criteria: Since calculated  2 < tabulated  2 hence the null hypothesis is
accepted.
(vii) Conclusion: We conclude that the Samples are drawn from population variance
27.45.
(1- α) % Confidence limits for σ2:
 2  / 2 , n 1 d . f   2 0.025 , 14 d . f  26.119 ,

 2 1  / 2 , n 1 d . f   2 0.975 , 14 d . f  5.629

(n  1) S 2 (n  1) S 2
 2 
 2  , n 1 d . f 2  , n 1 d . f
1
2 2

14  5.812 2 14  5.81
2
  
26.119 5.629
 18.0935   2  83.9554

Mr. Amar Tikole (9272319199), KIT’s College of Engineering (Autonomous), Kolhapur Page 40
Examples for Practice

Example 1: A random sample of size 20 from normal population gives a mean of 42 and
variance of 25; test the hypothesis that the population standard deviation is 8 at 5% l.o.s.
Also find 95% confidence interval for variance.

Example 2: A sample of 10 observations 4.5, 4.9, 3.5, 6.4, 1.3, 6.7, 2.4, 8.9, 3.4, and 2.9 is
thought to be taken from normal population from variance of 2.1; test whether the hypothesis
is reasonable.

Example 3: A random sample of size 15 values shows that the standard deviation is 6.4. Is
this compatible with the hypothesis that the sample is drawn from normal population with
variance 25.

Example 4: Test the hypothesis that σ = 8, given that S = 10, for a random sample of size 51
is drawn from normal population.

Example 5: A manufacturer claims that the life time of a certain brand of batteries
produced by his factory has a variance of 5000 hours2. A sample of size 26 has a variance of
7200 hours2. Assuming that it is reasonable to treat these data as a random sample from a
normal population, test the manufacturer claims.

Example 6: A manufacturer recorded the cut-off bias (volt) of a sample of 10 tubes as


follows: 12.1, 12.3, 11.8, 12.0, 12.4, 12.0, 11.9, 12.2, and 12.2. The variability of cut-off bias
for tubes of a standard type as measured by the standard deviation is 0.208 volts. Is the
variability of the new tube with respect to cut - off bias less than that of the standard type?

Mr. Amar Tikole (9272319199), KIT’s College of Engineering (Autonomous), Kolhapur Page 41
Test 7: Goodness of Fit Test:
Goodness of fit test is a very powerful test for testing the significance difference between
theory and experiment. It enables us to find if the deviation of the experiment from theory is
just by chance or it really due to the inadequacy of the theory to fit the observed data.
If Oi [ i = 1,2,3…n ] is a set of observed frequencies and Ei is the corresponding set of
expected frequencies then, test statistic,
n (Oi  Ei ) 2
 
2
~  2 n 1 d . f
i 1 Ei

follows chi-square distribution with (n – 1) degrees of freedom.


n n
Oi = Observed frequency. Ei = Expected frequency, and  Oi   Ei
i 1 i 1
Test criteria:

If calculated  2 is less than tabulated  2 , n 1 d . f then accept null hypothesis otherwise


reject.
Solved Examples

Example 1: The demand for a particular spare part in a factory was found to vary from day-
to-day. In a sample study the following information was obtained:
Days: Mon Tue Wed Thu Fri Sat
No. of part demanded: 1124 1125 1110 1120 1126 1115
Test the hypothesis that the number of parts demanded does not depend on the day of the
week.
Solution:
(i) Null Hypothesis: H0: The number of parts demanded does not depend on the day
of the week.
(ii) Alternative Hypothesis: H1: The number of parts demanded depends on the day
of the week.
(iii) Level of Significance: α = 0.05

(iv) Critical value: For (6-1) = 5 degrees of freedom the critical value of  2 at 5%

level of significance from the table = 11.071 i.e. 2 0.05, 5  11.071

(v) Test Statistic: On the basis of null hypothesis the expected frequencies of the
spare part on each of the six days would be:

Mr. Amar Tikole (9272319199), KIT’s College of Engineering (Autonomous), Kolhapur Page 42
1 6720
(1124  1125  1110  1120  1126  1115)   1120
6 6

Calculation of test statistic

O E (O  E ) 2
E
1124 1120 0.0142
1125 1120 0.0223
1110 1120 0.0892
1120 1120 0
1126 1120 0.0321
1115 1120 0.0223
Total 0.1803

(Oi  Ei ) 2
2    0.1803
Ei
(vi) Test Criteria: Since calculated chi-square is less than the critical value hence,
accept the null hypothesis.
(vii) Conclusion: We conclude that number of parts demanded does not depend on the
day of the week.

Example 2: The following figures show the distribution of digits in numbers chosen at
random from a telephone directory:
Digits: 0 1 2 3 4 5 6 7 8 9
Frequency: 1026 1107 997 966 1075 933 1107 972 964 853
Test whether the digits may be taken to occur equally frequently in the directory.
Solution:
(i) Null Hypothesis: H0: The digits occur equally frequently in the directory.
(ii) Alternative Hypothesis: H1: The digits will not occur equally frequently in the
directory.
(iii) Level of Significance: α = 0.05

(iv) Critical value: For (10-1) = 9 degrees of freedom the critical value of  2 at 5%

level of significance from the table = 16.919 i.e. 2 0.05, 5  16.919

Mr. Amar Tikole (9272319199), KIT’s College of Engineering (Autonomous), Kolhapur Page 43
(v) Test Statistic: On the basis of null hypothesis the expected frequencies for the
each of the digits:
1 10000
(1026  1107  997  966  1075  933  1107  972  964  853)   1000
10 10

Calculation of test statistic

O E (O  E ) 2
E
1026 1000 0.676
1107 1000 11.449
997 1000 0.009
966 1000 1.156
1075 1000 5.625
933 1000 4.489
1107 1000 11.449
972 1000 0.784
964 1000 1.296
853 1000 21.609
Total 58.542

(Oi  Ei ) 2
2    58.542
Ei
(vi) Test Criteria: Since calculated chi-square is greater than the critical value
hence, reject the null hypothesis.
(vii) Conclusion: We conclude that the digits are not uniformly distributed in the
directory.
Example 3: A sample analysis of examination results of 200 students was made. It was
found that 46 students had failed, 68 secured a third division, 62 secured a second division
and the rest were placed in first division. Are these figures commensurate with the general
examination result which is in the ratio of 4:3:2:1 for various categories respectively?
Solution:
(i) Null Hypothesis: H0: The observed figures do not differ significantly from the
general examination result which is in the ratio 4:3:2:1.
(ii) Alternative Hypothesis: H1: The observed figures differ significantly from the
general examination result which is in the ratio 4:3:2:1.

Mr. Amar Tikole (9272319199), KIT’s College of Engineering (Autonomous), Kolhapur Page 44
(iii) Level of Significance: α = 0.05

(iv)Critical value: For (4-1) = 3 degrees of freedom the critical value of  2 at 5%

level of significance from the table = 7.815 i.e. 2 0.05, 3  7.815

v) Test Statistic: On the basis of null hypothesis the expected frequencies for the
each of the division:
4 3
E ( 46)   200  80 E (68)   200  60
10 10
2 1
E (62)   200  40 E ( 24)   200  20
10 10
Calculation of test statistic

O E (O  E ) 2
E
46 80 14.45
68 60 1.0666
62 40 12.1
24 20 0.8
Total 28.4166

(Oi  Ei ) 2
2    28.4166
Ei
(vi) Test Criteria: Since calculated chi-square is greater than the critical value
hence, reject the null hypothesis.
(vii) Conclusion: We conclude that the data are not commensurate with the general
examination result.

Mr. Amar Tikole (9272319199), KIT’s College of Engineering (Autonomous), Kolhapur Page 45
Examples for Practice

Example 1: A die was thrown 132 times and the following frequencies were observed,
No. obtained 1 2 3 4 5 6 Total
Frequency 15 20 25 15 29 28 132
Test the hypothesis that the die is unbiased.
Example 2: In a cross breeding experiment with plants of certain species, 240 offspring
were classified into 4 classes with respect to the structures of their leaves as follows:
Class: I II III IV Total
Frequency: 21 127 40 52 240
According to the theory of heredity, the population of the four classes should be in ratio
1:9:3:3.Are these data with theory?
Example 3: The following data represents the monthly sales in Rs. of a certain retail store
in a leap year. Examine if daily sales are uniform through the year.
6100, 5600, 6350, 6050, 5250, 5200, 6300, 6250, 5800, 6000, 6150, 6150.
Example 4: In experiments on pea breeding, Mendel obtained the following frequencies of
seeds.
Round and Wrinkled and Round and Wrinkled
Yellow Yellow Green and Green
315 101 108 32
Theory predicts that the frequencies should be in proportion 9:3:3:1. Examine the
correspondence between theory and experiment.
Example 5: According to a theory the proportion of a community in the four classes A, B,
C, D should be 9:4:2:1. In a survey of 1600 items of this commodity the numbers in the four
classes were 882,432,168 and 118. Does the survey support the theory?

Mr. Amar Tikole (9272319199), KIT’s College of Engineering (Autonomous), Kolhapur Page 46
Test 8: Test of Independence of Attributes – Contingency Tables:
Let us consider two attributes A and B, A divided into r classes A1, A2…Ar and B divided
into c classes B1, B2…Bc. Such a classification in which attributes are divided into more than
two classes is known as manifold classification. The various cell frequencies can be
expressed in the following table known as r x c contingency table where (A i) is the number
of persons possessing the attribute Ai (i = 1, 2, 3…., r), (Bj) is the number of persons
possessing the attribute Bj (j = 1, 2, 3…., c), and (Ai Bj) is the number of persons possessing
r c
both the attribute Ai and Bj. also  ( Ai )   ( B j )  N , Where N is the total frequency.
i 1 j 1

r x c Contingency Tables
B
B1, B2 ……. Bj ……. Bc Total
A
A1 (A1 B1) (A1 B2) …… (A1 Bj) …… (A1 Bc) (A1)
A2 (A2 B1) (A2 B2) …… (A2 Bj) …… (A2 Bc) (A2)
. . . . .
Ai (Ai B1) (Ai B2) …… (Ai Bj) …… (Ai Bc) (Ai)
. . . . .
Ar (Ar B1) (Ar B2) …… (Ar Bj) …… (Ar Bc) (Ar)
Total (B1) (B2) (Bj) (Bc) N

The problem is to test if the two attributes A and B under consideration are independent
or not.
Under the null hypothesis that the attributes are independent, i.e. there is no
association between two attributes A and B.
The expected number of persons possessing both the attributes (Ai) and (Bj) are
calculated by using the formula,
( Ai )  ( B j )
E ( Ai B j ) 
N
Then test statistic under null hypothesis is,
r c (Oij  Eij ) 2
 
2

i 1 j 1 Eij

Where,
Mr. Amar Tikole (9272319199), KIT’s College of Engineering (Autonomous), Kolhapur Page 47
Oij = Observed frequency for contingency table category in row i and column j.
Eij = Expected frequency for contingency table category in row i and column j.
Which is distributed as Chi- square variate with (r -1) (c -1) degrees of freedom (d. f) at α
level of significance.
Test criteria:
1. Critical value of chi-square with (r -1) (c -1) degrees of freedom (d. f.) at α level of
significance written as,  2  , ( r 1)( c 1)

2. If the calculated chi-square is less than the critical value then accept the null hypothesis
and we conclude that there is no association between two attributes A and B. i.e.
attributes A and B are independent.
3. If the calculated chi-square is greater than the critical value then reject the null
hypothesis and we conclude that there is association between two attributes A and B. i.e.
attributes A and B are dependent.
Solved Examples

Example 1: Following results were collected by a doctor from his 500 patients.
Preference
Liquid Tablet Injection
Males 60 90 100
Sex
Females 110 75 65

Using chi-square test, can we say that the preference of the medicine depends on the sex of
the person?
Solution:
(i) Null Hypothesis: Preference of the medicine independent on the sex of the
person. (i. e. There is no association between preferences of the medicine and the sex
of the person.)
(ii) Alternative Hypothesis: Preference of the medicine dependent on the sex of the
person.
(iii) Level of Significance: α = 0.05

Mr. Amar Tikole (9272319199), KIT’s College of Engineering (Autonomous), Kolhapur Page 48
(iv) Critical value: For (2-1) (3-1) =2 degrees of freedom the critical value of  2 at

5% level of significance from the table = 5.991 i.e. 2 0.05, ( 2 1)(3 1)  5.991

(v) Test Statistic: On the basis of null hypothesis the expected frequencies are as
follows:
Preference
Liquid Tablet Injection Total
Males 60 90 100 250
Sex
Females 110 75 65 250
Total 170 165 165 500

170  250 165  250 165  250


E (60)   85 E (90)   82.5 E (100)   82.5
500 500 500
170  250 165  250 165  250
E (110)   85 E (75)   82.5 E (65)   82.5
500 500 500
Calculation of test statistic

O E (O  E ) 2
E
60 85 7.3529
90 82.5 0.6818
100 82.5 3.7121
110 85 7.3529
75 82.5 0.6818
65 82.5 3.7121
Total 23.4936
r c (Oij  Eij ) 2
2     23.4936
i 1 j 1 Eij

(vi) Test Criteria: Since calculated chi-square is greater than the critical value
hence, reject the null hypothesis.
(vii) Conclusion: We conclude that the preference of the medicine is dependent on
the sex of the person.

Mr. Amar Tikole (9272319199), KIT’s College of Engineering (Autonomous), Kolhapur Page 49
Example 2: Two samples of polls of votes for two candidates A and B for a public office are
taken, one from the residents of rural areas and one from urban areas. The results are given
in the following table. Examine whether the nature of the area is related to voting preference
in this election.

Votes
A B
Rural 620 380
Area
Urban 550 450
Solution:
(i) Null Hypothesis: The nature of the area is independent of the voting preference in
this election.
(ii) Alternative Hypothesis: The nature of the area is dependent of the voting
preference in this election.
(iii) Level of Significance: α = 0.05

(iv) Critical value: For (2-1) (2-1) =1 degrees of freedom the critical value of  2 at

5% level of significance from the table = 3.841 i.e. 2 0.05, ( 2 1)( 2 1)  3.841

(v) Test Statistic: On the basis of null hypothesis the expected frequencies are as
follows:
Votes
A B Total
Rural 620 380 1000
Area
Urban 550 450 1000
Total 1170 830 2000

1170  1000 830  1000


E (620)   585 E (380)   415
2000 2000

1170  1000 830  1000


E (550)   585 E ( 450)   415
2000 2000

Mr. Amar Tikole (9272319199), KIT’s College of Engineering (Autonomous), Kolhapur Page 50
Calculation of test statistic

O E (O  E ) 2
E
620 585 2.0940
380 415 2.9518
550 585 2.0940
450 415 2.9518
Total 10.0916
r c (O  E ) 2
ij ij
2    E  10.0916
i 1 j 1 ij

(vi) Test Criteria: Since calculated chi-square is greater than the critical value
hence, reject the null hypothesis.
(vii) Conclusion: We conclude that the nature of the area is dependent of the voting
preference in this election.

Example 3: The following table gives number of good and bad parts:

Good parts Bad parts Total

Day shift 960 40 1000

Evening shift 940 50 990

Night shift 950 45 995

Total 2850 135 2985

Test whether production of bad parts is independent of shift?

Solution:
(i) Null Hypothesis: The nature of the product and shift is independent.
(ii) Alternative Hypothesis: The nature of the product and shift is dependent.
(iii) Level of Significance: α = 0.05

(iv) Critical value: For (3-1) (2-1) =2 degrees of freedom the critical value of  2 at

5% level of significance from the table = 5.991 i.e. 2 0.05, ( 2 1)(3 1)  5.991

Mr. Amar Tikole (9272319199), KIT’s College of Engineering (Autonomous), Kolhapur Page 51
(v) Test Statistic: On the basis of null hypothesis the expected frequencies are as
follows:
2850  1000 135  1000
E (960)   954.7738 E ( 40)   45.2261
2985 2985
2850  990 135  990
E (940)   945.2261 E (50)   44.7738
2985 2985
2850  995 135  995
E (950)   950 E ( 45)   45
2985 2985
Calculation of test statistic
O E (O  E ) 2
960 954.7738 0.0286
40 45.2261 0.6039
940 945.2261 0.0288
50 44.7738 0.6100
950 950 0
45 45 0
Total 1.2713
r c (O  E ) 2
ij ij
2    E  1.2713
i 1 j 1 ij

(vi) Test Criteria: Since calculated chi-square is less than the critical value hence,
accept the null hypothesis.
(vii) Conclusion: We conclude that the nature of the product and shift is independent
i.e. production of bad parts is independent of shift.

Example 4: Out of 8000 graduates in a town 800 are females; out of 1600 graduate

employees 120 are females. Use  2 to determine if any distinction is made in appointment
on the basis of gender.
Solution: Observed Frequencies are,
Status
Employed Not Employed Total
Male 1480 5720 7200
Sex
Female 120 680 800
Total 1600 6400 8000

Mr. Amar Tikole (9272319199), KIT’s College of Engineering (Autonomous), Kolhapur Page 52
(i) Null Hypothesis: No distinction is made in appointment on the basis of gender.
(ii) Alternative Hypothesis: The distinction is made in appointment on the basis of
gender.
(iii) Level of Significance: α = 0.05

(iv) Critical value: For (2-1) (2-1) =1 degrees of freedom the critical value of  2 at

5% level of significance from the table = 3.841 i.e. 2 0.05, ( 2 1)( 2 1)  3.841

(v) Test Statistic: On the basis of null hypothesis the expected frequencies are as
follows: Status
Employed Not Employed Total
Male 1480 5720 7200
Sex
Female 120 680 800
Total 1600 6400 8000

1600  7200 6400  7200


E (1480)   1440 E (5720)   5760
8000 8000

1600  800 6400  800


E (120)   160 E (680)   640
8000 8000
Calculation of test statistic
O E (O  E ) 2
E
1480 1440 1.1111
5720 5760 0.2777
120 160 10
680 640 2.5
Total 13.8888
r c (Oij  Eij ) 2
 
2
  13.8888
i 1 j 1 Eij

(vi) Test Criteria: Since calculated chi-square is greater than the critical value
hence, reject the null hypothesis.
(vii) Conclusion: We conclude that the distinction is made in appointment on the
basis of gender.

Mr. Amar Tikole (9272319199), KIT’s College of Engineering (Autonomous), Kolhapur Page 53
Examples for Practice

Example 1: Describe Chi-square test for testing independence of attributes.


Example 2: Test the hypothesis at 5% level of significance that the presence or absence of
hypertension is independent on smoking habits from the following experimental data on 180
persons.
Non Smokers Moderate Smokers Heavy Smokers
Hypertension 21 36 30
No Hypertension 48 26 19

Example 3: Test the independence of attributes using following data and write conclusion.

Drugs
Liquid Pills Injection
Males 62 84 70
Sex
Females 70 75 50
Example 4: In a survey of 200 boys of which 75 were intelligent, 40 had educated fathers,
while 85 of the unintelligent boys had uneducated fathers. Do these figures support the
hypothesis that educated fathers have intelligent boys?
Example 5: Medical Council Collected following data regarding preference of medicine.
Male Female Total
Allopathic 55 65 120
Homeopathic 65 75 140
Ayurvedic 102 58 160
Total 222 198 420
Use Chi-square test and decide whether preference of medicine depends on sex of person.

Example 6: Following is the distribution of 500 people according to drug preference.


Tablet Liquid Injection
Below 10 90 50 25
10-20 45 70 27
20 and above 23 60 110
Using Chi-square test check the independence of above attributes and write conclusion.

Mr. Amar Tikole (9272319199), KIT’s College of Engineering (Autonomous), Kolhapur Page 54
Example 7: Using following data examine whether eye colour of son and eye colour of
father are independent of each other
Eye colour of father
Black Blue
Eye colour of Black 620 380
son Blue 550 450

Example 8: A drug manufacturing company collected following data regarding preference


of drug. Use chi-square test and decide whether preference of drug depends on sex of the
person.
Male Female
Tablet 55 48
Injection 62 55

Example 9: For the following data apply chi-square test and test of independence of age
and type of food.
Food
Chinese Panjabi South Indian
Below 15 50 54 60
Age 15 - 35 70 65 45
35 and above 35 65 55

Example 10: A certain drug was administered to 500 people out of a total of 800 included in
the sample to test its efficiency against typhoid. The results are given below:

Typhoid No Typhoid
Drug 200 300
No Drug 280 20

On the basis of these data, can it be concluded that the drug is effective in preventing
typhoid.

Mr. Amar Tikole (9272319199), KIT’s College of Engineering (Autonomous), Kolhapur Page 55
Example 11: Certain medical council collected following data regarding preference of
medicine.
Male Female
Allopathic 55 65
Homeopathic 65 75
Ayurvedic 102 58
Use chi-square test and decide whether preference of medicine depends on sex of the
person.
Example 12: Test independence of attributes from the following data of height father and
height of son and comment on it.

Height of father
Tall Short
Tall 145 75
Height of son
Short 60 120

Example 13: In an investigation to health and nutrition of two groups of children of different
status, the following results were registered
Health
Below Normal Above
Normal Normal
Poor 130 102 24
Social
Rich 20 108 96

Discuss the relation between the health and their social status.

**************************************************************************

Mr. Amar Tikole (9272319199), KIT’s College of Engineering (Autonomous), Kolhapur Page 56

You might also like