0% found this document useful (0 votes)
181 views132 pages

AgStat 2.22019 Mannula PDF

The document discusses statistical methods and provides the syllabus for Ag. Stat. 2.2 (Statistical Methods) course. It includes topics such as graphical representation of data, measures of central tendency and dispersion, probability, normal distribution, correlation, regression, tests of significance, analysis of variance, experimental designs, and sampling methods. The practical manual provides guidance and examples for students to apply these statistical concepts and techniques to solve problems in agriculture and related fields.

Uploaded by

Drvijay Kalpande
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
181 views132 pages

AgStat 2.22019 Mannula PDF

The document discusses statistical methods and provides the syllabus for Ag. Stat. 2.2 (Statistical Methods) course. It includes topics such as graphical representation of data, measures of central tendency and dispersion, probability, normal distribution, correlation, regression, tests of significance, analysis of variance, experimental designs, and sampling methods. The practical manual provides guidance and examples for students to apply these statistical concepts and techniques to solve problems in agriculture and related fields.

Uploaded by

Drvijay Kalpande
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 132

As per the Fifth Dean Committee Recommendations

NAVSARI AGRICULTURAL UNIVERSITY


DEPARTMENT OF AGRILCULTURAL STATISTICS
COLLEGE OF AGRICULTURE
BHARUCH, GUJARAT

PRACTICAL MANUAL

AG. STAT. 2. 2
STATISTICAL METHODS
SECOND SEMESTER B.Sc.(Agri.) CLASS
Name :
University Seat No. :
Registration No. :

Prepared By :
Dr. Alok Shrivastava, Dr. Y A Garde Dr. H. R. Pandya

DEPARTMENT OF AGRILCULTURAL STATISTICS


COLLEGE OF AGRICULTURE
BHARUCH, GUJARAT

1
DEPARTMENT OF AGRILCULTURAL STATISTICS
COLLEGE OF AGRICULTURE
Bharuch

CERTIFICATE

This is to certify that Mr./ Miss


………………………………………. of Second
Semester, University Seat No. ………. and
Registration No. ..………............., has
satisfactorily completed his / her practical
requirement in the subject Ag. Stat. 2.2 (Statistical
Methods) for the year 201 .

Date : - - 201

Associate Professor & Head Asstt. Professor


Dept. of Agril. Statistics Dept. of Agril. Statistics
College of Agriculture College of Agriculture
Bharuch Gujarat Bharuch Gujarat
Acknowledgment:

The author is thankful to Dr.H R Pandya, Dr. M. L. Lakhera Dr. H. N.


Chatrola, Dr. B. K. Bhat ,Dr. J. P. Patel & Dr. Yogesh A. Garde for
technical help to compile this manual.

Publisher:
Principal and Dean
College of Agriculture
Navsari Agricultural University
Campus Bharuch
Bharuch - 392012

Edition: FIRST

Year : 2018

Funded by: ICAR (Strengthening and development grant 2018-19)

University Publication No.: …./2018-19


College of Agriculture
Navsari Agricultural University
Bharuch
Ph. No.-02642-246152
Dr.K.G.Patel, Dean Email:[email protected] Mo.9601283390
No.NAU/COAB/ 29/12/2018 Date : 29/12/2018
No.ACB/ACD./T-8/ 5316 Dt: / /2018

FORWARD
Uncertainty and variation are two major components which governs the laws of
nature. Because of analytical power of the statistical science under the above situation, it has
been widely used in diverse field to analyze the behavior and to increase the precision in
findings. Hardily, there is any branch of science where statistical methods are not in use. The
Agriculture and related field are such fields which have led to the development and
discovery of so many statistical theories in increasing the precision of inference.
The Manual on Statistical Methods is intended to be a source of reference for
Students of undergraduate especially of this college will get benefit in the field of
Agriculture, livestock, Horticulture, Forestry & other allied discipline as well as researchers
and extension workers to get some basic concept of Statistics to get in to the matter deeply.
Though, a challenging tasks, I am happy that Dr. Alok Shrivastava has taken steps in this
direction.
As a matter of fact, explanation uses examples & and inference drawn in each and
every chapter of this Practical Manual/Notes will help the user profusely. I hope this
collection will draw the huge attention of students, researcher and other users of diverse
agriculture and allied fields as guide to solve their varieties of real life problems.

(K.G.Patel )
(As per 5th Dean Committee Recommendation)

: SYLLABUS:

Ag. Stat. 2.2 Statistical Methods Credit hours (2+1=3)

Theory:

Introduction to Statistics and its Applications in Agriculture.


Graphical Representation of Data, Measures of Central Tendency &
Dispersion. Definition of Probability, Addition and Multiplication Theorem
(without proof). Simple Problems Based on Probability. Normal
Distribution. Definition of Correlation, Scatter Diagram. Karl Pearson’s
Coefficient of Correlation Linear Regression Equations. Introduction to
Test of Significance, One sample & two sample test t for Means, Large
sample test (Z test), Chi-Square Test of Independence of Attributes in 2
2 Contingency Table. Introduction to Analysis of Variance, Principle of
experimental designs, Analysis of One Way Classification (CRD and
RBD). Introduction to Sampling Methods, Sampling versus Complete
Enumeration, Simple Random Sampling with and without replacement,
Use of Random Number Tables for selection of Simple Random
Sample.
Practical:

Graphical Representation of Data. Measures of Central Tendency


(Ungrouped data) with Calculation of Quartiles, Deciles& Percentiles.
Measures of Central Tendency (Grouped data) with Calculation of
Quartiles, Deciles& Percentiles. Measures of Dispersion (Ungrouped
Data).Measures of Dispersion (Grouped Data).Moments, Measures of
Skewness & Kurtosis (Ungrouped Data).Moments, Measures of
Skewness& Kurtosis (Grouped Data).Correlation & Regression
Analysis. Application of One Sample t-test. Application of Two Sample
Fisher’s t-test. Chi-Square test of Goodness of Fit. Chi-Square test of
Independence of Attributes for 2 2 contingency table. Analysis of
Variance One Way Classification.Selection of random sample using
Simple Random Sampling.
INDEX
Sr. No. Date Topics Page No. Sign
1 Problems on Frequency
Distribution &
Frequency Table
2 Problems on measures
of Central Tendency
3 Problems on measures
of Dispersion
4 Problems on Probability

5 Problems on Normal
Distribution
6 Problems on Large
Sample Test (Z-test)

7 Problems on Small
Sample Test (t-test)
8 Problem on F-test
9 Problem on 2 test
10 Problems on Correlation
and Regression
11 Problems on Rank
Correlation
12 Completely
Randomized Design
(CRD)
13 Randomized Block
Design (RBD)
14 Latin Square Design
(LSD)
15 Simple Random
Sampling with and
without replacement
16 APPENDICES
Statistical Table
**************
DEPARTMENT OF AGRICULTURAL STATISTICS
COLLEGE OF AGRICULTURE, BHARUCH

Some Preliminary
[A single death is a tragedy. A million deaths is a statistic. @ Joseph Stalin]
------------------------------------------------------------------------------------------------------------

Introduction
The term “statistics” is used in two senses : first in plural sense meaning a
collection of numerical facts or estimates—the figure themselves. It is in this sense
that the public usually think of statistics, e.g., figures relating to population, profits
of different units in an industry etc.
Secondly, as a singular noun, the term ‘statistics’ denotes the various methods
adopted for the collection, analysis and interpretation of the facts numerically
represented. In singular sense, the term ‘statistics’ is better described as statistical
methods. In our study of the subject, we shall be more concerned with the second
meaning of the word ‘statistics’.
Definition
Statistics has been defined differently by different authors and each author has
assigned new limits to the field which should be included in its scope. We can do no
better than give selected definitions of statistics by some authors and then come to
the conclusion about the scope of the subject.
A.L. Bowley defines, “Statistics may be called the science of counting”. At
another place he defines, “Statistics may be called the science of averages”. Both
these definitions are narrow and throw light only on one aspect of Statistics.
According to King, “The science of statistics is the method of judging
collective, natural or social, phenomenon from the results obtained from the analysis
or enumeration or collection of estimates.
Frequency Distribution & Frequency Table
Types of data: There are two types of data (a) Primary data (b) Secondary data
Collection of primary data: Primary data are collected through following methods
(i) Direct Personal Investigation
(ii) Indirect oral investigation
(iii)Information through correspondents
(iv) Information through schedules to be filled in by informants
(v) Information through Schedules in Charge of Enumerators
Classification: The process of arranging data in groups or classes according to resemblances
and similarities is called classification.
Objectives of classification: (i) To make data easy and comprehensive (ii) To clear
similarities and dissimilarities (iii) To support comparison (iv) To make scientifically
reasonable arrangement (v) to make basis for tabulation.
Types of Classification: (a) Qualitative (b) Quantitative
Tally mark: A bar (|) put against the class/variable for its occurrence is called a tally marks.
Frequency: Total number of tally marks put against a particular class/ variable is called its
frequency.
Frequency distribution: Arrangement of classes of values and their frequencies in a
systematic manner is called frequency distribution.
Frequency distribution table: A table showing the distribution of the frequencies in the
different classes is called a frequency distribution table.
Types of class:
(a) Inclusive classes __ In which both the upper and lower limits are included, and
(b) Exclusive classes __ In which upper limit of the class is not included in the class.

No. of classes: Sturges formula for determining the No. of classes


where Number of classes and No of observation
Class interval: The difference of upper and lower limits of a class is called the class interval.
Magnitude of class interval: Sturges formula for determining the magnitude of class interval
where (Magnitude of class interval)
Frequency polygon: Frequency polygon is drawn by plotting the frequency of each class as a
dot at the class midpoint and then connecting each adjacent pair of dots by a straight line. All
points are connected. It is commonly used for discrete distribution.
Frequency curve: Frequency curve is drawn by plotting the frequency of each class as a dot
at the class midpoint and then connecting each adjacent pair of dots by a free hand curve.
Some points may be left out. It is commonly used for continuous distribution.
EXERCISE NO. 1

Problems on Frequency Distribution & Frequency Table

------------------------------------------------------------------------------------------------
Classification and tabulation of data: The process of reduction of data to a manageable
size is called classification OR The process by which the data are arranged in groups or
classes according to similarities is known as classification and the process by which the
classified data are presented in an orderly manner by being placed in proper rows and
columns of a table in order to bring out their essential features or characteristics is known as
tabulation.
Objectives of classification:
1. To reduce data in groups/classes according to similarity.
2. To facilitate comparison through statistical analysis.
3. To point out most significant features of the data at a glance.
4. To give importance to a particular item by dropping out the unnecessary elements.
5. To enable a statistical treatment of the material collected.
Types of classification:
1. Geographical: When the classification is made on area basis e.g. district. taluka, city,
2. Chronological: When the classification is made on the basis of time e.g. production of
wheat in past 10 years.
3. Qualitative: When classification is made on the basis of some attributes. This
classification is further divided into four types.
(A) Simple classification: Only one attribute is considered e.g. blindness or sex.
(B) Two way classification: Two attributes are considered e.g. blindness & deafness,
colour & shape of flowers.
(C) Three way classification: Three attributes are considered e.g. sex, education level
and residing location.
(D) Manifold classification: More than three attributes are considered.
e.g.
Classification
One way Two way Three way
Sex Sex & Marital Sex, Marital status &
status Education level
High
Married Medium
Low
Male
High
Unmarried Medium
Low
Population
High
Married Medium
Low
Female
High
Unmarried Medium
Low
4. Quantitative: When the classification is made in the form of magnitude e.g. cows are
classified according to milk yield. This classification is further divided into two types.
(A) Discrete classification: Specific value in the range is considered e.g.
no. of petal, no. of insects etc.
(B) Continuous classification: Any value in the range of variation is
considered e.g. length, width etc.
FREQUENCY DISTRIBUTION
Objectives:
1. To condense the mass of data in such a manner that similarities and dissimilarities can be
easily understand.
2. To enable statistical treatment to the data collected.
Frequency: The no. or individual of items occurring in each class is termed as frequency.
Frequency distribution: The manner in which the frequencies are distributed over the
different class is called frequency distribution of the character under study and the table
indicating frequency distribution is called frequency table.
Class limit: It is the lowest and highest values of the distribution that can be included in the
class e.g 10-20, 20-30 etc. Two boundaries of a class are known as the lower limit and upper
limit of a class.
Class interval: The width of a class that is the difference of upper and lower limit of the
class is known as class interval.
Class mid point: It is the value lying half way between the lower limit (LL) and upper limit
(UL) of a class interval i.e. (LL + UL)/2.
Points while deciding class interval/classes:
1. It should be of uniform width which facilitates the statistical computation.
2. Range of the class should cover the data and should be continuous.
3. It should be convenient to make the mid-point of a class.
4. It should not be over lapping.
Types of frequency distribution:
(1) Discrete frequency distribution
(2) Continuous frequency distribution
Methods of classifying the data according to class interval:
Exclusive method: When the class intervals are so fixed that the upper limit of one class is
the lower limit of the next class. This method is known as exclusive method e.g. -10, 10-20.
Usually this method is preferred for continuous type of data. The data observed up to 9.99
would be included in 0-10 class while 10 or greater than 10 will be included in 10-20 class.
Inclusive method: In this method of classification, the upper limit of one class is included in
that class itself e.g. 100-199, 200-299. The value of 100 and 199 will be included in the class
of 100-199. This method is preferred for discrete type of data.
Procedure to form frequency distribution:
Step 1: Find range of the data. Range = Highest value – Lowest value.
Step 2: Fix the number of classes. Number of classes should preferably between 5 to 15
and should not be less than 5 and more than 30.
Approximate no. of classes = K =1 + 3.322 log N (Sturge’s rule) where N = no. of
observations under study.
Step 3: Fix the class interval = CI = Range/No. of classes or (L-S)/K where L = largest
value and S = smallest value
Step 4: Arrange different classes in ascending order of magnitude
Step 5: Pick up the values of observation and make tally mark against respective classes.
Step 6: Find total tally mark of each class which will give the no. of frequencies in the
respective classes.
1.1 The following data represent the milk yield in lits/day of surti buffaloes.
5.5 ,6.7,8.9,9.6,12.5,12.0,13.7,14.5,10.8,9.6 (ungroup data.)
1.2 Find out the frequency distribution table from the following data.

4,7,6,5,4,2,10,1,3,6,4,3,5,7,8,3,4,5,2,6,4,7,5,4,1,6,2,5,4,10,4,5,2,7,4,5,4,8,5,4,7
,9,5,6,5,9,6,5,7,0,8,1,5,6,3,9,3,1,5,3,7,8,5,3,0,6,8,4,3,2 (Discrete frequency
distribution)
Marks (X) Tally marks Frequency (fi)
0 ││ 2
1 ││││ 4
2 ││││ 5
3 ││││ │││ 8
4 ││││ ││││ ││ 12
5 ││││ ││││ ││││ 14
6 ││││ │││ 8
7 ││││ ││ 7
8 ││││ 5
9 │││ 3
10 ││ 2
Total 70
1.3 The following data represent the fat yield in kg/faction of buffaloes of Mehsana
breed. Prepare the frequency distribution table by using the 160-170,170-180,
etc. class interval.( Continuous frequency distribution)
295 199 195 192 209 197 200 189 177 195
169 205 202 204 165 206 207 189 201 203
187 208 191 203 226 212 172 182 207 217
213 221 214 222 244 221 180 229 215 219
216 223 231 253 225 237 225 230 227 228
218 234 240 260 267 242 243 232 236 279
251 252 261 268 246 284 283 257 233 162
262 178 247 273 239 259 231 173 245 266
270 175 212 211 214 196 215 243 216 244
274 194 225 223 224 226 228 255 227 198
248 218 233 234 235 238 242 265 258 211
252 235 249 254 255 241 256 271 264 272
224 286 253 263 264 254 269 285 282 281
210 298 188 175 276 277 278 289 287 288
182 170 299 208 181 193 271 205 221 291

Class interval Tally marks Frequency


160-170 │││ 3
170-180 ││││ ││ 7
180-190 ││││ │││ 8
190-200 ││││ ││││ 10
200-210 ││││ ││││ ││││ 14
210-220 ││││ ││││ ││││ │ 16
220-230 ││││ ││││ ││││ │││ 18
230-240 ││││ ││││ ││││ 15
240-250 ││││ ││││ │││ 13
250-260 ││││ ││││ ││ 12
260-270 ││││ ││││ │ 11
270-280 ││││ ││││ 10
280-290 ││││ ││││ 9
290-300 ││││ 4
Total 150
Example 1: Present the following frequency distribution in the form of frequency table with
suitable class interval and then draw frequency curve.
124 26 165 113 42 149 175 133 69 30 104 161 198
110 195 121 6 187 157 151 93 138 184 155 141 104
113 143 66 40 108 103 140 167 87 164 150 144 34
88 124 40 128 162 71 164 122 114 149 94 146 137
18 33 8 79 87 116 148 137 86 81 192 66 68
144 111 124 68 155 40 136 77 134 134 86 164 158
Solution: Using Sturge’s formula No. of classes
Here and

Magnitude of class interval


Since the minimum observation is 6 and maximum is 198, it is convenient to take class as 1-
25, 26-50 and so on if we consider inclusive classes; similarly if we consider exclusive
method of classes it will be convenient to take classes as 0-25, 25-50 and so on. Let us take
inclusive classes, and then the frequency table will be
Class Tally Marks Frequenc Steps for making frequency table
y
1-25 ||| 3 Step 1: Find the number of classes using
Sturge’s
26-50 |||| ||| 8 formula.
51-75 |||| | 6 Step 2: Mark the tally marks one-by-one
according
76-100 | | | | | | | | 10 to the observations from the raw table,
e.g.
101-125 |||| |||| |||| 15 mark | in class 101-125 for 124, then
126-150 |||| |||| |||| ||| 18 mark | in class 26-50 for 26, and so on.
151-175 |||| |||| ||| 13 Step 3: Now count the tally marks. These are
176-200 |||| 5 the frequencies of the respective classes.
Total 78

Steps for making frequency curve

Step1: Draw histogram or bar diagram


according to frequency table.

Step 2: Draw a free-hand curve joining the


mid points of the bars of the
histogram/bar diagram. This free-
hand curve is the frequency curve.
Difference between frequency polygon and frequency curve

Frequency polygon Frequency curve


1. Frequency polygon is drawn by plotting the Frequency curve is drawn by plotting the
frequency of each class as a dot at the class frequency of each class as a dot at the class
midpoint and then connecting each adjacent midpoint and then connecting each adjacent
pair of dots by a straight line. pair of dots by a free hand smooth curve.
2. All points are connected. Some points may be left out.
3. End points should be connected with the X- End points are not connected with the X-axis
axis
4. It is commonly used for discrete distribution. It is commonly used for continuous
distribution.

5. Not a true representative in comparison with True representative in comparison with


frequency curve. frequency polygon.

Difference between Classification and Tabulation


Classification Tabulation
1. Process of classification is prior to Tabulation is done after the classification.
tabulation hence it is the basis of
tabulation.
2. The data are classified according to the Tabulation is the presentation of data in
characters. rows and columns. Hence it is a
mechanical process.
3. It is the process of statistical It is the process of presentation of data.
computation of data.
4. The data are classified in class-sub class. The data are presented under heading-
subheadings
5. Percentage, ratio, coefficients are not Percentage, ratio, coefficients are used in
used in classification. tabulation.
Exercise 1: The numbers of dairy enterprenuer in a survey of district were recorded as
follows. Show the data in the form of frequency table with suitable class interval and then
draw frequency curve.
8 10 6 13 12 16 15 13 9 20 10 22 8
6 16 11 22 18 10 15 9 18 8 15 18 10
13 13 16 24 28 20 10 26 8 14 15 14 24
8 14 20 28 20 17 24 12 11 9 9 26 22
33 27 36 29 35 36 34 18 34 8 28 38 20
Solution:
DEPARTMENT OF AGRICULTURAL STATISTICS
COLLEGE OF AGRICULTURE, BHARUCH

Some Preliminary
[ I could prove God statistically ~ George Gallup]
------------------------------------------------------------------------------------------------------------
Objective: To Compute Arithmetic Mean for Grouped and Un-Grouped data.
Arithmetic Mean: The arithmetic mean of a set of observations is the quantity obtained by dividing the
sum of the values of the observations by their number.
The arithmetic mean of observations is given by

in case of frequency distribution, where is the frequency of the variable

In case of grouped data is taken as the mid value of the corresponding class.
In a frequency table if the class intervals are of equal width, say , then it is convenient to use this class
width as a devisor to make the calculations simpler.
(i) Select an assumed mean and take the deviations of the given values from . Now divide the
deviations by an arbitrary point (preferably class width).

(ii) Multiply by respective frequencies to get .

(iii) Compute the mean of the variable

(iv) Multiply by the divisor and add it to the assumed mean . The resulting value is the required
mean.

This result shows that if each score X is multiplied by a number , the net effect of this operation is to
multiply the mean by . This process is known as step-deviation method.
Combined mean: If are the means of distributions whose corresponding sizes are
then the mean of combined distribution on combining the distributions is given
by

Properties of mean: (1) The sum of deviation from mean is always zero. (2) Sum of square of deviation
from mean is least. (3) Mean is not independent of change of origin and scale.
Central tendency: Generally it is found that in any distribution, values of
the variable tend to cluster around a central value or centrally located
observation of the distribution. This characteristic is known as central tendency.
This centrally located value which represents the group of values is termed as the
measure of central tendency e.g. an average is called measure of central tendency.

Objectives

1) To get one single value that describe the characteristics of the entire
series/group.
2) To compare two or more distributions.

Different measures of Central tendency

1) Arithmetic mean (A.M.) Algebraic average

2) Median Positional average


3) Mode

4) Geometric mean (GM)


5) Harmonic mean (H.M.) Algebraic average
6) Weighted mean ( W.M.)

(1) Arithmetic mean or Mean

It is the most common and ideal measure of central tendency.

It is defined as the sum of the observed values of the character (


or variable) divided by the number of observations considered in obtained sum
( total)
_
Symbolically X : Sample mean
 : Population mean

Let X1 ,X2,... Xn be the value of n observations. Then,


For ungrouped data
n
 Xi
_ X1 + X2 + -- + Xn i=1
A.M. (X) = ------------------------ = ---------- ( i = 1, 2, - - - n).
n n

For grouped data

Let X1, X2 , ..., Xk be the values of X and f1, f2,...,fk


are their corresponding frequencies then
k
 fi Xi
_ f1X1 + f2X2 + ... + fnXk i=1
X = ------------------------------- = ---------------
f1 + f2 + ... + fk k
 fi
i=1

Method of computation : Raw data or ungrouped data


(i) Direct method :
n

X i
X  i1
n
(ii) Assumed mean method :
n

d i
X A i1
di = Xi - A ( A = Assumed mean )
n
Grouped data

(i) Direct method


k

f X i i
X  i 1
k

f
i 1
i n

where, f = freq. of kth class, X = class mid value,


k = no. of classes

(ii) Assumed mean method


k

fd i i
X A i1
di = Xi - A
n
A = Assumed mean
(iii) Step deviation method

fd i i
X A i1
I
n
Xi  A
where, dxi  , A = Assumed mean , I = Class interval.
I
Xi = Class mid value.
EXERCISE NO. 2
Problems On Measures Of Central Tendency
------------------------------------------------------------------------------------------------
2.1 Workout mean from the following information.

5.5, 6.7, 8.9, 9.6, 12.5, 12.0, 13.7, 14.5, 10.8, 9.6

.
2.2 Calculate mean from the following distribution.

Class Freq. Mid fX fX2 D=X-A/I fD fD2


Interval (f) Value
(X)
200-300 3
300-400 8
400-500 9
500-600 13
600-700 7
700-800 5
800-900 3
900-1000 2
2.3 Find out missing frequencies of the classes from the following. The mean value of
these data is 50.40.
Marks 0-20 20-40 40-60 60-80 80-100 Total
Students 10 22 36 f4 f5 100

Marks No of Xi Ci fixi
Students (fi)
00-20 10
20-40 22
40-60 36
60-80 f4
80-100 f5
Total 100
2.4 Workout mean value from the following information.

(a) fiXi = 1250 fi[Xi - 40]/10 = 25

(b) (Yi - 6) = 40  (Yi - 10.5) = - 50


2.5 .The following data represent the grade obtained by a student. Calculate weighted mean
value.

Sr. No. Course No. Credits (wi) Grade Point (Xi) wiXi
1 Eng. 1.2 2 5.4
2 Agron. 1.2 5 6.6
3 Ag. Chem.1.2 5 6.3
4 Ag. Bot.1.2 3 6.7
5 Pl. Path. 1.2 3 7.6
6 Hort. 1.2 2 8.1
Example : The microbial count setting per pettry plate of a milk sample are given below.
Find the average no of micriob.
No. of plate 1 2 3 4 5 6 7 8 9 10
No. of microb 6 16 25 38 85 108 100 65 28 9
Solution
No. of seeds
No. of pods (f)
(X)
1 6 6 Here
2 16 32
3 25 75
4 38 152
5 85 425 Since seed cannot be in fraction, therefore we
6 108 648 can say average number of seeds per pod of the
7 100 700 given variety
8 65 520 =6
9 28 252
10 9 90
Total 480 2900
Example: Compute the average income (Rs.) of dairy farmers of a village from the
following distribution of sale of milk and milk products per day.
Income(Rs.) 90-110 110-130 130-150 150-170 170-190 190-210 210-230 230-250
No. of farmers
15 42 60 64 112 174 150 83

Solution:
No. of Dairy Mid Here, assumed mean
Income farmers value and divisor (class interval)

90-110 15 100 -3 -45 Required mean


110-130 42 120 -2 -84
130-150 60 140 -1 -60
150-170 65 160 0 0
170-190 112 180 1 112
190-210 174 200 2 348 Rs.
210-230 150 220 3 450
230-250 82 240 4 328
Total 700 1049
Example : The average monthly income of 30 male and 20 female workers in an milk plant
is Rs. 3000.00 and Rs. 2450.00, respectively. Find the average monthly income of all the
workers in the milk plant.
Solution: Given that
Average monthly income of all workers
Exercise : The average weight of four groups of samples of milk from four herd of same
farm consists of 15, 20, 10 and 18 animal were 30.5, 36.2, 28.0 and 26.5 Ltr respectively.
Find the mean weight of the milk.
Solution:

Exercise : In an agribusiness company, the average monthly income of 18 field worker and 5
office assistants at Raipur centre is ` 20000.00 and 17000 and 32 field worker and 10 office
assistants at Bilaspur centre is ` 16500.00 and 14500.00, respectively. Find the average
monthly income of all the workers in the company.

Solution:
Objective: To Compute Median for Grouped and Un-Grouped data.
Median: The median is that value of the variable which devides the group into two equal parts, one part
comprising all values greater and the other all values less than the median. (Connor)
Median of ungrouped data: Let be the number of values of the variate then arrange the series in an
order (ascending or descending) then

(i) If is odd, median is the term of the array.

If is even, median is the average of the two middle values i.e.

Median of ungrouped data: The class corresponding to the cumulative frequency just greater than
is called the median class and the value of median is obtained by

Where lower limit of median class, Class interval

Frequency of median class Total frequencies


Cumulative frequency up to preceding class of median class

Median can also be located graphically by making ‘more than’ or/and ‘less than’ cumulative frequency
curve. Different steps of this method are
(i) Draw an ogive (Cumulative frequency curve).
(ii) Mark median point along -axis.
(iii) Draw a line from point parallel to -axis meeting the curve at .
(iv) Draw a perpendicular on -axis from point .
(v) The point of intersection of -axis will be the median.
Different steps for finding median when both ‘more than’ and ‘less than’ cumulative frequency are
drawn.
(i) Plot ‘less than’ and ‘more than’ ogive.
(ii) The intersecting point is .
(iii) Draw a perpendicular on -axis from point . The point of intersection of -axis will be the
median.
Uses of median: Median is useful when information is desired on the relative position and is the most
appropriate average in dealing with rates, ranks, scores and items that are not counted or measured, i.e. it
is very much useful when data are to be measured qualitatively and in group e.g. it is most suitable
measure for comparing data on health, intelligence, honesty etc. It is especially useful and more
representatives when the distribution is highly skewed or open ended.

Merits (i) It can be located by inspection. (ii) It is easy to understand and computation. (iii) Its value is
not affected by extreme values. (iv) It can be calculated for distributions with open end classes. (v) It is
suitable for qualitative studies. (vi) It can be exactly computed.

Demerits: (i) Necessitates arraying of data before it can be found. (ii) In case of even number of
observations, median cannot be determined exactly. We merely estimate it by taking the mean of two
middle terms. (iii) As compared with mean, it is much affected by fluctuations of sampling. (iv) Median
may not be representative, if the distribution is irregular and abnormal. (v) It is not amenable to
algebraic and arithmetical treatment.
Example 1: Calculate median from the following data
(i) 18 22 6 25 32 35 15 50 45 43
(ii) Class 30-35 35-40 40-45 45-50 50-55 55-60 60-65
Frequency ( 14 16 18 23 18 8 3
Solution: (i) Arranging the term in ascending order
6 15 18 22 25 32 35 43 45 50

Here therefore the median will be the mean of the and terms

Median
(ii)
Class Cumulative frequency Median item
30-35 14 14
35-40 16 30 item lies in class 45-50, therefore this
40-45 18 48 is median class. Hence
45-50 23 71 , , ,
50-55 18 89
55-60 8 97
60-65 3 100
Total 100

Example 2: Given the following frequency distribution with some missing frequencies. If the total
frequency is 685 and median is 42.6, then find the missing frequencies.
Class 10-20 20-30 30-40 40-50 50-60 60-70 70-80
Frequency 180 - 34 180 136 - 50
Solution: Let the missing frequencies be and respectively.
Class Cumulative frequency We have
10-20 180 180
20-30 180 +
30-40 34 214 + As median = 42.6 lies in the class 40-50,
40-50 180 394 + median class is 40-50.
50-60 136 530 + We have , ,
60-70 530 + + ,
70-80 50 580 + +
Total 685

or
As frequency of an item is always a whole number, we take
Example 3: Find median of the following data, graphically.

Groups 0-10 10-20 20-30 30-40 40-50 50-60 60-70


Frequency 4 8 11 15 11 7 4

Solution :

Group Upper (i) Draw an ogive


limit
(ii) Mid point Mark
0-10 10 4 4 along -axis

10-20 20 8 12 (iii) Draw parallel to -axis,


meeting the ogive at
20-30 30 11 23
(iv) Draw perpendicular on -axis
30-40 40 15 38 meeting -axis at
(v) meet -axis at 35 Median =35
40-50 50 11 49

50-60 60 7 56

60-70 70 4 60 Locating the median


Total 60
Exercise 1: The lengths (in cms.) of 50 panner sample in a trial data is given below. Find
the median length of the panner.
31.0, 23.3, 28.2, 18.4, 23.5, 30.0, 27.5, 22.7, 15.6, 23.4, 20.5, 22.5, 26.5, 35.4, 20.0, 24.5, 23.2,
25.0, 23.0, 26.5, 29.4, 19.5, 19.4, 17.5, 23.5, 29.0, 22.5, 18.4, 17.5, 20.4, 37.1, 29.0, 23.5, 23.5,
29.1, 21.0, 32.6, 27.0, 23.5, 28.5, 27.2, 30.0, 23.0, 23.5, 24.5, 22.4, 25.2, 28.2, 31.0, 26.0.
Solution:

Exercise 2: Find median of the following data


Pocket money (Rs.) 300-350 350-400 400-450 450-500 500-550 550-600
No. of students 15 18 25 40 30 2
Solution:
Objective: To Compute Mode for Grouped and Un-Grouped data.

Mode: In grouped data, if the series is inclusive series, the first step is to convert it into
exclusive series. The class which is having maximum frequency is known as modal class. If the
maximum frequency occurred in more than one class, the modal class is found out through the
method of grouping. Once the modal class is selected, the mode of the distribution is computed
through the following formula
where, Class interval
lower limit of modal class Frequency of class preceding the modal class
Frequency of the modal class Frequency of class succeeding the modal class
In some situations when mode is ill defined, following empirical relationship should be used

or
Mode can also be located graphically by drawing histogram of the frequency distribution.
Different steps are as follows
(i) Draw histogram.
(ii) Join the right corner of modal class’s rectangle to right corner of previous class rectangle
and left corner to left corner of succeeding class rectangle.
(iii) Draw perpendicular on -axis from the intersecting point meeting -axis at .
(iv) is the mode of the distribution.
Maximum frequency is not always a correct indication of mode. The concentration may be
around two or more points. In these cases, we have to find the point of maximum concentration.
In determining the point, we use the method of grouping.
The series, in which the concentration of items is around two or more than two values, are
called bimodal, trimodal or multimodal series depending upon the number of values around
which items concentrate.
Method of grouping is also used to determine the mode for the distribution in which the
maximum frequency occurs in the very beginning or at the end of the distribution.
Merits: (i) It can be located merely by inspection. (ii) It is easily comprehensible and commonly
used. (iii) It is not affected by extreme variation. (iv) Open end classes also do not pose any
problem in the location of mode. (v) Mode can be conveniently located even if the frequency
distribution has classes of unequal intervals provided the modal class and the class preceding
and succeeding it are of same magnitude.
Demerits: (i) It is not based on all the observations. (ii) It is not suitable for further
mathematical treatment. (iii) It is ill defined, and not always possible to find a clearly defined
mode.
Uses of mode: Mode is very much useful for dealing with quanlitative data. Mode is widely
used in business, forecasting weather changes and in biological studies, and for market studies.
Example 1 Find mode of the following data set.
26 23 24 20 18 26 24 20 24 19 25 24 24 28 30
25 32 30 24 18 20 24 30 22 24 22 28 30 24 18
Solution Arrange the data in an order (either ascending or descending)
18 18 18 19 20 20 20 22 22 23 24 24 24 24 24
24 24 24 24 25 25 26 26 28 28 30 30 30 30 32
Make a frequency table
Value 18 19 20 22 23 24 25 26 28 30 32
Frequency 3 1 3 2 1 9 2 2 2 4 1
Here 24 repeated maximum number of times.
Mode = 24.

Example 2: The agricultural holdings of 362 families of a village are given below. Find out the
modal size of holdings.
Holdings (ha) 0-5 5-10 10-15 15-20 20-25
No. of families 25 36 180 89 32

Solution
Holding No. of families Here modal class is 10-15 as it has the maximum frequency
(180). Therefore
0-5 25
5-10 36
10-15 Mode
180
15-20 89
20-25 32 ha
Total 362

Example 3: Find the mode of the following frequency distribution through graphical method.
0-10 10-20 20-30 30-40 40-50 50-60 60-70
5 10 12 15 13 8 4

Solution Class 30-40 is the modal class as it is having maximum frequency.


Steps for locating mode
(i) Draw histogram.
(ii) Join the right corner of modal class’s rectangle to right
corner of previous class rectangle and left corner to
left corner of succeeding class rectangle.
(iii) Draw perpendicular on -axis from the intersecting
point meeting -axis at .
(iv) is the mode of the distribution.
Locating mode
Exercise 1: Following are the number of baskets of milk drawn from an milk plant on every
alternate day for a period of two months. Find the mode.
30 35 40 32 35 40 42 28 25 35 37 42 45 30 35
36 28 35 20 26 28 35 32 38 35 40 28 27 35 30
Solution:

Exercise 2: From the data given below, find the mode.


Milk quantity
(in ltr) (below) 25 30 35 40 45 50 55 60

No. of cows 8 23 51 81 103 113 117 120

Solution:
Exercise 3: Find mode from the following table by the use of graph and check the results by
calculations.
Class 0-10 10-20 20-30 30-40 40-50 50-60 60-70 70-80
Frequency 2 18 30 45 35 20 6 3
Solution:
DEPARTMENT OF AGRICULTURAL STATISTICS
COLLEGE OF AGRICULTURE, BHARUCH

Some Preliminary
[I can prove anything by statistics - except the truth. ~ George Canning]

------------------------------------------------------------------------------------------------------------
Definition
Dispersion may be defined as the extend of the slatterns of observations around a
measure of central tendency and a measure of such scatter is called measures of dispersion.

The different measures of dispersion are as under :

1) Range.
2) Absolute mean deviation or Absolute deviation ( A.D.)
3) Standard deviation ( S)
4) Variance (S)2
5) Standard error of mean ( SEm.)
6) Coefficient of variation ( C.V.%)

Standard deviation (S)


The standard deviation or "root of mean square deviation" is the most common and
efficient estimator used in statistics. It is based on deviation from arithmetic mean and is
denoted by S or . S = Std. deviation for sample.  = Std. deviation for population

Definition
"It is a square root of a ratio of sum of square of deviation calculated from arithmetic
mean to the total number of observations minus one ."

Square of standard deviation is known as variance. It is also known as second moment


of dispersion.

Properties of standard deviation: (i) Standard deviation is based on all observations of the
distribution. (ii) It is dependent of change of scale. It changes as much as the scale is
changed. (iii) It is independent of change of origin. (iv) It is always positive. (v) It is affected
by extreme values.

The unit of standard deviation is same as the unit of the variable.

Combined variance: If and are the means of two series of size n1 and n2 with variance
and , respectively. Then the formula of variance of the series formed by adding the two
given series is given by

where , and
Standard deviation is the best measure of dispersion because of following reasons:
(i) Arithmetic mean, which is the best measure of central tendency, is used for
computing standard deviation.
(ii) It is based on all observations.
(iii)It is not much affected by fluctuations of sampling.
(iv) Algebraic sign is not ignored for computing standard deviation.

Methods of calculations : (Ungrouped data)

Raw data or ungrouped data

(1) Deviation method


n

 (X i  X) 2
S i1

n 1
(2) Variable square method
2
n
 n 
 X    Xi 
2
i
 i1 
n
S  i1
n 1

where : Xi = Variate value n = No. of observations

(3) Assumed mean method


2
n
 n 

i1
d    di 
2
i
 i1 
n
S di = Xi – A A = Assumed mean
n 1

Grouped data or frequency distribution

(1) Deviation method

 f (X i i  X) 2
S i1

n 1
k
where n   fi ; fi= Frequency of ith class
i1

(2) Variable square method

2
k
 k 
 f i X    fi X i 
 i1
2
i

n
S  i1
n 1
(3) Assumed mean method

2
k
 k 
 f i d    fi di ) 
 i1
2
i

n
S  i1
n 1

(4) Step deviation method

2
k
 k 
 fi d    fi d xi ) 
 i1
2
xi

n
S  i1  I dxi = (Xi - A)/I
n 1
Variance
Variance is the square of standard deviation. It is also called the “Mean square
deviation". Its being used very extensively in analysis of variance of results from field
experiment. Symbolically denoted by
S2 = Sample variance and 2 = Population variance.
Method of computation Raw data or ungrouped data
(1) Deviation method

 (X i - X) 2
S2  i1

(n  1)
(2) Variable square method

2
n
 n 
 X    Xi 
2
i
 i1 
n
SS
S 2  i1 
(n  1) df

where : Xi = Variate value SS = Sum of square


n = No. of observations. df = Degrees of freedom
(3) Assumed mean method

2
n
 n 
 d    di 
2
i
 i1 
n
S 2  i1 di = Xi - A
(n  1)
A = Assumed mean
Grouped data or frequency distribution

(1) Deviation method

 f (X
i1
i i - X) 2
S2 =
(n  1)
(2) Variable square method

2
k
 k 
 f i X    fi X i 
 i1
2
i

n
S 2  i1
(n  1)
(3) Assumed mean method

2
k
 k 
 fi d    fi di 
2
i
 i1 
n
S 2  i1 di = (Xi - A)
(n  1)
A = Assumed mean
fi = Frequency of ith class
(4) Step deviation method

2
k
 k 
 f i d    fi d xi  n
 i1
2
xi

S 2  i1  I2 dxi = (Xi - A)/I
(n  1)
Standard error of mean ( SEm.)
The standard deviation is the standard error of a single variate where as standard error
of mean is the standard deviation of sampling distribution of the sample mean OR it refers
to the average magnitude of difference between the sample estimate and population
parameter taken over all possible samples from the population.

Definition
It is defined as square root of the ratio of the variance to the total no. of observations in a
given set of data.

Symbolically it is written as Sx for sample and x for population.

Sx = S2/n Where, S2 = Variance ; n = No. of observation

For statistical analysis work the use of Sx is common. It is also used to provide confidence
limit on population mean and for test of significance.

Coefficient of variation (C.V.%)


It is a relative measure of variation and widely used to compare two or more statistical
series.
The statistical series may differ from one-another with respect to their mean or standard
deviation or both. Sometimes they may also differ with respect to their units and then their
comparison is not possible. To have a comparable idea about the variability present in them,
C.V. % is used. It was developed by Karl Pearson".
Definition
"It is a percentage ratio of standard deviation to the arithmetic mean of a given
series". It is without unit or unit less.
S.D. S
C.V.% = ------- x 100 OR = ---------- x 100
Mean Mean
The series for which the C.V.% is greater is said to be more variable OR we say
less consistence, less homogeneous or less stable while the series having lower C.V. %
is called more consistence OR more homogeneous.

Difference between mean deviation and standard deviation

Mean deviation Standard deviation


1- Deviation can be computed with any Deviation can be computed with only
measure (mean, median or mode). arithmetic mean.
2- Negative sign of the values are ignored,Negative sign of the values are
which is mathematical fault. mathematically treated to make them positive,
hence there is no mathematical fault.
3- Affected more due to sampling Affected less due to sampling fluctuations.
fluctuations.
4- More conclusions on mean deviation is More conclusions on standard deviation is
not possible. possible.
5- Less stability on mean deviation. Standard deviation is nore stable.
6- Mean deviation is not used where more Standard deviation is used where more
accuracy is needed accuracy is needed
7- Easy to compute Comparatively difficult to compute
EXERCISE NO. 3
PROBLEMS ON MEASURES OF DISPERSION
------------------------------------------------------------------------------------------------
3.1 Twenty seeds of Tur were planted on each of 20 agar plots. The numbers of seeds
germinated were observed as under. Workout various measures of dispersion.

X (X-X) (X-X)2 X2 di di2


6 -5.55 30.80 36 -6 36
7 -4.55 20.70 49 -5 25
8 -3.55 12.60 64 -4 16
12 0.45 00.20 144 0 0
18 6.45 41.60 324 6 36
10 -1.55 2.40 100 2 4
12 0.45 0.20 144 0 0
11 -0.55 0.30 121 -1 1
0 -11.55 133.40 0 -12 144
13 1.45 2.10 169 1 1
13 1.45 2.10 169 1 1
10 -1.55 2.40 100 -2 4
9 -2.55 6.50 81 -3 9
12 0.45 0.20 144 0 0
14 2.45 6.00 196 2 4
15 3.45 11.90 225 3 9
16 4.45 19.80 256 4 16
16 4.45 19.80 256 4 16
15 3.45 11.90 225 3 9
14 2.45 6.00 196 2 4
3.2 The following table shows the frequency distribution of tomato plants according to
the number of tomato per plant. Find out different measures of dispersion.

No. of No. of (Xi-X) fiXi fiXi2 (Xi-X)2 fi (Xi-X)2


tomato/plant plants (f)
(X)
0 2 -5 0 0 25 50
1 3 -4 3 3 16 48
2 7 -3 14 28 9 63
3 11 -2 33 99 4 44
4 18 -1 72 288 1 18
5 24 0 120 600 0 0
6 12 1 72 432 1 12
7 8 2 56 392 4 32
8 8 3 64 512 9 72
9 4 4 36 324 16 64
10 3 5 30 300 25 75
55 100 0 500 2978 110 478
3.3 Following frequency distribution show the number of cross-breed cows with
monthly production in lits. Calculate mean, standard deviation, variance, standard
error of mean and coefficient of variation. (step deviation method)

Milk-Yield No. of cows f Mid X-A


(lits) Cum. freq. value D= ------- f i Di fiDi2
(c.f.) (X) I
200-250 5 5 225 -4 -20 80
250-300 16 11 275 -3 -33 99
300-350 32 16 325 -2 -32 64
350-400 56 24 375 -1 -24 24
400-450 87 31 425 0 0 0
450-500 101 14 475 1 14 14
500-550 110 9 525 2 18 36
550-600 118 8 575 3 24 72
600-650 125 7 625 4 28 112
125 -25 501

Range = Max obs. – Min. obs.


= 650-200 = 450
Step deviation method
k

fd i i
(1) Mean X  A  i1
I
n

(-25)
= 425 + -------- x 50
125
= 425-10
= 415 milk yield / liters
2
k
 k 
 f i d 
2
xi  fi d xi ) 
 i1 
n
(2) S  i1  I
n 1

501- (-25) /125


= -------------------- x 50
124

501-5
= -------- x 50
124
496
= -------- x 50
124
= 100
3.4 Compare the variability in intelligence of two classes A and B from the following
information.

Class No. of Students Av. mark S.S.


A 26 19.6 1024
B 17 21.0 1225
3.5 Compare the variability of two series and draw your conclusion.

X-series Y-series

(1)  (X - X)2 = 96 (1)  Yi = 120

(2)  Xi2 = 2596 (2) S. S. = 81

(3) n = 25 (3)  (Yi - 12) = 0

3.6: A shopkeeper mixes a batch of 200 apples of mean mass 150 g and standard
deviation 30 g with another batch of 300 apples of mean mass 100 g and standard
deviation 20 g. find the mean and standard deviation of the combined batch of 500
apples.

Solution: We have g g

Combined mean

Hence the mean weight of combined batch is 120 g with standard deviation of 34.64 g.
3.7 From the performance of following two plant characters of a rice variety state which
character is more stable?
Penicle length (cm)
15 20 16 22 20 14 21 24 30 18 15 25
100 seed weight
(gm) 28 30 22 32 35 20 28 35 42 28 22 38

Solution:

3.8: In a sensory evaluation experiment, two judges accorded the following ranks to eight
milk products State which judge is more stable?
Judge A 8 7 6 3 1 1 5 4
Judge B 7 5 4 1 3 2 6 8
DEPARTMENT OF AGRICULTURAL STATISTICS
COLLEGE OF AGRICULTURE, BHARUCH

Some Preliminary
[ Facts are stubborn things, but statistics are pliable. ― Mark Twain ]

------------------------------------------------------------------------------------------------
Background: Most of the decision-making situations in business management involve
uncertainty. Since uncertainty is present and is an important aspect in determining the
consequences of various alternative courses of action, it is imperative to get proper
appreciation of it, draw a mathematical picture of it and attempt to measure it in numerical
terms
Basic terminology

1) Random experiment
If in each trial of an experiment conducted under identical conditions, the outcome it
not unique but may be one of the possible outcomes then such an experiments is called
Random experiment.
Example of Random experiment are: tossing a coin, throwing a die etc.
2) Trial and event
Any particular performance of a random experiment is called a trial and outcome as
combination of outcomes are termed as events.
For example (i) If a coin is tossed repeatedly, the results is not unique. We may get
any of two faces. Thus tossing a coin is random experiment and getting head as tail is even.
3) Exhaustive event
The total number of possible outcomes of a random experiment is known as the
exhaustive events or cases.
Example: In tossing of a coin, there are two exhaustive cases viz. head or tail.
4) Favorable event
The number of cases favorable to an event in a trial is the number of outcomes which
entail the happening of an event.
ex. In throwing of two --dice the no. of cased favorable of getting sum as 5 are
(1, 4), (4, 1), (2, 3) i.e. 4
5) Mutually exclusive events
Events are said to be mutually exclusive or incomparable if the happening of any one
of them precludes the happening of all the others, i.e. In throwing a dice all the 6 faces
numbered 1 to 6 are mutually exclusive.
6) Equally likely events
Outcomes of trial are said to be equally likely if taking into consideration all the
relevant evidences. There is no reason to expected one preference to other.
Example: In a random toss of unbiased or uniform coin, head and tail are equally
likely events.
7) Independent event
Several events are said to be independent if the happening of an event is not affected
by the supplementary knowledge concerning the occurrence of any number of the remaining
events.
Example: In tossing an unbiased coin, the event of getting a head in the first toss is
independent of getting a head in the and.

Definitions

Mathematical, classical definition:

If a random experiment or a trial results in “n” exhaustive mutually exclusive and


equally likely outcomes, out of which “m” are favorable to the occurrence of an event E,
then probability “P” of the occurrence of E, usually denoted by P (E) is given by
Number of favorable cases m
P = P (E) = ------------------------------------------------------ = --------------
B Total number of exhaustive cases n
Statistical (or Empirical) probability:
By:-(Von Moses). If an experiment is performed repeatedly under essentially
homogenous and identical conditions, then the limiting value of the ratio of the number of
times the event occurs to the number of trials, as the number of trials becomes indefinitely
large, is called the probability of happening of events, it being assumed that the limit is finite
and unique.
m
p(E)  lim n 
Symbolically n
Ex. . A ball is thrown from a box containing 6 red, 4 white and 5 blue balls.
Determine the probability that it is (a) red, (b) white (c) blue (d) not red

Solution: Possible outcomes = 15


6
a) Probability of getting red =
15
4
(b) Probability of getting white =
15
5
(c) Probability of getting blue =
15
P(getting red ball)+P(not getting red ball )=1 Total
Probability = 1
6
+ P (not getting red ball)=1
15
6 9
P (not getting red ball)= 1- =
15 15
Example:- Two cards are drawn at random from a pack of 52 cards so that the chance of
drawing 2 acas is

Solution: Exhaustive no. of case =


Favorable no. of case = 4 c 2 Required Probability =
Example. From a pack of 52 cards 3 are dropped at random, find the chances of that they are
a king, a queen and a knave.

Solution:
Exhaustive no. of case = 52 c 3
Favorable number of cases of a king = 4 C1
Favorable number of cases of a queen = 4 C1
Favorable number of cases of a knave = 4 C1
4
c1  4 c1  4 c1 1
P(E) 
52  51  50 271
Example. Show that the probability of obtaining a total of 9 in a simple throw with 2
dyes is 1/9.

Solution: Exhaustive no. of case of when 2 dice are thrown =62=36

Favorable case = 4[(5,4)(4,5)(6,3)(3,6)]

Hence, Probability of obtaining a total of 9 =


THEORMS OF PROBABILITY

Additional theorem of probability

If E ,E , E , ----E are k mutually exclusive events with their respective


1 2 3 k
probabilities P , P ----- P then the probability that an y one of them will happen is
1 2 k
P (E +E + E + ----+E ) = P + P + ----+ P
1 2 3 k 1 2 k
(Trick: Additional theorem is applied on those problems where “or” is used)
If A and B are any two events and are not disjoint then
P (A U B) = P (A) = P (A) + P (B) - P (A B)
Multiplication theorem of probability
IF E1, E2 ,E3 ------- Ek are k independent events with P1, P2 ---- Pk probabilities of
their occurrence respectively then probability of their simultaneous occurrence is
P (E1, E2 ,E3 ------- Ek) = (P1. P2. ----. Pk)
(Trick: Multiplication theorem is used on those problems where “AND” is used)
i.e. For two events A & B
P (A B) = P (A) . P (B/A), P (A) > 0
= P (B). P (A/B), P (B) > 0
Where P (B/A) represents conditional probability of occurrence of B where event A
has already taken place and vice versa for P (A/B).
Rule A: When events are mutually exclusive

If two events A and B are mutually exclusive with probabilities P1 and P2 respectively, then
the probability of occurrence of either of them (A or B) is equal to the sum of the individual
probabilities (A and B).

In symbols P(A or B) = P (A) + P(B) = P1 + P2

Proof: If an event A can happen in m1 ways and B in m2 ways, then the number of ways in
which either event can happen is m1 + m2. If the number of possibilities is n, then by
definition the probability of either the first or the second event happening is

m1  m2
P(A or B) 
n
m1 m2
 
n n
 P(A)  P(B)  P1  P2

where m1 m2
P(A)   P1 P(B)   P2
n n

Example: If A is the event drawing an ace from a pack of cards and B is the event drawing a
king, then P(ace = A) = 4/52 and P (Jack = B) = 4/52. The probability of drawing either an
ace or a king in a single draw is

Solution: P(ace or king) = P(A or B) = P(ace ) + P(Jack)

P(ace or king) = P(A) + P(B)

= 4/52 + 4/52 = 8/52

Since both ace and king can not be drawn in a single draw and are thus mutually exclusive
events.

From the above explanation, one can point out two facts. They are:

(i) The probability, P1 of an event lies between zero and one.

(ii) The sum of the probabilities of mutually exclusive events is one.


Rule B: When events are not mutually exclusive

If A and B are not mutually exclusive events, then the probability of either of them is equal to
the sum of their probabilities less the probability of their simultaneous occurrence.

Symbolically

P(A or B) = P(A) + P(B) - P(AB)

Where P(AB)[ P(Aor B) ]is the probability of joint occurrence of A and B

Example: This will serve the proof also.

If A is the event “drawing an ace” from a pack of cards, and B is the


event “drawing a spade card”; then A and B are not mutually exclusive events, since the ace
of spade can be drawn. Thus the probability of drawing either ace or a spade or both is

P(ace or spade) = P(ace) + P(spade) – P(ace of spade)

= 4/52 + 13/52 - 1/52 = 16/52

 A is the event “drawing an ace” from a pack of cards

 B is the event “drawing a spade card”;

 Then A and B are not mutually exclusive events,

 Since the ace of spade can be drawn. Thus the probability of drawing either ace or a
spade or both is
P(A or B) = P(A) + P(B) - P(A∩B)

P(ace or spade) = P(ace) + P(spade) – P(ace of spade)

= 4/52 + 13/52 -

1 /52 = 16/52
Similarly, we can generalize the rule for more than two events also.

P(A + B + C) = P(A) + P(B) + P(C) – P(AB) - P(AC) – P(BC) + P(ABC)

LAW OF MULTIPLICATION

Independent and dependent events: Events are said to be dependent or


independent accordingly as the occurrence of one does or does not affect the
occurrence of the others.

Two events, drawing of a king and queen will be independent, If the


drawing of the card is replaced after the first draw . But if the card after first
draw is not replaced and another card is drawn for the second event, the probability
of occurrence of the second event will depend on the probability of the occurrence of
the first. Hence in the latter case the second event will be dependent on the
first.

Rule C: When events are independent

If A and B are two independent events, with individual probabilities P1


and P2 respectively, then the probability of both happening at a time is the product
of their respective probability (P1.P2)

P(AB) = P(A) . P(B) = P l . P2


Proof

Let n1 and m1 be the possible and favorable numbers of cases for the
event A and n2 and m2 for the event B then

P(A) = m1/n1 and P(B) = m2/n2

Since two events are independent, we can associate n2 possible cases


for B with each of the n1 possible cases for A, so that the total number of possible
cases is n1.n2.

Similarly the total number of favorable cases for “A and B” is m1.m2

Thus,

m1.m 2 m m
P(A and B, both at a time)   1 . 2  P1.P2
n1.n 2 n1 n 2

i .e. P(A and B) = P(A).P(B

Rule D: When events are dependent

Example

Suppose a box contains 3 white balls and 2 black balls.

Let A be the event “first ball drawn is black” and B the event “second ball
drawn is black”, where the balls are not replaced after being drawn. Here A and B
are dependent events.
EXERCISE NO. 4
PROBLEMS ON PROBABILITY
------------------------------------------------------------------------------------------------------------
4.1 A die is rolled, find the probability that an even number is obtained.

4.2 Two coins are tossed, find the probability that two heads are obtained
4.3 i)A die is rolled, find the probability that the number obtained is
greater than 4.

ii) Two dice are rolled, find the probability that the sum is equal to 5

4.4. If the probability of solving a problem by two students Ram and Shyam
are 1/2 and 1/3 respectively then what is the probability of the problem to
be solved.
4.5 A number is selected from the first 30 natural numbers. What is the
probability that it would be divisible by 4 or 7 ?

4.6 A single card is chosen at random from a standard deck of 52 playing cards.
What is the probability of choosing a king or a club?
DEPARTMENT OF AGRICULTURAL STATISTICS
COLLEGE OF AGRICULTURE, BHARUCH

Some Preliminary
[How do you nurture a positive attitude when all the statistics say you're a dead
man? You go to work: Patrick Swayze]
------------------------------------------------------------------------------------------------------------

NORMAL DISTRIBUTION:

The most important continuous probability distribution used in the entire field of
statistics is normal distribution. The normal curve is bell-shaped that extends
infinitely in both directions coming closer and closer to the horizontal axis without
touching it. The mathematical equation of normal curve was developed by De
Moivre in 1733. A continuous random variable x is said to be normally distributed if it
has the probability density function represented by the equation of normal curve

The normal distribution, also called the normal probability distribution, is most
useful theoretical distribution for continuous variables. The data of many biological
phenomena follow normal distribution. The area under the curve represents the total
number of observations. The distribution is represented mathematically by


 X   2
1
f (X )  e 2 2
2 

Where  = Standard deviation,  = Mean, e = 2.71828

The quantities  and  are parameters of this distribution. The above equation
takes following form under the assumption that

 = 0,  = 1 and (X - ) /  = Z
2
z
1 
f (X )  e 2
2

This is standard form of the normal distribution. A variate Z is said to be normally


distributed with mean zero and standard deviation unity. It is called normal deviate.
The area under the curve represents the total number of observations.
PROPERTIES OF NORMAL DISTRIBUTION

1. It is a symmetrical, bell shaped single peaked curve. Its slope grows steeper
and steeper as it progress towards the ends. It is asymptotic curve i.e. it
approaches closer and closer to the base line but never coincide to the base
line.

2. The shape of the curve at the center towards the x-axis is concave while at
end it is convex. The curve changes the shape at the distance of  from the
mean.

3. There are two parameters viz.,  (mean)  (standard deviation). The curve
can be drawn if we are having the values of both the parameters of
population.

When  = 0 and  = 1, then the normal curve is termed as standard normal


curve and the variate is called the standard normal variate.

4. The normal curve is symmetrical about the mean therefore, mean divides the
entire area of the curve into two equal parts and hence the mean is also the
median. The maximum frequencies are also at the center of the curve and
therefore, the mode is also equal to median. Thus, mean, mode and median
coincide at the center.

5. If two ordinates at the distance of  on both the sides of the mean are
erected, the area of the curve so cut off is equal to 68.26 percent i.e. about
2/3 of the entire curve.

6. Similarly if two perpendiculars are erected at the distance of 2 on both the


sides of the mean, the area between these two perpendiculars will
be 95.44 percent.

7. If two ordinates are erected on both the sides of mean at the distance of 3,
the area so cut off will be 99.74 percent of the entire area of the curve.

8. The range of the normal distribution is equal to 6.

9. Range = Maximum value - Minimum value = ( + 3) - ( - 3) = 6

10. The absolute mean deviation about the mean = 0.7999  = 4/5 

11. The coefficient of skewness and kurtosis are 0 and 3, respectively


Examples:

If X is normally distributed with mean 3 and standard deviation 2 find.

(i) P (0 ≤ X ≤ 4) (ii) P (| X − 3 | < 4).

Solution:

Given μ = 3, σ = 2

(i) P (0 ≤ X ≤ 4)

We know that Z = (X – μ) / σ

When X = 0, Z = (0 – 3) / 2 = −3 / 2 = − 1.5

When X = 4, Z = (4 – 3) / 2 = 1 / 2 =0.5

P (0 ≤ X ≤ 4) = P (−1.5 < Z < 0.5)

= P (0< Z <1.5) + P (0 < Z < 0.5)

= 0.4332 + 0.1915 = 0.6249

(ii) P (| X − 3| < 4) = P (−4 < (X − 3) < 4) ⇒ P (−1 < X < 7)

When X = −1, Z = (−1 – 3)/2 = −4/2 = − 2

When X = 7, Z = (7 – 3)/2 = 4/2 = 2

P (−1 < X < 7) = P (−2 < Z < 2)

= P (0 < Z < 2) + P ( 0 < Z < 2)=

= 2(0.4772) = 0.9544
EXERCISE NO. 5

PROBLEMS ON NORMAL DISTRIBUTION


-------------------------------------------------------------------------------------------------------

5.1 The average number of acres burned by forest and range fires in a large New
Mexico county is 4,300 acres per year, with a standard deviation of 750 acres. The
distribution of the number of acres burned is normal. What is the probability that
between 2,500 and 4,200 acres will be burned in any given year?

5.2 If mean of a given data for a random value is 81.1 and standard deviation is
4.7, then find the probability of getting a value more than 83.
DEPARTMENT OF AGRICULTURAL STATISTICS
COLLEGE OF AGRICULTURE, BHARUCH

Some Preliminary
[There are three types of lies -- lies, damn lies, and statistics. ― Benjamin Disraeli]
------------------------------------------------------------------------------------------------------------
TEST OF SIGNIFICANCE

Test of significance: It is a kind of test which enables us to decide the opinion about the
population parameter on the basis of sample results that whether
(i) the deviation between the observed sample statistic and the hypothetical parameter value
(ii) the deviation between two sample statistics,
is significant or might be attributed to chance or the fluctuations of sampling.

Null hypothesis: It is a statement about the population parameter which is tested for possible
rejection under the assumption that it is true.
It is usually denoted by . Generally, it is called no difference hypothesis, because we
hypothesize that the statistic/parameter/ratio about which the statement has been developed is
not different from the population parameter/ratio. It is also called the ‘size of the critical
region’.
Alternative hypothesis: Any hypothesis complementary to the null hypothesis is called an
alternative hypothesis. It is generally denoted by H1. The alternative hypothesis may be of
single tailed or two tailed hypothesis. For example
H1: (i) or (ii) or (iii)
Here, alternative hypothesis (i) is a two tailed hypothesis where as hypotheses (ii) and (iii) are
single tailed hypothesis.
The table values for two tailed and one tailed tests are observed as
for single tailed test for two tailed test.

Errors in sampling: There are two types of errors in sampling. Type I error and Type II
error.
Type I error: Reject , when it is true. It is denoted by . It is also known as producer’s risk.
Type II error: Accept , when it is wrong. It is denoted by . It is also known as consumer’s
risk.
Level of significance The probability of happening of type I error is known as level of
significance. The levels of significance usually employed in testing of hypothesis are 5% and
1%. It is usually fixed in advance before collecting the sample information.

Degrees of freedom The number of ‘degrees of freedom’ refers to the number of


independent observations in the sample (i.e. the sample size ) minus the number of
population parameters which must be estimated from sample observations.
Procedure for testing of hypothesis
1. Set up the null hypothesis. e.g. : The sample has been drawn from the population
which has mean . Or there is no difference between sample mean and population
mean , i.e.

2. Set the alternative hypothesis. H1: (i) or (ii) or (iii)

3. Set the level of significance. This is to be decided before sample is drawn.


4. Compute the test statistic under the null hypothesis.
5. Observe the table value, at level of significance and respective degrees of freedom,
for the test applied.
6. Draw conclusions. Compare the computed value of Z in step 4 with the table value at
given level of significance.
If , i.e. if calculated value of Z is less than table value , we say, it is not
significant. By this we conclude that the difference is just due to the
fluctuations of sampling and the sample does not provide us sufficient evidence
against the null hypothesis which may, therefore, be accepted.
If , i.e. if calculated value of Z is greater than or equal to table value we
say it is significant and the null hypothesis is rejected.
EXERCISE NO. 6
PROBLEMS ON LARGE SAMPLE TEST (Z-TEST)
------------------------------------------------------------------------------------------------------------
Z - TEST
It is a large sample test and can be utilized for testing the hypothesis if the following
conditions are satisfied.

(1) Data follow normal distribution


(2) Sample size should be large ( n > 30 ) or
(3) The standard deviation of population should be known if sample is not large.

Z test can be defined as " It is the ratio of the difference between the estimated
population mean and hypothetical mean to the standard error of mean based on population
standard deviation or its estimate from large sample.

Standard Error: The standard deviation of a sampling distribution of a statistic is known as


its Standard Error. It forms the basis of the testing of hypothesis.

SND (Standard Normal Deviation) test for single sample: Under the null hypothesis
that the sample has been drawn from a population with mean and variance , i.e., there is
no difference between the sample mean and population mean , the test statistic (for
large samples) is
where is known as the standard error of the mean
If the population standard deviation is unknown then we use its estimate provided by the
sample variance .

Where
The steps for SND test are:
1. Compute the test statistic Z under null hypothesis .
2. If is always rejected.
3. If , we test its significance at pre-fixed level of significance. In agriculture, it is
generally 5% and sometimes at 1%. Thus for two tailed test
(a) If is rejected at 5% level of significance.
(b) If is rejected at 1% level of significance.
(c) If is accepted at 1% level of significance.
Similarly, for one-tailed test is compared with 1.645 (at 5% level) and 2.33 (at 1% level)
and accepted of reject , accordingly.
Sample may be regarded as large if .
Confidence limits for
99% confidence limit (1% level of significance) for are
98% confidence limit (2% level of significance) for are
95% confidence limit (5% level of significance) for are
90% confidence limit (10% level of significance) for are
Example 1: An automatic packing machine was designed to pack exactly 2.0 kg of panner.
A sample of 100 sample was examined to test the machine. The average weight was found to
be 1.94 kg with standard deviation 0.10 kg. Was the machine working properly?
Solution: Given sample size , sample mean kg,
Sample standard deviation kg
It is required to test the hypothesis that the population mean is 2.0 kg.
kg kg
Since sample size is large, the sample mean is approximately normally distributed with mean
and S.E. . However, since the population s.d. is not known, an approximate value of

S.E. is S.E.

Therefore,

Since exceeds 2.58, we reject the null hypothesis at 1% level of significance and
conclude that the machine is not functioning properly.
EXERCISE: PROBLEMS ON LARGE SAMPLE TEST (Z-TEST)

4.1 Average milk yield of a cow in a year is estimated as 1750 Kg. To test this,
a random sample of 100 cows was taken. The results of which are given in
the following table. Test whether average yield of cow is 1750 Kg. or not.
Also work out the confidence limits at 95%.

Milk No. of cows Mid value Xi - A


Production fi Xi Di= ------ fiDi fiDi2
I
800-1100 5 950 -3 -15 45
1100-1400 12 1250 -2 -24 48
1400-1700 30 1550 -1 -30 30
1700-2000 26 1850 0 0 0
2000-2300 10 2150 1 10 10
2300-2600 8 2450 2 16 32
2600-2900 5 2750 3 15 45
2900-3200 4 3050 4 16 64
100 -12 274
Test for means of two large samples.
Inferences about the difference are naturally based on its estimate , the difference
between the sample means. When both sample sizes and are large (say, greater than 30), and
are each approximately normal and their difference is approximately normal.
A test of the null hypothesis that the two population means are the same, employs the
test statistic

If the samples have been drawn from the population with common standard deviation , then under
, the test statistic becomes

Divisor or are the standard error of the difference of

means .
If is not known then its estimate based on the sample variances is used. If sample sizes are not
sufficiently large then an unbiased estimate of is given by
Where

But, since sample sizes are large , . Therefore, in practice, for large
samples,

Confidence limits for difference of means


95% confidence limit (5% level of significance) are
90% confidence limit (10% level of significance) are
99% confidence limit (1% level of significance) are
98% confidence limit (2% level of significance) are
Example : The mean yield of milk from a district A was 210 ltr with standard deviation 10 ltr from a
sample of 100 dairy. In district B, the mean yield was 220 kg with standard deviation 12 kg from a
sample of 150 dairy. Assuming that the standard deviation of yield in the entire state was 11 kg, test
whether there is any significant difference between the mean yields of milk in the two districts.
Solution: Let and denote the mean yields of crops in districts A and B, respectively.
against
Given , , ltr, ltr, ltr, ltr
Population s.d. ltr
We may assume that the s.d. of yield in the whole state is the s.d. of yields in the two districts. That
is, the two populations have the same s.d. ltr.

We know

Since exceeds 2.58, we reject null hypothesis at 1% level and conclude that there is a
significant difference in the mean yields of milk in the two districts.
4.2 A random sample of 200 villages have been taken from one district and average
population per village was found to be 495 with a standard deviation of 50.
Another random sample of 200 villages from another district gave an average
population 510 with S.D. 40. Test whether there is any difference between the
average of the two samples. Also work out the confidence limits at 95 percent.

4.3 Test Ho : 1 = 2 from the following and interpret the results.

X-series Y-series
Sx2 = 0.2 Yi = 300
S2 = 8 Yi2 = 3145
Xi2 = 9312 n = 30
4.4 Head diameter of 125 sunflowers was measured and found that the average
diameter was 38 cm. Is it true to say that the sample was taken from the
population having average diameter 43.5 cm with variance 20.25 sq cm?
DEPARTMENT OF AGRICULTURAL STATISTICS
COLLEGE OF AGRICULTURE, BHARUCH

Some Preliminary

[Statistics are used much like a drunk uses a lamppost: for support, not illumination: Vin Scully]
------------------------------------------------------------------------------------------------------------
"t" test

When the sample is large and if  is not known, we estimate the same and can be
used in Z test. But if 'n' is small error will be more for replacing  by S and under that
situation the Z remain no longer normal, but changes to another distribution named "t".
The "t" distribution was found out by W.S. Gossett in the name of 'Student' in 1908.
Values of 't' depends on degree of freedom and is always greater than its limiting
value of Z for any unit degree of freedom. When d.f. is large t --> Z. Difference between
t and Z becomes more and more marked as n become smaller and smaller.

Definition: It is the ratio of the deviation between of sample mean and hypothetical mean
to the standard error of mean estimated from the small sample.

CONDITION FOR APPLYING THE 't' TEST

1) Data follow normal distribution.


2) The sample is small (n < 30) and the standard deviation of the population is estimated
from the sample.

(1) Test of single mean : To test if the sample mean differs significantly from the
hypothetical value of the population mean.
Assumptions: (i) The parent population from which the sample is drawn is normal. (ii) The
sample observations are independent, i.e. the sample is random.
Conditions: (i) The population standard deviation is unknown. (ii) Sample size is small.
Let be a random sample of size from a normal population with mean and
variance . Then Student’s is defined by the statistic

where is the sample mean and is an unbiased estimate of


the population variance , and it follows Student’s t-distribution with degrees of
freedom.

(2) Test of difference of two sample means


Assumptions: (i) The parent populations from which the samples have been drawn are
normally distributed. (ii) The two samples are random and independent of each other.
Conditions: (i) The population variances are equal and unknown. (ii) Sizes of the samples are
small.
Thus before applying t-test for testing the equality of means it is theoretically desirable to test
the equality of population variances by applying F-test. If the variances do not come out to be
equal then t-test becomes invalid.
Case 1: When the samples are related (Paired)
While comparing the means of two groups of observations, there could be instances where the
groups are not independent but paired. In such situations, two sets of observations come from a
single set of experimental units. The observations obtained from such pairs could be
correlated.The statistical test used for comparing means of paired samples is generally called
paired t-test. The test statistics is
where , , and

follows Student’s t-distribution with degrees of freedom.

Case 2: When the samples are independent


Step1: H0: Two samples have been drawn from the populations with the same mean. Or there is
no difference between the two sample means, i.e.
Step2: Set the alternative hypothesis. H1: (i) or (ii) or (iii)
Step3: Set the level of significance.

Step4: The statistic where

Compare the calculated with the significant points of the t-distribution with
degrees of freedom.
Applications of t-distribution
1. To test if the sample mean differs significantly from the hypothetical value of the
population mean.
2. To test the significance of the difference between two sample means.
3. To test the significance of an observed sample correlation coefficient.
4. To test the significance of observed partial correlation coefficient.
5. To test the significance of observed multiple correlation coefficients.
6. To test the significance of the sample regression coefficient.
7. To test the significance of difference of two regression coefficients.
Example of single mean
Example : Following are the girth at breast-height (gbh) of trees attained since 6 years of planting.
Tree No. 1 2 3 4 5 6 7 8
gbh (cm) 30.85 30.24 30.94 29.89 21.52 25.38 22.89 29.44
Do we say that the trees selected are taken from the population of trees having mean gbh 25.50 cm?
Solution: Level of significance = 5%
Tree gbh (cm) ( )
No.
1 30.85 951.7225
2 30.24 914.4576
3 30.94 957.2836
4 29.89 893.4121
5 21.52 463.1104
Observe table value at 7 degrees of freedom
6 25.38 644.1444
at 5% level of significance.
7 22.89 523.9521
8 29.44 866.7136
Total 221.15 6214.7963
Since calculated is less han table value at 7 degrees of freedom, it may therefore be
concluded that the trees selected are taken from the population of trees having mean gbh 25.50
Example of paired sample
Example: Following are the observations on organic carbon (%) obtained from soil core samples
drawn from two different layers of a number of soil pits in a natural forest. Compare the organic
carbon status of soil at the two soil depth levels.
Soil Pit 1 2 3 4 5 6 7 8 9 10
Layer 1 (x)
Organic 1.59 1.39 1.64 1.17 1.27 1.58 1.64 1.53 1.21 1.48
carbon (%)
Layer 2 (y)
1.21 0.92 1.31 1.52 1.62 0.91 1.23 1.21 1.58 1.18

Solution: Level of significance = 5%


Organic carbon (%)
Difference
Soil pit
Layer 1 Layer 2
(x) (y)
1 1.59 1.21 0.38 0.1444
2 1.39 0.92 0.47 0.2209
3 1.64 1.31 0.33 0.1089
4 1.17 1.52 -0.35 0.1225 Observe table value at 9 degrees
5 1.27 1.62 -0.35 0.1225 of freedom at 5% level of significance.
6 1.58 0.91 0.67 0.4489 It may therefore be concluded that there is
7 1.64 1.23 0.41 0.1681 no difference between the mean organic
carbon content of the two layers of soil.
8 1.53 1.21 0.32 0.1024
9 1.21 1.58 -0.37 0.1369
10 1.48 1.18 0.30 0.0900
Total 1.81 1.6655
Example on independent samples
Example : A farmer has introduced new variety of wheat in ten fields along with his
standard variety in 14 similar areas, so that his full sets of yields (q/ac) are
New variety 25 21 24 20 26 22 23 26 27 22
Standard variety 24 21 19 21 21 17 23 20 17 22 18 20 18 19
Test whether the new variety has any difference in yield?
Solution: : There is no difference between the two sample means, i.e.
:
We have

Hence, pooled
Statistic

Comparison of this value with the tabulated value of t-distribution with


degrees of freedom shows that the observed value is greater than the 1% significance point
(2.819). Hence the observed difference is significant at 1% level.
Example : Yield trial on onions with two methods of propagation was conducted in 15 localities.
The following are the yields (gms) from the two methods. Test whether the methods of propagation
were significantly different.
Locality 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Method of propagation I
28 14 30 70 36 35 60 22 50 99 53 72 47 68 24
Method of propagation II
20 28 22 23 25 22 31 22 26 23 27 31 32 26 20

Solution: : There is no difference between the two sample means, i.e.


:
We have

Hence, pooled

Statistic Comparison of this value (3.47) with


the tabulated value of t-distribution with degrees of freedom shows that the
observed value is greater than the 1% significance point (2.763). Hence the observed difference is
significant at 1% level.
EXERCISE NO. 7
PROBLEMS ON SMALL SMAPLE TEST (t-test)
------------------------------------------------------------------------------------------------------------

5.1 The plants are chosen from a population at random whose height in inches is
given below.
62, 65, 67, 71, 74, 75, 77, 78, 80, 81.
In the height of above data discuss the suggestion that the mean height of plants
in the population is 70 inches. Also work out the confidence limits at 95%.

Plant Height (X) (Xi-X) (Xi-X)2 = S.S.


62 -11 121
65 -8 64
67 -6 36
71 -2 4
74 1 1
75 2 4
77 4 16
78 5 25
80 7 49
81 8 64
Total 730 0 384
5.2 An experiment was conducted in difference villages at Bharuch Taluka to test
whether the improved variety of tobacco gives higher yield as compare to local
variety or not. The results are given as under. Test whether improved variety of
tobacco is superior to local variety or not. Also work out the confidence limit at
95 percent.

Sr. No. Local Improved (Xi-X) (Xi-X)2 (Yi-Y) (Yi-Y)2


1 18.0 19.4 0.40 0.16 0.4 0.16
2 16.5 18.3 -1.1 1.21 -0.7 0.49
3 20.5 22.0 2.9 8.41 3.0 9.00
4 19.7 21.8 2.1 4.41 2.8 7.84
5 15.3 16.0 -2.3 5.29 -3.0 9.00
6 16.2 18.3 -1.4 1.96 -0.7 0.49
7 17.3 18.0 -0.3 0.09 -1.0 1.00
8 18.0 19.5 0.4 0.16 0.5 0.25
9 19.5 20.3 1.9 3.61 1.3 1.69
10 15.0 16.4 -2.6 6.76 -2.6 6.76
176 190 32.06 36.68
5.3 The following table gives the weight of chicks in two series A & B.

A 9 17 14 13 15 10 11 13
B 8 15 11 10 13 9 - -
Examine whether the two series differ significantly in mean weight.
A (X) (Xi-X) (Xi-X)2 B (Y) (Yi-Y) (Yi-Y)2
9 -3.75 14.06 8 -3 9
17 4.25 18.06 15 4 16
14 1.25 1.56 11 0 0
13 0.25 0.06 10 -1 1
15 2.25 5.06 13 2 4
10 -2.75 7.56 9 -2 4
11 -1.75 3.06 - - -
13 0.25 0.06 - - -
102 49.48 66 0 34
5.4 To test the effectiveness of new drug for blood pressure of 12 patients was
recorded before and after the drug administration. The data are given in the table.
Also work out the confidence limit at 95 percent.

Sr. No. Before (X) After (Y) di =Y - X di2


1 100 105 5 25
2 102 104 2 4
3 120 128 8 64
4 130 129 -1 1
5 135 132 -3 9
6 140 140 0 0
7 125 123 -2 4
8 120 121 1 1
9 130 135 5 25
10 128 128 0 0
11 132 136 4 16
12 104 109 5 25
24 174

Set the null hypothesis : Ho : d = 0 ; Ha : d  0


d 
 di = 24/12 = 2
n
_ di2 – (di)2/ n
(iii) SE. of d = ----------------
 n (n - 1)
5.6 : Crown diameters (m) have been reduced by cutting 10% of basal branches of trees. Test
whether the difference of crown diameter is significantly reduced.
Tree No. 1 2 3 4 5 6 7 8 9 10

Crown Before 5.9 8.3 8.5 9.3 10.6 11.4 11.9 12.3 12.6 13.0

diameter (m) After cut of basal


branches 5.9 8.3 8.3 9.1 10.2 11.3 11.6 12.2 12.2 12.7

Solution:
DEPARTMENT OF AGRICULTURAL STATISTICS
COLLEGE OF AGRICULTURE, BHARUCH

Some Preliminary
---------------------------------------------------------------------------------------------------------------
t and Z tests are used for comparing two populations mean. When the population is to
be compared with respect to their variances the F test is used.

F-test: F-statistic is the ratio of two independent ‘unbiasd estimator’ of the population
variance. F-test is also known as variance ratio test.

Let be the sample variance of a random sample of size drawn from a normal population
with variance and let be the sample variance of another independent random sample of
size drawn from a normal population with variance . We are interested to test the null
hypothesis .
If the test statistic

[If ] or [If ]
then we reject the in favour of at level of significance; otherwise not.
Assumption for F-test
(i) Both samples drawn are simple random samples. (ii) Parant populations of both samples
are normal.
Uses of F-test
1. F-test for equality of two population variances.
F-test for the significance of an observed multiple correlation coefficient.
2. F-test for significance for an observed correlation ratio.
F-test for testing the linearity of regression.
3. F-test for equality of several means.
To test the significance of observed partial correlation coefficient.

Example : The standard deviations calculated from two random samples of sizes 9 and 13 are
2.23 and 1.87 respectively. May the samples be regarded as drawn from normal populations
with the same standard deviation? Given that .
Solution Here , , and

Degrees of freedom are (8, 12)


Since the observed value of (=1.42) is less than the 5% tabulated value (2.85)
corresponding to d.f. (8, 12), we cannot reject the null hypothesis at 5% level of significance.
The conclusion is that the population standard deviations may be equal.
EXERCISE: PROBLEMS ON F-TEST

6.1 If S1 = 1.2, S2 = 1.5, n1 = 15 and n2 = 16 then calculated F-test and draw your
conclusion.
6.2 Two random samples of bottle guard are drawn from two populations and the following
lengths (cm) were obtained
Sample I 40.0 41.6 42.3 44.0 45.4 45.8 46.0 46.2 46.5 47.0
Sample II 35.6 36.2 36.8 37.1 37.4 38.4 39.6 40.0 42.5 44.0 45.2 46.0
Find the variances of two samples and test whether the two populations have the same
variance.
Given that , .
Solution:

6.3: Tree diameter (cm) at breast height recorded on two different samples trees are
Sample Tree diameter at breast height (cm)
Sample 1 14.8 12.0 10.5 14.2 11.8 13.6 13.8 14.5 10.0 12.2
Sample 2 10.0 10.1 9.6 9.5 10.1 11.6 14.1 12.7 12.6 8.5
Can it be said that the both samples are drawn from the populations of equal variances?
Solution:
DEPARTMENT OF AGRICULTURAL STATISTICS
COLLEGE OF AGRICULTURE, BHARUCH

Some Preliminary

[Smoking is one of the leading causes of statistics: Fletcher Knebel]

Chi-square was introduced by Karl Pearson in the year 1899. It is calculated by


k
( Οi - Ei )2
2 = 
i1 Ei
Where, Oi = Observed frequency of ith class
Ei = Expected frequency of ith class
k = number of classes , i = 1,2,..,k
Definition: "It is the sum of the ratio of the square of deviations obtained between observed
and expected frequency to the expected frequency of the respective class of the
frequency distribution."
1) Testing goodness of fit
When chi-square test is used to know whether the given sampling distribution is in
agreement with the theoretical or expected frequency distribution the test is known as
test of goodness of fit.
Procedure:
Step I : Set the appropriate null hypothesis.
Ho : Given sampling distribution is in the agreement with theoretical or expected
frequency distribution.
Ha : Given sampling distribution is not in agreement with theoretical or expected
frequency distribution.
Step II : Fix the level of significance.
Step III : Work out expected frequency according to given ratio or expectation.
Step IV : Calculate Chi square as
k
( Οi - E i )2
2 = 
i1 Ei
where, Oi = Observed frequency of ith class
Ei = Expected frequency of ith class
k = number of classes , i = 1,2,..,k
Step V : Compare cal  with table value at 5% level of significance and (k-1) degree
2

of freedom.
Step VI : If cal 2  table 2 0.05, (k-1) d.f. observed difference is not significant at 5% level of
significance. Ho accepted.
If cal 2 > table 2 0.05, (k-1)d.f. observed difference is significant at 5% level of
significance. Ho rejected.
Step VII : Conclusion : Non significance difference indicates that the given sampling
distribution is in agreement with theoretical distribution and the fit is good.
Significant difference indicates that the given sampling distribution is not
in agreement with theoretical distribution and the fit is poor.
2) Test of Independence
Another common use of the chi square test is in testing independence of
classifications.
Independence: The two attributes A and B are said to be independent to each other if
the proportion of A's among B's is the same as that in not - B's.
Contingency table : When the individuals in a sample have two characters or attributes
and a frequency distribution is made classifying them according to both so as to show the
relation between the characters, the resulted table is termed as contingency table.
Yate's correction
In order to avoid irregularities caused by smaller frequencies in 2 x 2
contingency table, a correction for continuity known as Yate's correction is to be applied.
When the product of principal diagonal (ad) is greater than off diagonal (bc) i.e. ad > bc,
then 1/2 is to be subtracted from the values of principle diagonal cell frequencies and 1/2
is to be added to the values of off diagonal so that the marginal total remain unchanged.
Similarly bc > ad then 1/2 is to be added and 1/2 is to be subtracted from the values of
principle and off diagonal cell frequency respectively.
If ad > bc,

2
 N
 ad  bc    N
2  
2
R1  R 2  C1  C 2

Procedure for test of Independence of attribute in case of 2 x 2 contingency table:

Step I : Set the appropriate null hypothesis.


Ho : The given classification of group of individuals independent to each other.
Ha : The given classification of group of individuals is not independent to each
other.
Step II : Fix the level of significance.
Step III : Let group A and B are classified in two ways, the results of the
classification can be set out the following table.

Class A1 A2 Total
B1 a b R1
B2 c d R2
Total C1 C2 N

Step IV : Calculate Chi square as

 
2 ad  bc2  N
R1  R 2  C1  C 2
Where, a, b, c and d are the observed frequency of the respective cell R1, R2,
C1, and C2 are the rows and column totals. N is the grand total.
Step V : Compare calculated 2 with table value at 5% level of significance and (r-1)(c-1)
degree of freedom.
Step VI : If cal 2  table 2 0.05,(r-1)(c-1) d.f. observed difference is not significant at 5%
level of significance. Ho accepted.
If cal 2 > table 2 0.05,(r-1)(c-1) d.f. observed difference is significant at 5% level of
significance. Ho rejected.
Step VII :Acceptance of Ho means the two characters are independent to each other
Rejection of Ho means the two characters are not independent to each other.
Yate’s Correction: If the expected cell frequencies are small (say less than 5) the combining
of classes become meaningless as it will give degrees of freedom, in that case add
0.5 to the cell frequency which is less than 5 and then adjust the other cell frequencies for the
observed marginal totals. The adding of 0.5 frequencies to the minimum frequency of a
contingency table is called Yate’s Correction. Or calculate the by following
modified formula-
Procedure for test of Independence of attribute in case of r x c contingency table :
Step I : Set the appropriate null hypothesis.
Ho : The given classification of group of individuals is independent to each other.
Ha : The given classification of group of individuals is not independent to each other.
Step II : Fix the level of significance.
Step III : Let a group of individuals is classified in two ways in 'r' rows and ‘c’ column in the
following table.
Class I II Total
1st a11 a12 a13 ... R1
2nd a21 a22 a23 ... R2
: : : : : :
: : : aij : :
. . . : .
Total C1 C2 N

Step IV : Work out expected frequency of each cell as follows.

R1C1 R1C2
E(a11) = -------- E(a12) = --------
N N

R2C1 R2C2
E(a21) = ------ E(a22) = -------
N N

In general, RiCj
E(aij) = -------
N
Step V : Calculate Chi square as
( Οij - Eij )2
2 =  i, j Eij

where, Oij = Observed frequency of ith row and jth column


Eij = Expected frequency of ith row and jth column

Step VI : Compare calculated 2 with table value at 5% level of significance and (r-1)(c-1) degree
of freedom.
Step VII : If cal 2  table 2 0.05,(r-1)(c-1) d.f. observed difference is not significant at 5% level
of significance. Ho accepted.
If cal 2 > table 2 0.05,(r-1)(c-1) d.f. observed difference is significant at 5% level of
significance. Ho rejected.

Step VIII Acceptance of Ho means the two characters are independent to each other Rejection of
Ho means the two characters are not independent to each other.
Example :In an orchard of 1750 trees, a record was taken of the number of shaded and un-
shaded trees, and in each of these classes the frequency of high and low yielding trees was
noted as below:

Shaded Un-shaded
Low yielding 640 305
High yielding 510 295
Test whether shading on the tree has any effect on its yielding capacity?
Solution: : The shading on the tree has no effect on its yielding capacity
: The yielding capacity of the tree is affected by shading
Shaded Un-shaded Total
Low yielding 640 (a) 305 (b) 945 (a+c)
High yielding 510 (c) 295 (d) 805 (c+d)
Total 1150 (a+c) 600 (b+d) 1750 (N)

Table value of at 1 d.f. at 5% level of significance is 3.841.


Comparing calculated with table value, since calculated we accept the null
hypothesis and conclude that the shading has no effect on yielding capacity of the tree.
EXERCISE NO. 9
CHI-SQUARE TEST (2)

7.1 In a series of experiment 1342 plants with green foliage and 1138 with yellow
were conducted. This was a back cross in which the theoretical ratio was 1:1. Test
whether the ratio between the observed number of plants agrees with the
theoretical ratio.
Observed Expeted Value ( O- E) ( O- E ) 2 (O - E ) 2
Value (O) (E) ------------
E
1342 2480/2 = 1240 102 10404 10404/1240 =
8.39
1138 2480/2 = 1240 -102 10404 8.39
2480 16.78

7.2: A cross between two varieties of sorghum one giving high yield and the other for high
amount of fodder was made. The number of plants in generation were observed as 79,
160, 85. Test whether this sample data is in agreement with the Mendalian ratio 1:2:1 or not.
Solution : The sample ratio is in agreement with 1:2:1 ratio.
Observed Expected Frequency
frequency
79 0.0494
160 0.0247
85 0.1975
Total = 324 324 0.2716
Conclusion: with d.f. at 5% level of
significance. Therefore, the null hypothesis is accepted, i.e., the plants are segregating
according to Mendalian ratio, 1:2:1 in generation.
DEPARTMENT OF AGRICULTURAL STATISTICS
COLLEGE OF AGRICULTURE, BHARUCH

Some Preliminary
[Facts are stubborn, but statistics are more pliable. Mark Twain]
----------------------------------------------------------------------------------------------------------------
So far we have studied problems relating to one variable only. In practice we come across a
large number of problems involving the use of two or more than two variables.

UNIVARIATE POPULATION

A population that is characterized by a single variable is termed as Univariate population e.g.


population of height of students, weight, yield etc.

BIVARIATE POPULATION
When two variables are simultaneously studied in a single population is termed as bivariate
population e.g. the height and weight of the students, rainfall and yield, the amount of fertilizer used
and the crop yield.
If two quantities vary in such a way that movement in one are accompanied by movements in
the other, these quantities are said to be correlated. e.g. price of commodities and amount
demanded, increase in rainfall up to a point and production of crop. The degree of relationship
between the variables under consideration is measured through the correlation analysis.

CORRELATION
It indicates the association between the two or more variables in a bivariate distribution or an
analysis of the covariation of two or more variables is usually called correlation.
TYPES OF CORRELATION
Correlation is described or classified in several different ways. Three of the most important
ways of classifying correlation are:
i) Positive or negative
ii) Simple, partial and multiple
iii) Linear and non-linear
METHODS OF STUDYING CORRELATION
There are four major approaches of ascertaining whether two variables are correlated or not:
1. Scatter diagram method
2. Graphic method
3. Algebraic method: Karl Pearson’s coefficient of correlation
4. Rank method
Computational formula:
Cov XY 
ρ
σx  σy

r
Cov XY 

 xy 
SP(xy)
Sx  Sy x y2 2
SS x  SS y

where,  xy   XY -
 X Y
n
 x2 =  X2 - (X)2/n
 y2 =  Y2 - (Y)2/n
PROPERTIES OF CORRELATION COEFFICIENT:

1. A change in an origin does not affect the value of the correlation coefficient.
2. A change in a scale does not affect the value of correlation coefficient.
3. The value of correlation coefficient lies between -1 to +1.
4. Correlation coefficient is unit free.
5. Geometric mean of two-regression coefficient is equal to correlation coefficient.

Degree of correlation: Correlation coefficient determines the mathematical measure of the


correlation which gives the degree of correlation. Since it is a numerical measure, therefore it may be
positive or negative.
Measure of degree Magnitude of Correlation ( )
Perfect
High degree
Moderate degree
Low degree
No correlation

Correlation coefficient of combined sample


If are respectively the sizes of the first, second and combined samples.
their means, their coefficient of correlation;
their standard deviations then , the correlation coefficient of
combined samples is given by

where, , , ,
EXERCISE NO. 10

EXERCISE: PROBLEMS ON CORRELATION


----------------------------------------------------------------------------------------------------------------

Example: Following are the number of seeds per cob and their weight (g) of corn. Find the
coefficient of correlation and test its significance.
Cob No. 1 2 3 4 5 6 7 8 9 10
No. of seeds (X) 278 236 298 275 225 282 290 262 265 239
Seed weight (gm) (Y) 184 151 191 168 160 162 186 158 153 147
Solution Let, No. of seeds Seed weight
Cob X Y (
1 278 184 13 169 18 324 234
2 236 151 -29 841 -15 225 435
3 298 191 33 1089 25 625 825
4 275 168 10 100 2 4 20
5 225 160 -40 1600 -6 36 240
6 282 162 17 289 -4 16 -68
7 290 186 25 625 20 400 500
8 262 158 -3 9 -8 64 24
9 265 153 0 0 -13 169 0
10 239 147 -26 676 -19 361 494
Total 2650 1660 0 5398 0 2224 2704

It indicates the high degree of positive correlation.


To test the significance of correlation

Table value for for 8 d.f.


Since we conclude that the correlation coefficient is significant.
10.1: Following are the measurements of certain plant characters
Plant No. 1 2 3 4 5 6 7 8 9
Plant Height (X) (cm) 42.00 48.50 38.20 40.30 42.80 50.30 46.40 44.00 38.50
100 Seed weight (Y) (g) 130.60 146.52 138.56 152.60 139.70 156.48 142.70 144.60 130.25
Find out Karl Pearson coefficient of correlation between the characters and test its
significance.
Solution:
Exercise 10.3 Following table gives the value of soil temperature at 4 inches below ground
in degree F (X) and germination interval in days (Y) for wheat at 12 places. Find the
coefficient of correlation between the two characters and test its significance.
Place 1 2 3 4 5 6 7 8 9 10 11 12
Soil temperature (ºF) 57 42 40 38 42 45 42 44 40 46 44 43
Germination interval (days) 10 26 30 41 29 27 27 19 18 19 31 29
Solution:
REGRESSION

Definition: Regression is a study of average relationship between two or more variables


in terms of original units of the data.
REGRESSION COEFFICIENT

Regression coefficient can be defined as the average increase or decrease in the


dependent variable for a unit change in the independent variable. or it is the average rate
of change in dependent variable with a unit change in independent variable. It is
represented by yx and xy for the population regression coefficient. In practice they are
estimated with the help of the sample from the bivariate population under
consideration and these estimates are generally represented as byx and bxy respectively.

Method of computation

β yx 
 X  μ Y  μ   Cov XY 
x y

 X  μ x
2
V X 

b yx 
 X  X Y  Y    XY   X  Y /n   xy
 X  X   X   X  /n  x
2 2 2 2

Similarly,

β xy 
 X  μ Y  μ   CovXY 
x y

 Y  μ y
2
V Y 

b xy 
 X  X Y  Y    XY   X  Y /n   xy
 Y  Y   Y   Y  /n y
2 2 2 2

The line indicating the mean relationship between two variables is known as regression line.
Regression coefficient is the rate of change in one variable by changing one unit in the other.
The two regression lines are

(i) Regression line of on : or

(ii) Regression line of on : or


Properties of regression line
1. Both the lines of regression passes through the point .
2. If the value of correlation coefficient is 0 the regression lines are perpendicular to each
other. In case of prefect correlation both the lines coincide and we get only one line. This
leads us to conclusion that for higher degree of correlation between the variables, the
angle between the lines is smaller. On the other hand, if lines of regression make a larger
angle, they indicate a poor degree of correlation between the variables.
3. Correlation between observed and estimated values is the same as the correlation
between and .

PROPERTIES OF REGRESSION COEFFICIENT

1) Geometric mean between regression coefficient is correlation coefficient i.e.


r =  byx . bxy

a) Arithmetic mean of byx & bxy is equal to or greater than correlation


coefficient i.e. byx + bxy  r
2
b) If one regression coefficient is greater than unity than other regression
coefficient must be less than unity.
2) Regression coefficient is independent of origin but not scale
3)Regression coefficient lies between -  to + 
4)Regression coefficient has unit
5) Regression coefficient has one way relationship

USES OF REGRESSION:

1) To predict the value of Y for a given value of X with the help of regression equation.
2) To know the rate of change in Y for a unit change in X with the help of regression
coefficient.
Relations among r, byx, bxy, Sx and Sy

(i) r =  byx . bxy

Sy Sx
(ii) byx = r ---- (iii) bxy = r ----
Sx Sy
DIFFERENCES BETWEEN CORRELATION AND REGRESSION

CORRELATION REGRESSION
1 It deals with mutual association It deals with cause and effect relationship
2 It is two way relationship It is one way relationship
3 Correlation coefficient is unit free Regression coefficient is in the units of
dependent variable
4 Correlation coefficient lies between Regression coefficient lies between - 
- 1 to + 1 and + 
5 For a given value of one variable other For a given value of independent variable the
variable can not be predicted value of the dependent variable can be
predicted.
EXERCISE: PROBLEMS ON REGRESSION

10.4 The following table shows the yield of straw (Y) and yield of grain (X) in Kg.
from plots of 10 x 10 m.
Calculate the regression coefficient of Y on X and X on Y. Estimate the grain yield
for plot giving 24.5 Kg. straw. What will be the correlation coefficient value in
between straw and grain yields?
Grain(X) 59 65 62 68 59 67 65 68 67 66 69 65
Straw(Y) 19 27 21 21 20 22 28 23 26 27 22 20

Sr. Grain x= (X- x2 Straw y=(Y-Y) y2 xy


no (X) X) (Y)
1 59 -6 36 19 -4 16 24
2 65 0 0 27 4 16 0
3 62 -3 9 21 -2 4 6
4 68 3 9 21 -2 4 -6
5 59 -6 36 20 -3 9 18
6 67 2 4 22 -1 1 -2
7 65 0 0 28 5 25 0
8 68 3 9 23 0 0 0
9 67 2 4 26 3 9 6
10 66 1 1 27 4 16 4
11 69 4 16 22 -1 1 -4
12 65 0 0 20 -3 9 0
Total 780 0 124 276 0 110 46
10.5 From the observation of age (X) and the mean blood pressure (Y) following
quantities were calculated. Work out the regression coefficient of Y on X and
estimate the mean blood pressure for a person having the age of 35 years.
(1) X = 55 (2) Y = 140 (3) Xi2 = 61500
(4) yi2 = 1936 (5) Yi2 = 393936 (6) xy = 1380
10.6 Work out correlation coefficient from the following & indicate the nature of them.
(1) Xi = 40, XiYi = 207.2, Xi2 = 170,
Yi = 264.4,
2
byx = 0.72, n = 10

(2) X = 5, Y = 8, XiYi = 407.5, Xi2 = 259, Yi2 = 665, n = 10

(3) Y = 10, byx = 4/3, Xi2 = 2314, XiYi = 1548,


Yi2 = 1036, n = 10
Example : Equation of two regression lines are 4X + 3Y +7 = 0 and 3X + 4Y + 8 = 0.
Find
(i) mean of X and Y
(ii) Regression coefficients and Correlation coefficient between X
and YSolution: Given equations are 4X + 3Y +7 = 0 and 3X + 4Y + 8 = 0
(i) Both equations passes through mean ,therefore the equations will
satisfy the point
Thus …(1)
…(2)
Multiplying (1) by 4 and (2) by 3, we get
…(3)
…(4)
Subtracting (4) by (3)
Substituting the value of in (1) we get

(ii) Let is the equation of Y on X


Thus
and is the equation of X on Y
Thus
Now
which is greater than 1, therefore our assumptions were wrong and
is the equation of X on Y
Thus
and is the equation of Y on X
Thus
Now
Which is less than 1, Therefore and
(iii) Correlation coefficient between X and Y

Since correlation coefficient and regression coefficients have same sign therefore
Correlation coefficient between X and Y
Exercise 10.7: The following data are for the amount of water supplied in inches
and the yield of alfalfa in tons per acre
Water 12 18 24 30 36 42 48
Yield 5.3 5.7 6.5 7.2 8.2 9.7 8.4
(i) Find the regression of yield on water.
(ii) Assuming that the relation between the two is linear, calculate the expected
yield when the amount of water supplied is 20 inches.

Solution:

Exercise 10.8: Two lines of regression are given by X  0.268 Y  26.73


and Y  0.65 X  21.64 . It is also given that . Calculate the mean values
of X and Y, and correlation coefficient between X and Y.
Solution
DEPARTMENT OF AGRICULTURAL STATISTICS
COLLEGE OF AGRICULTURE, BHARUCH

Some Preliminary
[All statistics have outliers.― Nenia Campbell, Terrorscape]
---------------------------------------------------------------------------------------------------
The Karl Pearson’s method is based on the assumption that the population being studied
is normally distributed. When it is known that the population is not normal, or when the
shape of the distribution is not known there is a need for a measure of correlation that
involves no assumption about the parameters of the population.

This method was developed by Charles Spearman in 1904. This measure is especially
useful when quantitative measures for certain factors can not be fixed. E.g. (I) correlation
between marks obtains in two different subjects by the same group of students. (ii)
Correlation of height and weight of the students can be worked out without making exact
measurement. We shall first stand the students according to height; the same procedure can
be utilized for weight for giving ranks. When there are two or more items are of equal
magnitude, their ranks are to be calculated by taking the average of their ranks.

R 1 -

6  di2 1 12 P 3  P   6 di2

n n2  1  or 1 -

n n2  1 
where di2 = square of difference of rank
n = number of pairs
P = number of items where ranks are common

Example : Two judges in a rose exhibition-cum-compitition rank the 10 compititors in the


following order

Judge 1 4 3 6 1 2 7 9 8 10 5
Judge 2 1 6 4 7 5 8 10 9 3 2
Do the two judges appear agree in their judgments?
Solution:
Flower set No.
Total
1 2 3 4 5 6 7 8 9 10
Judge 1 4 3 6 1 2 7 9 8 10 5
Judge 2 1 6 4 7 5 8 10 9 3 2
Rank difference 3 -3 2 -6 -3 -1 -1 -1 7 3 0
9 9 4 36 9 1 1 1 49 9 128
Here

The judges appeared to agree with at low level.


EXERCISE: PROBLEMS ON RANK CORRELATION

11.1 Calculate the rank correlation from the following data.


Students Marks in Rank
Agronomy Statistics Agronomy Statistics
1 75 25
2 40 42
3 52 35
4 65 29
5 60 33
11.2 Marks of 10 students in Maths and Statistics are as under. Calculate Spearman
Rank Correlation.

Students Marks in Rank di di2


Maths Statistics Maths Statistics
1 10 5 1 7 -6 36
2 8 3 3.5 10 -6.5 42.25
3 3 6 9 5 4 16
4 6 6 6 5 1 1
5 8 4 3.5 8.5 -5 25
6 4 8 8 2 6 36
7 2 4 10 8.5 1.5 2.25
8 5 6 7 5 2 4
9 9 7 2 3 -1 1
10 7 9 5 1 4 16
179.5

R 1 -
 
6  di2 1 12 P 3  P 

n n 1
2

DEPARTMENT OF AGRICULTURAL STATISTICS
COLLEGE OF AGRICULTURE, BHARUCH

Some Preliminary

[“Statistics is the main of all inaccurate studies” ― Edmond Goncourt (de), Jules de Goncourt]
---------------------------------------------------------------------------------------------------

Design of Experiments refers to a plan for assigning subjects to experimental conditions and
the statistical analysis associated with the plan.
Experimental unit- The basic units for which response measurements are collected are
called experimental units of subject.
Factors Distinct types of conditions that are manipulated on the experimental units are called
factors.
Factor levels: The different modes of presence of a factor are called factor levels.
Treatment: Each specific combination of the levels of different factors is called a treatment.
Analysis of Variance (ANOVA): The analysis of variance technique divides the total
variance of a set of data into component parts. Each component part has its own source of
variation which the ANOVA procedure will identify and locate. Additionally, the magnitude
of contribution of each source of variation is delimited by this procedure.
Assumptions for validity of ANOVA
(i) The samples drawn are independent of each other and random.
(ii) Parent population from which observations are taken is normal
(iii) Various treatment and environmental effects are additive in nature.
There are three basic principles of experimental designs-
(a) Randomization
Randomly allotment of treatment to different plots is knpwn as randomization.
(b) Replication
Repetition of treatments under test in an experiment is known as replication.
(b) Local control

Randomization: To obtain better estimate of treatment mean, it should enjoy all types of soil
variations existing in the experimental area. This will help to give real comparison
between treatments. It refers to the method of giving equal chance to all the
individuals to show their performance. The statistical procedure employed for
comparing the means of different treatments holds well when the treatments are
allotted at random.
The method of randomization avoids personal bias in the allotment of
treatments. This is necessary for the validity of the use of standard error. In short,
randomization helps to make an unbiased estimate of (a) treatment means and (b)
experimental error.
Replications: The number of experimental units on which a particular treatment is applied is
called the number of replications of that treatment.
The purpose of replication is to obtain more information (more degrees of freedom) for
estimating and assessing the experimental error and to obtain estimates of effects with
smaller standard errors.
Variation in the soil fertility cannot be avoided owing to its unpredictable nature. The
experimenter, therefore, seeks to average out its inference over different treatments by
repetition. If a treatment is repeated n times, the mean of these repetitions will be subject to a
standard error of , where σ = standard deviation of individual plot estimated from the
experiment. This means as n increases the experimental error goes on decreasing and smaller
differences between the treatments can be brought out. If the area is limited, it is better to
increase the replications than the plot size. Thus, replications are necessary for
(i) estimating the experimental error
(ii) increasing the precision of treatment means, and
(iii) increasing the sensitiveness of the test of significance by decreasing the standard error.
Local control : Local control means control of all factors except the ones about which
we are investigating. Variations do occur if the experimenter is careless during the conduct
of the experiment and does not carry out the operations in all the plots on the same day or
part of the day e.g. in varietal or manurial experiments, operations like sowing, weeding or
interculturing have to be completed on the same day. Weeding done in the some plots on one
day and in the remaining plots on the next day is likely to cause variation, particularly when
it is preceded or followed, by rain or cloudy weather or drought conditions. If the experiment
area is big and weeding cannot be managed due to shortage of labour or some such
unavoidable cause, the operation should be completed block wise on the same day or part of
the day, so that any one block containing all the treatments will have received similar care
and variation caused by that operation between block, will be isolated in replication
deviation.
DEPARTMENT OF AGRICULTURAL STATISTICS
COLLEGE OF AGRICULTURE, BHARUCH

Some Preliminary

Completely randomized design (C. R. D.)


------------------------------------------------------------------------------------------------
The application of the two principles, replication and randomization without the use of
third principles, local control of error, results in the simplest of experimental design, known
as completely randomized design (C.R.D.). Each treatment is assigned to some fixed no. of
units at random. It is not necessary that the no. of units for each treatment is the same but, if
we are interested equally in all the treatment effects, it is simpler and more efficient to have
each treatment tried on equal no. of experimental units.

Statistical Model : Yij  μ  τ i  ε ij

Where, Yij = Response or yield from the jth unit receiving the ith treatment
 = General mean
 i = Effect of ith treatment
 ij = Uncontrolled variation associated with jth unit receiving ith treatments.

Analysis of variance: Completely randomized design with equal replications:

Source of M. S.
variation DF Sum of Squares (SS) (SS/DF) Cal. F
Treatment (t-1) t t r MST MST
 Yi.2 ( Y ij )2 MSE
i 1 j 1
i 1

r rt
Error t(r-1) By subtraction MSE
Total (rt-1) t r

t r
( Y ij )2
 Y
i 1 j 1
2
ij 
i 1 j 1

rt
Analysis of variance : Completely randomized design with unequal replication.

Source of M. S.
variation DF Sum of Squares (SS) (SS/DF) Cal. F
(t-1) t r MST MST
Treatment t 2
( Y ij )2 MSE
Y

i 1 j 1

i.
t
i 1 ri
r
i 1
i

t
Error MSE
r  t
i 1
i
By subtraction
t
Total t r
r 1
i 1
i
t r
( Y ij )2
 Y
i 1 j 1
2
ij  t
i 1 j 1
r
i 1
i

1 1
SEm  MS E or SEd  MSE (  )
r or r0 ri rj
Where, r = Number of observations for treatments (equal number of observations)
ro = Harmonic mean of number of observations for different treatments (when
unequal number of observation for treatment).

No. of Treatments
r0 
1  1  ..........  1
r1 r2 rt
Where r1, r2... are the number of observations for different treatments.

lsd at 5 % if treatment F is significant. = SEd x to.o5, ne

MSE
CV %  x 100
Y ..
Statistical analysis of variance

Let there be N units and `t' treatments. If N is a multiple of t, i.e. N=nt, each treatment
can be allotted to n units at random. As the only source of assignable variation is the effect of
treatments, the total variation in the data under C. R. D. can be analysed as one way
classification. The no. of replications per treatment should be such that the degrees of
freedom for the error shall be at least 10. For the actual computation, following example is
calculated.
: CRD Calculation(with equal replication):

Statistical analysis of variance

Let there be N units and `t' treatments. If N is a multiple of t, i.e. N=nt, each treatment
can be allotted to n units at random. As the only source of assignable variation is the effect of
treatments, the total variation in the data under C. R. D. can be analysed as one way
classification. The no. of replications per treatment should be such that the degrees of
freedom for the error shall be at least 10. For the actual computation, following example is
calculated.
Ex. yield in per plot 5 varieties of wheat applied each to 4 plots at random.

Variety Yield (kg/plot) Total Mean


1 2 3 4
A 8 8 6 10 32 8
B 10 12 13 9 44 11
C 18 17 13 16 64 16
D 12 10 15 11 48 12
E 8 11 9 8 36 9
Grand Total 224
Calculation:
Grand total
General mean µ = -------------------------
No. of observations
224
= ---------- = 11.2
20
(Grand total)2 (224)2
Correction factor ( C. F.)= ----------------------- = ------------ = 2508.8
No. of observations 20

Total S. S. 19 d. f. = Σ (Individual value)2 - C. F.


= (82 + 82 + --- + 92 + 82 ) - 2508.8
= 2716.0 - 2508.8 = 207.2

Σ ( Respective treatment total)2


Treatment S. S. = --------------------------------------------- - C. F.
No. of observation in respective treat.

(322 + --- + 362)


= ---------------------- - 2508.8
4
= 2664.0 - 2508.8 = 155.2
Error S. S. = Total S. S. - Treatment S. S.
15 d.f. = 207.2 - 155.2 = 52.0
ANOVA TABLE :
Source of Degree of Sum of Mean Calculated `F' Table `F' value
Variation freedom squares square `F' value
(d.f.) (S. S.) (m. s.)
Between 4(t-1) 155.20 38.80 11.18** n1 = 4 & n2 = 15
treatment `F' at
(variety) 5% - 3.06
1% - 8.25
Within 15(t(n-1)) 52.00 3.47
treatment
(error)
Total 19(nt - 1) 207.20
** Significant at 1% level.
F test indicates that there are significant difference between varieties of wheat. Now
we wish to know which variety performed best and which varieties are differing among them
selves. This can be known by least significant difference (L. S. D.) or critical difference and
confidence interval.
L. S. D. = t0.05,15 x 2 x S. Em. = t0.05,15 x S.Ed.
= 2.131 x 1.317 =  2S2/ r
= 2.81 =  2 x 3.47 / 4 =  1.735 = 1.317

_
Now rank the treatment mean (Y) :

C D B E A
16 12 11 9 8
--- ---------- --------
---------
Varieties which do not differ significantly are underlined by a common bar.
Normally we would allot the same no. of experimental units to each treatment.
However, on account of death or failure in performance (stop giving milk), it can happen that
we end up the experiment with unqual no. of experimental units in different treatments. The
analysis continues to be simple only the divisors in the sum of squares undergoing changes.
The variance of the difference between treatment means varies depending on the no. of
observations in each estimate of
_ _
V(Yi - Yj) = S2 ( 1/ri + 1/Rj )
_ _
Where S2 = Error m. s., ri & rj are the no. of observations that made up Yi - Yj ,
respectively.
Example: An experiment is conducted to determine the soil moisture deficit resulting from varying amounts of
residual timber left after cutting trees in the forest. The measurements of moisture deficit are given in the
following table: Perform the ANOVA test and construct confidence intervals for treatment differences
Moisture deficit under different treatments
Treatment Moisture deficit in soil
T1 1.44 1.64 1.20 1.48 1.55 1.46
T2 2.65 3.82 2.76 2.12 2.78
T3 1.02 1.22 1.05 1.16 1.32 1.04
T4 0.68 0.82 0.95 0.84
Solution: H0: All treatments retain the soil moisture equally or the effect of all treatments to retain soil
moisture is equal.
Totals of moisture deficit under different treatments
Treatment Moisture deficit in soil Total Mean
T1 1.44 1.64 1.20 1.48 1.55 1.46 7.31 1.218
T2 2.65 3.82 2.76 2.12 2.78 14.13 2.826
T3 1.02 1.22 1.05 1.16 1.32 1.04 6.81 1.135
T4 0.68 0.82 0.95 0.84 3.29 0.822

Grand total = 1.44+1.64+...+0.84 = 31.54 Grand Mean

Correction factor (CF)


Total Sum of Squares (TSS) = 64.93 - 47.37 = 17.56
Treatment Sum of Squares (SST)
= 8.90 + 39.93 + 7.73 + 2.71 - 47.37 = 11.90
Error Sum of Squares (SSE) = 17.56 - 11.90 = 5.66
Analysis of variance
Degree of Sum of Mean Sum of F (Table)
Source of Variation F
Freedom Square Square 5% 1%
Treatments 3 11.90 3.97 12.03 3.20 5.18
Error 17 5.66 0.33
Total 20 17.56

Since calculated F value (12.03) > table value of F at 1% level of significance (5.18) which indicate that the
difference between the treatments to retain soil moisture is highly significant. Therefore, the null hypothesis
may not be accepted.
Critical difference between treatment 1 and 2, Treatment 2 and 3

[2.898 = table value of t(1%,17)]

Critical difference between treatment 1 and 3

= 0.961
Critical difference between treatment 1 and 4, Treatment 3 and 4
= 1.075
Critical difference between treatment 2 and 4
= 1.117
Now, keeping the treatment means in descending order
Treatment T2 T1 T3 T4
Mean yield 2.826 1.218 1.135 0.822

CV(%) = 38.25

Conclusion: Treatment T2 showed its superiority to retain the soil moisture among all other treatments. All
other treatments are statistically alike for the purpose of retaining soil moisture.
DEPARTMENT OF AGRICULTURAL STATISTICS
COLLEGE OF AGRICULTURE, BHARUCH
EXERCISE NO. 12
PROBLEMS ON CRD
------------------------------------------------------------------------------------------------------------

Exercise 12.1: Wood density (g/cc) observed on a randomly collected set of stems
belonging to different cane species are given below:

No.of samples Species


uewuk A B C D E
1 0.58 0.53 0.49 0.53 0.57
2 0.54 0.63 0.55 0.61 0.64
3 0.38 0.68 0.58 0.53 0.63
4 0.46 0.58 0.48 0.55 0.60
Do the data revel that the cane species are having equal wood density?
Solution:
DEPARTMENT OF AGRICULTURAL STATISTICS
COLLEGE OF AGRICULTURE, BHARUCH

Some Preliminary

[If your experiment needs a statistician, you need a better experiment.


― Ernest Rutherford]
----------------------------------------------------------------------------------------------------------------

Randomized Block Design : (RBD)

When the experimental material is heterogeneous efforts should be made to group into
homogeneous groups of size equal to no. of treatments, each of groups constitutes a
replication. The treatments are applied to these units of a group at random. Fresh
randomization is to be followed in assigning treatments the experimental units of each group.
In case of field experiment if it is observed that the fertility gradient of the field is in one
direction the whole field may be divided in to a no. of blocks. The no. of plots in each block
is equal to the no. of treatments, so that each block is a replicate.
The shape of the blocks should be either rectangular or square and that the
experimental area should be made as compact as possible. This reduces the difference in soil
fertility within the blocks to a minimum.
The fertility within a block should be as uniform as possible. During the course of
experiment, an uniform technique should be employed for all the plots of the same block. If
necessary the changes in the technique and other conditions may be made bet. the blocks, but
within the blocks uniformity should be maintained.

Advantages (i) Accuracy: Blocking can increase precision by removing one source of
variation from experimental error. (ii) Flexibility: There is no restriction on the
number of treatments or number of blocks so long as each treatment is replicated the same
number of times in each replication. (iii) Easy computation: Statistical analysis is relatively
simple. Moreover any number of treatments may be omitted from the analysis without
complicating it. (iv) It is possible to separate sum of squares for error into components
corresponding to particular treatment effect.
Disadvantages (i) The efficiency of the design decreases as the number of treatments and,
hence, block size increases. (ii) Missing data can cause some difficulty in the analysis. (iii)
The design is less efficient than others in the presence of more than one source of variation.
Statistical model :
Yij = µ + Ti + Bj + Eij
Where,
Yij = Yield of ith treatment in jth replication.
µ = general mean
Ti = Effect due to ith treatment
Bj = Effect due to jth replication
Eij = Uncontrolled variation in plot receiving ith treatment in jth replication.
Analysis of Variance

Source of DF Sum of Squares (SS) M. S. Cal. F


variation (SS/DF)
Replication (r-1) r t r MSR MSR /MSE
 Y. 2j
j 1
(
i 1
Y
j 1
ij )2

t rt
Treatment (t-1) t t r MST MST /MSE
Y 2
i.
( Y ij )2
i 1 j 1
i 1

r rt
Error (t-1) By subtraction MSE
(r-1)
Total (rt-1) t r

t r
( Y ij )2
 Y
i 1 j 1
2
ij 
i 1 j 1

rt

Standard error and critical difference :

MS E
The standard error of mean = S .Em. 
r
The standard error or the difference between the treatment means based on r
replications is estimated by the relation.

S .Ed .  2 MSE
r
where, MSE = Error M.S. r = No. of replications

Critical difference at 5 % level of significance

CD 0.05 = S Em x 2 x t0.05,ne or SEd x t 0.05,ne


(when treatment F is significant)
Statistical analysis:

Ex. 13 : The yield of 6 varieties of a wheat in kg/plot, are given below. The no. of
replications is 5, plot size is 1/20 acre and the varieties have been represented by A, B, C, D,
E & F.
Treatment Replication Treat. Treat.
I II III IV V Total means
A 20 26 30 28 23
B 9 12 10 16 7
C 12 15 16 14 14
D 17 10 20 23 20
E 28 26 23 35 30
F 40 50 56 64 70
Total
Statistical analysis :
The yield of 6 varieties of a wheat in kg/plot, are given below. The no. of replications is
5, plot size is 1/20 acre and the varieties have been represented by A, B, C, D, E & F.
Treatment Replication Treat. Treat.
I II III IV V Total means
A 20 26 30 28 23 127 25.40
B 9 12 10 16 7 54 10.80
C 12 15 16 14 14 71 14.20
D 17 10 20 23 20 90 18.00
E 28 26 23 35 30 142 28.40
F 40 50 56 64 70 280 56.00
Total 126 139 155 180 164 764
Analysis :
Grand total 764
1. General Mean = -------------------------------- = --------------- = 25.46
Total no. of observations 30

(Grand total)2 (764)2


2. Correction Factor = C. F. = ------------------------------- = --------- = 19456.53
Total no. of observations 30
30
3. Total S. S. = Σ (Individual observation)2 - C. F.
29 d.f. 1
= (202 + 92 + --- +702) - C. F. = 27000 - 19456.53 = 7543.47
6
Σ(Treatment total)2
1
4. Treat. S. S. = ----------------------------- - C. F.
5 d.f. No. of replications

(127)2 + --- +(280)2


= ------------------------- - 19456.53
5

130750
= ----------- - 19456.53 = 26150.00 - 19456.53 = 6693.47
5
5
Σ (Replication total)2
1
5. Replication S. S. = -------------------------------- - C. F.
4 d.f. No. of treatments

(126)2 + --- + (164)2


= -------------------------- - 19456.53
6

118518
= ------------- - 19456.53 = 19753.00 - 19456.53 = 296.47
6

6. Error S. S. = Total S.S.- (Treatment S.S. + Repli. S.S.)


= 7543.47 - (6693.47 + 296.47) = 553.53

ANOVA TABLE :
Source D.F. S.S. M.S. Cal. F Tab. F Result
Replication 4 (r-1) 296.47 74.12 2.67 2.87 NS
Treatment 5 (t-1) 6693.47 1338.69 48.36* 2.71 Sig.
Error 20 553.53 27.68
(r-1) (t-1)
Total 29 (rt-1) 7543.47
Conclusion:
It is clear from the table that treatments are significant at 5% level. There are significant
differences between the treatments means.
Now we have to test the significance of the difference between the individual treatments and
that will be done with the help of Least Significant Difference (L.S.D.) or Critical Difference (C.D.).
Note:-
If the F test reveals that treatments are non-significant, then there is no need to find out value
of L.S.D. or C.D.

L.S.D. = t0.05,ne x 2 x S.Em.


Where,
t0.05,ne = Table `t' value at 5% level of significance for error d.f.
S.Em. = Standard error of mean
L.S.D. = t0.05,ne x S.Ed.
Where,
S.Ed. = Standard error of difference between two means.
Now S.Em. = Error m.s. / No. of replication = 27.68 / 5 = 5.536 = 2.35

Thus, L.S.D. = t0.05,ne x 2 x S.Em.


= 2.086 x 1.414 x 2.35 = 6.94
Conclusions :
The treatments have been compared by setting them in the descending order of their yields
in the following manner.

Treatments F E A D C B
Mean yields 56.00 28.40 25.40
18.00 14.20 10.80
in kg/plot ---------------------
-------------------------
------------------
The treatments which do not differ significantly have been underlined by a common bar. The
treatment F has been found to be the best of all treatments.
EXERCISE NO. 13
Problems on RBD
------------------------------------------------------------------------------------------------
Example13. 1: An experiment with 10 treatments was carried out in a randomized block
design with three replications for urdbean, variety LBG-17. The seed yield (q/ha) under
different treatments and replications are given in the following table. Analyze the data and
interpret the results
Seed yield (q/ha)
Treatments
Block
1 2 3 4 5 6 7 8 9 10
I 16.22 25.78 28.59 27.47 26.13 23.96 16.99 18.21 37.75 32.15
II 24.59 33.97 26.08 18.32 24.77 23.23 10.05 25.72 39.88 36.71
III 33.39 33.78 34.16 27.88 23.01 28.17 31.62 20.55 39.20 39.83
DEPARTMENT OF AGRICULTURAL STATISTICS
COLLEGE OF AGRICULTURE, BHARUCH

Some Preliminary
[To consult the statistician after an experiment is finished is often merely to ask
him to conduct a post mortem examination. He can perhaps say what the
experiment died of. : Ronald Fisher]
------------------------------------------------------------------------------------------------
LATIN SQUARE DESIGN

The principle of `local control' was used in randomized block design by


grouping the units in one way. i.e. according to blocks. This grouping can be carried
one step forward and we can group the units in two ways and get the Latin square
design. The design is used with advantage is agricultural field experiments where
the contours are not always known. This design eliminates the initial variability
among the units in two orthogonal directions. It has also been used successfully for
experiments in industry, laboratory as well as for experiments with animals.
The no. of replications equals the no. of treatments. Let m stand for the
no. of treatments then the total no. of experimental units needed for this design is m
x m. These m2 units are arranged in m rows (one source of variation) and m
columns (second source of variation). The m treats are allotted to these m 2 units at
random, subject to the condition that each treatment occurs once and only once in
each row and in each column.
This arrangement of units and allocation of treatments to units makes the m
rows similar to m complete blocks of an RBD (the same is true also of the m
columns).
The Latin square design is actually an incomplete three way layout, where
all the three factors, rows, columns and treatments are at the same no. of levels (m),
for a complete three way layout with each factor at m levels, we need m 3
experimental units. But in Latin square design we take only m 2 of these m3 units
according to the plan stated above.
As an example, let us consider a 4 x 4 Latin square for comparing four
varieties of a crop. We take a rectangular field divided into 4 x 4 = 16 plots, arranged
in four rows and four columns. We represent the varieties by A, B, C, D. The
following is a representative 4x4 Latin square.

Columns
D C B A
Rows C B A D
B A D C
A D C B
What is a Latin square design under what circumstances it is preferred: -

While planning the experiment, the restrictions have been imposed as needed and according to
these restrictions, the design changes.

The design which simultaneously can control variation in two directions is known as Latin
square design. It does not mean length and breadth of plot should be in equal Latin square design, but
the no. of rows and columns are equal. It is reliable to give precise results.
1. When the experimental material can be divided into homogeneous groups by one way and
also into groups by the other way.
Ex. The field having fertility gradient in two direction.
2. Animals can be divided into groups according to their body weight, age, lactation no. etc.
3. In cross over trials, Latin square design is most suitable.

Randomization in a Latin square :

This involves placing the treatments at random in position in the square, subject to the
restriction that treatment can occur and only once in a row or column. The basic principle as stated by
Fisher is that each plot has an equal probability of receiving any of the possible treatments, and each
pair of being treated alike. Yates discussed in detail the procedures necessary for randomization of
Latin squares from the 3 x 3 to 12 x 12. In general, if we have all the possible arrangements for a
Latin square of given dimension, the process of randomization involves.

(i) drawing one of these at random, for example, for the 5 x 5 square :-

C1 C2 C3 C4 C5
R1 A B C D E
R2 E A B C D
R3 D E A B C
R4 C D E A B
R5 B C D E A

(ii) Randomize the rows :-

C1 C2 C3 C4 C5
R3 D E A B C
R1 A B C D E
R5 B C D E A
R4 C D E A B
R2 E A B C D

(iii) Randomize the column fixing the first column :-

C1 C5 C3 C2 C4
D C A E B
A E C B D
B A D C E
C B E D A
E D B A C
(iv) Letter randomization to have a latin square for actual conduct of the experiment :-

1 2 3 4 5
A B C D E
C E B D A

Put these values instead of the original values. Final Latin square design to utilize to
experiment.

1 2 3 4 5
1 D B C A E
2 C A B E D
3 E C D B A
4 B E A D C
5 A D E C B

Statistical Model :
Yij(K) =  + Ri + Cj + T(k) + ij(k)
i = 1,2, ...,r j = 1,2, ...,c k = 1,2, ...,t and r = c = t

Where :

Yij(K) = The response of k th treatment in ith row and jth column


 = General mean
Ri = Effect of i th row = Ri - Y
Cj = Effect of j th column = Cj - Y
T(k) = Effect of k th treatment = Tk - Y
ij(k) = Experimental error associated with kth treatment in ith row and jth column.
They are assumed to be NID (0, 2)
Advantages
(i) Chief advantage of LSD is that, it controls the heterogeneity of soil in two directions instead of
one as in case of RBD.
(ii) The precision of experiment is increased because of compact blocks.
Disadvantages
(i) This design is not flexible as RBD. This limits the number of treatments. The number
of plot increases as the number of treatments increases. So, for a large number of treatments,
say beyond 12, LSD is less efficient as the block size will also increase introducing to the
heterogeneity as a source of error. Similarly, if number of treatments is small, say less than 5,
the degrees of freedom for error become very small.
The analysis becomes very complicated if there are missing data or if treatments are mis-
assigned
Analysis of Variance

Source DF Sum of Squares M.S. Cal. F


t
Row (t-1)
1 t
( Y ij ( k ) )2 MSR=
SSR/DF
MSR/
MSE

i , j , k 1
2
Y i .(.) 
t i 1 t2 t
Column (t-1) (  Yij ( k ) ) 2 MSC = MSC/
1 t
 SSC/DF MSE
i , j , k 1
Y. 2j (.) 
t i 1 t2
t
Treat. (t-1)
1 t
( Y ij ( k ) )2 MST = MST/
Y 2  SST/DF MSE
i , j , k 1

t i 1 ..(k ) t2
Error (t-1) MSE or
By Difference
(t-2) Se2 = SSE/DF
Total (t2-1)
t

t
( Y ij ( k ) )2
Y
i , j , k 1
2

i 1
ij ( k )
t2

Standard error of mean : SEm  S e2 / t


Where, Se2 = Error M.S. t = No. of treatments
\\\

Critical difference at 5 % level of significance, CD0.05 = S Em x 2 x t0.05,ne

S e2
C.V. % = x 100
Y
When the experimental material is not completely homogeneous and we observe that there are
more than one kind of variation in the material, e.g. fertility gradient in the field is in two
directions, we devide the experimental media into small blocks in such a way that the variation in
experimental material is controlled in two directions. Now all the treatments under study are
applied randomly within each row and column, the experimental design of this kind is called
Latin Square Design (LSD).
Advantages
(iii) Chief advantage of LSD is that, it controls the heterogeneity of soil in two directions instead
of one as in case of RBD.
(iv) The precision of experiment is increased because of compact blocks.
Disadvantages
(ii) This design is not flexible as RBD. This limits the number of treatments. The number of
plot increases as the number of treatments increases. So, for a large number of treatments, say
beyond 12, LSD is less efficient as the block size will also increase introducing to the
heterogeneity as a source of error. Similarly, if number of treatments is small, say less than 5, the
degrees of freedom for error become very small.
(iii) The analysis becomes very complicated if there are missing data or if treatments are mis-
assigned.
Method of analysis
Let there be t treatments.
Structure of ANOVA for LSD
Source of Degree of Sum of
Mean Sum of Square F
Variation Freedom Square
Rows t-1 SSR

Columns t-1 SSC

Treatment t-1 SST

Error (t-1) (t-2) SSE


Total TSS
Analysis: Let there are t treatments. Then total observations (plots)
Let Total of kth treatment, Total of ith row Total of jth column
Grand total (GT) = Correction factor (CF) =
Total Sum of Squares (TSS) Treatment Sum of Squares (SST)
Row Sum of Squares (SSR) Column Sum of Squares (SSC)
Error Sum of Squares (SSE) = TSS- SST-SSR-SST
Mean Sum of Squares and F test are computed as given in above Table.
If , then H0 is refused at level of significance and we conclude that
treatments differ significantly. Then compute standard error of difference between two treatment
means as . Then compute critical difference as

If , H0 may be accepted, i.e. the data do not provide any evidence to


prefer one treatment to the other and as such all of them can be considered alike.
Compare RBD with LSD.
Randomized block design Latin square design
1. Available for a wide range of treatments. Maximum numbers of treatments are
limited to twelve.
2. No restriction on the number of Numbers of replications are fixed as per
replications. number of treatments.
3. Analysis of variance is more flexible. No flexibility on analysis of variance.
4. If the data for any block is destroyed it Much more complicated analysis under
can be easily omitted without any such circumstances.
complication in the analysis.
5. RBD are easier to manage in the field. Management of LSD is not as easy as
RBD.
6. It can be accommodated equally well in a The shape of the field should
rectangular, square or of any shape. necessarily be square or rectangular.
7. It is less efficient when the fertility For two directional fertility variations
variation is more in two directions LSD is more efficient design.
8. RBD is less efficient to control LSD is more efficient in such situations.
simultaneously two factors contributing
to the experimental error
EXERCISE NO. 14
Problems on LSD
------------------------------------------------------------------------------------------------
Example:-
An experiment was conducted in poultry farm on birds test the gain in weight by
feeding various types of feeds. Data are as under.
Age group Body weight groups Total
I II III IV V
1 9F2 5F1 8F5 5F4 4F3 31
2 8F5 5F3 4F4 4F2 4F1 25
3 3F1 6F5 5F2 3F3 2F4 19
4 3F4 5F2 2F3 3F1 4F5 17
5 5F3 2F4 2F1 2F5 2F2 13
Total 28 23 21 17 16 105

1. General Mean (G. M.)


Total sum of values (Grand total) 105
G. M. = ----------------------------------------- = -------- = 4.2
No. of observations 25

2. Correction factor (C. F.)


(G.T.)2 (105)2
C. F. = -------------------------- = ----------- = 441.00
No. of observations 25

3. Total S. S.
Total S. S. = (Individual observations)2 - C. F.
= (92 + --- + 22) - C. F.
= 535 - 441 = 94
4. Row S. S.
R2i
Row S. S. = ------- - C. F.
t
(31)2 + --- + (13)2
= ----------------------- - C. F. = 481 - 441 = 40
5
5. Column S. S.

C2 j (28)2 + --- + (17)2


Column S. S. = ------ - C. F. = ----------------------- - C. F.
t 5

= 459.8 - 441.0 = 18.8

6. Treatment S. S.
Treatment S. S. = Total
F1 = 17
F2 = 25
F3 = 19
F4 = 16
F5 = 28
-----
105
T2k
Treatment S. S. = -------- - C.F.
t

(17)2 + --- +(28)2


= ----------------------- - C.F.
5

= 463.0 - 441.0 = 22
7. Error S.S.
Error S. S. = Total S. S. - (Row S.S. + Column S. S. + Treatment S. S.)
= 94.0 - (40.0 + 18.8 + 22.0)
= 94.0 - 80.8
= 13.2
If we let represent the no. of treatments, rows and columns in a Latin square the form of the
analysis is :-
Source Sum of squares Degree of freedom
Rows (R2i ) - (G)2 (t - 1)
------- ------
t t2
Columns (C2j) - (G)2 (t - 1)
------ -------
t t2
Treatments (T k) - (G)2
2
(t - 1)
------ -------
t t2
Error By difference (t - 1) (t - 2)
Total - (t2 - 1)
Where Ri, Cj, and Tk, represent row, column and treatment totals, and G the G.T.
ANOVA TABLE :
Source d. f. S.S. M.S. Cal. F Table F
(n1, n2, 0.05)
Age group 4 40.00 10.0 9.09* 3.26`
Weight group 4 18.80 4.7 4.27*
Feeds 4 22.00 5.5 5.0*
Error 12 13.20 1.1 -
Total 24 94.00 - -
* Significant.
Standard error of different
between two means (S.Ed.) =  (2 x Error m. s.) / t
=  (2 x 1.1) / 5 = 0.6633
C. D. (Critical difference) or least significant difference (L.S.D.)

L.S.D. = t0.05,ne X S.Ed.


= 2.179 x 0.6633
= 1.445 kg
F1 = 3.4
F2 = 5.0
F3 = 3.8
F4 = 3.2
F5 = 5.6

F5 F2 F3 F1 F4
5.6 5.0 3.8 3.4 3.2
------------
-------------
-----------------------
Interpretation of results :
Among the various feeds the best performance is of feed 5 and the poor is feed 4.
Feeds F3, F1 and F4 are some what having equal effect.
Example 14.1: Yield of six hybrid maize varieties as influenced by plant population and
integrated nitrogen management from an experiment laid on LSD were recorded, with
layout, as follows-
Grain yield (q/ha)
68.69 (H2) 102.76 (H5) 94.84 (H1) 101.53 (H6) 79.23 (H4) 98.00 (H3)
79.50 (H3) 95.25 (H6) 60.60 (H2) 75.30 (H4) 106.80 (H5) 90.35 (H1)
89.50 (H1) 78.00 (H4) 110.20 (H5) 85.65 (H3) 73.55 (H2) 102.30 (H6)
106.20 (H6) 92.98 (H3) 81.45 (H4) 106.20 (H5) 86.25 (H1) 65.75 (H2)
85.50 (H4) 90.80 (H1) 98.00 (H6) 70.70 (H2) 92.55 (H3) 111.95 (H5)
114.65 (H5) 67.50 (H2) 90.45 (H3) 94.00 (H1) 102.20 (H6) 82.60 (H4)
Analyze the data and draw conclusion.
EXERCISE NO. 15

Some Preliminaries

Simple Random Sampling with and without replacement


-----------------------------------------------------------------------------------------------
Simple random sampling (SRS) is a method of selection of a sample comprising of n number of
sampling units out of the population having N number of sampling units such that every
sampling unit has an equal chance of being chosen.
The samples can be drawn in two possible ways.
1. The sampling units are chosen without replacement in the sense that the units once chosen
are not placed back in the population .
2. The sampling units are chosen with replacement in the sense that the chosen units are
placed back in the population.

1. Simple random sampling without replacement (SRSWOR):


SRSWOR is a method of selection of n units out of the N units one by one such that at any stage
of
selection, anyone of the remaining units have same chance of being selected, i.e. 1/ . N
2. Simple random sampling with replacement (SRSWR):
SRSWR is a method of selection of n units out of the N units one by one such that at each stage
of selection each unit has equal chance of being selected, i.e., 1/ . N .

Procedure of selection of a random sample:


The procedure of selection of a random sample follows the following steps:
1. Identify the N units in the population with the numbers 1 to N.
2. Choose any random number arbitrarily in the random number table and start reading
numbers.
3. Choose the sampling unit whose serial number corresponds to the random number drawn
from the table of random numbers.
4. In case of SRSWR, all the random numbers are accepted ever if repeated more than once.
In case of SRSWOR, if any random number is repeated, then it is ignored and more Numbers
are drawn.
Example: List all possible simple random samples of size n = 2 that can be selected from the population {0, 1, 2, 3,
4}i.e .N=5
EXERCISE NO. 16
Appendix: STATISTICAL TABLES
------------------------------------------------------------------------------------------------
B1: t-table B2: table

Degrees of Level of significance Degrees of Level of significance


freedom freedom
1% 5% 1% 5%
1 63.657 12.706 1 6.635 3.841
2 9.925 4.303 2 9.210 5.991
3 5.841 3.182 3 11.345 7.815
4 4.604 2.776 4 13.277 9.488
5 4.032 2.571 5 15.086 11.070
6 3.707 2.447 6 16.812 12.592
7 3.499 2.365 7 18.475 14.067
8 3.355 2.306 8 20.090 15.507
9 3.250 2.262 9 21.666 16.919
10 3.169 2.228 10 23.209 18.307
11 3.106 2.201 11 24.725 19.675
12 3.055 2.179 12 26.217 21.026
13 3.012 2.160 13 27.688 22.362
14 2.977 2.145 14 29.141 23.685
15 2.947 2.131 15 30.578 24.996
16 2.921 2.120 16 32.000 26.296
17 2.898 2.110 17 33.409 27.587
18 2.878 2.101 18 34.805 28.869
19 2.861 2.093 19 36.191 30.144
20 2.845 2.086 20 37.566 31.410
21 2.831 2.080 21 38.932 32.671
22 2.819 2.074 22 40.289 33.924
23 2.807 2.069 23 41.638 35.172
24 2.797 2.064 24 42.980 36.415
25 2.787 2.060 25 44.314 37.652
26 2.779 2.056 26 45.642 38.885
27 2.771 2.052 27 46.963 40.113
28 2.763 2.048 28 48.278 41.337
29 2.756 2.045 29 49.588 42.557
30 2.750 2.042 30 50.892 43.773
>30 2.58 1.96
B3: F table values at 1% level of significance
Df for Degrees of freedom for greater variance (numerator)
Smaller
variance 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
1 4052.18 4999.50 5403.35 5624.58 5763.65 5858.99 5928.36 5981.07 6022.47 6055.85 6083.32 6106.32 6125.86 6142.67 6157.28

2 98.50 99.00 99.17 99.25 99.30 99.33 99.36 99.37 99.39 99.40 99.41 99.42 99.42 99.43 99.43
3 34.12 30.82 29.46 28.71 28.24 27.91 27.67 27.49 27.35 27.23 27.13 27.05 26.98 26.92 26.87
4 21.20 18.00 16.69 15.98 15.52 15.21 14.98 14.80 14.66 14.55 14.45 14.37 14.31 14.25 14.20
5 16.26 13.27 12.06 11.39 10.97 10.67 10.46 10.29 10.16 10.05 9.96 9.89 9.82 9.77 9.72
6 13.75 10.92 9.78 9.15 8.75 8.47 8.26 8.10 7.98 7.87 7.79 7.72 7.66 7.60 7.56
7 12.25 9.55 8.45 7.85 7.46 7.19 6.99 6.84 6.72 6.62 6.54 6.47 6.41 6.36 6.31
8 11.26 8.65 7.59 7.01 6.63 6.37 6.18 6.03 5.91 5.81 5.73 5.67 5.61 5.56 5.52
9 10.56 8.02 6.99 6.42 6.06 5.80 5.61 5.47 5.35 5.26 5.18 5.11 5.05 5.01 4.96
10 10.04 7.56 6.55 5.99 5.64 5.39 5.20 5.06 4.94 4.85 4.77 4.71 4.65 4.60 4.56
11 9.65 7.21 6.22 5.67 5.32 5.07 4.89 4.74 4.63 4.54 4.46 4.40 4.34 4.29 4.25
12 9.33 6.93 5.95 5.41 5.06 4.82 4.64 4.50 4.39 4.30 4.22 4.16 4.10 4.05 4.01
13 9.07 6.70 5.74 5.21 4.86 4.62 4.44 4.30 4.19 4.10 4.02 3.96 3.91 3.86 3.82
14 8.86 6.51 5.56 5.04 4.69 4.46 4.28 4.14 4.03 3.94 3.86 3.80 3.75 3.70 3.66
15 8.68 6.36 5.42 4.89 4.56 4.32 4.14 4.00 3.89 3.80 3.73 3.67 3.61 3.56 3.52
16 8.53 6.23 5.29 4.77 4.44 4.20 4.03 3.89 3.78 3.69 3.62 3.55 3.50 3.45 3.41
17 8.40 6.11 5.18 4.67 4.34 4.10 3.93 3.79 3.68 3.59 3.52 3.46 3.40 3.35 3.31
18 8.29 6.01 5.09 4.58 4.25 4.01 3.84 3.71 3.60 3.51 3.43 3.37 3.32 3.27 3.23
19 8.18 5.93 5.01 4.50 4.17 3.94 3.77 3.63 3.52 3.43 3.36 3.30 3.24 3.19 3.15
20 8.10 5.85 4.94 4.43 4.10 3.87 3.70 3.56 3.46 3.37 3.29 3.23 3.18 3.13 3.09
21 8.02 5.78 4.87 4.37 4.04 3.81 3.64 3.51 3.40 3.31 3.24 3.17 3.12 3.07 3.03
22 7.95 5.72 4.82 4.31 3.99 3.76 3.59 3.45 3.35 3.26 3.18 3.12 3.07 3.02 2.98
23 7.88 5.66 4.76 4.26 3.94 3.71 3.54 3.41 3.30 3.21 3.14 3.07 3.02 2.97 2.93
24 7.82 5.61 4.72 4.22 3.90 3.67 3.50 3.36 3.26 3.17 3.09 3.03 2.98 2.93 2.89
25 7.77 5.57 4.68 4.18 3.85 3.63 3.46 3.32 3.22 3.13 3.06 2.99 2.94 2.89 2.85
26 7.72 5.53 4.64 4.14 3.82 3.59 3.42 3.29 3.18 3.09 3.02 2.96 2.90 2.86 2.81
27 7.68 5.49 4.60 4.11 3.78 3.56 3.39 3.26 3.15 3.06 2.99 2.93 2.87 2.82 2.78
28 7.64 5.45 4.57 4.07 3.75 3.53 3.36 3.23 3.12 3.03 2.96 2.90 2.84 2.79 2.75
29 7.60 5.42 4.54 4.04 3.73 3.50 3.33 3.20 3.09 3.00 2.93 2.87 2.81 2.77 2.73
30 7.56 5.39 4.51 4.02 3.70 3.47 3.30 3.17 3.07 2.98 2.91 2.84 2.79 2.74 2.70
31 7.53 5.36 4.48 3.99 3.67 3.45 3.28 3.15 3.04 2.96 2.88 2.82 2.77 2.72 2.68
32 7.50 5.34 4.46 3.97 3.65 3.43 3.26 3.13 3.02 2.93 2.86 2.80 2.74 2.70 2.65
33 7.47 5.31 4.44 3.95 3.63 3.41 3.24 3.11 3.00 2.91 2.84 2.78 2.72 2.68 2.63
34 7.44 5.29 4.42 3.93 3.61 3.39 3.22 3.09 2.98 2.89 2.82 2.76 2.70 2.66 2.61
35 7.42 5.27 4.40 3.91 3.59 3.37 3.20 3.07 2.96 2.88 2.80 2.74 2.69 2.64 2.60
A3: F table values at 5% level of significance
Df for Degrees of freedom for greater variance (numerator)
Smaller
variance 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
1 161.45 199.50 215.71 224.58 230.16 233.99 236.77 238.88 240.54 241.88 242.98 243.91 244.69 245.36 245.95
2 18.51 19.00 19.16 19.25 19.30 19.33 19.35 19.37 19.38 19.40 19.40 19.41 19.42 19.42 19.43
3 10.13 9.55 9.28 9.12 9.01 8.94 8.89 8.85 8.81 8.79 8.76 8.74 8.73 8.71 8.70
4 7.71 6.94 6.59 6.39 6.26 6.16 6.09 6.04 6.00 5.96 5.94 5.91 5.89 5.87 5.86
5 6.61 5.79 5.41 5.19 5.05 4.95 4.88 4.82 4.77 4.74 4.70 4.68 4.66 4.64 4.62
6 0.49 0.75 0.86 0.92 0.95 0.98 1.00 1.01 1.02 1.03 1.03 1.04 1.05 1.05 1.05
7 0.48 0.74 0.85 0.90 0.94 0.96 0.98 0.99 1.00 1.01 1.02 1.02 1.03 1.03 1.04
8 0.48 0.73 0.84 0.89 0.93 0.95 0.97 0.98 0.99 1.00 1.01 1.01 1.02 1.02 1.02
9 5.12 4.26 3.86 3.63 3.48 3.37 3.29 3.23 3.18 3.14 3.10 3.07 3.05 3.03 3.01
10 4.96 4.10 3.71 3.48 3.33 3.22 3.14 3.07 3.02 2.98 2.94 2.91 2.89 2.86 2.85
11 4.84 3.98 3.59 3.36 3.20 3.09 3.01 2.95 2.90 2.85 2.82 2.79 2.76 2.74 2.72
12 4.75 3.89 3.49 3.26 3.11 3.00 2.91 2.85 2.80 2.75 2.72 2.69 2.66 2.64 2.62
13 4.67 3.81 3.41 3.18 3.03 2.92 2.83 2.77 2.71 2.67 2.63 2.60 2.58 2.55 2.53
14 4.60 3.74 3.34 3.11 2.96 2.85 2.76 2.70 2.65 2.60 2.57 2.53 2.51 2.48 2.46
15 4.54 3.68 3.29 3.06 2.90 2.79 2.71 2.64 2.59 2.54 2.51 2.48 2.45 2.42 2.40
16 4.49 3.63 3.24 3.01 2.85 2.74 2.66 2.59 2.54 2.49 2.46 2.42 2.40 2.37 2.35
17 4.45 3.59 3.20 2.96 2.81 2.70 2.61 2.55 2.49 2.45 2.41 2.38 2.35 2.33 2.31
18 4.41 3.55 3.16 2.93 2.77 2.66 2.58 2.51 2.46 2.41 2.37 2.34 2.31 2.29 2.27
19 4.38 3.52 3.13 2.90 2.74 2.63 2.54 2.48 2.42 2.38 2.34 2.31 2.28 2.26 2.23
20 4.35 3.49 3.10 2.87 2.71 2.60 2.51 2.45 2.39 2.35 2.31 2.28 2.25 2.22 2.20
21 4.32 3.47 3.07 2.84 2.68 2.57 2.49 2.42 2.37 2.32 2.28 2.25 2.22 2.20 2.18
22 4.30 3.44 3.05 2.82 2.66 2.55 2.46 2.40 2.34 2.30 2.26 2.23 2.20 2.17 2.15
23 4.28 3.42 3.03 2.80 2.64 2.53 2.44 2.37 2.32 2.27 2.24 2.20 2.18 2.15 2.13
24 4.26 3.40 3.01 2.78 2.62 2.51 2.42 2.36 2.30 2.25 2.22 2.18 2.15 2.13 2.11
25 4.24 3.39 2.99 2.76 2.60 2.49 2.40 2.34 2.28 2.24 2.20 2.16 2.14 2.11 2.09
26 4.23 3.37 2.98 2.74 2.59 2.47 2.39 2.32 2.27 2.22 2.18 2.15 2.12 2.09 2.07
27 4.21 3.35 2.96 2.73 2.57 2.46 2.37 2.31 2.25 2.20 2.17 2.13 2.10 2.08 2.06
28 4.20 3.34 2.95 2.71 2.56 2.45 2.36 2.29 2.24 2.19 2.15 2.12 2.09 2.06 2.04
29 4.18 3.33 2.93 2.70 2.55 2.43 2.35 2.28 2.22 2.18 2.14 2.10 2.08 2.05 2.03
30 4.17 3.32 2.92 2.69 2.53 2.42 2.33 2.27 2.21 2.16 2.13 2.09 2.06 2.04 2.01
31 0.44 0.69 0.79 0.84 0.87 0.90 0.91 0.92 0.93 0.94 0.95 0.95 0.96 0.96 0.97
32 0.44 0.69 0.79 0.84 0.87 0.90 0.91 0.92 0.93 0.94 0.95 0.95 0.96 0.96 0.97
33 0.44 0.69 0.79 0.84 0.87 0.90 0.91 0.92 0.93 0.94 0.95 0.95 0.96 0.96 0.96
34 4.13 3.28 2.88 2.65 2.49 2.38 2.29 2.23 2.17 2.12 2.08 2.05 2.02 1.99 1.97
35 4.12 3.27 2.87 2.64 2.49 2.37 2.29 2.22 2.16 2.11 2.07 2.04 2.01 1.99 1.96

You might also like