0% found this document useful (0 votes)
14 views47 pages

Bstat Design

The document discusses the planning of experiments and provides details on the key steps: 1) Introduction to the problem and stating the objective. 2) Choosing factors, levels, and response variables to study. 3) Selecting an appropriate experimental design. 4) Performing the experiment according to the plan. 5) Analyzing the experimental data and drawing conclusions.

Uploaded by

Aby Mathew
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views47 pages

Bstat Design

The document discusses the planning of experiments and provides details on the key steps: 1) Introduction to the problem and stating the objective. 2) Choosing factors, levels, and response variables to study. 3) Selecting an appropriate experimental design. 4) Performing the experiment according to the plan. 5) Analyzing the experimental data and drawing conclusions.

Uploaded by

Aby Mathew
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 47

UNIT I

Planning of Experiments

An experiment can be defined as planned research conducted to obtain new facts, or


to confirm or refute the results of previous experiments. An experiment helps a
researcher to get an answer to some question or to make an inference about some
phenomenon. Most generally, observing, collecting or measuring data can be
considered as an experiment. In a narrow sense, an experiment is conducted in a
controlled environment in order to study the effects of one or more categorical or
continuous variables on observations. An experiment is usually planned and can be
described in several steps.

1. Introduction to the problem

The planning of experiment begins with an introduction in which a problem is


generally stated and a review of relevant literature including previous results and a
statement of importance of solving the problem. After that, the objective of the research
is stated. The objective should be precise and can be a question to be answered, a
hypothesis to be verified or an effect to be estimated. All further work in the
experiment should depend on the stated objective.

2. Choice of factors and levels

The experimenter must select the independent variables or factors to be


investigated under the experiment. The factors in the experiment may be either
quantitative or qualitative. If they are quantitative it should be decided that how these
factors are to be controlled at the desired values and measured. We must also select the
values or levels of the factors to be used in the experiment. These levels may be chosen
specially or selects at random from the set of all possible factor levels.

3. Selection of response variable

In choosing a response or dependent variable, the experimenter must be certain


that the response to be measured really provides information about the problem under
study. How the response to be measured and the probable accuracy of measurement are
also to be considered.

1
4. Choice of experimental design

This step is of primary importance in experimental process. The experimenter


must determine the difference in true response he wishes to detect and the magnitude of
the risks he is willing to tolerate so that an appropriate sample size may be chosen. He
must also determine the order in which data will be collected and the method of
randomization to be employed. It is always necessary to maintain a balance between
statistical accuracy and cost. A mathematical model for the experiment must also be
proposed, so that a statistical analysis of data may be performed.

5. Performing the experiment

This is the actual data collection process. The experimenter should carefully
monitor the progress of experiment to ensure that it is proceeding according to the plan.
Particular attention should be paid to randomization, measurement accuracy and
maintaining as uniform an experimental environment as possible.

6. Data Analysis

Statistical methods should be employed in analyzing the data from the


experiment. Numerical accuracy is important. Graphical methods are also useful in the
data analysis process.

7. Conclusions and Recommendations

Once the data has been analyzed, the experimenter may draw conclusions or
inferences about his results. The statistical inferences must be physically interpreted,
and the practical significance of these findings must be evaluated. Then
recommendation concerning these findings must be made. These recommendations
involve further round of experiments, as experimentation is usually an iterative process,
with one experiment answering some questions and simultaneously posing other. In
presentations of results and conclusion, charts and graphs are very effective.

Some important terms (Terminologies)

Treatments: - The different factors whose effects are being compared are called
treatments or varieties. Eg: standard ration, temperature, humidity combination,
insecticides.

2
Experimental material: The material to which treatment is applied.

Experimental unit: - Experimental unit or experimental plot is the smallest division of


the experimental material to which treatment is applied and experimental units are
independent of each other. The experimental unit can be an animal or group of animals,
for example 10 steers in a pen.

Sample unit: - The sample unit can be identical to the experimental unit or it can be a
part of the experimental unit. For example, if we measure weights of independent
calves at the age of 6 months, then a calf is a sample and an experimental unit. On the
other hand, if some treatment is applied to 10 chicks in a cage and each chick is
weighed, then the cage is the experimental unit and each chick is a sample unit.

Replication: - When a treatment is applied to more than one experimental unit, it is


said to be replicated.

Experimental error: - A characteristic of all experimental material is variation.


Experimental error is a measure of variation which exists among observations on
experimental units treated alike. Experimental error can consist of two types of errors:
systematic and random. Systematic errors are effects which change the measurements
under study in a consistent way and can be assigned to some source. For example,
variability due to lack of uniformity in conducting the experiment, from uncalibrated
instruments, unaccounted temperature effects etc.

Random errors occur due to random, unpredictable phenomena. They produce


variability that can not be explained. Over a series of replicates they will cancel out.
For example, in an experiment with livestock, the individual animals have different
genetic constitution.

Definition:

The design of experiment means

1. The set of treatments selected for comparison


2. The specification of experimental materials to which the treatments are to be
applied
3. The rule according to which the treatments are allocated to the experimental
units

3
4. The specification of measurements or other records to be made on each
experimental unit

Requirements of good experiment


Let us assume that the treatments, the experimental units, and nature of
observations have been decided. Then the requirements for a good experiments are that
the treatment comparison should be as far as possible be free from systematic error, that
they should be made sufficiently precisely, that the conclusions should have a wide
range of validity, that the experimental arrangement should be as simple as possible,
and finally that uncertainty in the conclusions should be assessable.

1. Absence of systematic error: - Observations from experimental units should not


differ in any systematic way and experimental units should be allowed to respond
independently of one another. When it is impossible or impracticable to achieve
this, any assumption about the absence of systematic differences should be
explicitly recognized and as far as possible checked by supplementary
measurements or previous experience.

2. Precision: - Precision means how close the measurements are to one another
regardless of how close they are to the true mean, that is, it explains the
repeatability of the experiment. Accuracy represents how close the estimated mean
of replicated measurements is to the true mean. The closer to the true mean, the
more accurate the result. Random errors affect precision of an experiment and to a
lesser extent its accuracy. A small random error means greater precision.
Systematic errors affect the accuracy of the experimental, but not the precision. In
order to have a successful experiment, systematic errors must be eliminated and
random errors should be as small as possible. In experiments precision is expressed
n
as the amount of information I  , where n is the number of observations in a
2
group or treatment, and  2 is the variance between units in the population. just as
the estimator of variance  2 is the mean square error s 2  MSE ,the estimate of
n
the amount of information is I 
MSE

4
Reciprocal of I is the square of the estimator of the standard error of the mean
1 MSE
  s y 2 that is , more information results in smaller standard error and
I n
estimation of mean is more precise. More information and greater precision also
results in easier detection of possible differences between means.

3. Range of validity:- When we estimate the difference between two treatments, we


obtain conclusions referring to the particular set of units used in the experiment and
to the conditions investigated in the experiment. If we wish to apply the conclusions
to new conditions or units, some additional uncertainty is involved. For example, a
new experimental technique works very well when special attention is devoted to it,
may be quite unsuited to routine use. So it is important to have not just empirical
knowledge about what the treatment differences are, but also some understanding of
the reasons for the differences. Such knowledge will indicate what extrapolation of
the conclusions is reasonable. In DOE, we should artificially vary conditions, if we
can do so without inflating the error. It is also important to recognize explicitly
what are the restrictions on the conclusions of any particular experiment.

4. Simplicity:-It is important to retain flexibility; the initial part of the experiment


might suggest a much more promising line of enquiry. It will be a bad thing if a
large experiment has to be completed before any worthwhile results are obtained. It
is also desirable to have simple design and methods of analysis.

5. Calculation of uncertainty:-It is desirable that we should be able to calculate, if


possible from the data themselves, the uncertainty in the estimates of the treatment
differences. This usually means estimating the standard error of the differences,
from which limits of error for the true differences can be calculated at any required
level of probability, and from which the statistical significance of the differences
between the treatments can be measured.

Basic principles of Design

The observations in any experiment are affected not only by the action of
treatment, but also by some extraneous factors which tend to mask the effect of
treatment. Randomization, replication and local control are the three basic principles of
experimental design, for increasing the precision of experiment and also for drawing
valid inferences from the experiments.

5
1. Randomization:- After the treatments and experimental units are decided the
treatments are allotted to the experimental units at random to avoid any type of
personal or subjective bias which may be conscious or unconscious. This ensures
validity of the results. It helps to have an objective comparison among the
treatments. It also ensures independence of observation which is necessary for
drawing valid inference from the observations by applying appropriate statistical
techniques.

2. Replication:- The repetition of treatment by applying them to more than one


experimental unit is known as replication. The functions of replication are

1. to provide an estimate of experimental error


2. to improve precision of experiment by reducing the standard deviation of a
treatment mean. That is if we repeat a single treatment r times, the treatment

mean will have standard error of , where  is standard deviation of
r
individual experimental units and is estimated from the experiment. If r
increases standard error decreases.
3. to increase the scope of inference of the experiment by selection and appropriate
use of quite variable experimental units, and
4. to effect control of error variance.

3. Local control or Blocking :- The entire experimental material, supposed to be


heterogeneous, may be divided in to different groups by taking homogeneous units
together and then treatments may be allocated randomly to different units in each
group. The procedure is known as local control or blocking. The main aim of local
control is to reduce the error by suitably modifying the allocation of treatments to
experimental units.

6
UNIT II

Uniformity Trials

In field trials every agricultural research worker is interested in ascertaining the


relative worth of a set of treatments with reasonable confidence. To achieve this,
efficiency of experimental design is improved by adopting the principles of
randomization, replication and local control. A part from these, the accuracy of
estimates also depends on the size and shape of the experimental units adopted. To
determine suitable size and shape of the plot and the number of plots in a block, an
experiment called uniformity trial is planned which consist of growing a crop under
uniform conditions on a piece of land.

A crop is grown in a field under uniform conditions and variability is measured among
equal size plots for yield or any other trait, and then it is called a uniformity trial. A
fertility map is prepared on the basis of fertility gradient. Blocks are formed on the
patches that have same fertility.

Analysis of variance (ANOVA)

ANOVA is a technique which enables us to break down the variance of the


measured variable in to the portions caused by the several factors, varied singly in
combination, and a portion caused by experimental error. More precisely, ANOVA
consist of partitioning of total sum of squares of deviation from the mean in to two or
more component sums of squares, each of which is associated with a particular factor
or with experimental error, and a parallel partioning of the total number of degrees of
freedom. With the help of this technique, it is possible to perform certain tests of
hypothesis and to provide estimates for components of variation. In ANOVA, we test
significance of several means. It means the null hypothesis is H 0 :  1 =  2 = … =  t .
and alt ernat e hypot hesis is H 1 :  i   j , for at least one i and j. The assumptions
of the ANOVA are

1. The effect s are addit ive.

2. The exper iment al errors are independent .

3. The errors are distributed normally with mean zero and common variance 2.

ANOVA is based on linear statistical model. The ANOVA model for an experiment
with different levels of only one factor is

7
yij    ti  eij ; i=1, 2, .t and j=1, 2…r

where yij is the value of the variable in the jth replicate of the ith treatment  is the
general mean effect, ti is the effect due to ith treatment , eij is the random error which is
assumed to be independently and normally distributed with mean zero and variance  e2

In ANOVA, criterion for determining number of replications is that it should ensure at


least 10 degrees of freedom (d.f.) for estimate of error variance, as F is unstable below
10 d.f.

Completely Randomized Design (CRD)

CRD is the simplest design using only two basic principles of experimentation,
namely replication and randomization. In this design, the whole experimental material,
supposed to be homogeneous, is divided in to a number of experimental units
depending upon the number of treatments and number of replications for each
treatment. The treatments are then allotted randomly to the units in the entire material.
This design is useful for laboratory or greenhouse experiments, where as its use in field
experiment is limited. Missing values or unequal replications do not create any
difficulty in the analysis of this design.

Advantages
1. The number of replications may be varied from treatment to treatment. Because
of this flexibility, all the available experimental material can be used with out
any wastage.
2. CRD provides the maximum number of degrees of freedom for the estimation
of experimental error.
3. The statistical analysis of CRD is very simple, even if information on some
units are missing.

Disadvantages

1. It is less accurate than other designs, when large number of treatments are
included, a relatively large amount of experimental material must be used. Here,
the heterogeneity of experimental material will be increased. This will result in
increased experimental error and reduced precision.

8
Lay out of CRD

The placement of the treatments on the experimental units along with the arrangement
of experimental units is known as the lay out of an experiment

Suppose that there are t treatments, namely T1 , T2 ,......Tt . Further suppose that the
treatments are replicated r times each. Then we require txr=n experimental units. In
case of unequal replications the number of experimental units required will be
r1  r2  .....  rt  n

The entire experimental material is divided in to n number of experimental units. For


example, suppose that there are 5 treatments each with 4 replications. We need 20
experimental units. The units are numbered as

1 2 3 4 5
10 9 8 7 6
11 12 13 14 15
20 19 18 17 16
The n distinct three-digit random numbers are selected from random number tables.
The random numbers are written in order and are ranked. The first set of r units are
allotted to treatments T1, next r units to T2 and so on. The procedure is continued until
all treatments have been applied. For example, the selected random numbers and their
ranks are as follows:

Random number Rank Treatment to be applied


807 18
186 4
T1
410 10
345 9
626 14
340 7
T2
883 19
569 13
341 8
094 2
T3
322 6
252 5
047 1
469 12
T4
632 15
183 3
417 11
782 17
T5
969 20
697 16

9
That is, treatment 1 is applied to units 18,4,10 and 9: treatment 2 is applied to units
14,7,19 and 13, and so on. The final layout with unit number and treatment allocated
will be as follows:

1 T4 2 T3 3 T4 4 T1 5 T3
10 T1 9 T1 8 T3 7 T2 6 T3
11 T5 12 T4 13 T2 14 T2 15 T4
20 T5 19 T2 18 T1 17 T5 16 T5
Analysis of CRD

The ANOVA model for CRD is


yij    ti  eij ; i=1, 2, .t and j=1, 2…r

where yij is the value of the variable in the jth replicate of the ith treatment,  is the
general mean effect, ti is the effect due to ith treatment , eij is the random error which is
assumed to be independently and normally distributed with mean zero and variance  e2

. The observed result from a CRD are arranged as below

Treatment
1 2 3 … t
y11 y21 y31 … yt1
y12 y22 y32 ... yt2
. . .
: : :
y1r1 y2r2 y3r3 … ytrt Grand Total
Total T1 T2 T3 Tt G
No. of
replication r1 r2 r3 rt n
Next , the required sum of squares are computed.
G2
Correction Factor (CF) = where G is the grand total,
n

Total sum of squares (TSS ) = y 2


11 
 y122  .........  ytr2t  CF

t ri
=  yi 1 j 1
ij
2
_ CF ,

10
 T12 T22 Tt 2 
     CF = Ti 2
Treatment sum of squares (SST) =  r1
t
.....
r2 rt  
i 1 ri
 CF ,

Error sum of squares (SSE) = TSS-SST.


We are interested in testing the hypothesis H0:T1=T2=……=Tt against, the alternative
hypothesis that treatment effects are not all equal. We set up the following ANOVA
table, with mean squares and F.
Source of Degrees of Sum of Mean sum of F
variation freedom squares squares
Treatment(between t-1 SST SST MST
MST= Ft 1,n t
treatment) t 1 MSE
Error(within n-t SSE SSE
MSE=
treatment) nt

Total n-1 TSS

If the calculated value of F is greater than the table value of F ,t 1, nt  , where 

denotes the level of significance, the hypothesis H0 is rejected and it can be inferred
that the treatment effects are significantly different from one another.

A non significant F may result either due to small treatment difference or a very large
experimental error or both. It does not mean always that all the treatments have the
same effect. When the experimental error is large, it is an indication of the failure of the
experiment to detect treatment differences. In order to find the reliability of the
experiment, the coefficient of variation (CV) is used. It is computed as

MSE
CV  100
overallmean

It CV is 20% or less, it is an indication of better precision of the experiment. When the


CV is more than 20%, the experiment may be repeated and efforts made to reduce
error.

In case of significant F , the null hypothesis is rejected. Then the problem is to know
which of the treatment means are significantly different. The most commonly used test
for this purpose is least significant differs (LSD) other wise known as critical difference
(CD)

11
The formula is CD=t  SE(d) where SE(d) is the standard error of difference of means
and t is the table value of t for a specified level of significance and error degrees of

1 1
freedom . If we are computing ith and jth treatment mean, the SE(d) = MSE   
r r 
 i j 

where ri and rj are the number of replication for ith and jth treatments respectively. If the

2MSE
replications are equal, SE(d)= .
r

Two treatment means are significantly different if the difference between treatment
means is greater than the calculated CD value, other wise they are not significantly
different.

Contrast

Sometimes significance of combinations of treatment means is tested. Such a


combination is called contrast provided they satisfy certain conditions.

A contrast is a linear combination of treatment means or totals that have known


coefficients in such a way that at least two of them are non-zero and the sum of the
coefficients is always zero. Let T1 ,T2 ,T3 ,...,Tt , be the t treatment means each having
equal number of replications r. The linear combination

Z  c1 T1  c2 T2  ...  ct Tt

Is a contrast if for some i’s ci  0 and c


i
i  0 , where i=1,2,...,t. In case all treatments

do not have the same number of replications say treatment T i has ri replications, then
for Z to be a contrast, the condition is,  rc
i
i i  0.

Eg: T1  T2 , T1  2T2  T3

Orthogonal contrast

Two contrasts for the same set of treatment means, each of them having the
same number of replications, are said to be orthogonal if sum of the product of the
corresponding coefficient is zero. The contrasts Z1 and Z2 ,

12
Z1  c1 T1  c2 T2  ...  ct Tt and Z2  d1 T1  d2 T2  ...  dt Tt are orthogonal if and only

if c1d1  c2 d2  ...  ct dt  0 i.e. c d


i
i i  0 for i=1,2,...,k.

Eg: Z1  T1  T3 and Z 2  T1  2T2  T3 are orthogonal contrasts since 1x1-0x2-1x1=0.


When treatment means are based on ri replications, then the contrasts are orthogonal if

 rc d
i
i i i  0 . For t treatments, there cannot be more than k-1 orthogonal contrasts.

Randomised Block Design (RBD)

For the application of CRD, experimental material should be homogeneous. Usually,


the experimental materials are not so homogeneous as required. So the principle of
local control is to adopted and the experimental material is divided in to homogeneous
units having common characteristics which may influence the response under study.
For example, animals of same age or litter or weight may form blocks.

In RBD, the whole experimental material is divided in to a number blocks equal to


the number of replications for each treatment. Then each block is divided in to a
number of experimental units equal to the number of treatments. Then treatments are
randomly allocated separately to each of these blocks. Let t be the number of treatments
and r be the number of blocks. Then RBD is defined as arrangement of t treatments in r
blocks such that each treatment occurs precisely once in each block. The randomization
of treatments is done independently in each block.

Advantages

1. It increases the precision of experiment. This is due to the reduction of


experimental error by adoption of local control.
2. The amount information got in RBD is more compared to CRD, hence RBD is
more efficient than CRD.
3. Flexibility is another advantage of RBD. Any number of replication can be
included in RBD. If large numbers of homogeneous units are available, large
number of treatments can be included in this design. The statistical analyses are
simple and easy. Even when some observations are missing for certain
treatments; the data can be analyzed by the use of missing plot technique.

Disadvantage

13
When the number of treatments is increased, the block size will increase. If the block
size is large it may be difficult to maintain homogeneity within blocks. Consequently,
the experimental error will be increased. Hence RBD may not be suitable for large
number of treatments.

Lay out of RBD

Each block is divided in to t units. The units in each block are numbered from 1 to t.
The treatments are also numbered conveniently. By using random number table, we
select t distinct random numbers from 1 to t. These random numbers correspond to the
treatment numbers. The first selected treatment is applied to the first unit of a block, the
second selected treatment to the second unit and so on. The randomization is done in
each block in the same way.

Analysis of RBD

For the analysis of RBD, we have the ANOVA model


yij    ti  b j  eij

where yij is the value of the variate for the ith treatment in the jth block ,  is the general
mean effect, t i is the effect due to ith treatment , bj is the effect due to jth block and eij is
random error which is assumed to be independently and normally distributed with mean
zero and variance  e2 .

The result from RBD can be arranged in two- way table according to the replications
(blocks) and treatments. There will be rt observations in total. The data arrangement is
given below.
Treatments Replication(blocks)
1 2 3 … r Total
1 Y11 y12 y13 … y1r T1
2 Y21 y22 y23 y2r T2
3 Y31 y32 y33 y3r T3
. . .
. . .
. . .
t yt1 yt2 yt3 ………………………. ytr Tt
B1 B2 B3 …………………………..Br G

14
The total variance is divided in to three sources of variation: between blocks, between
treatments and error. The required sum of squares are obtained as follows

G2
Correction Factor (CF) = ,
rt
Total sum of squares (TSS ) =   yij2  CF ,

1 r 2
Block sum of squares (SSB) =  B j _ CF ,
t j 1

1 t 2
Treatment sum of squares (SST) =  Ti  CF ,
r i 1
Error sum of squares (SSE) = TSS-SST-SSB.
The ANOVA table for testing the hypothesis H0:T1=T2=……=Tt against, the
alternative that they are not all equal is given below:
Source of degrees of Sum of Mean sum of
variation freedom squares squares F Table F
Blocks or r-1 SSB SSB MSB Fr 1, r 1t 1
MSB=
Replications r 1 MSE

Treatments t-1 SST SST MST Ft 1, r 1t 1


MST=
t 1 MSE
Error (r-1)(t-1) MSE=
SSE SSE
 r  1 t  1
Total rt-1 TSS

MSB
If calculated value of F= is less than Fr 1, r 1t 1 , there is no significant
MSE
difference between blocks and it is an indication that the RBD will not contribute to
precision in detecting treatment differences. In such situations, the adoption of RBD in
preference to CRD is not advantageous.

MST
If calculated value of F= is greater than table value of Ft 1, r 1t 1 , the
MSE
hypothesis H0 is rejected and it can be inferred that there is significant difference
between treatment means. Now, for comparing pairs of treatment, we can calculate

CD=t.SE(d)

15
where t= table value of t for specified level of significance and error degrees of

2MSE
freedom and SE (d) = .
r
Ti
The treatments means are given as (i  1, 2.....t ) . Any two treatment means are said
r
to differ significantly if their difference is larger than the critical difference.

Efficiency of Blocking

If F for replications is significant, blocking is considered to be effective in reducing


the experimental error. The effect of blocking can also be ascertained by finding out the
relative efficiency of RBD over CRD. For this purpose amount of information of these
n
designs are compared. The amount of information is . The MSE for CRD has to
MSE
be estimated from the ANOVA of RBD. Had the CRD been used instead of RBD, the
variation due to block could have been added to the variation due to extraneous factors.
nr MSB   nt  ne  MSE
Hence the estimate of MSE for CRD, MSECRD  where MSB is
nr  nt  ne
Block mean square, MSE is Error mean square error, nr-block degrees of freedom, nt-
treatment degrees of freedom and ne-error degrees of freedom. Let us denote amount of
information of RBD and CRD as I  RBD  and I  CRD  , respectively. The relative

I ( RBD)
efficiency of RBD over CRD is obtained as RE (RBD) = 100 . If RE is more
I (CRD)
than 100%, the excess is known as the gain in efficiency due to RBD.

When the error degrees of freedom is less than 20, the relative efficiency (RE) has to
be adjusted by multiplying it by the precision factor. The precision factor is computed
(ne  1)(ne1  3)
as PF  where ne-error degrees of freedom and ne1=nr+ne=error degrees
(ne  3)(ne  1)
1

of freedom for CRD.

Latin Square Designs (LSD)

To control two-way heterogeneity in the experimental materials, LSD is used. To


achieve this, two restrictions are imposed by forming blocks in two directions, row-
wise and column-wise. For example, the age of animals may form row and the weight
of animals may form columns. Then treatments are allocated in such a way that every

16
treatment occurs once and only once in each row and each column. The number of rows
and columns are equal. Hence the arrangement will form a square. Thus, a latin square
of size s is an arrangement of s latin letters in to s2 positions such that every row and
every column contains every treatment precisely once. Through the elimination of row
and column effects, the error variation can be considerately reduced.

LSD is not suitable for less than five treatments. For a Latin square with five
treatments, written as 5x5 Latin square, arrangement may be as follows:

A B C D E A B C D E
B C D E A B A E C D
C D E A B C D A E B
D E A B C D E B A C
E A B C D E C D B A

The selection of squares can be done from Fisher and Yates (1953) statistical tables.

Lay out of LSD

Some terminologies

Standard square:- A standard square is one in which the first row and first column are
ordered alphabetically.

Conjugate Square:- Two standard squares are said to be conjugate squares when the
rows of one square are the columns of other square.

If the number of treatments is t, then txt standard square is selected from Fisher and
Yates (1953) statistical tables. The columns of the selected standard square are
rearranged randomly by using random number tables. Then keeping the first row of the
rearranged square as such, the remaining rows are randomized. Then treatments are
then allocated as in the final arrangement.

Analysis of LSD

The ANOVA model for LSD is

yijk    ri  c j  tk  eijk

where yijk is the observation on the kth treatment in the ith row and jth
column(i,j,k=1,2,3………t)  is the general mean effect, ri is the effect due to ith row ,
cj is the effect due to jth bcolumn, tk is the effect due to kth treatment and eij is random

17
error which is assumed to be independently and normally distributed with mean zero
and variance  e2 .

The results of LSD will be in the form of two-way tables according to rows and
columns. The results have to be arranged according to treatments also. Let R i denote
the ith row total, Cj be the jth total, Tk be the kth treatment total and G be the grand total.
The different sum of squares t x t LSD can be obtained as follows:

G2
Correction Factor (CF) =
t2
Total sum of squares (TSS) =  yijk
2
 CF

 Ri2
Sum of squares due to Rows (SSR) =  CF
t
 C 2j
Sum of squares due to columns (SSC) =  CF
t
 Tk2
Sum of squares due to Treatments (SST) =  CF
t
Sum of squares due to Error (SSE) = TSS-SSR-SSC-SST

For testing the hypothesis H0:T1=……..=Tt against the alternative T’s are not all equal,
the ANOVA table is as given below

Source of degrees of Sum of Mean sum of squares


variation freedom squares F
Rows t-1 SSR SSR MSR
MSR=
t 1 MSE
Columns t-1 SSC SSC MSC
MST=
t 1 MSE
Treatments t-1 SST SST MST
MST=
t 1 MSE
Error (t-1)(t-2) SSE SSE
MSE=
 t  1 t  2 
Total t2-1 TSS

If F is not significant for treatments, we can conclude that the treatment effect do not
differ significantly. If F is significant, we calculate CD= t x SE(d) where t denote the

18
table value of t for a specified level of significance and error degrees of freedom and

2MSE
SE(d) = , where t= number of rows or columns.
t

Efficiency of LSD

In estimating efficiency of LSD over RBD , we have to consider the type of blocks. If
LSD had been RBD with columns as blocks it is termed as column blocking. Similarly,
if LSD had been RBD with rows as blocks, it is termed as row blocking. When we
resort to column blocking, variation due to rows will be added to error variation.
Hence, in case of column blocking the estimate of MSE for RBD is given by

nr MSR  (nt  ne ) MSE


MSE ( RBD) 
nr  nt  ne
Similarly in case of row blocking,
nc MSC  (nt  ne ) MSE
MSE(RBD) =
nc  nt  ne
where ne, nt, nr, and nc represent degrees of freedom for error , treatment, row and
column respectively
I ( LSD) MSE ( RBD)
The RE of LSD over RBD is RE= 100  100 .
I ( RBD) MSE ( LSD)

When the error degrees of freedom is less than 20, the precision factor is to be taken in
to account. The precision factor is

 ne  1  ne1  3 
PF=   1  where ne= error degrees of freedom for LSD,
 ne  3  ne  1 
ne1 =nr+ne= nc+ne.error degrees of freedom for RBD
For estimating efficiency of LSD over CRD, the estimate of MSE for CRD is
nr MSR  nc MSC  (nt  ne ) MSE
MSE(CRD) 
nr  nc  nt  ne
I ( LSD) MSE (CRD)
The RE of LSD over CRD is RE= 100  100 .
I (CRD) MSE ( LSD)

In the P.F formula, the number of degrees of freedom for CRD will be ne1  nr  nc  ne .

19
Unit IV

Missing Plot Technique in RBD

With a single missing value in RBD with t treatments and r replications each , the
rBi1  tT j1  G1
first step is to estimate the missing value by using the formula, X 
(r  1)(t  1)
where X is the estimate of the missing value, Bi1 is the total of available values in the ith
block (with the missing value), Tj1 is the total of available values in the jth treatment
(with the missing value), G1 is Grand total of all available values.

The analysis then carried as usual after substituting the estimated value of the missing
observation with the following changes:

1. One degrees of freedom is to be subtracted from the degrees of freedom


corresponding to SSE and Total sum of squares (TSS).
2. Adjusting the treatment sum of squares (SST) by subtracting from it a quantity

 B11   t  1 X   Bi1  tT j1  G1 
2 2

Bias= or .
t (t  1) t (t  1)(r  1) 2
3. For comparing mean of the treatment with a missing value and the mean of any

MSE  t 
other treatment, SE(d)= 2  . The SE(d) formula for other
r  (r  1)(t  1) 

comparison will be the usual one.

Missing Plot technique in LSD

The procedure is to first obtain the estimate of missing value X, by the formula,
t ( R1  C1  T 1 )  2G1
X
(t  1)(t  2)
where t is the number of treatments
R1 is the total available observations in the row with missing value
C1 is the total available observations in the column with missing value
T1 is the total available observations for the treatment with missing value
G1 is the grand total of all the available observations
The estimated missing value is then inserted and the analysis is carried out according
to the usual procedure for a LSD except for subtracting 1 degrees of freedom each from
the degrees of freedom for total sum of squares and error sum of squares.

20
The upward bias in SST is computed by using the formula
2
G1  R1  C1  (t  1)T 1 
upward bias=  .
 t  1 t  2  
2

The SE of the difference between the mean of the treatment with missing value and the

MSE  t 
mean of any other treatment is given by SE(d)= 2  .
t  (t  1)(t  2) 

Analysis of covariance

In many experiments for each experimental unit we have observation on one more
supplementary variable in addition to the response variable. They are usually
concomitant observations or concomitant variables. If the concomitant variables are
unrelated to the treatments and influence the response variable, the variation in
response variable caused by them should be eliminated before comparing the
treatments. For example, in animal feeding experiments designed to compare the effect
of different diets on growth, the initial weight of animal is expected to affect the
increase in weight recorded at the end of the experimental period and therefore
comparison of diets should be made after the variation in weight increase resulting
from the difference in initial weights of animal has been eliminated.

The technique of analysis used to eliminate the variation resulting from the influence
of concomitant variable on the response variable is called Analysis of covariance. It is
an adaptation of methods of regression analysis to experimental designs and consists
essentially of fitting a regression of response variable on the concomitant variables. The
response variables are then ‘adjusted’ by regression and comparison of treatments are
carried out by using the adjusted response variables. At the same time, variation due to
non-uniformity of the experimental material is eliminated by the use of suitable design.

Assumptions of Analysis of covariance


1. The relationship between covariate X and response variable Y is linear.
2. The relationship is same for each treatment.
3. The covariates are not affected by treatments.
4. The observations are from a normal population.

21
Analysis of covariance for CRD

For the CRD with t treatments, r i replication for ith treatment,i=1,2,3,…..t where  ri

=n , the model is
yij    ti   ( xij  x )  eij

where yij is the jth observation on the response variable taken under ith treatment,
 - is the general effect, ti- ith treatment effect,  is linear regression coefficient
indicating the dependency of yij on xij , xij is the observation on the concomitant
variable corresponding to yij, x -is the mean of the xij values and eij is the random error
component which is independently and normally distributed with mean zero variance
2.
The data arrangement is as given below
Treatments
1 2 …. t
y x y x y x
y11 x11 y21 x21 yt1 xt1
y12 x12 y22 x22 yt2 xt2
.
.
.
. Grand Total
Total Ty1 Tx1 Ty2 Tx2 Tyt Txt GTy GTx

The first step in the analysis of covariance is to compute the sum of squares for the
variable (Y) and the covariate (X) as well as the sum of products for Y and X. The
sums of squares for both Y and X are computed in the usual manner for a CRD. The
sum of products (SP) of Y and X is computed as follows:

GTy .GTx
CFyx 
n

Total SP, Gyx    yij .xij  CFyx

TyiTxi
Treatment SP = Tyx =   CFyx
ri

Error SP Eyx = Gyx-Tyx

22
Analysis on Y :

GTy2
CFy=
n

Gyy=TSSy=  yij2  CFy

Tyi2
SST=Tyy=   CFy
ri

SSE=Eyy=Gyy-Tyy

GTx2
Analysis on X : CFx=
n

Gxx=TSS=  xij2  CFx

Tix2
Txx=SSTx=   CFx
ri

Exx = SSEx= TSSx-SSTx=Gxx-Txx

The next step is to verify whether covariate is affected by treatments. If X is not


affected by treatments there should not be significant difference between treatments
with reference to X.

E yx
Then the regression coefficient within treatments is computed as   . The
Exx
significance of  is tested using the F-test. The test statistic F is given by

Mean square of 
F=
Adjusted Error mean square

 
2
E yx

 E xx 
 
1

 
= 
2 
E yx
 E yy  


E xx 

( n  t 1)

23
The F follows a F distribution with 1 and (n-t-1) degrees of freedom. If the regression
coefficient is significant we proceed to make adjustments for the variate. If it is not
significant, it is not worth while to make the adjustments.

The adjusted values for the variable Y are then computed as follows:

G 
2

G  G yy 
1 yx
yy
Gxx

E 
2

E  E yy 
1 yx
yy
Exx
T G E
1
yy
1
yy
1
yy

One degree of freedom is lost in error due to fitting regression line.

Analysis of covariance table is given below:

Adjusted values for y F-ratio


Source df Sum of
Products df Sum of Mean sum of
YY XX XY squares squares
1
Total n-1 Gyy Gxx Gyx n-2 G yy

Treatment t-1 Tyy Txx Tyx t-1 T1yy Tyy1 MST


 MST
t 1 MSE
E1yy
 MSE
1 n  t 1
Error n-t Eyy Exx Eyx n-t-1 E yy

If F is not significant for treatments, we can conclude that the treatment effect do not
differ significantly. If F is significant, we calculate CD= t x SE(d) where t denote the
table value of t for a specified level of significance and error degrees of freedom.

Then the adjusted treatment means are obtained from the formula

yi1  yi    xi  x 

and the standard error for the difference between two adjusted means is given by

  xi  x j  
2
1 1
SE(d)= MSE    .
 ri rj Exx 
 

24
When the number of replications is same for all treatments and when averaged over all
values of ( xi  x j )2

2MSE  Txx 
SE(d) = 1   .
r   t  1 Exx 

Two treatments are significantly different if the difference between adjusted treatment
means is greater than the calculated CD value, other wise they are not significantly
different.

Assignment: Analysis of covariance for RBD and LSD

25
UNIT III

Factorial Experiments

Experiments are characterized by the nature of treatments under investigation and the
nature of comparisons required. There are three main types of experiments.

1. Single factor experiments


2. Factorial experiments and
3. Biological assays

The treatments in a single factor experiments are the different levels of the same
factor. For example, several feeds for animals, different doses of a drug or different
feeds for animals. The main purpose of such experiment is to compare the treatments in
all possible pairs. Thus, when the treatments consists of different levels of a single
variable factor and all other factors are kept at a single prescribed level, it is known as a
single factor experiment.

When several factors are investigated simultaneously in a single experiment,


such experiments are known as factorial experiments. The comparison required in this
type of experiments are the effect of different levels of the factors and their combined
effect termed as main effects and interaction effects respectively. For example, in an
experiment to test the effect of protein content and type of feed on milk yield of dairy
cows, the first factor is protein content and the second is type of feed. Protein content is
defined in three levels, and five types of feed are used. Each cow in the experiment
receives one of 15 protein x feed combinations. An objective could be to determine
cow’s response to different protein levels is different with different feeds. This is the
analysis of interaction. The experiment is called 3 x 5 factorial experiment.

A factorial experiment is named based on the number of factors and the levels
of each factor. For example, if there are four factors at two levels, the experiment is
known as 24 factorial experiment and if there are two factors at three levels, it is known
as 32 factorial experiment. In general if there are n factors each with p levels , then it is
known as pn factorial experiment. If there are three factors one at two levels, second at
3 levels and third at 4 levels, it is a 2 x 3 x 4 factorial experiment.

26
If the number of levels of each factor in an experiment is the same, the
experiment is called symmetrical factorial; otherwise, it is called asymmetrical factorial
or mixed factorial.

Advantages of Factorial experiments


1. We can study the individual effects as well as their interactions. In many
biological and clinical trails factors are likely to have interaction. Therefore,
factorial types of experiments are more informative in such investigations.
2. The factorial experiment will result in considerable saving of experimental
resources. For example, in the experiment to test the effect of protein content and
feed on milk yield of cows, 3 levels of protein and 4 types of feed are used. For a
satisfactory precision, we require at least 12 degrees of freedom for error variance.
Hence we require 5 replication for experiment on feed alone. The total number of
plots required for the experiment on feed is 20. Similarly, the total number of
plots required for the experiment on protein is 21. We need 41 plots for the two
experiments. But for conducting a factorial experiment with 12 combination of the
two factors we require only 36 plots for similar precision.
3. Time required for factorial experiment is less than that required for separate
experiment.
4. When we have several related factors, single factor experiments might be
unsatisfactory because of changes in environmental conditions. But if factorial
experiment is used, such difficulty will not arise.
Dis-advantage
When the number of factors or levels of factors or both are increased, the
number of treatment combination will increase. Consequently, block size will increase.
Then, it may be difficult to ensure the homogeneity of the experimental material. This
will lead to increased experimental error and loss of precision in the experiment.

When several treatment combinations are involved, execution of the experiment and
statistical analysis become complex.

Factorial experiment with factors at two levels


The simplest factorial experiment is 22, an experiment with two factors each at two
levels. The levels of a factor may be its presence and absence or a high and a low dose
or two modes of application of a technique. We denote the factors in a general way by

27
the letters A and B. The levels of factors can be denoted as 0 & 1. Therefore
combination can be written as a0b0, a0b1, a1b0 & a1b1
or 0 0, 1 0 , 0 1 & 1 1 or I, a, b & ab.
The symbol I denotes that both factors are at the lower level in the combination and is
called the control treatment.

When there are three factors each at two levels, the factorial is denoted by 23 and there
are eight treatment combinations. Factors are denoted by A, B ,& C. The general
factorial with n factors each of two levels is denoted by 2 n. For a 22 or 23 factor any of
the three designs, CRD, RBD or LSD can be used. But their analysis involves some
more partitioning of the treatment sum of squares to obtain the main effect and
interaction variation of the factors.

The main effect and interaction effect in factorial experiment can be computed by
many methods.

1. Fisher’s algebraic methods

Suppose that we have two factors A and B of two levels. Then treatment
combinations are represented as

a0b0 , a1b0 , a0b1 , a1b1 or (1) a b ab

The simple effect of A at level b0 =a1b0 - a0b0 = a - (1)


The simple effect of A at level b1 =a1 b1 - a0b1 = ab - b
The simple effect of B at level a0 =a0 b1 - a0b0 = b - (1)
The simple effect of B at level a1 = a1 b1-a1b0 = ab - a
The main effect of A = Average of simple effects of A
1
=  a  (1)  ab  b
2
1 1
=  ab  a  b  (1)  (a  1)(b  1)
2 2
1
The main effect of B  b  (1)  ab  a 
2
1 1
  ab  a  b  (1)  (a  1)(b  1)
2 2
The interaction effect of AB = 1/2[simple effect of A at b1
- simple effect A at b0]
1 1
=  ab  b  (a  (1))   ab  a  b  (1)
2 2

28
1
= (a  1)(b  1)
2
or = ½[simple effect of B at a1 – simple effect of B at a0]

In general, for a 2n factorial experiment, we can write

1
Effect of X = (a  1)(b  1)(c  1)(d  1)... where n is the number of factors , & the
2n1
sign in each bracket is negative if the corresponding capital letter is present in X and
positive if it is absent.

For example, if there are three factors A ,B and C each at two levels, we have

1
Main effect of A  (a  1)(b  1)(c  1)
22

1
  abc  ab  ac  a  bc  b  c  (1)
4

1
Effect of BC  (a  1)(b  1)(c  1)
22

1
  abc  bc  a  (1)  ab  ac  b  c 
4

1
Effect of ABC  (a  1)(b  1)(c  1)
22

1
  abc  a  b  c  ab  ac  bc  (1)
4

The above method can be simplified using tabular form

Factorial Treatment combination


effect (1) a b ab Divisor
G + + + + 4
A - + - + 2
B - - + + 2
AB + - - + 2

The sign + is given for an effect if the corresponding small letter is present and – if it
is absent. The sign for interaction is the product of the corresponding signs for the
individual letters. If there is replication, then the divisor will be multiplied by r where r
is the number of replications.

29
For a 2n experiment with r replications,

 effect of total of X 
2

SS(X) = , where effect of total of X= (a  1)(b  1)...


2n r

2. Yate’s technique to determine sum of squares

In the first column, we write the treatment combination in the standard order. In the
second column, against each treatment combination, we write the corresponding total
yields from all replicates. The entries in the 3rd column can be split in to two parts. The
first half is obtained by writing the pairwise sum of observations in the second column.
The second half is obtained by subtracting first observation in the pair from the second
observation. In a 2 n experiment, the procedure is repeated n times. For example, in a 2 2
experiment,

Treatments Total (1) (2) Main Sum of


effect squares
(1) T1 T1+T2 T1+T2+T3+T4 G
a T2 T3+T4 T2-T1+T4-T3 A A2/22r
b T3 T2-T1 T3+T4-T1-T2 B B2/22r
ab T4 T4-T3 T4-T3-T2+T1 AB AB2/22r

n
2

The sum of squares is obtained as SS(X) =


r.2n

Analysis
Suppose that we have factorial RBD with 2 factors at 2 levels. Then ANOVA model
is yijk    ri  a j  bk  ab jk  eijk

where the terms have the usual meaning. The degree of freedom associated with a
factor is equal to its level minus one. For interaction, degrees of freedom will be the
product of degrees of freedom of the individual factors of that interaction.

Then the split up of degrees of freedom for ANOVA table is given as

30
Source Degrees of freedom Sum of squares
Blocks r-1
Treatment
A 1
B 1
AB 1
Error (r-1)(22-1)
Total 22.r-1

For a 2n factorial experiment, with r replications, error degrees of freedom=(2n-1)(r-1)


and total degrees of freedom = 2n.r-1

2MSE
The S.E(d) of A=
r.2

2MSE
The S.E(d) of B=
r.2

2MSE
The S.E(d) of AB=
r

In general for a factorial experiment,

2MSE
S.E (d) of X= where X is the main factor or interaction, D is the product of
r.D
levels of left out factors in X and r is the number of replication.

3n Factorial

This is a factorial arrangement with n factors at three levels. The three levels of the
factor can be low, intermediate and high. The levels will be denoted as 0 for low, 1 for
intermediate and 2 for high. Factors and interaction will be denoted by capital letters.
Treatment combination is denoted by n digits, where the first digit the level of factor A,
2nd digit that of factor B and so on. For example 102 denote A at intermediate level, B
at low level and C at high level.

Sum of squares in Asymmetrical factorial Experiment

In an asymmetrical factorial experiment, all the factors are not at the same number of
levels. Suppose that in an experiment there are p levels of factor A and q levels of
factor B. Then such experiment is a p x q factorial experiment.

31
The ANOVA model for this experiment is

yijk    ri  a j  bk  ab jk  eijk

where the terms have the usual meaning. TSS and SS due to Replications (SSR) are
found in the usual way. The degrees of freedom associated with a factor is equal to its
level minus one.

In order to compute main effect sum of squares & interaction effect sum of squares in
an asymmetrical factorial experiment, first prepare a two-way table of factors as given
below. The values in the table are the totals of all the treatments over ‘r’ replications
arranged in the form of a table

Factor A Factor B Total


1 2 …………..q
1 y.11 y.12. ……………….y.1q A1
2 y.21 y.22 y.2q A2
.
.
.
.
p y.p1 y.p2 y.pq Ap
Total B1 B2 Bq G
G2
CF=
pqr
 Ai2
Sum of squares of factor A( SSA)=  CF
qr
 B 2j
Sum of squares of factor B ( SSB) =  CF
pr
 yij2 .
AxB Total table sum of squares =  CF
r
Sum of Squares of interaction AB (SSAB) =AxB total table SS-SSA-SSB

SSE=TSS-SSR-SSA-SSB-SSAB

The ANOVA table can be completed with these results

32
Source Degrees of Sum of Mean sum of F-Ratio
freedom squares squares
Blocks r-1 SSR MSR MSR/MSE
Treatment
A p-1 SSA MSA MSA/MSE
B q-1 SSB MSB MSB/MSE
AB (p-1)(q-1) SSAB MSAB MSAB/MSE
Error (pq-1)(r-1) SSE MSE
Total pqr-1

The interpretation of results is done in usual way.

Asymmetrical factorial experiment with three factors

Suppose that we have a factorial RBD with three factors A, B, C having p, q and s
levels respectively. There will be pqs treatment combination. The ANOVA model for
this experiment is

yijkl    ri  a j  bk  ab jk  ci  ac jl  bckl  abc jkl  eijkl

The terms ab, ac, bc and abc are interaction effects. The other terms have the usual
meaning The degrees of freedom associated with a factor is equal to its level minus
one.

For interaction the degrees of freedom will be the product of the degrees of freedom of
the individual factors of that interaction.

Factorial effect Degrees of


freedom
A p-1
B q-1
C s-1
AB (p-1)(q-1)
AC (p-1)(s-1)
BC (q-1)(s-1)
ABC (p-1)(q-1)(s-1)

The ANOVA table can be completed with these results

33
Source Degrees of Sum of Mean sum of F-Ratio
freedom squares squares
Blocks r-1 SSB
Treatment
A p-1 SSA
B q-1 SSB
C s-1 SSC
AB . SSAB
. .
. .
. .
Error (pqs-1)(r-1) SSE
Total Pqsr-1
2MSE
The general formula for SE(d) of X= where X is the main factor or
r.D
interaction

D is the product of levels of left out factors in X & r is the number of replications

2MSE
SE(d) for A=
r.q.s

2MSE
SE(d) for AB =
r.s

2MSE
SE(d) for ABC=
r

Confounding

In factorial experiments, when the number of factors or the levels of factors are
increased the number of treatment combinations increases rapidly. For example, in 24
factorial experiments, there are 16 treatment combinations. So large blocks have to be
used and it will be difficult to ensure homogeneity within the blocks. In such situations,
we use an incomplete factorial which investigates the main effects of the factors and the
more important interactions under uniform condition by suitably subdividing the
experimental material to smaller homogeneous blocks. The heterogeneity of blocks is
allowed to affect only interactions which are likely to be unimportant. The process by
which unimportant comparisons are deliberately confused or mixed up with block
comparisons for the purpose of assessing more important comparison with greater
precision is called confounding. Confounding may also be defined as a technique for

34
arranging complete factorial experiment in blocks where the block size is smaller than
the number of treatment combination in one replicate.

In confounded experiment, each replication is split in to a number of blocks.


Then the entire number of treatment combination is divided in to a number of sets. The
number of sets is equal to number blocks. Each block is allotted one set of treatment
combination at random. Within each block, the treatment combinations forming the set
are allotted at random to the plots. For example for a 22 factorial experiment, with
factors A & B, there are 4 treatment combinations and the replication can be divided in
to two blocks. Suppose that interaction AB is confounded. Then the 4 combination can
be divided in to two sets. Recall that the interaction AB is estimated from ab- b-a+ (1)

Then one set can be ab, (1) and another set a, b.

The arrangement, in one replication may be

Replication 1
Block 1 Block 2
a ab
b (1)
For a 23 factorial experiment, suppose that each replicate is divided in to 2 blocks of 4
units each, and interaction ABC is confounded, interaction ABC is estimated from
abc+a+b+c-ab-ac-bc-(1)
The two sets are abc, a, b, c and ab, ac, bc, (1)

If there are three replicates arrangement can be

Replication 1 Replication 2 Replication 3


abc ab c ac a bc
a ac abc (1) b ab
b bc b ab abc (1)
c (1) a bc c ac

The interaction effect ABC is inseperably mixed up or completely confounded with


block effects. When a certain treatment effect is confounded in all replications, the
system of confounding is known as complete confounding. In complete confounding,
no information can be gained about the confounded treatment effects.

35
If a treatment effect is confounded in some replications and unconfounded in other
replications, the system is known as partial confounding. For example, consider a 23
factorial experiment with three replications. The arrangement of treatment combination
may be as follows.

Replication 1 Replication 2 Replication 3


abc a (1) bc ac c
c ac b c (1) abc
ab b abc a ab b
(1) bc ac ab bc a

In replication 1 , interaction AB is confounded with block effects

In replication 2 , interaction AC is confounded and in replication 3 interaction ABC is


confounded. Hence it is partial confounding. The interaction effect of AB can be
estimated from Replication 2 and Replication 3, interaction effect of AC from
Replication 1 &Replication 3 and the interaction ABC from respectively 1 & 2. The
block containing (1) is called a key block. Usually confounding is adopted only when
the number of treatment combination exceeds nine.

The analysis of confounded factorial experiments involves the same principle as all
other factorial analysis. In the source of variation the component, blocks within
replications, is added. Let r be number of replications, b= 2 n-p be the number of blocks
in a replication and t=2p be the number of treatment combination in a block.

Source of variation Degrees of Complete Partial


freedom confounding confounding
Replication r-1 r-1 r-1
Blocks within
replication r(b-1) r (2n-p-1) r (2n-p-1)
Treatments b(t-1) 2n-p (2p-1) 2n-1
Main effects
A .
B .
. .
.
.
Interaction
.
.
.
Error (r-1)b(t-1) (r-1) 2n-p (2p-1) (r-1)2n- r2n-p+1
Total 2nr-1

36
 Ri2
Replication sum of squares =  CF
2n

r b Bij2
Block sum of squares = 
i 1 j 1 t
 CF

Block within replication sum of squares= Block sum of squares – Replication sum of
squares

37
UNIT IV

Split-Plot Design

In factorial experiments, sometimes some factors have to be applied to large


experimental units in addition to some requiring plots of smaller size. In such situations
split-plot designs are used. In this design, the whole experimental area is initially
divided in to a number of large plots and the different levels of one factor is applied to
these plots known as whole plot treatments or main plot treatments. The whole plots are
then subdivided in to a number of smaller plots known as sub plots and levels of the
second factor is allotted to these smaller plots known as sub plot treatments. The sub
plot treatment and main plot treatments are allotted at random to the main plots and sub
plots.

This enables to test for the effects of the sub plot treatments and interaction of the
whole plot treatments and sub plot treatments more efficiently than the main effects of
main plot treatments. That is the effect of main plot treatments are estimated with low
precision and sub plot treatment and interaction effects are estimated high precision.

The model for split –plot experiment in randomized blocks is


yijk=   ri  m j  eij  sk  (ms) jk  eijk ; i=1,2….r, j=1.2……m, k=1,2,….s

yijk=the observation of ith replication, jth main plot and kth sub plot
 =over all mean,
ri= replication effect,
mj= jth main plot treatment effect,
eij=main plot error or error (a),
sk=kth sub plot treatment effect,
(ms)jk= interaction effect,
eijk = error component for subplot and interaction or error (b).

The ANOVA will have two parts which correspond to the main plots and sub
plots. For the main plot analysis, replication X main plot treatment table is formed.
From this two way table, sum of squares for replication, main plot treatment & error (a)
are computed.

38
Replication Main Plot treatments
1 2 …… m Total
1 y11. y12. . . . y1m y1..=R1
2 y21. y22. y2m y2..=R2
. .
. .
. .
. .
. .
. .
r yr1. yr2. . . . yrm. yr..=Rr

Total y.1. y.2. y.m. Y...=G


M1 M2 Mm
G2
CF=
mrs
 yij .2
Total sum of squares for main plot X Replication table=  CF
s

 yi..2 i Ri2
Replication sum of squares (SSR) =  CF   CF
sm sm
 y.2j .  M 2j
Main Plot treatment sum of squares (SSM)=  CF   CF
rs rs
Main plot error sum of squares, SSE(a)=Total sum of squares for MxR table –
Replication sum of squares - Main plot treatment sum of squares
For the analysis of sub plot treatments, main plot x sub plot treatment table is formed
Main Plot Sub plot
1 2 . . . s
1 y.11 y.12 y.1s y.1.
2 y.21 y.22 y.2s y.2.
.
.
.
m y.m1 y.m2 . . . y.ms y.m.
y..1 y..2 y..s
s1 s2 ss

From the table, sum of squares for subplot treatments and interaction between main plot
and sub plot treatments are computed. Error (b) sum of square is found out by residual
method.

39
 y. jk 2
Total sum of squares for table (M x S) =  CF
r
 Sk2
Sum of squares due to sub plots, SSS =  CF
mr
Sum of squares due to interaction, SSE= Total sum of squares (M x S) – Main plot Sum
of squares – Sum of squares due to sub plots

TSS, Total sum of squares = y


ijk
2
ijk  CF

SS sub plot error (b) = Total SS -[ RSS+MSS+SSE(a)+SubSS+SSInteraction]

The ANOVA table for a split-plot design in Randomized blocks is given in the
following table

Source Degrees of Sum of Mean sum of F


freedom squares squares
Replication r-1 SSR MSR MSR
MSE (a)
Main plot m-1 SSM MSM MSM
MSE (a)
Error(a) (r-1)(m-1) SSE(a) MSE(a)
Sub plot s-1 SSS MSS MSS
MSE (b)
Interaction (m-1)(s-1) SSI MSI MSI
main plot x MSE (b)

sub plot
Error(b) m(r-1)(s-1) SSE(b) MSE(b)
Total rms-1

For comparing two main plot treatment

2MSE (a)
CD = t r 1 m1 
rs

For comparing two sub plot treatment

2MSE (b)
CD = tm ( r 1)( s 1) 
rm

40
For comparing two sub plot treatment means at a given main plot treatment,

2MSE (b)
CD = tm ( r 1)( s 1) 
r

For comparing two main plot treatments either at a given sub plot treatment or at
different sub plot treatments

2  MSE (a)  ( s  1) MSE (b) 


CD= tw  SE (d )  ;
rs

t( r 1)( m1) MSE (a)  tm ( r 1)( s 1) MSE (b)(s  1)


where tw 
MSE (a)  ( s  1) MSE (b)

Strip-plot design

If two factors are involved and if both the factors require large plot sizes, it is difficult
to carry out the experiment in a split plot design. In some other situations a higher
precision may be required for the interaction than the precision for the two factors. The
strip-plot design is suitable for such experiments. It is also known as split-block design.

In strip-plot design each block is divided in to number of vertical and horizontal strips
depending on the levels of the respective factors. The vertical strip treatments are laid
out either in randomized blocks or in Latin square. The intersection of plots provides
information on the interaction of the two factors.

Replication 1 Replication 2
a0 a2 a1 a3 a2 a0 a3 a1
b1 b1
b0 b2
b2 b0

For example, for factors like spacing and ploughing, a block may be divided in to strips
in one direction to be allotted for one set of treatments, say different spacing and in to
another set of strips, in direction right angles to the first, to be allotted for second set of
treatment, say ploughing. The allotment of the treatment to the strips at each stage has
to be made at random. The analysis of strip plot design is carried out in three parts. The
first part is the vertical strip analysis, the second part is the horizontal strip analysis;

41
and the third is the interaction analysis. Suppose that A & B are the vertical strip and
horizontal strip factors, respectively. The data are rearranged in A x Replication table,
B x Replication table and A x B table. From A x Replication, sum of squares for
replication, A and error (a) are computed. From B x Replication table, the sum of
squares for B and error (b) are computed; and from A x B table, A B SS is computed.
The ANOVA table is formed with these results

Source of Degrees of Sum of Mean sum of F


variation freedom squares squares
Replication r-1 SSR MSR MSR
MSE (a)
A a-1 SSA MSA MSA
MSE (a)
Error(a) (r-1)(a-1) SSE(a) MSE(a)
B b-1 SSB MSB MSB
MSE (b)
Error(b) (r-1)(b-1) SSE(b) MSE(b)
AB (a-1)(b-1) SSAB MSAB MSAB
MSE (c)
Error© (r-1)(a-1)(b-1) SSE(c) MSE(c)
Total rab-1

2MSE (a)
SE(d) for A=
rb

2MSE (b)
SE(d) for B=
ra

2[ MSE (a)  (b  1) MSE (c)


SE(d) for A at levels of B=
rb

2[ MSE (b)  (a  1) MSE (c)


SE(d) for B at levels of A=
ra

The CD values are computed using appropriate t or tw.

42
Incomplete Block Design (IBD)

If in a block the number of experimental units is smaller than the number of


treatments, then the block is said to be incomplete and a design constituted of such
blocks is called an incomplete block design. In order to ensure equal or nearly equal
precision of the comparison of different pairs of treatments, the treatments are so
allotted to the different blocks that each pair of treatment has the same or nearly the
same number of replications and each treatment has an equal number of replications.

Balanced IBD

When the number of replications of all pairs of treatments in a design is the same,
then the design is known as BIBD. The design ensures equal precisions of the estimates
of all pairs of treatment effect.

Definition:- An IBD is said to be BIBD if it satisfies the following conditions

1. The experimental material is divided in to b blocks of k units each, different


treatments being applied to the units in the same block

2. There are t treatments each of which occurs in r blocks

3. Any two treatments occur together in exactly  blocks.

The quantities t,b,r,k and  are called parameters of BIBD. The necessary relationships
between the parameters of BIBD are

(i) rt=kb

(ii)  (t-1)=r(k-1)

(iii) b  t ; r> 

Lattice design

A balance two dimensional (design) with k2 treatments having one restriction is called
simple lattice. Also in this design, the treatments should be assigned to each block in
such a manner that  is same for all pairs of treatments. For a design to be balanced,
minimum required number of blocks is k(k+1). Thus, at least k+1 replications are
needed for a k2 simple lattice. This property of having separate replications for BIBD
holds only when t is a multiple of k and especially for lattice square design. Most of the
BIB design do not hold this property. In general, in an m-dimensional balanced lattice
design, the number of treatments is km, where k is a prime number or prime power.

43
Response surface methodology

The response surface methodology seeks to relate an average response to the


value of input factors which are quantitative in nature. If the input factors are amount of
fertilizer, soil moisture, time and the like, the yield of a crop or response may be
expressed as a function of the levels of these factors. For example, an experimenter
might want to find out how the yield of a crop depends upon the amount of N, P, K. He
is interested in the presumed functional relationship Y=f(N,P.K) that expresses the
yield Y as function of the variables N, P, K. The response actually observed in a
particular experiment differs from Y because of experimental error, which is denoted
by e . Thus,

Y=f(N, P, K)+e

is the required model relating the observed response to the levels of input factors. In
general, if we have r input factors (variables) X1, X2,…..Xr, the model can be written as

Y=f(X1, X2,…..Xr)+e

The function, f is called the response surface.

There are two uses of response surface. First, it has been applied to describe how the
response is affected by a number of quantitative variables over some already chosen
levels of interest. The second use is to locate the neighborhood of maximal or minimal
response. In agriculture experiments, it is of interest to determine optimum level of a
factor or combination of factors that will maximize the yield. When the response is cost
of production per unit of output, the objective may be to minimize the response.

Cross- over Designs

Cross over Designs are used in situation in which treatment are applied in sequence
over several periods to a group of individual items and the number of experimental
units may be less than number of observation. The design has been used for comparing
two or four treatments in dairy husbandry and other biological studies. The cross over
design has two restrictions imposed on randomization of the treatments to the
experimental units. The treatments are all included in each replicate or group. The
experimental units are rated with regards to time of application in each replicate or
group. The second restriction is that each treatment must be applied an equal number of
times in each period or time in the replicates.

44
For example, suppose that we have to compare the effect of two feeding rations, A & B
on the amount and quality milk produced by the cow. Since cows vary greatly in their
milk production, each ration is tested on every cow by feeding it either the first or the
second half of the period of lactation, so that each cow gives a separate replicate. The
rations are allotted to the periods at random with the restriction that half of the cows
receive first ration and the other half receives 2nd ration B in the period 1 and cows
receiving A receive B in period 2.

The experimental design for the six replicates (six cows) is of the following
Cows or Replication
Rows 1 2 3 4 5 6
Period I B B A A B A
Period II A A B B A B
If the above design were applied to an experimental situation which require a separate
experimental unit for each replicate, the analysis would be the same as given alone. For
example, suppose that two treatments A and B are applied to dairy cows, that treatment
period is used, that twelve cows are paired in to 6 pairs with each member of pair being
rated as superior or inferior and that one half of the superior and one half of the inferior
cows receive treatment B. The experimental design might be of the following form

Cows or Replication
Rows 1 2 3 4 5 6
superior B B A A B A
Inferior A A B B A B

The cross over design may be used for any number of treatments with the condition that
the number of replicates must be a multiple of the number of treatments.

TRANS FORMATIONS

ANOVA has t hree assumpt io ns.

1. The effect s are addit ive.

2. The exper iment al errors are independent .

3. The errors are distributed normally with mean zero and common variance 2.

When the above assumptions of ANOVA are violated we have to transform the data.

45
Whenever t he st andard deviat ions of samples are roughly
proport io nal t o t he means, an effect ive t ransfor mat io n may be a lo g
t ransfor mat io n. Frequenc y dist r ibut ions skewed t o t he r ight are o ft en
made mor e s ymmet r ic al by t ransfor mat ion t o a logar it hmic scale. While
logar it hms t o any base can be used, commo n lo gar it hms ( base 10) or
nat ural logar it hms ( base e) are generall y t he mo st convenient . T he
presence o f mult iplicat ive effect s and a rough proport ionalit y bet ween
st andard deviat io ns and means suggest t hat a logar it hmic t ransfor mat io n
ma y be appropr iat e. For example, a log transformation is often appropriate
when the dependent variable is a concentration. This cannot be less than zero, and may
have several moderately high observations, but may have a small number of very high
values. Taking logs (one can be added to each observation, if some are zero) often
normalizes the data.
Whenever t he response var iable is a count of relat ively r are event s
(e.g. insect count s on a leaf, blood cells wit hin a gr idded regio n o f a
hemat ocyt omet er, et c.) t he dat a t end t o fo llow a specia l dist r ibut io n
called a Poi sson di st ribution . In such sit uat io ns square root
t ransfor mat io n is used. It is bet t er to use  ( y+0.5) inst ead o f ( y). I f
t here is negat ive values in t he dat a, use appropr iat e co nst ant to make it
posit ive by adding it t hrough out . For example, counts such as the numbers
of cells in a haemocytometer square, can sometimes produce data which can be
analysed by the ANOVA. If the mean count is low, say less than about five, then the
data may have a Poisson distribution. This can be transformed by taking the square root
of the observations.

Another kind of data that may require transformation is that based on counts
expressed as percentages or proportions of the total sample. Such data generally
exhibit what is called a binomial distribution rather than a normal distribution. One of
the characteristics of such a distribution is that the variances are related to the means.
In such situation we go for arc sin transformation to the square root of proportion or
percentages. i.e., sin-1(p). It is used to stabilize the data when observed proportions are
in the range of 0 to 30% or 70 to 100%. When the data contains 0 or 1, transformation
is improved by replacing 0 by (1/4n) and 1 by (1-(1/4n)) before taking angular values,
where n is the number of observations based on which p is estimated for each group.

46
A logit transformation {loge(p/(l-p))} where p is the proportion, will often correct
percentages or proportions in which there are many observations less than 0.2 or greater
than 0.8 (assuming the proportions cannot be < 0 or > 1)

When the treatment S.D. are proportional to square root of means, the appropriate
transformation is x to 1/x. It is mostly used when time is the independent variable.

In general transformations are used to reduce the heteroscedasticity of the data or


to make the data more closely resemble the normal distribution.

When a transformation has been made, the analysis is carried out with the
transformed data. The conclusions are drawn from such analysis. However, while
presenting the results, the mean and standard errors are transformed back in to original
units. While transforming back to original units, some corrections have to be made. In
case of logtransformed data, if the mean value is X , the mean value of original units
will be Y  anti log( X  1.15V ( X )) , where V ( X ) is variance of the mean X .

If the square root transformation has been used, then Y  ( X  V ( X ))2 .

If no suitable transformation can be found, a nonparametric test can often be used

47

You might also like