Experimental Design I
Experimental Design I
ng
COURSE CONTENT:
Basic concepts of experimentation, Completely randomized design, Randomised
complete block design, Latin Square Design, Graeco Latin Square Design, Simple
factorial Design
COURSE REQUIREMENTS:
This is a compulsory course for all statistics students. Students are expected to have
a minimum of 75% attendance to be able to write the final examination.
READING LIST:
1.) Statistical Design and Analysis of Experiments by P.W.M. John.
2.) Experimental Designs by Cochran and Cox.
3.) Designs and Analysis of Experiments for Biology and Agric. Students by
Oyejola, B.A.
4.) Statistical Methods by Snedecor and Cochran.
5.) Statistical Procedures for Agricultural Research by Gomez and Gomez.
LECTURE NOTES
Introduction
An experiment involves the planning, execution and collection of measurements
or observations.
Examples of simple experiment
1. Comparison of two teaching methods
2. Comparison of two varieties of maize
just due to chance. Clearly every experiment must be designed to have a measure of the
experimental error.
Definitions
Experimental Unit/plot
This is the smallest to which a treatment is applied, and on which an observation
is made e.g. an animal bird, an object, a cage, a field plat and so on.
- Definition of a unit depend on the objective of the experiment.
Factors
These are distinct types of condition that are manipulated on the experimental unit
e.g. age, group, gender, variety, fertilizer and so on.
Factor Levels
Different mode of the presence of a factors are called factor levels.
- Factors can be quantitative or qualitative.
Treatments
Each specific combination of the levels of different factors is called the treatment.
Replication
These are the numbers of experimental units to which a given treatment is
applied.
(I) (II)
In layout (II) if the field has fertility gradient so that there is a gradual
productivity from top to bottom. Then the white variety will be at advantage been
in a relatively more fertile area hence, the comparison within the variety would be
biased in favour of variety “W”.
A better layout is obtained by randomization as shown in layout (I).
yij = μi + eij
= μ + αi + eij
i = 1, 2, …, t and j = 1, 2, …, r
Where is the observed value for replicate j of treatment i, μ i is the population mean
for treatment i, μ is the population mean, is the effect of treatment i and eij is the
experimental error resulting from replicate j of treatment i.
Assumption: are assumed normally distributed about the mean, μi, and variance, σ2
or N (0, σ2) i.e. independently and identically normally distributed with mean 0
dS
= −2∑∑ ( y ij − μ − α i )
dμ i j
⇒ −2∑∑ ( y ij − μˆ − α i ) = 0
i j
∑∑ y − ∑∑ μˆ − ∑∑ α
i j
ij
i j i j
i =0
∑∑ y
i j
ij − rtμˆ − r ∑ α i = 0
i
⇒ rtμ̂ = ∑ yij
∴ μˆ =
∑∑ y ij
= y..
rt
dS
= −2∑ ( y ij − μ − α i )
dα i j
http://www.unaab.edu.ng
− 2∑ ( y ij − μˆ − αˆ i ) = 0
j
∑ y − ∑ μˆ − ∑ αˆ
j
ij
j j
i =0
∑y j
ij − rμˆ − rαˆ i = 0
∑y ij
αˆ i = − μˆ
j
r
= y i. − y..
Randomization Procedure
1. Determine the total number of experimental units or plots (N) where N = rt with r
being the number of replications and t the number of treatments.
2. Assign a plot number to each experimental unit in any convenient manner
consecutively 1 to N.
3. Assign the treatments to the experimental units by any chosen randomization
scheme e.g. using table of random numbers, random number generator, drawing
of lots and so on.
Data Structure
Treatments
1 2… T
1 y11 y12 L y t1 y.1
http://www.unaab.edu.ng
2 y12 y 22 L yt 2 y .2
1 1 1 1 1
1 1 1 1 1
1 1 1 1 1
y1. y 2. L yt. y y.
Analysis of Variance
The total variation in CRD is partitioned into two sources of variation i.e.
variation due to treatment and variation due to the error. The relative size of the two
variations is used to indicate whether the observed difference among the treatment means
is significant or due to chance, the treatment difference is said to be significant if the
treatment variation is significantly larger than the experimental error.
Total sum of squares, SST,
2
t ni y
SST = ∑∑ ( y ij − y..) 2 = ∑∑ y −
2
..
i =1 j =1
ij N
i =1 j =1 i =1 i =1 ni N
2
y
C.F = ..
= correcting factor
N
⇒ SST = SSB + SSE
i.e.SSE = SST − SSB
ASSIGNMENT
http://www.unaab.edu.ng
• Show that:
∑∑ ( y
i j
ij − y.. ) 2 = ∑∑ ( y i. − y.. )2 + ∑∑ ( yij − y i. ) 2
i j i j
ANOVA TABLE
Or
Reject H0 if FC FT
COMPARISON OF MEANS
If a significant result is declared then there is need to identified the mean that are
different and this can be done using multiple comparison of means such as
LSD – Least Significant Difference
DMRT – Duncan’s Multiple Range Test Turkey
Scheffee etc.
LSD = tSED
http://www.unaab.edu.ng
=r
If the observe difference between any two means is greater than the LSD value then those
two means are said to be significantly different.
COEFFICIENT OF VARIATION
This is a measure of precision of the estimates obtained from the data. It is also
used to assess the quality of the management of an experiment. A low coefficient of
variation indicates high precision of estimate or efficient management of the experiment.
Example: In an effort to improve the quality of recording tapes, the effect of four kinds of
coating A, B, C, D on the reproducing quality of sound are to be compared. The
measurements of sound distortion are given below.
A 10 15 8 12 15
B 14 8 31 15
C 17 16 14 15 17 15 18
D 12 15 17 15 16 15
Recommend the best coating for the sound production.
ADVANTAGE OF CRD
1. The design is very flexible
2. The statistical analysis is simple
http://www.unaab.edu.ng
DISADVANTAGE OF CRD
Design is very inefficient if units are not homogenous.
ASSIGNMENT
1. Analyze the following data from a field experiment with four treatments using 1%
significance level. Carryout mean comparison if necessary. How good is the
management of the experiment.
A 14.3 11.6 11.8 14.2
B 20.7 21.0
C 32.6 32.1 33.0
D 24.3 25.2 24.8
Introduction
The design is used when the experimental unit can be grouped such that the
number of units in a group is equal to number of treatments. The groups are called blocks
or replicates and the purpose of grouping is to have units in a group as homogeneous as
possible so that observed differences in a group are mainly due to treatments. Variability
within group is expected to be lower than variability between groups. Since the number
of units per block equal the number of treatments, the blocks are of equal size hence, the
design is a complete block design. The primary purpose of blocking is to reduce the
experimental error by eliminating the known sources of variability.
Model:
yij = μ + αi + βj + eij
i = 1, 2, …, t and j = 1, 2, …, r
where is the observed value for block j of treatment i, μ is the population mean, is
the effect of treatment i, βj is the effect of block j and eij is the experimental error
resulting from block j of treatment i.
Assumption:
- block and treatment effect are additive,
- N (0, σ2)
- ∑αi = 0, ∑βj = 0,
Estimation of Parameters
A procedure similar to that used in CRD can be utilized here to obtain the desired
estimates.
Randomization and Layout
The randomization process for randomised complete block design is applied separately
and independently to each of the block.
- Divide the experimental area into r-blocks.
- Sub-divide the block into t-experimental units. Where t is the number of
treatments.
http://www.unaab.edu.ng
- Number the plot consectively from I to t and assign the t-treatment at random to
the t-unit within each block following any randomization scheme.
DATA STRUCTURE
Blocks
Treatment 1 2 3…r Total
1
2
3 y 31 y 32 y 33 ... y 3r
1 1 1 1
1 1 1 1
1 1 1 1
t
Total y.3 ... y.r y..
Analysis of Variance
The total variation is partitioned into the variation due to blocks, variation due
treatments, and variation due to error. i.e.
2
y
SSTotal = ∑∑ ( yij − y .. ) = ∑∑ y −
2
2 ..
ij N
2 2
y y
SS Trt = ∑∑ ( y i. − y .. ) 2 = ∑ i.
− ..
r N
2 2
y y
SS Block = ∑∑ ( y ij − y .. ) 2 = ∑
.j
− ..
t N
SSE = SSTotal − SSTrt − SS Block
http://www.unaab.edu.ng
ANOVA TABLE
Source Df SS MS F
Block r–1 SSB
MSB =
r −1
Treatment t–1 SST
MST =
t −1
Error (r-1)(t-1) SSE
Total rt – 1
Hypothesis
Or
Comparing the calculated F-ratios to the table F-value at a given significance level, we
decide to reject or fail to reject the null hypothesis. i.e. Reject if
Reject if
COMPARISON OF MEANS
Use LSD to compare the treatments if the F-ratio is found to be significant.
where
Coefficient of variation
,
,
⇒ xˆ − μˆ − αˆ i − β j = 0 ,
=
http://www.unaab.edu.ng
=0
Note that: the degree of freedom must be adjusted by the number of missing values i.e.
reduce the number of degrees of freedom by the number of missing values.
DISADVANTAGES OF RCBD
1. Not best for large number of treatments.
2. More tasking in the execution of the design than the CRD.
3. Missing value can create problem especially in estimation and non formal
analysis.
4. The precision will be affected due to missing values.
RELATIVE EFFICIENCY
Blocking maximizes the difference among blocks. Hence it is necessary to examine how
much is gained by the introduction of blocking into the design. The magnitude of the
http://www.unaab.edu.ng
reduction in the experimental error due to blocking over the CRD can be obtained by
Where is the block mean square and the is the error mean square.
V1 V2 V3 V4 V5
I 2.73 1.73 3.00 2.13 0.27 9.86
II * 2.27 1.53 2.40 0.27 6.47
III 2.33 2.00 2.13 1.87 0.10 8.43
IV 2.53 2.00 2.00 1.47 0.13 8.13
7.59 8.00 8.66 7.87 0.77 32.89
ASSIGNMENT
In an experiment to test the effect of five level of potash (ABCDE) on the yield of
cotton the following strength indices were obtain as given below. One of the data point is
missing. Analyze the data and compare the mean of the five level of potash and make
necessary recommendation. How effectively was the experiment carried out. Was there
any gain in precision in using RCBD over CRD
http://www.unaab.edu.ng
Model:
yijk = μ + αi + βj + δk + eijk
where i, j, k = 1, 2, … , t
where is the observed value from row j and column k receiving treatment i.
Assumptions:
The model is completely additive i.e. there is no interaction between the rows, columns
and treatments.
N (0, σ2)
and
Estimation of Parameters
A procedure similar to that used in CRD can be utilized here to obtain the desired
estimates.
RANDOMIZATION PROCEDURE
1. Obtain a square field partitioned into t rows and t columns.
2. Arrange the treatment into the unit in a standard form.
3. Randomize between the columns
4. Randomize between the rows
Example: Consider an experiment with four treatments to be compared using latin square
design i.e. 4 x 4 LS
http://www.unaab.edu.ng
4 B D C A
ANALYSIS OF VARIANCE
The total variation is partitioned into components for row, column, treatment and
error. The sum of squares are obtained in the usual form.
ANOVA TABLE
Source Df SS MS F
MSR
Rows t-1 SSR MSR /MSE
MSC
Columns t-1 SSC MSC /MSE
MS
Treatments t-1 SSTrt MSTrt Trt/MSE
BLOCKING EFFICIENCY
The efficiency of both row and column blockings in a latin square design indicate the
gain in precision relative to either the CRD or RCBD.
http://www.unaab.edu.ng
Where Er, Ec, Ee are the mean squares row, column, and error respectively with t as the
number of treatment. For an R.E = 325% it indicates that the use of LSD is estimated to
increase the experimental precision by 225% while if the R.E is less than 100% means
that there is no gain.
RELATIVE EFFICIENCY LSD TO RCBD
Relative efficiency of latin square design as compared to RCBD can be computed in two
ways i.e. when rows are considered as blocks and when columns are considered as blocks
of the RCBD.
I II III IV I
II
III
IV
1) =
2) =
Where the R0, C0, T0 are the total of the row, column and treatment respectively that
contain the missing observation. Again one degree of freedom is subtracted from both
total and error degrees of freedom in the case of one missing value.
ADVANTAGES AND DISADVANTAGES
1. The elimination of two sources of variation often lead to a smaller error mean
square than would be obtained by use of CRD and RCBD.
2. ANOVA is simple
3. Missing values can easily be handled
Disadvantages.
1. Assumption of no interaction between different factors may not hold.
2. Unlike the CRD and RCBD, the number of treatments is restricted to the number
of replications. Hence it is limited in application.
3. For large number of treatments such as t > 12, the square becomes too large and
does not remain homogeneous.
4. For small number of treatments such as t < 3, degrees of freedom for the error is
usually too small for any meaningful comparison or conclusion.
5. A square field is often required for the design and this may not be practicable.
Example: The following show the field layout and yield of a 5 x 5 latin square
experiment on the effect of spacing on yield of millet, the spacing are: A(2cm), B(4cm),
C(6cm), D(8cm) and E(10cm)
Column
Row 1 2 3 4 5 Total
1 B: 257 E: 230 A: 279 C: 287 D: 202 1255
2 D: 245 A: 283 E: 245 B: 280 C: 260 1313
3 E: 182 B: 252 C: 280 D: 246 A: 250 1210
4 A: 203 C: 204 D: 227 E: 193 B: 259 1086
5 C: 231 D: 271 B: 266 A: 334 E:339 1440
http://www.unaab.edu.ng
I II III
A2 B1 C3
http://www.unaab.edu.ng
B3 C2 A1
C1 A3 B2
Randomize between columns
T=4 A, B, C. D
I II III IV
I A1 C4 B3 D2
II B2 D3 A4 C1
III C3 A2 D1 B4
IV D4 B1 C2 A3
T=5 A, B, C, D, E
I II III IV V
A1 B3 C5 D2 E4
B2 C4 D1 E3 A5
C3 D5 E2 A4 B1
D4 E1 A3 B5 C2
E5 A2 B4 C1 D3
MODEL
yijkl = μ + αi + βj + δk + τl + eijkl
where i, j, k, l = 1, 2, … , t
where is the observed value from row j and column k receiving treatment i.
Assumptions:
The model is completely additive i.e. no interaction between the row, the column, the
subscript factor and the treatment
N (0, σ2)
, and ∑τl = 0
Estimation of Parameters
A procedure similar to that used in CRD can be utilized here to obtain the desired
estimates.
STATISTICAL ANALYSIS
The total variation is petition into five components i.e. the row, column, subscript factor,
treatment and error.
SSR =
∑y 2
. j ..
−
y 2....
t N
http://www.unaab.edu.ng
ANOVA TABLE
Source Df SS MS F
MSR
Rows t-1 SSR MSR /MSE
MSC
Column t-1 SSC MSC /MSE
MS
Subscript t-1 SSSubscript MSSubscript Subscript/MSE
MS
Treatments t-1 SSTrt MSTrt Trt/MSE
AREA
Order 1 2 3 4 5 Total
1 A1 (3.5) B3 (4.2) C5 (6.7) D2 (6.6) E4 (4.1) 25.1
2 B2 (8.9) C4 (1.9) D1 (5.8) B3 (4.5) A5 (2.4) 23.5
3 C3 (9.6) D5 (1.7) E2 (2.7) A4 (3.7) B1 (6.0) 25.7
4 D4 (10.6) E1 (10.2) A3 (4.6) B5 (3.7) C2 (5.1) 34.1
5 E5 (3.1) A2 (7.2) B4 (4.0) C1 (3.3) D3 (3.5) 21.1
Total 35.6 27.2 23.8 21.8 21.1 129.5
http://www.unaab.edu.ng
ASSIGNMENT
The data below is obtained from an experiment using Graeco Latin Square Design with
four diet, (A, B, C, D). breed I, II, III, IV weight group {1, 2, 3, 4} feed concentration {i
ii iii iv}. Is there any significant difference between the diets. If any, compare them and
make necessary recommendation. Also comment on the management of the experiment.
Breeds
I II III IV
1 A1 (5.9) B3 (4.2) C4 (10.2) D2 (6.6)
2 B2 (8.9) A4 (4.5) D3 (6.0) C1 (3.0)
3 C3 (9.6) D1 (5.8) A2 (7.2) B4 (4.6)
4 D4 (10.5) C2 (4.1) B1 (3.5) A3 (6.7)
SIMPLE FACTORIAL EXPERIMENT
Introduction
Factorial experiments are used in the study of the effects of two or more factors. In
factorial experiments, all the possible combinations of the level of the factors make up
the treatments. For example, if there are two factor A, B each with ‘a’ and ‘b’ levels
respectively, then we have ‘ab’ treatment combinations.
A B
Maize variety Fertilizer Rate
In analyzing data from a factorial experiment, we would be interested in the main effect
and the interaction effect of a factor. The main effect of a factor is a measure of the
change in response in the level of a factor averaged over all levels of the other factors.
For example, let two factors A and B be at two level a0, a1 and b0, b1 respectively with
treatment combinations a0b0, a0b1, a1b0, a1b1. The main effect of A is a measure of change
in A from a0 to a1 averaged over the two levels of B.
i.e. At level b0 of B: the simple effect of A is a1b0 – a0b0.
Similarly, at level b1 of B: the simple effect of A is a1b1 – a0b1..
Main effect of A =
The r represent the replication, where each treatment total response is from r units.
Also at level a0 of A: simple effect of B is a0b1 – a0b0.
Similarly, at level a1 of A simple effect of B is
Thus
Main effect of B =
Each effect of a factor at a given level of the other factor is known as simple effect. The
interaction effect is the differential response to one factor in combination with varying
levels of a second factor. That is, an additional effect due to the combined influence of
two or more factors. For example, interaction between A and B (AB) is estimated as the
difference between two simple effects. and
BA =
From above illustration (i) shows the case of no interaction, (ii) shows the case of mild
interaction and (iii) shows the case of strong interaction.
LAYOUT
Any of the earlier design discussed can be used, in particular, the RCBD. The treatment
combinations are assigned to each block randomly
For example, consider the case of two factors A and B, each at two levels a0, a1 and b0, b1
respectively. The treatment combinations are a0b0, a0b1, a1b0, a1b1.
a0b1
http://www.unaab.edu.ng
MODEL CRD
where is the observed value from the kth unit corresponding to level i of factor A and
MODEL RCBD
H0:
H1: atleast one
H20:
H21: atleast one
H30:
ANALYSIS OF VARIANCE
http://www.unaab.edu.ng
The total variation is partitioned into that due to factor A, factor B, the interaction AB
and the error i.e.
where the SST, SSA, SSB, SSAB and SSE are obtain in the usual form i.e.
-SSA- SSB
ANOVA TABLE
Source Df SS MS F
MSA
A a-1 SSA MSA /MSE
MSB
B b-1 SSB MSB /MSE
MSAB
AB (a-1) (b-1) SSAB MSAB /MSE
ERROR ab(r-1) SSE MSE
Total abr-1 SST
Example
An engineer designing a battery for use in a device that would be subjected to some
extreme variation in temperature has three types of plate materials to use. He decided to
test the plate materials under three temperature settings (150F, 700F, 1250F) to see their
effect on the life of a battery. Four test runs are to be made at each treatment
combination. Test the effect of temperature and plate material and their possible
interaction on the battery life.
http://www.unaab.edu.ng
Temperature
Type 150F 700F 1250F
A
Tyres
Car brand A B C
1
3
http://www.unaab.edu.ng