0% found this document useful (0 votes)
3 views50 pages

Lecture-3-With-Computations

The document discusses advanced experimental designs and statistical methods, emphasizing the importance of proper planning and execution in experiments to ensure valid data analysis and conclusions. It covers key concepts such as treatment, response variables, experimental errors, and different models (fixed vs random), along with principles of experimentation like randomization and local control. Additionally, it outlines considerations for choosing experimental designs and approaches for data analysis, including ANOVA assumptions and model building.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views50 pages

Lecture-3-With-Computations

The document discusses advanced experimental designs and statistical methods, emphasizing the importance of proper planning and execution in experiments to ensure valid data analysis and conclusions. It covers key concepts such as treatment, response variables, experimental errors, and different models (fixed vs random), along with principles of experimentation like randomization and local control. Additionally, it outlines considerations for choosing experimental designs and approaches for data analysis, including ANOVA assumptions and model building.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 50

Advanced

Experimental Designs
Introduction
Statistical Methods

“Most data cannot be efficiently utilized without statistical


method and that statistical method is futile unless applied
to a data”

Snedecor & Cochran, 1989


Introduction
Statistical method – tools to simplify and interpret data
Experimental design – the way a statistical experiment
is actually planned, including what, when, where and
how the data will be gathered and recorded
• Involves in the assignment of treatments in the
experimental units
• Thorough understanding of the analysis to be performed
once the data becomes available
• Alternative: simple descriptive statistics
Relationship between statistics and
experimental designs
 Experiments involves probability samples
 Experiments lead to inference that must be
accompanied by probability statements
 Experiments should be designed in accordance with
the principles of statistics for the probability
statement to be valid
Importance of Experimental Designs
 Ensure cost effective collection of appropriate data
 Provides a roadmap for appropriate and valid analysis
of data
 Allows for drawing valid conclusions using statistical
inference
Definition of Terms
 Treatment or Factor – set of experimental procedures or conditions whose effects
are to be measured and compared
 Response Variable – characteristic used to measure the effect of the treatment
 Experimental unit (eu) – unit or group of units to which a single treatment is
applied
 Sampling unit (su) – portion of the eu which the response variable is observed or
measured
 Experimental error – variation in the observed values of the response variable
from experimental units treated alike
 Sampling error – measure of variation among sampling units within an
experimental unit
Sources of experimental error

 Inherent variability of the experimental materials used


 Errors in experimentation
 Errors in observations and measurement
 Combined effects of all extraneous uncontrolled
factors
Fixed Model vs Random Model
Fixed model – all factors under test are fixed factors
• A factor is considered fixed when its levels are selected on
purpose
Example:
Compare the yield of 4 common rice varieties in the locality
Random model – all factors under test are random factors
• A factor is random when the levels of the factors tested are
random sample of a population
Example:
Genetic variability of native chickens randomly collected
within the province
Precision
• The ability of the measurement to be consistently
reproduced
• Measured by the variance
• Ways to increase precision:
• Increase number of samples
• Skillful grouping of the experimental materials
• Proper selection of treatments
Accuracy

 Closeness of the observed values to the true values


 Measured by the bias (difference between the
average of the values and the true value)
 Ways to increase accuracy:
• Refinement of experimental technique
• Proper selection of treatments
Principles of Experimentation
Randomization
• Methods of allocating the treatment levels to the eu using some chance
mechanism such that eu’s equally likely to be assigned to a treatment level
Functions:
• To have a valid measure of experimental error
• To provide a random sample of observation
• To eliminate systematic bias in assigning the treatments to the eu’s
• To satisfy assumption of independent errors
• To minimize errors associated with experimental units that are adjacent in
space and time
Principles of Experimentation
Local Control ( Error Control)
• All practices or techniques used to minimize the experimental
error which includes balancing, blocking and grouping of the
experimental units
• Making the design more efficient making the test of significance
more sensitive or the test procedure more powerful
Common techniques of local control:
• Use of most appropriate experimental design
• Use of proper shape and size of eu
• Use of concomitant variable
• Refinement of experimental technique
Principles of Experimental Design
Grouping – placing homogeneous eu’s into a groups
and comparing the treatments in each group
Blocking – grouping the experimental units into
blocks such that the units within a block are relatively
homogeneous
Balancing – grouping, blocking and assignment of the
treatments to the eu in such a way that a balanced
configuration results
Structure of Experimental Design
Treatment Structure – consists of Design Structure – refer to
the set of treatments, treatment grouping of the experimental
combinations or populations that units into homogeneous
the researcher has selected to groups or blocks
study/compare • Completely randomized design
• One-way treatment – treatments • Randomized complete block
consist of levels of a single factor
design
• Factorial – treatments consists of all
possible combinations of one levels of
• Latin square design
two or more factors being tested • Split-plot design
• Fractional factorial – only a fraction of • Strip-plot design
the full set of treatment combinations • Strip-split plot design
in a factorial treatment combination • Incomplete block design
• Factorial arrangement with one or
more controls
• Nested treatment structure
Considerations in choosing the
experimental design
Objectives of the research
Description of responses or animal/farm performance to be evaluated
• Quantitative vs qualitative
• Univariate vs multivariate
Descriptions of factors hypothesized to influence animal on farm performance
[independent variables]
• Primary factors, nuisance factors (blocking factors/factor levels)
• Fixed effects vs random effects
Practical restrictions (cost, time, allocations of limited resources)
Approaches in data analysis

• Exploratory data analysis – process of looking at raw data to decide on


its important features
• Rounding-off data
• Grouping of data in a convenient form
• Identifying outliers
• Finding mean and quantile values
• Determining violations of assumption of anova
• Non-additivity
• Heterogeneity of variance
• Non-normality
• Non-linearity
Approaches in data analysis
Model Building – mathematical (linear or non-linear)
model to describe to describe biological processes or
relationship.
• Represent theoretical relationships that are observed to
exist the real life process, situation or problem
Analysis of variances
Comparison among treatment means
Assumptions of ANOVA
 homogeneity of variance of the expl errors (i.e., erratic
effects of treatments outliers, non-normal or skewed
distribution of data)
Tests:
Hartley’s F-max; Ho: σ21 =σ22 = σ23; DR: Reject Ho if Fmaxc >
Fmax(0.05). Should not be used for heavily skewed distribution.
Bartlett’s test (U); Ho: σ21 = σ22 = σ23. Most powerful when all
treatments are ND.
Levene’s test. Uses absolute deviation from the mean to perform ANOVA.
The DR is based on the comparison of Fc and Ft (a=0.05).
Brown-Forsythe test.
Cochran’s test.
Assumptions of ANOVA
Normality of expl errors (cause extremely skewed
distribution which can be reduced by increasing
sample sizes)
Tests;
Chi-square goodness of fit
Kolmogorov-Smirnov
Wilk-Shapiro W test
Normal probability plots
Assumptions of ANOVA
Independence of errors
Tests:
Plot of the predicted Yij vs Residual. No pattern means independent.
Durbin-Watson
Runs test
Additivity of effects
No interaction b/n treatment and environmental effects
Effects of treatments across
Require transformation if violation occurs
Test: Tukey’s test for non-aditivity
Model Building
Weight gains (kg) of pigs to varying levels probiotic
Natuphos (phytase) The model will be:
Level (gm/kg Initial
feeds) Sex Weight Weight gain
2 M 12.0 24 yijk = µ + Leveli + Sexj + Initial_weightk +
4 F 10.0 20 ɛ(ijk);
6 F 11.0 21
2 M 13.0 25
4 M 11.0 23 where
6 F 10.0 22 y is the observed values of weight
2 F 9.5 19.5 gain
4 M 10.5 21 µ is the overall mean of weight gain
6 M 11.5 23 ɛ(ijkl)m is the random error
2 F 12.0 25.5
4 F 10.5 21.5 associated with the ith level given to jth
6 M 12.5 26.5 sex with the kth initial weight
2 F 11.5 23
4 F 9.5 20.5
6 M 10.5 21
SAS codes and output
Linear Model for CRD Equation:
 Without subsampling
Consider the backfat thickness (mm) of pigs with
Example
varying protein levels in the diet. treatment: protein level
trt levels: 1 2 3 Y=u + t + e
Replication eu’s: pigs
Level
su’s: pigs
I II III IV response variable: BFT
yij = µ + τi + ɛij; i=1, 2, 3; j=I, II, …rj

1 28 26 30 28
Where
Yij = the BFT of the jth pig feed with ith level
2 27 25 26 26 µ = the true mean BFT of pigs
τi = the effect of ith protein level
ɛij = expl error associated with the BFT of the
3 24 24 26 25
jth pigs fed the ith level
The ANOVA Table (CRD)
Analysis of Variance
Sources of Degrees
of Sum of Mean Computed Tabular F-Value
variation Freedom Squares Squares F-Value α = 0.05 α = 0.01
Treatment 2 21.50 10.750 7.588* 4.256 8.021
(Level)
Exp'l Error 9 12.75 1.417
Total 11 34.25
* - Significant at α = 0.05.
Equations and Manual Calculations (CRD)
Step 1. Compute the treatment totals and means as follows:
Replication Treatment Treatment
Level
Totals means
I II III IV
1 28 26 30 28 112 28.00
2 27 25 26 26 104 26.00
3 24 24 26 25 99 24.75
GT = Σx 315
GM = µ = (Σx/n) 26.25
Step 2. Compute the correction factor as: Step 4. Compute the Total Sum of Square (Total
= SS):
Step 3. Degrees of Freedom (df): ) – CF
Treatment df = t – 1 = 3 -1 =2, i.e., + . . . +
Error df = n – t = 12 – 3 = 9, and = 282 + 262 + 302 + . . . + 252 - CF
Total df = n – 1 = 12 – 1 = 11 = 8303 – 8268.75 = 34.25
Step 5. Treatment Sum of Square (Treatment SS):
- CF

= 8290.25 – 8268.75 = 21.50


Error SS = Total SS – Treatment SS
= 34.25 – 21.50 = 12.75
Step 6. Compute the Means square for Hypothesis testing:
treatment and error, i.e., MST and MSE:
Null hypothesis Ho: All treatments means are equal,
i.e.,
MST = Treatment SS/ t-1
= 21.50/2 = 10.75 Alternative hypothesis Ha: At least one of the
treatment means is different, i.e., µ1 ≠ µ2 ≠ µ3,
MSE = Error SS/n-t
= 12.75/9 = 1.417 Test Statistic: F – Test → F = MST/MSE

Decision Rule DR: reject Ho if computed F - value <


Step 7. Compute the F – value: the tabular F – value at α = 0.05, otherwise do
not reject.
Computed F (Fc) = MST/MSE
= 10.75/1.417 = 7.588 • Decision: Since the computed F – value (7.588) >
the tabular F – value (α → 0.05 = 4.256), there is
Step 8. Compare the computed F with the
enough evidence to reject the Ho that all
tabular F – value at α = 0.05 and α = 0.01
and check your assumptions (hypothesis) treatments are equal, i.e., at least one of the
treatments is different.
Data Formatting and SAS code for CRD
without sub-sampling
Linear model for CRD  With Sub-sampling
y=u+t+e+d

yijk = µ + τi + ɛij + δijk


Soil Soil Location
type samples
1 2 3 4
Where based on the table
Bat-ongan 1 9.1 7.3 7.3 10.7
y = observed value of N in the kth sampling
2 7.3 9.0 8.9 12.7
unit with in the jth location in the ith soil type
Cabitan 1 12.6 9.1 10.9 8.0 µ = the overall mean
τi = the ith soil type
2 14.5 10.8 12.8 9.8 ɛij = the random error associated with the jth
Dayao 1 7.3 6.6 5.2 5.3 location in the ith soil type
δij = the random error associated with the kth
2 9.0 8.4 6.9 6.8 sampling unit in the jth location in the
ith soil type
Manual Computation (CRD with Sub-
sampling)
Phosphorous Concentration from Location to Location within a Soil Type
Soil Type Soil Locations ST ST
(ST) Samples 1 2 3 4 Totals Means
1 9.10 7.30 7.30 10.70
Bat-ongan
2 7.30 9.00 8.90 12.70 72.30 9.04
Sample Total (B) 16.40 16.30 16.20 23.40
1 12.60 9.10 10.90 8.00
Cabitan
2 14.50 10.80 12.80 9.80 88.50 11.06
Sample Total (C) 27.10 19.90 23.70 17.80
1 7.30 6.60 5.20 5.30
Dayao
2 9.00 8.40 6.90 6.80 55.50 6.94
Sample Total (D) 16.30 15.00 12.10 12.10
GT 216.30
GM = GT/n 9.0125
Step 1. Correction Factor: Step 4. Error SS:
CF = (GT)2/n = (216.30)2/24 = 1949.40
ESS = Σ[Σ(Sample Total)2/(number of sample)] –
Step 2. Degrees of Freedom: [Σ(Soil Type Total)/(total samples)] – CF
Total df = n – 1, Treatment df = t -1,
1949.4 = 51.08.
Error df = t(r-1), and Sampling df = tr(s-1)

Step 3. Total SS
Step 6. Sub-sampling Error SS (MSD):

) – CF MSD = Total SS – Treatment SS – Error SS

= (9.12 + 7.32 + 7.32 + 10.72 + . . . + 12.12) – CF = 137.81 – 68.07 – 51.08 = 18.65


= 2087.21 – 1949.4 = 137.81 Note: MS for each source is derived by dividing
Step 4. Treatment SS (Soil Type): the SS with their respective DF.

Treatment SS (Soil Type) - CF

= 68.07
ANOVA (CRD with Sub-sampling)
ANOVA
F-
Sources DF SS MS Compute F - Value
d (0.05)
Soil Type 2 68.07 34.035 5.997 4.256
Exp’l Error (E) 9 51.08 5.676 3.652 2.796
Sampling
Error (D) 12 18.65 1.554
Total 23 137.81
Hypothesis Testing 1: Test for variability Hypothesis Testing 2: Test of differences
of locations within soil types. among treatment means:
Ho: The phosphorous concentrations of the Ho: There are no differences in the
phosphorous concentration among the
locations within soil types are the same. different soil types.
DR: Reject Ho if Fc > Ft (α, edf, sdf) → F(0.05, 9, 12) = DR: Reject Ho if Fc > Ft (α, tdf, edf) → F(0.05, 2, 9) = 4.256,
2.796, otherwise do not reject. otherwise do not reject.

Test Statistic: Fc = MSE/MSD = 5.676/1.554 = Test Statistic: Fc = MST/MSE = 34.035/5.676 =


3.652. 5.997.

Decision: Since Fc = 3.652 > Ft = 2.796, we reject Decision: Since Fc = 5.997 > Ft = 4.256, we reject
Ho . Ho .

• Conclusion: At 5% level of significance, the • Conclusion: At 5% level of significance, the


phosphorous level varies from location to phosphorous level varies among soil types.
location within a soil type.
Linear model for RCBD
Rice yield (tons/ha) at different seeding
rate
General equation
Seeding Blocks  Without sub-sampling
rate
1 2 3 4
yijk = µ + τi + pj + ɛijk; i=1, 2, . . .,t; j=1, 2, . . .,r
25 5.1 5.4 5.3 4.7 where
yijk is the observed value from the eu in
50 5.3 6 4.7 4.3 the jth block given the ith
treatment
75 5.3 5.7 5.5 4.7 µ overall mean
τi effect of ith treatment
100 5.2 4.8 5 4.4
pj effect of the jth block
125 4.8 4.8 4.4 4.7 ɛijk random error associated with the eu in
the jth block given the ith
150 5.3 4.5 4.9 4.1 treatment
RCBD Manual Computation
Blocks Treatment Treatment
Seeding
Rate (kg)
1 2 3 4 Total Mean
25 5.1 5.4 5.3 4.7 20.5 5.125
50 5.3 6 4.7 4.3 20.3 5.075
75 5.3 5.7 5.5 4.3 20.8 5.200
100 5.2 4.8 5 4.7 19.7 4.925
125 4.8 4.8 4.4 4.4 18.4 4.600
150 5.3 4.5 4.9 4.1 18.8 4.700

Block Total 31 31.2 29.8 26.5


GT 118.5
GM 4.938
Step 1. Compute the Correction Step 3. Compute the Sum of Squares
Factor (CF)
Total SS
585.09375 = 5.12 + 5.42 + 5.32 + . . . + 4.12 – CF
Step 2. Degrees of Freedom 590.47 – 585.09375 = 5.376
Total df = n-1 = 24 – 1 = 23
Block df = b – 1 = 4 – 1 = 3 Treatment SS
Treatment df = t-1 = 6 – 1 = 5
Error df = (r-1) (t-1) = (4-1) (6- - 585.09375 = 1.174
1) = 3×5 = 15.
Step 4. Means Squares (MS):
Block SS Divide each Sum of Square by
= - 585.09375 their respective degrees of
freedom.
= Step 5. Computed F – Values:
Divide the Treatment MS and
Error SS = Total SS – Treatment SS –
Block MS by the Error MS.
Block SS Step 6. Compare the Computed
= 5.376 – 1.174 – 2.361 F with the Tabular F (i.e., α =
= 1.841 0.05 and 0.01).
Analysis of variance

Sources of Degrees of Sum of Mean Computed Tabular F - Value


Variation Freedom Squares Squares F - Value
0.05 0.01
Block 3 2.361 0.787 6.412 3.287 5.417
Treatment 5 1.174 0.235 1.912 2.901 4.556
Error 15 1.841 0.123
Total 23 5.376
Sample SAS code for RCBD without SS
 Example 2
Example 1 *rice yield at different seeding
*rice yield at different seeding rate;
rate; data rice_yield;
data rice_yield; input rep srate yield;
input rep rate yield; cards;
cards; *your data
*your data ;
; proc glm data=rice_yield;
proc anova data=rice_yield; class rate block;
class rate block; model yield=block rate;
model yield=rate block; means rate/duncan;
means rate/duncan; lsmeans srate/stderr;
title 'RCBD ANOVA' run;
run; quit;
quit;
Linear model for RCBD
General equation:
 With sub-sampling

yijk = µ + τi + pj + ɛij + δijk; i= 1, 2, . . . , t;


j = 1, 2, . . . , r; k = 1, 2, . . . , s;

where
yijk = observed value on the kth su in the eu in
the jth block given the ith treatment
µ = the overall mean
τi = effect of the ith treatment
pj = effect of the jth block
ɛij = expl error associated with the eu in the jth
block given the ith treatment
δijk = sampling error associated with the kth su
in the jth block given the ith
treatment
Factorial Experiment
Two-factor factorial  n-way treatment structure – n factors,
Example: say A1, A2, . . ., An at levels a1, a2, . . . , an,
respectively. The no. of
Effects of different dietary combinations/treatments is a1 x a2 x , . . . ,
levels of calcium and
an
supplemental Vitamin C on
growth of broilers Example:
 Compare the effects of 6 N levels, 2
levels of irrigation, 4 var of rice and their
 Factor 1: dietary levels of
interaction
calcium (3)
 Levels: 1, 2 and 3
 Compare the effects of 3 protein sources
2 protein levels and their interaction
 Factor 2: Vitamin C
 Compare the effects of 4 different
supplement (3) machines, 4 operators and 4 work shifts
 Levels: 50ppm, 100ppm and and their interaction
150ppm
Factorial Experiment Vitamin C
Calcium level Simple effect of
50 100 V-C
Treatment effects 1 78.7 90.7 12.0
2 71.3 80.7 9.4
Simple effect – difference simple effect of
b/n levels of one factor at calcium 7.4 10.0 2.60
each level of the other Interaction effect
factor Vitamin C
Main effect – average of Calcium level Main effect of
50 100 Calcium
the simple effects of a 1 78.7 90.7 84.7
factor across levels of 2 71.3 80.7 76.0
another factor
Mean of V-C
Interaction effect – 75.0 85.7
difference of the simple Main effect of calcium: 84.7 – 76.0 = 8.7
effects of one factor at the Main effect of V-C: 85.7 – 75.0 = 10.7
-
different levels of the other Interaction effect: (90.7 -78.7) – (80.7 – 71.3) = 2.6
factors
Linear Models for Two-Factorial Experiment
Completely Randomized Design
yij = u + αi + βj + (αβ)ij + ɛijk
Randomized Complete Block Design
yijkl = u + pk +αi + βj + (αβ)ij + ɛijkl
Latin Square Design
yijklm = u + pk + yl +αi + βj + (αβ)ij + ɛijklm

Note: For three or more factor experiments, say factors a, b and c, factor c is
just added in the model and its interaction with a and b making 4
interactions in the model i.e., (ab), (ac), (bc) and (abc).
Y= U + A + B + (AB) + E
Y= U + A + B + C + (AB) + (AC) + (BC
Variety 1
Nitrogen
Levels 1 2 3 4
A 5.5 5.7 5.1 5.8
B 4.8 5.2 5.4 5.3
C 5.2 5.1 4.8 5.2
Variety 2
A 5.6 6.1 6.2 6.5
B 5.7 5.6 6.1 6.1
C 5.8 5.7 5.8 6.3
Variey 3
A 4.9 4.9 5.1 4.8
B 4.5 5.1 4.8 4.9
C 4.3 5.1 5.3 5.1
Split-Plot Design
Features:
 Specially suited for two-factor experiment
 Special feature: two sizes of eu’s i.e., main plot and sub-plot
 Main plot (A) are divided into sub-plots (B) to which the levels of the second factor are
assigned
 Precision of estimates of effects are sacrificed to improve precision of estimates
of effects of the sub-plot factor
When to use:
 Levels of one factor require larger eu’s than those of the other factor
 Greater precision is desired for comparisons among levels of one factor than
those desired for the other factor
 Larger variation can be expected among the levels of one factor than those
expected among levels of the other factor
 Additional factor is added to the experiment to increase the scope of the
experiment
Linear Model for Split-plot
 Split-plot in CRD lay-out
yijk = u + αi + δij+ βk + (αβ)ik + ɛijk

where: I = 1, 2, . . . , a; j = 1, 2, . . . , r; k = 1, 2, . . . , b;
yijk = observation from sub-plot given the kth level of B in the jth
main plot given the ith level of A
u = overall mean
αi = effect of the ith level of factor A
δij = random error associated with the jth main plot given given ith level of A
βk = effect of the kth level of factor B
(αβ)ik = interaction effect between the ith level of A and the kth level of B
ɛijk = random error associated with the sub-plot given the kth level of B in
the jth whole plot given the ith level of A
Sample data (RCBD)
Effects of varying levels phosphorus on the yield of different Main plot factor (A): Variety
varieties of soybean Sub-plot factor (B): Phosphorus
Block Variety
Phosphorus Experimental units: soybean plots
0 30 60 90 Blocks: soil characteristics
1 53.5 60.5 60.8 59.6 Response variable: yield
1 2 44.8 51.0 51.5 49.9
3 50.7 54.9 59.4 64.7 Model:
1 62.2 68.8 70.9 67.8
2 2 52.5 58.7 59.4 58.1
Yijkl = u + blockj + varietyi +
3 61.4 64.9 70.0 74.4
1 53.4 59.5 61.0 60.3 errorij + phosphorusk +
3 2 43.1 49.6 49.7 49.5 (variety*phosphorus)ik + errorijk
3 50.6 54.8 60.5 65.0
SAS codes for Split-plot (RCBD)
Principles of Experimental Design
Replication
• Number of times a treatment level appears in the experiment
• Repetition of the basic experiment
• Functions:
• To provide estimate of experimental error
• To increase precision of the estimates of the parameter
• To increase the scope of the experiment
• Factors affecting the number of replications:
• Degree of precision required
• Uniformity of eu’s
• Number of treatments
• Experimental design
• Time allotment for the experiment
• Cost and availability of resources

You might also like