Statistical Algorithms For Optimal Experimental Design With Corre - Good
Statistical Algorithms For Optimal Experimental Design With Corre - Good
DigitalCommons@USU
All Graduate Theses and Dissertations Graduate Studies
5-2013
Recommended Citation
Li, Change, "Statistical Algorithms for Optimal Experimental Design with Correlated Observations" (2013). All Graduate Theses and
Dissertations. Paper 1507.
This Dissertation is brought to you for free and open access by the
Graduate Studies at DigitalCommons@USU. It has been accepted for
inclusion in All Graduate Theses and Dissertations by an authorized
administrator of DigitalCommons@USU. For more information, please
contact [email protected].
STATISTICAL ALGORITHMS FOR OPTIMAL EXPERIMENTAL DESIGN
WITH CORRELATED OBSERVATIONS
by
Chang Li
of
DOCTOR OF PHILOSOPHY
in
Mathematical Sciences
Approved:
2013
ii
Abstract
by
This research is in three parts with different although related objectives. The first part
developed an efficient, modified simulated annealing algorithm to solve the D-optimal (de-
terminant maximization) design problem for 2-way polynomial regression with correlated
observations. Much of the previous work in D-optimal design for regression models with
correlated errors focused on polynomial models with a single predictor variable, in large
of the annealing cooling parameters, thresholds, and search neighborhoods for the pertur-
bation scheme, which finds approximate D-optimal designs for 2-way polynomial regression
for a variety of specific correlation structures with a given correlation coefficient. Results
in each correlated-errors case are compared with the best design selected from the class of
designs that are known to be D-optimal in the uncorrelated case: annealing results had gen-
erally higher D-efficiency than the best comparison design, especially when the correlation
The second research objective, using Balanced Incomplete Block Designs (BIBDs),
was to construct weakly universal optimal block designs for the nearest neighbor correlation
structure and multiple block sizes, for the hub correlation structure with any block size, and
iv
for circulant correlation with odd block size. We also constructed approximately weakly
time varying parameters, and solved D-optimal design for linear regression with it. Then
based on that improved algorithm, we combined the nonlinear regression problem and
decision making, and developed a nested PSO algorithm that finds (nearly) optimal ex-
perimental designs with each of the pessimistic criterion, index of optimism criterion, and
regret criterion for the Michaelis-Menten model and logistic regression model.
(79 pages)
v
Public Abstract
by
algorithm can successfully determine highly efficient D-optimal designs for second order
In the second part, I solved weak universal optimal block designs for the nearest
neighbor correlation structure and multiple block sizes, for the hub correlation structure
with any block size, and for circulant correlation with odd block size.
In the third part, we propose an improved Particle Swarm Optimization (PSO) al-
gorithm with time varying parameters. Then combining the theorem of decision making
and PSO, we innovated nested PSO algorithms with all of these three criteria and make
comparison among the quality of solutions found from the three criteria.
vi
Acknowledgments
I really appreciate my advisor, Professor Daniel C. Coster, for his guidance and help
I would like to thank my committee members, Professors James Powell, Adele Cutler,
Christopher Corcoran, and Drew Dahl, for their teaching, advice, and help in my study.
of support, advice, and help from the faculty, staff, and schoolmates in this department. I
am also indebted to the department of Mathematics and Statistics for the financial support.
Finally, I also appreciate my parents for their love, help, and encouragement.
Chang Li
vii
Contents
Page
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
Public Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
2 Research Problems and Literature Review . . . . . . . . . . . . . . . . . . . . . . . . 3
2.1 D-optimality for Polynomial Regression with Correlated Observations . . . 3
2.1.1 Model: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.1.2 Correlation structures . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 weak universal optimal block design . . . . . . . . . . . . . . . . . . . . . . 7
2.3 Particle Swarm Optimization algorithm in experimental design and decision
making . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.3.1 Experimental design and the Fisher information matrix . . . . . . 12
2.3.2 Models with unknown parameters . . . . . . . . . . . . . . . . . . . 13
2.3.3 Essential elements of decision making . . . . . . . . . . . . . . . . . 14
2.3.4 Optimization criterion for decision making . . . . . . . . . . . . . . 14
3 Simulated Annealing Algorithm for D-optimal Design . . . . . . . . . . . . . . 16
3.1 Improved simulated annealing algorithm for 2-way second-order polynomial
regression with correlated observations . . . . . . . . . . . . . . . . . . . . . 16
3.1.1 Research Objective: . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.2 The Principle of Simulated Annealing . . . . . . . . . . . . . . . . . . . . . 17
3.2.1 Simulated Annealing Algorithm for D-optimal Design for 2-Way
Polynomial Regression . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.3 Improvements from this algorithm compared with a standard simulated an-
nealing algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.4 Results and comparison with D-optimal design for 2-way second-order poly-
nomial regression with uncorrelated observations . . . . . . . . . . . . . . . 21
4 Construction of Weak Universal Optimal Block Design . . . . . . . . . . . . . 25
4.1 Weak universal optimal block design for nearest neighbor correlation with
block size 3 to 6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.2 Weak universal optimal block design for hub correlation for any block size 32
4.3 Weak universal optimal block design for circulant correlation with odd block
size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.4 Weak universal optimal block design for block-structured correlation . . . . 39
viii
List of Tables
Table Page
5.1 Basic PSO for linear regression with circulant correlation structure . . 47
5.2 Basic PSO for linear regression with nearest neighbor correlation struc-
ture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
Chapter 1
Introduction
optimal design when observations are correlated rather than independent. Such optimal
experiments are of increasing practical relevance in applied science when responses are
known to be correlated and there is a demand for statistical accuracy (i.e., optimality)
from experiments that are expensive and time consuming to perform. Examples would
include genome mapping experiments and microarray analysis, where genetic association
Three separate, although related, research objectives were developed. The first re-
quired the development and implementation of an efficient simulated annealing (SA) al-
design problem for multi-way polynomial regression with correlated observations, an impor-
errors case.
The creative part of this modified simulated annealing algorithm required the division
of the underlying perturbation scheme into more tractable sub-parts resulting in a more
dynamic scheme and a better defined threshold for searching in the neighborhood of the
target (optimal) solution. This improved algorithm overcomes the limitation of standard
optimization hill-climbing algorithms by allowing the search process to extend beyond lo-
cal optima. The algorithm has been implemented successfully for multiple specifications
of correlation structures, including cyclic, hub, and nearest neighbor (defined elsewhere)
structures.
2
The second objective of this research continued with the common theme of optimal
design with correlated observations but focused on the design objective of weak universal
related optimal design, the order of observations within blocks is critical for optimality. An
efficient way to construct weakly universal optimal block designs with various correlation
structures and block sizes is presented along with proofs that the conditionals for optimality
are satisfied.
The third research objective combined decision making theory and Particle Swarm
Optimization (PSO) and featured nested PSO algorithms and three criterion functions with
application to the Michaelis-Menten model and the two parameter logistic regression model.
Comparisons were made among the quality of solutions found from the three criteria. The
three criteria reflect different levels of “optimism” and “pessimism” associated with the
decision making process in the PSO algorithm and may be adjusted to achieve different
solutions to the design problem. For example, when using the “index of optimism” criterion,
the settings of 0.3 (the decision maker is relatively pessimistic), 0.5 (the decision maker
compromises between the pessimistic and optimistic case) and 0.7 (the decision maker is
relatively optimistic) were used, respectively, and solution quality compared on the design
objective function.
A more complete specification of the three research objectives and accompanying liter-
ature review follows in Chapter 2. Chapter 3 presents results for the first research objective
involving the SA algorithm and D-optimality, Chapter 4 contains theory and applications
for the second research objective, and Chapter 5 deals with the PSO algorithm and results
for two types of models. Discussion and suggestions for future research are in Chapter 6.
Appendices contain annotated examples of Matlab code used to produce numerical results.
3
Chapter 2
yi = fi (x)0 β + i (2.1)
where i=1. . . n, β is a k-vector of parameters, and fi (x) = (f1i (x), f2i (x), . . . fki (x)) is
ous structures or patterns.Motivation for this research in optimal designs with correlated
observations can be found in [2]. [3] introduced optimal design with correlated observations
in detail.
for optimization in the absence of an analytical solution. This algorithm derives from the
principle of annealing metal: heat the metal to a high temperature first, then decrease the
temperature slowly. As the temperature is decreased, the molecules in the metal tend from
unordered to ordered.
The probabilistic feature of the SA algorithm mimics this behavior in metal by allowing
transitions to less ideal “solutions” during the cooling stage which, in turn, provides for the
opportunity to leave local optimal, something deterministic algorithms may fail to do.
[4] proposed a simulated annealing algorithm for D-optimal design with uncorrelated
[5] and [6]. In [7], Zhu solved the 1-way D-optimal design for polynomial regression with
designs with block effects, which can be considered as a special case of the D-optimal design
problem with correlated observations, since the block effects can be incorporated into the
correlation structure.
Most previous work only considered the simplest case, that is, optimal design for
1-way polynomial regression. However, in real world problems, the response variable is
usually influenced by multiple effects and their interactions. This kind of problem is more
complicated, and has not been solved by existing algorithms or their generalizations.
2.1.1 Model:
The full model for second order 2-way polynomial regression is presented in [9] and
[10]. The model for the second order 2-way polynomial regression is:
where i=1,2 . . . n, and each of the x1i and x2i are in [-1,1], where i has mean 0,
The design matrix is: X = (xij )n×6 , The first column is all 1’s, the other 5 columns
correspond to the values of X1 , X2 , X12 , X22 , X1 X2 , respectively . That is, each column of X
corresponds to one design variable (or their square or interaction effect) in the model.
M = X 0 V −1 X (2.3)
where
is the variance covariance matrix of the errors. Some common correlation structures
We define commonly used correlation structures below for a single correlation param-
σ2 i=j
cov(yi , yj ) =
ρσ 2 |i − j|=1 or |i − j|=n-1
0
otherwise
1 ρ 0 0 ... ρ
ρ 1 ρ 0 ... 0
0 ρ 1 ρ ... 0
R=
. . . .
. . . .
. . . .
ρ 0 0 0 ... 1
σ2 i=j
cov(yi , yj ) =
ρσ 2 |i − j|=1
0
otherwise
1 ρ 0 0 ... 0
ρ 1 ρ 0 ... 0
0 ρ 1 ρ ... 0
R=
. . . .
. . . .
. . . .
0 0 0 0 ... 1
R R12 ... R1b
R21 R ... R2b
. . . .
(2.5)
. . . .
. . . .
0 0 ... R
Here R is a k × k matrix with the elements on the main diagonal =1, and all other
elements=ρ, (k is the common block size). ρ is the correlation coefficient for the observations
in the same block. Rij is a k × k block with all elements =ρij . In this paper we take all of
Note that one commonly used block correlation structure is proposed by [12]:
with V = (1 − ρ)Ik + ρJk . Here Jk is the k × k matrix with all of the elements =1.
1 ρ ρ ρ ... ρ
ρ 1 0 0 ... 0
ρ 0 1 0 ... 0
R=
. . . .
. . . .
. . . .
ρ 0 0 0 ... 1
[13] present the definition of and a sufficient condition for weak universal optimality
in balanced block design with correlated observations. The definition is: X ∗∗ is weakly
X∗ for every convex Ψ invariant under permutation of coordinates and such that
where X∗ is a set of eligible designs, usually it is all of the BIBDs with certain parameters.
covariance matrix V if
Note that (ii) is also known as A-optimality criterion. A matrix is completely sym-
metric (CS) if it is in the form aIk + bJk , here a and b are scalars, and Jk is a k × k matrix
with all elements=1. [13] also define for X ∗ in X∗ , D(X ∗ , V ) = cov(tˆ0 |V ), where t0 is the
usually known as the least squares (LS) estimator. cov(tˆ0 |V ) is the covariance matrix of
the least squares (LS) estimator t0 under design X ∗ and covariance matrix V.
Balanced Incomplete Block Designs (BIBD) provide a foundation for this research
about weak universal optimal block designs. A BIBD is a block design with v treatments,
b blocks, each block having size k. Incomplete means k < v, and balanced means each
treatment appears once in each of r blocks, and each pair of treatments appear together in
the same number of blocks, this number is denoted by λ. For a BIBD, the parameters v, k,
b, r and λ satisfy:
vr = kb (2.8)
and
r(k − 1)
λ= (2.9)
v−1
The construction of all kinds of BIBDs is discussed in detail in [14], and these construc-
tion is our foundation of weak universal optimal block designs. Some other foundational
results about the construction of BIBD’s with block size k=3 or 4 are presented by [15], [16],
[8] and [17]. In several former papers and books, like [15] and [14], a BIBD is represented
by triple parameters (v,k,λ), and denoted by (v,k,λ)-BIBD. In this paper we keep on this
notation.
we can take r=4, b=5 and λ = 3. The BIBD can be constructed in this way:
(1, 2, 3, 4), (2, 3, 4, 5), (3, 4, 5, 1), (4, 5, 1, 2), (5, 1, 2, 3). (2.10)
especially useful when the block size is fixed or limited. For example, if we want to do an
experiment of eye-drops to several persons, and take the eyes of each person as a block, then
the block size can only be two. Or if we want to do an experiment of several detergents,
9
but we only have 3 operators, and the speed of washing is the same in any one session but
differ from session to session, then the block size can only be 3.
Optimal block designs have been studied in many papers. [19] introduced optimal
block designs with correlated observations under various circumstances. [1] introduced the
weak universal optimal block design with a circulant correlation matrix in each block. The
1 ρ 0 0 ... ρ
ρ 1 ρ 0 ... 0
0 ρ 1 ρ ... 0
R= (2.11)
. . . .
. . . .
. . . .
ρ 0 0 0 ... 1
The core of Zhu’s research is the Theorem 3 in section 2 of that paper. In that section,
b
X 1 1
cov(Q) = k 2 Pj0 (Ik − Jk )R(Ik − Jk )Pj (2.12)
k k
j=1
is a k × v matrix : (Pj )li = 1 iff i is on the lth position of the jth block. For example: if the
10
1 0 ... 0 0 ... 0
0 1 ... 0 0 ... 0
. . . . 0 ... 0
Pj = (2.13)
. . . . 0 ... 0
. . . . 0 ... 0
0 0 ... 0 0 ... 0
0 0 ... 1 0 ... 0
0 0 ... 0 0 ... 1
0 ... 0 0 ... 1 0
. . . . . ... .
Pj = (2.14)
. . . . . ... .
. . . . . ... .
0 0 ... 0 0 ... 0
0 0 ... 0 1 ... 0
That is, each row of Pj has one 1 and (v-1) 0’s. k columns of Pj has one 1 and (k-1)
Zhu uses cov(Q) to represent the covariance matrix of the LS estimator instead of
cov(tˆ0 |V ), and in this paper we retain his notation. From the construction of Pj , we can
see it has k elements = 1, and other elements = 0. The core of the right side of the formula
is (denote it as W):
1 1
W = (Ik − Jk )R(Ik − Jk ) (2.15)
k k
11
if i, i0 are in the same block j. Here w is the element in matrix W, Bj is block j, and
In section 3, [7] introduced the construction of weak universal optimal block designs
with a circulant correlation matrix with block size=3 based on the Steiner triple system
introduced in [20].
sion making
algorithm is a bionic algorithm which simulates the preying behavior of a bird flock. In
the Particle Swarm Optimization algorithm, each solution of the optimization problem
is considered to be a “bird” in the search space, and we call it a“particle”. The whole
population of the solution is termed as a “swarm,” and all of the particles are searched
by following the current best particle in the swarm. Each particle has a an associated
optimization function, which determines the particle’s fitness value, and a velocity, which
determines the direction and distance of the search. As the PSO algorithm proceeds, for
each particle, we track two “best” values: the first value is the best for the individual
particle by itself so far, which is denoted by “pbest”; the second value is the best solution
from the whole population so far, denoted by “gbest”. When the algorithm terminates,
Associated with each particle is a velocity, v, and position, x. The velocity and position
xi+1 = xi + vi (2.18)
Here vi is the velocity of the particle in the ith iteration, xi is the position of the
particle in the ith iteration. ω is called the inertia weight. pbesti and gbest are the local
best position for particle i and global best position for all of the particles, respectively. Term
“rand” is a random number in [0,1], while c1 , c2 are “learning factors”, with c1 termed the
From the formulas, we can see the update of v is composed of three parts: the first
part is the inertia velocity before the change; the second part is the cognitive learning part,
which represents the learning process of the particle from its own experience; the third part
is the social learning part, which represents the learning process of the particle from the
An experimental design ξ which has n support points can be written in the form:
x1 x2 · · · xn
ξ=
ξ1 ξ2 · · · ξn
Here xi , i = 1 . . . n are the values of the support points within the allowed design region,
and ξi are the weights, which sum to 1, and represent the relative frequency of observations
The general form of the regression model can be written as y=f(θ , ξ)+. Here f(θ , ξ)
can be either linear or nonlinear function, θ is the vector of unknown parameters, and ξ is
the vector of design( includes the information for both weight and the value of the support
point). The range of θ is Θ, and the range of ξ is Ξ. The value of a design is computed
from the Fisher information matrix, which is usually obtained as the negative of the matrix
In many case, the Fisher information matrix involves the unknown parameter θ, and
13
is denoted by I(θ, ξ). One popular criterion is to minimize the function log|I −1 (θ, ξ)|. In
One typical example of regression with a Fisher information matrix involving unknown
ax
y= + , x > 0 (2.19)
b+x
k
X
I(θ, ξ) = ξi M (xi , θ) (2.21)
i=1
For the Michaelis-Menten model on design space X = [0, x̃] , [22] showed that an
optimal design is supported at 2 support points, and one of which is x̃. So ξ is a vector
(x1 , ξ1 )0 .
Another typical example is two parameter logistic regression model ([23]), in which the
probability of response is assumed to be p(x; θ) = 1/(1 + exp(−b(x − a))). Here θ = (a, b)T
b2 p(x, θ)(1
− p(x, θ)) −b(x − a)p(x, θ)(1 − p(x, θ))
Z
dξ(x) (2.22)
2
−b(x − a)p(x, θ)(1 − p(x, θ)) (x − a) p(x, θ)(1 − p(x, θ))
14
(ii)a number of states which can not be controlled by the decision maker;
(iii)objective function: payoff function or loss function which depends on both an action
and a state (our objective is to maximize the payoff function or minimize the loss function);
(iv) criterion: by certain criterion, the decision maker decide which action to take.
In the function log|I −1 (θ, ξ)|, θ is in the set of states which are out of our control and
Decision-making with loss functions is proposed in several papers, like [25]. Clearly,
our objective is to minimize the loss functions. Based on the loss function, there are several
(i) Pessimistic criterion: The pessimistic decision maker always considers the worst
case, that is, suppose θ will maximize the loss function. The decision maker will take the
action that minimizes the loss function on the worst case. This criterion is also known as
(ii) Index of optimism criterion: usually the decision maker will trade off from optimism
and pessimistic in decision making. This derives index of optimism criterion which take the
weighted average of maximum and minimum of the loss function. The weight is called index
of optimism, which is between 0 and 1. It reflects the content of optimism of the decision
minξ [(1 − α)maxθ∈Θ log|I −1 (θ, ξ)| + αminθ∈Θ log|I −1 (θ, ξ)|] (2.24)
15
(iii) Minimax regret criterion: in this criterion, our objective is to minimize the max-
imum possible regret value. The regret value is defined by the difference between the loss
under certain action and the minimum loss possible under the same state. The formula for
The significance for criterion (ii) and (iii) are: usually the decision maker will trade off
from optimism and pessimism in decision making. This derives index of optimism criterion
which take the weighted average of maximum and minimum of the possible loss. The weight
is called index of optimism, which reflects the content of optimism of the decision maker.
Some times after the decision maker made a decision, he or she may regret when cer-
tain states appear. In this case we want to minimize the maximum regret value, which
is the distance between the loss value of the action he take and the minimum loss value
possible in the relevant state. Regret value is also called opportunity costs, represent regret
Chapter 3
mately solve for D-optimal design for 2-way polynomial regression with correlated observa-
tions is proposed. This algorithm is applicable to any number of observations, not neces-
sarily a multiple of the dimension of the parameter vector. It conquers the shortcoming of
previous work, which mainly concentrated on the case that n (the number of observations)
sion of the parameter β). We also provide a reinforced version of our simulated annealing
In [27], the authors found the D-optimal design for 2-way (i.e., 2 predictors, X1 and
X2) 2nd-degree (i.e., a model including quadratic and cross-product terms in each of X1 and
X2) polynomial regression with uncorrelated observations based on 9 factorial points (the
combination of -1, 0 1) in detail. The method of Box and Draper provides a convenient way
to approximate the maximum value of the determinant of the information matrix |X 0 X|.
One way to approximately solve the D-optimal design problem to polynomial regression
with correlated observations is to use the best D-optimal design for polynomial regression
with uncorrelated observations but for the specified correlation structures. We call this the
17
“uncorrelated method.”
However, the uncorrelated method will seldom find the globally D-optimal design for
any specific correlation structure because the support points are unlikely to be in -1, 0, 1
and the order of the observations themselves will impact the D-criterion.
“uncorrelated method” usually can not get ideal result. For example, for the circulant
correlation structure with n=9, when ρ = 0.4, the “best” determinant of “uncorrelated
method” is 10688, while the “best” determinant of our improved simulated annealing algo-
rithm is 68277. What this means is that much greater D-efficiency (ratio of the maximized
determinants) is available using our methods versus the “uncorrelated method” approach.
We use the “uncorrelated method” as a benchmark for the potential or realized improve-
ments obtained from our SA algorithm. In practice, if an experimenter has some idea of
the magnitude of the correlation, ρ, and the structure of the dependency (circulant, hub,
nearest neighbor, and so on), our best designs will be more efficient and lead to more precise
hill-climbing algorithms, see [7] and [4]. The SA algorithm attempts to globally maximize an
energy function E(X) for X in a specified state space (a design region for our D-optimality
problem), by moving about the state space according to a transition mechanism defined
accept Xn as the current solution with probability exp(dE/Tc ), where Tc is the current
value of a temperature control parameter, T. Thus, there is positive probability that the
algorithm will move to a poorer design, which is the key feature of the SA search algorithm,
as it provides for the possibility that the algorithm will escape a local maximum. As the
algorithm proceeds, the temperature decreases, making it less likely that designs with lower
a stationary distribution of the underlying Markov chain, which typically requires a large
number of iterations as well as a suitably chosen transition scheme over the state space.
3.2.1 Simulated Annealing Algorithm for D-optimal Design for 2-Way Poly-
nomial Regression
For 2-way polynomial regression, the n × 6 design matrix is fully determined by the
values of X1 and X2 , each in [-1, 1]. Therefore, at each iteration of our simulated annealing
algorithm, a new design matrix is obtained by perturbing the current values of X1 and X2 .
We denote the current values of X1 and X2 by X1c and X2c and new values by X1n and
X2n , respectively.
In many applications of simulated annealing, the values of only one current design point
are perturbed (by some random mechanism) at each iteration, and typically a systematic
pass is made through all design points in this manner, and the process repeated until
design points are perturbed simultaneously. However, both of these traditional methods
were found to be inefficient for our D-optimal design with correlated errors. Thus, we used
a modification that improved convergence and solution quality. Our modification was to
divide the design points into three parts, of equal or nearly equal size, and perturb all points
in each part in an “inner” loop, while systematically doing this for each of the three parts.
This represented a middle ground for the perturbation scheme between the two traditional
matrix X0 . Control parameter gc is chosen from [0, 1] and is used to adjust the size of
the perturbations as the algorithm proceeds. Calculate the energy function of the current
Divide the n design points (rows of X) into three parts. If n = 3k, for some positive
integer k, then each part has = n/3 design points. If n = 3k+1, the first two parts have
19
k design points and the third part has k+1. Similarly, if n = 3k+2, the first part has k
Cycle through each of the 3 parts of X systematically, repeating the following inner
loop:
Inner Loop:
(i) Let Z1 and Z2 be n x 1 vectors with each element of Zi (i = 1,2) sampled at random
from [-1, 1] for those design points belonging to the current part of X under consideration.
(ii) Generate new candidate design points X1n = X1c + gZ1 and X2n = X2c + gZ2 . If
any element of X1n or X2n falls outside [-1, 1], set the value to the closest boundary value
(iv) If dE = E(Xn )−E(Xc ) > 0, accept the new design by setting Xc = Xn . Otherwise,
compare exp(dE/T ) with a random number chosen uniformly from [0,1] multiply by a
coefficient 1.01c . If exp(dE/T ) is greater than this number, we set Xc = Xn . If not, keep
the Xc unchanged . Step 3: If Tc < Tf , stop. Otherwise, increment the counter c to c+1,
cally, after the usual stopping condition based on the temperature is reached in Step 3, the
process is repeated, often several times, by reheating to the original starting temperature,
and continuing at Step 2. In Table 6.6, we present results of the algorithm for n = 12 and
Reduction Control Parameter r: this tuning parameter is chosen by the user, but is
often set about 0.98 - 0.99 for geometric rate of reduction in the temperature.
turbations in design points at early iterations. As solution quality improves and the tem-
of the current design which is more likely to be close to a global optimum when iteration
counter c is large.
annealing algorithm
1. There are 2 vectors, X1c and X2c , to be changed. In this case, the standard
simulated annealing algorithm , which treats the perturbation vector Z as a whole, does not
produce satisfactory results. In our modified algorithm, we divide the Z (and consequently
the perturbation process) into 3 parts, and make perturbations part by part. This method
ensures that we do not miss any corner of the design region, and is much more precise than
the usual annealing method. Additionally, this part-by-part perturbation scheme allows the
coefficients. This makes our algorithm more flexible since it can be applied to experiments
2. We shrink the search neighborhood and increase the threshold for accepting a
perturbation each time we lower the temperature. That is,when the temperature is high,
we search in a wide neighborhood and are more likely to jump out of the local optimum. At
each time we lower the temperature,we make the perturbation neighborhood smaller and
make the acceptance threshold higher so it becomes harder to leave a local optimum. We
implement this approach by multiplying the scale number g by the reduction coefficient r
and multiplying the random number to be compared with dE by a coefficient, 1.01c at each
time we decrease the temperature. Here c initially is 0, and will increase by 1 each time we
This approach is in accordance with the idea of simulated annealing, that is: when
the temperature becomes lower, the “molecules” are less active and tend to an equilibrium
stabilization. This modification resulted in improved relative efficiency of the final design.
3. In each part of step 3, we repeat the iterations until the improvement is less than
a small threshold value multiple times. This guarantees we go to the next step only when
the improvement is negligible and none in the current step. In other words, we do not
21
miss any valuable improvement. We take the threshold as 0.02× determinant of the current
3.4 Results and comparison with D-optimal design for 2-way second-order
In this paper, we use the D-optimal designs in [10] to compute |X 0 V −1 X|, and compare
Since the most often used correlation parameters are 0.1 and 0.4, in the tables below,
we mainly use these 2 parameters in the computation and comparison. In table 8, we list
In Table 3.1 through Table 3.4, we present the comparisons of the simulated annealing
results and the “uncorrelated method” when observations number n is a multiple of 6 using
each of the autoregressive, circulant, nearest neighbor and block correlation structures. Ta-
bles 3.1-3.3 present results of the SA algorithm for the autoregressive, circulant and nearest
neighbor structure for designs of size 6, 12 and 18 and correlation parameter of 0.1 and
0.4, and Table 3.4 presents results of the SA algorithm for the block structure for designs
of size 12 and correlation parameter of 0.1 and 0.4, along with comparisons with the best
From these tables,we see that when ρ=0.1, the determinants obtained by simulated an-
nealing and the uncorrelated method are similar. However, when ρ=0.4, the determinants
from the simulated annealing algorithm are much higher than the results of the uncorre-
lated method. When n gets larger (especially when n=18), the ratio increases to well above
23
1, so the D-efficiency of the annealing design is relatively much better than that of the
“uncorrelated method.”
For the case that the observations number n is not a multiple of the dimension of the
parameter vector, we take n=7 in Table 3.5. We find in all of these cases, the results of the
simulated annealing algorithm are much better than the results of the uncorrelated method.
Table 3.5: 2-way polynomial regression with n=7
Correlation
ρ Uncorrelated Determinant Annealing Determinant
Structure
Nearest
0.1 962.3 1036.8
Neighbor
Nearest
0.4 2655.1 3213.2
Neighbor
Auto
0.1 951.3 1028.1
Regress
Auto
0.4 1757.5 2382.4
Regress
Table 3.6 provides the comparison of reheated simulated annealing with non-reheated simu-
lated annealing. From this table, we can see that with the addition of the reheating process,
We also present a comparison of the support points for the circulant correlation structure
8 1, -1 1, -1 1,-1 1,-1
Uncorrelated
4542.7 4600.5 10688.0
Determinant
From the above table, we can see for ρ=0.1, the support points of the simulated annealing
results are very close to uncorrelated method. When ρ becomes larger, the support points
of simulated annealing have larger separation from the uncorrelated method support points.
Chapter 4
Balanced block designs and in particular balanced incomplete block designs (BIBDs)
have been in widespread use in agricultural, ecological, pharmaceutical, and industrial re-
search for many years. This is - in part - a consequence of the need for efficient estimation
for improved precision but physical constraints on the available experimental units dictates
block sizes less than the number of treatments (incomplete blocks). The “balance” achieved
in these designs is reflected in the fact that treatment effects are still estimated with equal
precision (equal variance) after adjustment for block effects. Typically, treatments are as-
signed “at random” to units within each block, as the order of observations does not impact
the variance of treatment effects. However, this “balance” characteristic is only true for
BIBDs with uncorrelated observations. With correlated observations within each block, the
order of the observations matters and this order impacts variance and hence any notion of
“balance”. Thus, there is need of research into construction of these useful designs when
any one of a number of different correlation structures might exist within each block of
units.
[1] is one providing a foundation in the research about the construction of weak uni-
versal optimal block designs in the presence of correlations. However, some shortcomings
1. Other correlation structures and block sizes might be more applicable. Zhu’s re-
search is limited to the construction of weak universal optimal block designs for a circulant
correlation matrix with block size k=3. This is the simplest case because for a circulant
correlation matrix with k=3, the requirement that the correlation between treatments in
each block be the same is automatically satised (i.e., the correlation structure is also known
26
as “complete symmetric”).
Additionally, since Zhu’s method is based on a specific property of Steiner triple sys-
tems, this construction approach does not generalize to other correlation structures and
block sizes.
2. For circulant correlation matrix, from (2.15) (see chaper 2) Zhu obtained that
W = R − k1 (1 + 2ρ)Jk . However, this holds only for the circulant structure. For other
kinds of correlation matrix, the matrix structure of formula (2.12) is more complex, and
covariance matrix for the adjusted treatment means (cov(Q)) depends on the order (or
weak universal optimal block designs with various correlation structures and block sizes.
Lemma1: For any BIBD, the condition (ii) of weak universal optimal design (A-
the j th block. So
X X
trace(cov(Q)) = k2 w`` (4.1)
i=1...v i∈Bj
Notice that the element w`` on the diagonal of W correspond to the position ` of each
block. That is, once the position ` (no matter in which block) is occupied by an element
(no matter which one), w`` is added in the formula (4.1) once. Since there are b blocks,
each position ` is occupied b times. That is, each element w`` is added in (4.1) exactly b
bk 2 `=1...k w`` .
P
It means under the A optimality criterion, all of the BIBDs with the same parameters
are equally good. So the condition (ii) for weak universal optimal is satisfied since the trace
Since all of our constructions are based on BIBD, based on lemma 1, we only have to
prove our designs satisfy condition (i) in the proofs below. The main idea is to construct
27
a block group based on each block of the original BIBDs. We split each W to three parts:
one part is a constant times J, another part is R, yet another part is an irregular matrix.
Our construction will make Nii0 equals a constant for any i and i0 , and find a tricky way to
Here Nii0 is the number of times that treatment i and i0 are in the same block and are
Since the parameter λ and r of the BIBDs will be changed after our design, in this
paper, we always denote the parameter before our design as λ0 and r0 , and the parameter
4.1 Weak universal optimal block design for nearest neighbor correlation with
block size 3 to 6
Design 2.1: Weak universal optimal block design for nearest neighbor correlation with
block size 3 to 6
For block size k=3, construct a (v, 3, λ0 )-BIBD by the method in [8]. In each block B,
denote the 3 treatments in the block in order as (1,2, 3). Then we generate another 2 blocks
based on the original one: B2 = (1; 3; 2);B3 = (2; 1; 3). The result is a (v, 3, 3λ0 )-BIBD
design.
For k=4, construct a (v, 4, λ0 )-BIBD by the process in [15]. Then in each block B,
denote the 4 treatments in the block in order as (1,2, 3,4). Then we generate another
block based on the original one in this order: block B 0 = (2, 4, 1, 3). The result is a (v, 4,
For k=5, based on a (v, 5, λ0 )-BIBD constructed in [14], for each block B, denote
the 5 treatments in the block in order as (1,2,3,4,5). Then we generate another 4 blocks
based on the original one in this order: block B2 = (1, 4, 2, 5, 3), B3 = (3, 1, 5, 2, 4), B4 =
(2, 1, 4, 3, 5), B5 = (4, 5, 1, 3, 2). This result is a (v, 5,5λ0 )-BIBD design.
For k=6, based on a (v, 6, λ0 )-BIBD constructed in [14], for each block B, note the 6
treatments in the block in order as (1,2,3,4,5,6). Then we generate another 2 blocks based
on the original one in this order: block B2 = (2, 4, 6, 1, 3, 5), B3 = (3, 6, 2, 5, 1, 4). This result
28
For each block size, we call the original block and blocks constructed based on it as a
“block group.”
Theorem 2.1: Design 2.1 is a weak universal optimal block design for all of the BIBDs
with the same k and r value. Proof: Expanding formula (2.15), we obtain:
1 1 2 + 2ρ k + 2(k − 1)ρ 1
W = (R − (RJ + JR) + 2 JRJ) = R − J+ 2
J− T
k k k k k (4.2)
k + 2ρ 1
=R− 2
J− T
k k
on the four corners, and (k-2) repetitions of ρ on each of the the four sides (except the four
W 0 and T, of W.
X X X
0
cov(Qi , Qi0 ) = k 2 wh(i,j)h(i 0 ,j) + k
2
rh(i,j)h(i0 ,j) − k th(i,j)h(i0 ,j) (4.4)
i,i0 ∈Bj i,i0 ∈Bj i,i0 ∈Bj
h(i,j)=`(` = 1, 2 . . . k) if i is on the `th position of the j th block. The first part of the value
of cov(Qi , Qi0 ) , which is based on W 0 , is only a function of λ, and does not depend on the
29
arrangement of the treatments in blocks. Here Nii0 is the number of times that treatment
In contrast, the second part of the value of cov(Qi , Qi0 ) , which is based on R and T,
is related to the arrangement of the treatments in blocks. Thus the second part which is
Suppose the replication of each treatment for the original BIBD is r0 , then the repli-
cation of each treatment for our construction is r = kr0 /2. From the construction of the
block, we can see Nii0 = λ0 (so i,i0 ∈Bj rh(i,j)h(i0 ,j) is a constant), λ = λ0 k/2.
P
(i) i = i0 . From our construction, each treatment appears in the head or tail of T
once (under this condition th(i,j)h(i0 ,j) = 0), and appears in the middle of T (k/2-1) times
(under this condition th(i,j)h(i0 ,j) = 2ρ). So from the structure of the matrix T, for each
P
block group, i,i0 ∈Bj th(i,j)h(i0 ,j) = 2(n − 1)ρ = (k − 2)ρ. Since each treatment is included
Thus, from (2.12) and (2.16), and since each treatment appears in exactly r blocks,
k + 2ρ k−2
cov(Qi , Qi ) = k 2 [r(1 − 2
)] − k × rρ] = r(k 2 − k − kρ) (4.5)
k k
That is, all of the elements on the main diagonal of cov(Q) are of the same value.
(ii) i 6= i0 . If the sum of the position order number of a pair of treatments is 2k+1(
like they are on position k and k+1, k-1 and k+2), we say they are symmetric to the
middle. If a pair of treatments are symmetric to the middle, then they will appear in the
middle of T (n-1) times (under this condition th(i,j)h(i0 ,j) = 2ρ), and appear in the corner
30
of T once (under this condition th(i,j)h(i0 ,j) = 0); if a pair of treatments are not symmetric
to the middle, then they will appear on the side but not corner of T twice (under this
condition th(i,j)h(i0 ,j) = ρ),, and appear in the middle of T (n-2) times (under this condition
th(i,j)h(i0 ,j) = 2ρ),. So from the structure of the matrix T, we can see for each block group,
P
i,i0 ∈Bj th(i,j)h(i0 ,j) = 2(n − 1)ρ = (k − 2)ρ. Then
k k + 2ρ
cov(Qi , Q0i ) = λ0 [k 2 × ρ − k 2 × × − k × (k − 2)ρ] = λ0 (kρ − k 2 /2) (4.6)
2 k2
is a constant (since k and ρ are constants) for i 6= i0 . That is, all of the elements that
are not on the main diagonal of cov(Q) are of the same value.
Combining (4.5) and (4.6), condition (i) for weak universal optimality is satisfied.
Case2: k is odd. Suppose the replicates of each treatment for the original BIBD is
r0 , then the replication of each treatment for our construction is kr0 . We denote it as r=
kr0 . From the construction of the block, we can see Nii0 = 2λ0 (so i,i0 ∈Bj rh(i,j)h(i0 ,j) is a
P
constant), λ = kλ0 .
(i) i = i0 . From our construction, each treatment appears in the head or tail of T twice
(under this condition th(i,j)h(i0 ,j) = 0), and appears in the middle of T (k-2) times (under
this condition th(i,j)h(i0 ,j) = 2ρ). So from the structure of the matrix T, we can see for each
block group, i,i0 ∈Bj th(i,j)h(i0 ,j) = 2(k − 2)ρ. Since each treatment is included in r0 block
P
So from (2.12) and (2.16), and notice each treatment appears in exactly r blocks, and
k + 2ρ 2k − 4
cov(Qi , Qi ) = k 2 [r(1 − 2
)− rρ] = r[k 2 − k − 2(k − 1)ρ)] (4.7)
k k2
That is, all of the elements on the main diagonal of cov(Q) are of the same value.
31
By the same argumentation as k is even, under the A optimality criterion, all of the
BIBDs with the same parameter r and k are equally good. So condition (ii) for weak
universal optimal is satisfied since the trace of our design attains this same minimum.
(ii) i 6= i0 . From our construction each pair of treatments will appear in the middle
of T (k-3) times (under this condition th(i,j)h(i0 ,j) = 2ρ), in the corner of T once (under
this condition th(i,j)h(i0 ,j) = 0), and on the side but not corner of T twice (under this
condition th(i,j)h(i0 ,j) = ρ). So from the structure of the matrix T, we can see for each block,
P
i,i0 ∈Bj th(i,j)h(i0 ,j) = 2(k − 2)ρ.
k + 2ρ
cov(Qi , Q0i ) = λ0 [k 2 × 2ρ − k 2 × k × − k × 2(k − 2)ρ] = λ0 (−k 2 + 2kρ) (4.8)
k2
is a constant(since k and ρ are constants) for i 6= i0 . That is, all of the elements that are
Combining (4.7) and (4.8), condition (i) for weak universal optimality is satisfied. The
proof is completed.
For example, for k=4, v=5, based on the BIBD in formula (10) our design will generate
one block based on each original block in this way: (2,4,1,3), (3,5,2,4), (4,1,3,5), (5,2,4,1),
(1,3,5,2).
1 1 2 + 2ρ k + 2(k − 1)ρ 1
W = (R − (RJ + JR) + 2 JRJ) = R − J+ J− T
k k k k2 k (4.9)
k + 2ρ 1
=R− 2
J− T
k k
We get
1 1 1+ρ 4 + 6ρ 1
W = (R − (RJ + JR) + JRJ) = R − J+ J− T (4.10)
4 16 2 16 4
32
Here
0 0.4 0.4 0
0.4 0.8 0.8 0.4
T = (4.11)
0.4 0.8 0.8 0.4
0 0.4 0.4 0
0.7 0 −0.4 −0.3
0 0.5 −0.1 −0.4
W = (4.12)
−0.4 −0.1 0.5 0
−0.3 −0.4 0 0.7
83.2 −19.2 −19.2 −19.2 −19.2
−19.2 83.2 −19.2 −19.2 −19.2
cov(Q) =
−19.2 −19.2 83.2 −19.2 −19.2
(4.13)
−19.2 −19.2 −19.2 83.2 −19.2
−19.2 −19.2 −19.2 −19.2 83.2
4.2 Weak universal optimal block design for hub correlation for any block size
1 ρ ρ ρ ... ρ
ρ 1 0 0 ... 0
ρ 0 1 0 ... 0
R=
. . . .
. . . .
. . . .
ρ 0 0 0 ... 1
33
For a (v,k, λ0 )-BIBD, we can always construct a weak universal optimal block design
with λ = kλ0 . The basic idea is expanding each block to a block group with k blocks.
Design 3.1: Based on a (v,k, λ0 )-BIBD constructed by [14], in each block, denote the
k treatments in the block in the order (1,2 . . . k), then we construct k-1 blocks based on the
original one, in the ith (i=2 . . . k) block, the element i is on the top, and other elements
Theorem 3.1: Design 3.1 is a weak universal optimal block design for BIBD’s with
the same parameters.
Proof: Suppose the replication of each treatment for the original BIBD is r0 , then the
multiple of k. From the construction of the block group, we can see that Nii0 = 2λ0 , λ = kλ0 .
1 1 2 + 2ρ k + 2(k − 1)ρ 1
W = (R − (RJ + JR) + 2 JRJ) = R − J+ 2
J− T
k k k k k (4.14)
k + 2ρ 1
=R− J− T
k2 k
(2k − 4)ρ (k − 2)ρ (k − 2)ρ (k − 2)ρ ... (k − 2)ρ
(k − 2)ρ 0 0 0 ... 0
. . . .
T =
. . . .
. . . .
(k − 2)ρ 0 0 0 ... 0
X X X
0
cov(Qi , Qi0 ) = k 2 wh(i,j)h(i 0 ,j) + k
2
rh(i,j)h(i0 ,j) − k th(i,j)h(i0 ,j) (4.15)
i,i0 ∈Bj i,i0 ∈Bj i,i0 ∈Bj
block. The first part of the value of cov(Qi , Qi0 ) , which is based on W 0 , is only a function
of λ, and does not depend on the arrangement of the treatments in blocks. In contrast,
the second part of the value of cov(Qi , Qi0 ) , which is based on R and T, is related to
the arrangement of the treatments in blocks. Thus the second part which is based on R
and T needs specific attention. So the key step of the proof is to show that under our
P P
construction, i,i0 ∈Bj rh(i,j)h(i0 ,j) and i,i0 ∈Bj th(i,j)h(i0 ,j) are constants and independent of
If i = i0 , notice that in our design 3.1, for each treatment, if it appears on the top in
one block, then it appears in other places in the other k-1 blocks of the same block group.
X 2k − 4
th(i,j)h(i0 ,j) = λ0 [(2k − 4)r0 ρ] = λ0 rρ (4.16)
k
i,i0 ∈Bj
for i=1. . . v.
So from (2.12) and (2.16), and notice each treatment appears in exactly r blocks, and
k + 2ρ 2k − 4
cov(Qi , Qi ) = k 2 [r(1 − 2
)− rρ] = r[k 2 − k − 2(k − 1)ρ)] (4.17)
k k2
That is, all of the elements on the main diagonal of cov(Q) are of the same value.
Notice that in our design 3.1, for each group of k blocks, each treatment appears on
P
the top once, so Nii0 = 2. Thus in each block group, th(i,j)h(i,j) is composed of k elements,
P
(k − 2)ρ appearing twice and 0 appearing (k-2) times. That is, th(i,j)h(i0 ,j) = 2(k − 2)ρ.
So we have
k + 2ρ
cov(Qi , Qi0 ) = λ0 [k 2 × 2ρ − k 2 × k × − k × 2(k − 2)ρ] = λ0 (−k 2 + 2kρ) (4.18)
k2
35
is a constant(since k and ρ are constants) for i 6= i0 . That is, all of the treatments that are
Combining (4.17) and (4.18), condition (i) for weak universal optimality is satisfied.
We confirm our formulas in the proof of theorem 3.1 with block size=4. From those
formula,we get trace(cov(Q)) = (48 − 24ρ)b and cov(Qi , Q0i ) = 32ρ − 32(1 + ρ) + 4(4 + 6ρ) −
4 × 4ρ = −16 + 8ρ.
Based on the BIBD in formula (10) with k=4 and v=5, our design will generate 3 blocks
based on each original block in this way: (2,4,1,3), (3,1,2,4), (4,1,2,3); (3,5,2,4), (5,3,2,4),
(2,3,1,5).
0.3 −.0.1 −0.1 −0.1
−0.1 0.7 −0.3 −0.3
W = (4.19)
−0.1 −0.3 0.7 −0.3
−0.1 −0.3 −0.3 0.7
38.4 −38.4 −38.4 −38.4 −38.4
−38.4 38.4 −38.4 −38.4 −38.4
cov(Q) =
−38.4 −38.4 38.4 −38.4 −38.4
(4.20)
−38.4 −38.4 −38.4 38.4 −38.4
−38.4 −38.4 −38.4 −38.4 38.4
4.3 Weak universal optimal block design for circulant correlation with odd
block size
1 ρ 0 0 ... ρ
ρ 1 ρ 0 ... 0
0 ρ 1 ρ ... 0
R=
. . . .
. . . .
. . . .
ρ 0 0 0 ... 1
Design 4.1 Weak universal optimal block design for circulant correlation with odd block
size:
Based on a (v,k, λ0 )-BIBD constructed in [14] with odd block size k (suppose k=2n-1)
in each block, denote the k treatments in the block in the order (1,2 . . . 2n-1), then we
k−1
construct 2 − 1 additional blocks, each block constructed based on the previous block.
In the ith block, take the (n-1) treatments in the even positions ( positions 2,4,6 . . . 2n-2)
of the (i-1)th block and put them in order to the first n-1 positions, and put the remaining
For example, for k=5, based on a (v,5,λ0 )-BIBD, for each block B, denote the 5 treat-
ments in the block in order as (1,2,3,4,5). Then we generate another one block based on the
original one in this order: block B2 = (2, 4, 1, 3, 5), This result is a (v, 5,2λ0 )-BIBD design.
For k=7, based on a (v,5,λ0 )-BIBD, for each block B, denote the 7 treatments in the block
in order as (1,2,3,4,5,6,7). Then we generate another 2 blocks based on the original one in
this order: block B2 = (2, 4, 6, 1, 3, 5, 7), , B3 = (4, 1, 5, 2, 6, 3, 7),This result is a (v, 7,3λ0 )-
BIBD design.
Theorem 4.1: Design 4.1 is a weak universal optimal block designfor all of the BIBDs
with the same r value.
Proof: Suppose the replication of each treatments for the original BIBD is r0 , then the
(k−1)r0
replication of each treatments for our construction is 2 . We denote it as r. From the
k−1 0
construction of the block, we can see λ = 2 λ, .
37
1 1 2 + 4ρ k + 2kρ 1 + 2ρ
W = (R − (RJ + JR) + 2 JRJ) = R − J+ 2
J =R− J (4.21)
k k k k k
X X
0
cov(Qi , Qi0 ) = k 2 rh(i,j)h(i0 ,j) − k 2 wh(i,j)h(i 0 ,j) (4.22)
i,i0 ∈Bj i,i0 ∈Bj
The second part of the value of cov(Qi , Qi0 ) , which is based on W 0 , is only a function
of λ, and does not depend on the arrangement of the treatments in blocks. In contrast, the
first part of the value of cov(Qi , Qi0 ) , which is based on R is related to the arrangement of
that is, Nii0 is a constant for any i and i0 . Here Nii0 is the number of times that treatment
We will begin our proof with treatment n. Since all of the treatments are cyclic
Let’s consider the circulant distance of n and other treatments. In the original block,
the distance between treatment n and n-1 and the distance between treatment n and n+1
is 1, the distance between treatment n and n-2 and the distance between treatment n and
n+2 is 2, . . . the distance between treatment n and 1 and the distance between treatment
After the construction, the distance between treatment n and n-1 and the distance
between treatment n and n+1 is (n-1) ( that is, the farthest distance), the distance between
treatment n and n-2 and the distance between treatment n and n+2 is 1, . . . the distance
between treatment n and 1 and the distance between treatment n and 2n-1 is n-2. That
is, except n+1 and n-1, the distance between n and any other treatments gets closer by 1
k−1
unit. Since we repeat this process 2 − 1 = n − 2 times, this construction can guarantee
n and other treatments will be neighbor circularly exactly once. So Nii0 = 1 under our
construction.
From (2.12) and (2.16), and since each treatment appears in exactly r blocks, and
1 + 2ρ
cov(Qi , Qi ) = k 2 r − k 2 × r × = r(k 2 − k − 2kρ) (4.23)
k
That is, all of the values on the main diagonal of cov(Q) are of the same value.
(ii) i 6= i0 .
k − 1 1 + 2ρ k − k2
cov(Qi , Q0i ) = λ0 (k 2 × ρ − k 2 × × ) = λ0 (kρ + ) (4.24)
2 k 2
is a constant(since k and ρ are constants) for i 6= i0 . That is, all of the treatments that
are not on the main diagonal of cov(Q) are of the same value.
Combining (4.23) and (4.24), condition (i) for weak universal optimal is satisfied.
For example, if we take k=5, v=6, then we can take r=5, b=6 and λ = 4. The BIBD
(6,1,2,3,4).
Then for circulant correlation, our design will generate one block based on each original
0.6625 0.0625 −0.3375 −0.3375 0.0625
0.0625
0.6625 0.0625 −0.3375 −0.3375
W =
−0.3375 0.0625 0.6625 0.0625 −0.3375 (4.25)
−0.3375 −0.3375 0.0625 0.6625 0.0625
0.0625 −0.3375 −0.3375 0.0625 0.6625
80 −32 −32 −32 −32 −32
−32 80 −32 −32 −32 −32
−32 −32 80 −32 −32 −32
cov(Q) =
(4.26)
−32 −32 −32 80 −32 −32
−32 −32 −32 −32 80 −32
−32 −32 −32 −32 −32 80
Theorem 5.1 For any kind of block-structured correlation, any BIBD is a weak universal
1 1
W = (Ik − Jk )ρij Jk (Ik − Jk ) = 0 (4.27)
k k
1 1 1
W = (Ik − Jk )R(Ik − Jk ) = R − (1 + (k − 1)ρ)Jk (4.28)
k k k
40
1+(k−1)ρ
So from (2.12), we have cov(Qi , Qi ) = r(1 − k ).
is a constant for i 6= i0 .
can see our designs in section 2 also work for this block-structured correlation, since the
Chapter 5
Experimental Design
5.1 Motivation
An immediate limitation of the methods of Chapters 3 and 4 is that the models are
linear - that is, linear in the parameters. When models are nonlinear in the parameters,
and the design objective function remains maximization of the (Fisher) information matrix,
the problem becomes theoretically intractable because the information matrix is - itself - a
function of the unknown parameters. Traditionally, non-linear optimal design methods have
been sequential ([10]) in nature, using current results from partial experiments to improve
the selection of the next design point. This has practical limitations. A second limitation
of the methods of Chapters 3 and 4 is that the designs are exact -they have integer weights
on selected design points/treatments. If the design size changes, both the support points
Given these two limitations, our third research objective was to develop an algorithm
that could overcome both the linear model and the integer weight limitations and produce
reasonable if not optimal designs, still with the objective of maximizing the (Fisher) infor-
mation of some function of this information matrix. To this end, we implemented modified
and nested Particle Swarm Optimization (PSO) algorithms with multiple decision making
criteria to determine the gain in efficiency that might be achieved. Examples of potential
improvements are presented using two types of non-linear models: the Michaelis-Menton
minimization problems and a nested PSO algorithm for the pessimistic criterion are pre-
42
sented by [28]. Chen’s paper is a milestone in the research of applying particle swarm opti-
mization to experimental design. The combination of decision making and particle swarm
optimization in engineering has been studied in several previous papers, see for example
However, the combination of PSO algorithm and other decision making criteria, in-
cluding the index of optimism criterion and minimax regret criterion, are seldom considered
in previous research.
algorithm and various decision making criteria, and use time varying parameters proposed
by [31], [32], and [33] to find efficient designs for non-linear models and to compare the
results of the PSO methodology to that of the simulated annealing algorithm to determine
which method might provide the better results even when the models are linear.
maxiter − iter
c1 = (cupper − clow ) × + clow (5.1)
maxiter
iter
c2 = (cupper − clow ) × + clow (5.2)
maxiter
Here iter is the current number of iteration and maxiter is the maximum number of itera-
tions. cupper and clow are the upper and lower bounds of the learning factors, respectively.
In this algorithm, following the result of [32], we take cupper = 2, clow = 0.75.
Consequently, as the PSO algorithm proceeds, the cognitive learning factor is linearly
maxiter − iter
ω = (ω1 − ω2 ) × + ω2 (5.3)
maxiter
43
ω1 and ω1 are the upper and lower bounds of ω, respectively. In our algorithm, following
We use improvements (i) and (ii) because this approach is in accordance with the idea
of particle swarm optimization: at the beginning, each bird has a large cognitive learning
factor and small social learning factor, and each bird searches mainly by its own experience.
After a period of time, as each bird gets more and more knowledge from the bird population,
it relies increasingly on the social knowledge for its search. In addition, the effect of inertia
velocity will decrease over time since the particles get more and more information from
cognitive learning and social learning in the process of searching, so they rely increasingly
Initialization process
1.1. For each of the n particles, initialize particle position xi and velocity vi with random
values.
1.2. Evaluate the fitness value of each particle according to the objective function.
update process:
2.1. Update the velocity of particles by formula (2.17). Here vi are limited to an interval
[vmin , vmax ]. If any value of vi is out of the bounds, then we will take the corresponding
2.2. Based on the velocity, update the position of particles by formula (2.18)
2.3. update the fitness value, then update pbest and gbest based on that.
If not, update c1 , c2 and ω by formula (5.1), (5.2) and (5.3), and repeat the update process
2.1.
Clearly, this basic algorithm can be used to solve either minimization or maximization
problem. For minimization problem, the update process of pbest and gbest in 2.3 is: for
each particle, if the updated fitness value < the fitness value of current pbest, then pbest
44
is updated to the new solution; otherwise, pbest remains unchanged. gbest is the particle
pbest for the particle that achieves the minimum of the pbest fitness values over the whole
population of particles.
For maximization problem, the update process of pbest and gbest in 2.3 is: for each
particle,if the updated fitness value > the fitness value of current pbest, then pbest is
updated to the new solution; otherwise, pbest keeps unchanged. gbest is the particle which
This algorithm is an efficient way to obtain D-optimal design for linear regression,
especially when the observations are correlated. The model for linear regression can be
written as in [7]
yi = fi (x)0 β + i (5.4)
where i=1. . . n, β is a k vector of parameters, and fi (x) = (f1i (x), f2i (x), . . . fki (x)) is
matrix is: X = (xij )n×d , and D-optimality aims to maximize of the determinant of the
information matrix, where the information matrix for these models is:
I = X 0 V −1 X (5.5)
is the variance covariance matrix of the errors. In the linear regression, the swarm is the
design matrix.
For regression with Fisher information matrix involving unknown parameters, we need
two “swarms” of particles (one is ξ, another is θ), and solve it by using a nested PSO
algorithm. These two swarms of particles are used in different layers of iterations. In each
layer, the fitness value is determined by one of the two swarms of particles. For convenience,
45
we note the two swarms corresponding to ξ and θ as swarm 1 and swarm 2, the position as
define factions (θ, ξ) = maxθ∈Θ (log|I −1 (θ, ξ)|)). Then this optimization problem is
to find minξ factions (θ, ξ). Clearly factions (θ, ξ) is based on the particle swarm θ, and
Initialization process
1.1. For each of the n particles in each of the two swarms ξ, and θ, initialize particle position
1.2. evaluate the fitness value factions (x) of each particle according to the objective function
by basic algorithm . Then compute the local and global best position based on that.
update process:
2.1. update velocity xvi of particles in swarm 1 by formula (2.17). Here xvi are limited
into an interval [vmin , vmax ]. If any value of vi is out of the bound, then we will take
2.2. based on the velocity, update the position of particles in swarm 1 by formula (2.18)
2.3. based on the new position, update the fitness value factions (x) by basic algorithm.
If the stopping criteria is satisfied, output the gbest and related fitness value. If not,
update c1 , c2 and ω by formula (5.1), (5.2) and (5.3), and repeat the update process.
From the algorithm, we can see the process of evaluating factions (x) is the inner circu-
lation, the process of evaluating minξ factions (θ, ξ) is the outer circulation.
define factions (θ, ξ) = (1 − α)maxθ∈Θ log|I −1 (θ, ξ)| + αminθ∈Θ log|I −1 (θ, ξ)|. Our object
Initialization process:
1.1. For each of the n particles in each of the 2 swarms ξ, and θ, initialize particle position
46
1.2. evaluate the fitness value maxθ∈Θ log|I −1 (θ, ξ)| and minθ∈Θ log|I −1 (θ, ξ)| by basic al-
gorithm. Then initialize the factions (x) and local and global best position.
The only difference is in the update process, we will compute both maxθ∈Θ log|I −1 (θ, ξ)|
and minθ∈Θ log|I −1 (θ, ξ)| by basic PSO algorithm and take the weighted average of them
define RV (θ, ξ) = log|I −1 (θ, ξ)| − minξ log|I −1 (θ, ξ)|. Then this optimization problem
Initialization process:
1.1. For each of the n particles in each of the 2 swarms ξ, and θ, initialize particle position
1.2. compute the fitness value minξ log|I −1 (θ, ξ)| by basic algorithm. Based on that, com-
pute RV (θ, ξ). Then initialize the local and global best position based on that.
update process:
2.2. based on the velocity, update the position of particles in swarm 2 by formula (2.18)
2.3 update the fitness value maxθ∈Θ RV (θ, ξ)) by basic algorithm.
2.5. based on the velocity, update the position of particles by in swarm 1 formula (2.18)
2.6. update the fitness value( the loss function) minξ maxθ∈Θ RV (θ, ξ) by basic algorithm.
If the stopping criteria is satisfied, output the gbest and related fitness value. If not,
update c1 , c2 and ω by formula (5.1), (5.2), and (5.3), and repeat the update process.
From table 5.1 and 5.2, we see that when ρ=0.1, the determinants obtained by
47
simulated annealing is a little higher than the result of PSO method. However, when
ρ=0.4, the determinants from the PSO algorithm are much higher than the results of the
simulated annealing.
Table 5.1: Basic PSO for linear regression with circulant correlation structure
Table 5.2: Basic PSO for linear regression with nearest neighbor correlation structure
From tables 5.3 and 5.4, we see the gbest of index of optimism criterion is better than that
of the pessimistic criterion and minimax regret criterion, and gbest is inversely proportional
to α. That is because the pessimistic criterion always consider the worst case, but index of
optimism criterion takes a trade off between optimistic case and pessimistic case. When α
increase, the extent of optimistic get larger, so the loss function get smaller( and therefore
better).
48
criterion gbest support point 1 and weight support point 2 and weight
Index of optimism
5.6371 28.9594 0.5140 200 0.4860
with α=0.7
Index of optimism
6.1237 91.5995 0.2158 200 0.7842
with α=0.5
Index of optimism
7.5770 118.1915 0.1648 200 0.8352
with α=0.3
Table 5.4: Different criterion with two parameter logistic regression model
Index of optimism
3.3675 0.0227 0.5731 0.3514 -0.4051
with α=0.7
Index of optimism
3.4405 -0.4583 0.6799 0.0676 2.2800
with α=0.5
Index of optimism
3.6436 2.5594 1.6075 2.2055 -0.2563
with α=0.3
Chapter 6
Discussion
algorithm can successfully determine highly efficient D-optimal designs for second order
polynomial regression on [−1, 1]2 for a variety of correlated error structures and with the
design size, n, not limited to a multiple of the number of regression parameters. The
combination of (i) a ”middle ground” perturbation scheme, (ii) the use of a parameter
that controls the size of the neighborhood for the perturbations, and (iii) re-heating, leads
to designs that - while not likely globally optimal - are better than those obtained by
searching among the set of designs known to be D-optimal for the uncorrelated errors case.
In particular, when the true correlation parameter is well away from 0, the final SA design
has much greater relative efficiency than the ”best uncorrelated” comparison design.
The SA algorithm needs only a well-defined energy function to maximize, here the
determinant of the information matrix. Thus, the same algorithm may be used for other
design optimality criteria, for example, A- and E-optimality. In the absence of exact ana-
lytic optimal designs when errors are correlated, the SA algorithm is an attractive, easily
implemented method to find highly efficient designs. Extensions to higher degree polyno-
mial regression models are immediate, except for the likely need for longer run times and
slower reduction of the temperature to allow for more effective searching over a larger design
region.
Limitations of this approach are apparent. First, the value of the correlation parameter
is specified in our examples as is the correlation structure itself. While the trend in improved
D-efficiency as the correlation moves further from 0 and the n-size increases is generally
apparent, and the final design points depart more from the usual vertices of the design region
used in the optimal uncorrelated case, whether the final SA design will be of practical value
50
for the experimenter depends on the correlation parameter, which is usually unknown. If the
true correlation is close to 0, the uncorrelated errors optimal designs are likely satisfactory,
In the second part, I solved weak universal optimal block designs for the nearest neigh-
bor correlation structure and multiple block sizes, for the hub correlation structure with any
block size, and for circulant correlation with odd block size. For circulant correlation with
even block size and nearest neighbor correlation with block size more than 6, the problem
becomes more complicated. How to make a general construction to a weak universal optimal
block design for circulant correlation with even block size and nearest neighbor correlation
In the third part, combining the theorem of decision making and pso, we propose
nested pso algorithms with all of these three criteria applied to the Michaelis-Menten model
and the two parameter logistic regression model and make comparison among the quality
of solutions found from the three criteria. For index of optimism criterion, we set the index
of optimism=0.3 (the decision maker is relatively pessimistic), 0.5 (the decision maker
compromises between the pessimistic and optimistic case) and 0.7 (the decision maker is
with 2 way linear regression. Specifically, for matrix with variable in high dimension (in
our section 1 the variable is usually 2 12 ×1 vectors or even more complicated), simulated
annealing is more efficient than other algorithms, like PSO. That is because our improved
simulated annealing algorithm allows us to improve the solution part by part, so we do not
miss any corner of the design region. On the other side, PSO is not very good at solving
However, PSO algorithm is a efficient way to solve nonlinear and weighted optimal de-
sign, which can not be solved by simulated annealing. For matrix with variable in relatively
low dimension (like for 1-way polynomial regression, the variable is a 6×1 vector), the PSO
51
algorithm usually can get result that is a little better than simulated annealing.
52
References
[1] Zhu, Z., Coster, D. C., and Beasley, L., “Properties of a covariance matrix with an
application to D-optimal design,” Electronic Journal of Linear Algebra, Vol. 10, 2003,
pp. 65–76.
[2] Dette, H., Kunert, J., and Pepelyshev, A., “Exact optimal designs for weighted least
squares analysis with correlated errors,” Statistica Sinica, Vol. 18, 2008, pp. 135–154.
[3] Goos, P., The optimal design of blocked and split-plot experiments, Springer, 2002.
[5] Dimitris, B. and Omid, N., “Robust optimization with simulated annealing,” Springer
Science+Business Media, LLC., 2009.
[6] Abdullah, S., Golafshan, L., and Nazri, M., “Re-heat simulated annealing algorithm
for rough set attribute reduction,” International Journal of the Physical Sciences,
Vol. 6(8), 2011, pp. 2083–2089.
[7] Zhu, Z., “Optimal experimental designs with correlated observations,” PhD disserta-
tion, Department of Mathematics and Statistics, Utah State University, 2004.
[8] Cheng, C., “Optimal regression designs under random block-effects models,” Statistica
Sinica, 2008, pp. 485–497.
[9] Boon, J. E., “Generating exact d optimal design for polynomial models,” SpringSim
’07 , Vol. 2, 2007.
[11] Cadima, J., Calheiros, F., and Preto, I., “The eigenstructure of block-structured cor-
relation matrices and its implications for principal component analysis,” Journal of
Applied Statistics, Vol. 37, 1971, pp. 577–589.
[12] Atkinsa, J. E. and Cheng, C., “Optimal regression designs in the presence of random
block effects,” Journal of Statistical Planning and Inference, Vol. 77, 1999, pp. 321–335.
[13] Kiefer, J. and Wynn, H. P., “Optimum Balanced Block and Latin Square Designs for
Correlated Observations,” The Annals of Statistics, Vol. 9, 1981, pp. 737–757.
[14] Kenneth, H. R. and Michaels, J. G., Handbook of Discrete and Combinatorial Mathe-
matics, CRC Press, 2000.
[15] Cheny, K. and Wei, R., “A few more cyclic Steiner 2-designs,” The Electronic Journal
of Combinatorics 13 , 2006.
[16] Lam, C. and Miao, Y., “On Cyclically Resolvable Cyclic Steiner 2-Designs,” Journal
of Combinatorial Theory, Vol. A 85, 1999, pp. 194–207.
53
[17] Chang, Y., “Some Cyclic BIBDs with Block Size Four,” Wiley Periodicals, Journal of
Combinatorial Designs, Vol. 12, 2004, pp. 177–183.
[18] Oehlert, G. W., A first course in design and analysis of experiments, CRC Press, 2000.
[19] Jin, B., “Optimal Block Designs with Limited Resources,” PhD dissertation, Virginia
Polytechnic Institute and State University, 2004.
[21] Kennedy, J. and Eberhart, R., “Particle swarm optimization,” Proceedings of Interna-
tional Conference on Neural Networks, 1995, pp. 1942–1948.
[22] Dette, H. and Wong, W., “E optimal designs for the Michaelis Menten model,” Statis-
tics and Probability Letters, Vol. 44, 1999, pp. 405–408.
[23] King, J. andWong, W. K., “Minimax D-optimal Designs for the Logistic Model,” Bio-
metrics, Vol. 56(4), 2000, pp. 1263–1267.
[24] Diao, Z., Zhen, H. D., Liu, J., and Liu, G., Operations research, Higher Education
Press, 2001.
[25] Fozunbal, M. and Kalker, T., “Decision-Making with Unbounded Loss Functions,”
Information Theory, 2006 IEEE International Symposium, 2006, pp. 2171 – 2175.
[26] Hoel, P., “Minimax Designs in Two Dimensional Regression,” The Annals of Mathe-
matical Statistics, Vol. 36, 1965, pp. 1097–1106.
[27] Box, M. J. and Draper, N. R., “Factorial Designs, the |X 0 X| Criterion, and Some
Related Matters,” Technometrics, Vol. 13, 1971, pp. 731–742.
[28] Chen, R. B., Chang, S. P., Wang, W., and Wong, W. K., “Optimal Experimental
Designs via Particle Swarm Optimization Methods,” Contributed , 2011.
[29] Yang, R., Wang, L., and Wang, Z., “Multi-objective particle swarm optimization for
decision-making in building automation,” IEEE/PES General Meetingl , 2011, pp. 1–5.
[30] Yang, L. and Shu, L., “Application of Particle Swarm Optimization in the Decision-
Making of Manufacturers’ Production and Delivery,” Electrical, Information Engineer-
ing and Mechatronics 2011 , 2012, pp. 83–89.
[31] Kiranyaz, S., Pulkkinen, J., and Gabbouj, M., “Multi-dimensional particle swarm op-
timization for dynamic environments,” Innovations in Information Technology, 2008,
pp. 34–38.
[32] Cai, X., Cui, Z., Zeng, J., and Tan, Y., “Particle Swarm Optimization with Self-
adjusting Cognitive Selection Strategy,” International Journal of Innovative Comput-
ing, Information and Control , Vol. 4, 2008, pp. 943–952.
[33] Ratnaweera, A., Halgamuge, S., and Watson, H., “Self-organizing hierarchical particle
swarm optimizer with time-varying acceleration coefficients,” Evolutionary Computa-
tion, IEEE Transactions on, Vol. 8, 2004, pp. 240– 255.
54
Appendices
55
Appendix A
A.1 A Simulated Annealing Algorithm for D-optimal Design for 2-Way Poly-
X=zeros(n,5,100);
O=ones(n,1);
for j=1:100
56
X(:,:,j)=Z;
Xc= X(:,:,1);
end
Xc=[O Xc]; %Xc is the design matrix
result= ones(1,100);
for i=3: 100 % the inner loop 1
Z=zeros(n,1); %Z is the vector of perturbation
for j=1:4
Z(j)=g*random(’unif’,-1, 1,1,1); % We do perturbation part by part
end
X1= Xc(:,2)+Z;
X2=X1.^2;
Z=zeros(n,1);
for j=1:4
Z(j)=g*random(’unif’,-1, 1,1,1);
end
X3= Xc(:,4)+Z;
57
X4=X3.^2;
X5=X1.*X3;
X(:,:,i+1) = [X1 X2 X3 X4 X5];
Xn=[O X(:,:,i+1)]; % Xn is new candidate design matrix
dE=det(Xn’*inv(V)* Xn)-det(Xc’*inv(V)*Xc);
result(i)= (abs(det(Xn’*inv(V)* Xn))<1.02*abs(det(Xc’*inv(V)*Xc)));
if result(i-2)+ result(i-1)+ result(i)==0 %we repeat the iterations until the improvement
is less than the threshold value 3 times.
break
end
Xn(find(Xn>1))=1;
Xn(find(Xn<-1))=-1; % the boundary of Xn is [-1,1].
if dE>0
Xc= Xn;
elseif exp(dE/T ) >1.01^c* random(’unif’,0,1)
%1.01^c is the threshold value.
Xc= Xn;
end
end
result= ones(1,100);
for i=3: 100 % inner loop 2
Z=zeros(n,1);
for j=5:8
58
Z(j)=g*random(’unif’,-1, 1,1,1);
end
X1= Xc(:,2)+Z;
X2=X1.^2;
Z=zeros(n,1);
for j=5:8
Z(j)=g*random(’unif’,-1, 1,1,1);
end
X3= Xc(:,4)+Z;
X4=X3.^2;
X5=X1.*X3;
X(:,:,i+1) = [X1 X2 X3 X4 X5];
Xn=[O X(:,:,i+1)];
dE=det(Xn’*inv(V)* Xn)-det(Xc’*inv(V)*Xc);
result(i)= (abs(det(Xn’*inv(V)* Xn))<1.02*abs(det(Xc’*inv(V)*Xc)));
if result(i-2)+ result(i-1)+ result(i)==0
break
end
Xn(find(Xn>1))=1;
Xn(find(Xn<-1))=-1;
if dE>0
Xc= Xn;
elseif exp(dE/T ) >1.01^c* random(’unif’,0,1)
59
Xc= Xn;
end
end
result= ones(1,100);
for i=3: 100 % inner loop 3
Z=zeros(n,1);
for j=9:12
Z(j)=g*random(’unif’,-1, 1,1,1);
end
X1= Xc(:,2)+Z;
X2=X1.^2;
Z=zeros(n,1);
for j=9:11
Z(j)=g*random(’unif’,-1, 1,1,1);
end
X3= Xc(:,4)+Z;
X4=X3.^2;
X5=X1.*X3;
X(:,:,i+1) = [X1 X2 X3 X4 X5];
Xn=[O X(:,:,i+1)];
dE=det(Xn’*inv(V)* Xn)-det(Xc’*inv(V)*Xc);
result(i)= (abs(det(Xn’*inv(V)* Xn))<1.02*abs(det(Xc’*inv(V)*Xc)));
60
Xn(find(Xn>1))=1;
Xn(find(Xn<-1))=-1;
if dE>0
Xc= Xn;
elseif exp(dE/T ) >1.01^c* random(’unif’,0,1)
Xc= Xn;
end
end
V=eye(12);
W=zeros(12);
for i=1:11
m=zeros(1,i);
61
m(1:i)=0.4^(12-i)
W=diag(m,12-i);
V=V+W;
end
V=V+V’-eye(12); %V is the correlation matrix
X=zeros(5,12);
O=ones(1,12);
X(1,:)= [-1 -1 -1 0 0 0 1 1 1 -1 -1 1 ];
X(2,:)= X(1,:).^2;
X(3,:)= [-1 0 1 -1 0 1 -1 0 1 -1 1 -1 ]; % take the value given by Box et al.
X(4,:)= X(3,:).^2;
X(5,:)= X(1,:).* X(3,:);
X= zeros(50,7);
Xv=zeros(no_of_particles,7); );
%X is a group of vectors including
the information of our design, the first
62
Y= zeros(no_of_particles,2);
Yv=zeros(no_of_particles,2); % Y is the unknown parameters.
p_currentY=zeros(1,2);
p_currentX=zeros(1,7
c_upper=2;
c_low=0.75;
p_bestX= X(x,:);
current_fitness(x) = 1;
p_best_fitness(x) = 1;
end
%decide on the global best among all the particles
[g_best_val,g_best_index] = min(current_fitness);
g_bestX= X(g_best_index,:);
for i= 1:no_of_particles
p_bestY= Y(i,:);
current_fitness(i) = 1;
p_best_fitness(i) = 1;
end
%decide on the global best among all the particles
[g_best_val,g_best_index] = max(current_fitness);
g_bestY= Y(g_best_index,:);
g=0.9;
g=0.9;
for i= 1:no_of_particles
P1 =1/(1+exp(-Y(i,2)*(X(x,1)-Y(i,1))));
M1=[Y(i,2)^2* P1*(1- P1), -Y(i,2)*(X(x,1)-Y(i,1)) * P1*(1- P1);
-Y(i,2)*(X(x,1)-Y(i,1)) * P1*(1- P1), (X(x,1)-Y(i,1))^2* P1*(1- P1)];
P2 =1/(1+exp(-Y(i,2)*(X(x,2)-Y(i,1))));
M2=[Y(i,2)^2* P2*(1- P2), -Y(i,2)*(X(x,2)-Y(i,1)) * P2*(1- P2);
-Y(i,2)*(X(x,2)-Y(i,1)) * P2*(1- P2), (X(x,2)-Y(i,1))^2* P2*(1- P2)];
P3 =1/(1+exp(-Y(i,2)*(X(x,3)-Y(i,1))));
P4=1/(1+exp(-Y(i,2)*(X(x,4)-Y(i,1))));
M5= X(x,5)* M1+ X(x,6)* M2+X(x,7)* M3+ (1- X(x,5)- X(x,6)- X(x,7)) * M4;
%M5 is the information matrix
current_fitness(i) =log(det(inv(M5)));
p_best_fitness(i) = current_fitness(i);
p_bestY= Y(i,:);
end
[g_best_val,g_best_index] = max(current_fitness);
g_bestY= Y(g_best_index,:);
p_currentY= Y(i,:);
%Update of the velocity of Y. If the velocity get out of the bound,
then we either take the boundary number or keep it unchanged.
bv= g*Yv(i, :) + c1*rand*(p_bestY-p_currentY) + c2*rand*(g_bestY-p_currentY);
if length(find(bv<-.2))+ length(find(bv>1))>0
Y(i,:) = p_currentY;
66
Y(i,:) = p_currentY+Yv(i,:) ;
end
end
if g>0.4
g=0.99*g;
else
g=g;
end
fval(x)= current_fitness(g_best_index);
end
for x = 1:no_of_particles
%we take X( the design vectors) to minimize the maximize our loss function,
so the outer loop is used to solve minimization problem.
current_fitness(x) = fval(x);
if current_fitness(x) <p_best_fitness(x)
p_best_fitness(x) = current_fitness(x);
p_bestX= X(x,:);
67
end
[g_best_val,g_best_index] = min(current_fitness);
g_bestX= X(g_best_index,:);
for x = 1:no_of_particles
p_currentX= X(x,:);
av=k*Xv(x, 1:4) + c1*rand*(p_bestX(1:4)-p_currentX(1:4)) + c2*rand*(g_bestX(1:4)-p_currentX(1:4))
cv=k*Xv(x, 5:7) + c1*rand*(p_bestX(5:7)-p_currentX(5:7)) + c2*rand*(g_bestX(5:7)-p_currentX(5:7))
cv(find(cv<-.1))=-.1;
cv(find(cv> .1))=.1;
av(find(av<-.5))=-.5;
av(find(av> 1))=1;
a1=p_currentX(1:4)+Xv(x,1:4);
a2=p_currentX(5:7)+Xv(x,5:7);
if length(find(a1>3))+ length(find(a1<-.5))+ length(find(a1>.26))+ length(find(a2<.23)) >0
X(x,:) = p_currentX;
else
68
end
Vita
Chang Li
Logan, UT 84341
EDUCATION:
RESEARCH INTERESTS:
70
• Statistical computing
TEACHING EXPERIENCE:
Intermediate Algebra
College Algebra
Calculus Techniques
Grading National University Entrance Examination, China, July 2004 and July 2005
CERTIFICATES:
COMPUTER SKILLS: