100% found this document useful (6 votes)
85 views

Model Oriented Design of Experiments, 2nd Edition Fast eBook Download

The document discusses the second edition of 'Model-Oriented Design of Experiments' by Valerii V. Fedorov and Peter Hackl, highlighting its relevance through significant citations in various fields since its first publication in 1996. The new edition addresses concerns about information loss by introducing the elemental Fisher information matrix and includes updated content on optimal experimental designs. It aims to provide a comprehensive understanding of experimental design theory, targeting graduate students and practitioners with a moderate background in statistics.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (6 votes)
85 views

Model Oriented Design of Experiments, 2nd Edition Fast eBook Download

The document discusses the second edition of 'Model-Oriented Design of Experiments' by Valerii V. Fedorov and Peter Hackl, highlighting its relevance through significant citations in various fields since its first publication in 1996. The new edition addresses concerns about information loss by introducing the elemental Fisher information matrix and includes updated content on optimal experimental designs. It aims to provide a comprehensive understanding of experimental design theory, targeting graduate students and practitioners with a moderate background in statistics.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

Model Oriented Design of Experiments, 2nd Edition

Visit the link below to download the full version of this book:

https://medipdf.com/product/model-oriented-design-of-experiments-2nd-edition/

Click Download Now


Lecture Notes in Statistics (LNS) includes research work on topics that are more
specialized than volumes in Springer Series in Statistics (SSS).
The series editors are currently Peter Bühlmann, Peter Diggle, Ursula Gather,
and Scott Zeger. Peter Bickel, Ingram Olkin, and Stephen Fienberg were editors of
the series for many years.
Valerii V. Fedorov • Peter Hackl

Model-Oriented Design
of Experiments
Second Edition
Valerii V. Fedorov Peter Hackl
Independent Consultant Department of Statistics
Newtown Square, PA, USA Vienna University of Economics and
Business
Vienna, Austria

ISSN 0930-0325 ISSN 2197-7186 (electronic)


Lecture Notes in Statistics
ISBN 978-1-0716-4301-3 ISBN 978-1-0716-4302-0 (eBook)
https://doi.org/10.1007/978-1-0716-4302-0

1st edition: © Springer Science+Business Media New York 1997


2nd edition: © Springer Science+Business Media, LLC, part of Springer Nature 2025

This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether
the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse
of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and
transmission or information storage and retrieval, electronic adaptation, computer software, or by similar
or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this book
are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or
the editors give a warranty, expressed or implied, with respect to the material contained herein or for any
errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional
claims in published maps and institutional affiliations.

This Springer imprint is published by the registered company Springer Science+Business Media, LLC,
part of Springer Nature.
The registered company address is: 1 New York Plaza, New York, NY 10004, U.S.A.

If disposing of this product, please recycle the paper.


Preface to the Second Edition

The first edition of this book was published in December 1996, almost 30 years ago.

A good indicator of a book’s relevance is the number of citations it has received


and the evolution of these citations over time. Google Scholar reports more than a
thousand publications citing the book since 1997 and more than a hundred citations
in the last 5 years. The citations refer to both applications and questions of statistical
theory. Browsing through the titles and abstracts, the majority of the publications
deal with applications in fields such as engineering, environmental analysis, biology,
medicine, health care, pharmacology, but also social networking, social media,
network analysis, artificial intelligence, and others. The fact that the number of
citations is still at a considerable level has prompted us to follow Springer-Verlag’s
suggestion and consider a second edition.
In the first edition, almost all results are presented in terms of means and
dispersion (or variance-covariance) matrices, or the inverse of the dispersion
matrix, often called the (Pearson) information matrix. This approach simplifies
the methodological effort of the presentation and makes life much easier for the
practitioner: Only the first and second moments need to be known, not the shape of
the distributions.
However, there are situations where practitioners are concerned about potential
loss of information due to the lack of more detailed information about the
distribution of interest. In response to these concerns, we have added the concept
of the elemental (Fisher) information matrix to the presentation. This concept
allows a relatively straightforward adaptation of the main results on the design of
optimal experiments obtained in the previous edition. We have also considered some
inequalities linking the Fisher and Pearson information matrices. Collections of
elemental Fisher information matrices for commonly used distributions are provided
in new Appendices A and B.
The authors would like to thank Springer-Verlag and especially Allison De Ville
for the motivation to work on the second edition of our book. We are grateful to our
colleagues who helped us to correct a number of misprints they found in the first

v
vi Preface to the Second Edition

edition, especially Dr. S. Leonov. We hope that the new edition will be well received
by the community of users of optimal designs of experiments and researchers in this
field.

North Wales, PA, USA Valerii V. Fedorov


Vienna, Austria Peter Hackl
Preface

These lecture notes are based on the theory of experimental design for courses
given by Valerii Fedorov at a number of places, most recently at the University
of Minnesota, the University of Vienna, and the Vienna University of Economics
and Business Administration.
It was Peter Hackl’s idea to publish these lecture notes and he took the lead in
preparing and developing the text. The work continued longer than we expected,
and we realized that a few thousand miles distance remains a serious hurdle even in
the age of Internet and many electronic gadgets.
While we mainly target graduate students in statistics, the book demands only
a moderate background in calculus, matrix algebra and statistics. These are, to
our knowledge, provided by almost any school in business and economics, natural
sciences, or engineering. Therefore, we hope that the material may be easily
understood by a relatively broad readership.
The book does not try to teach recipes for the construction of experimental
designs. It rather aims at creating some understanding—and interest—in the
problems and basic ideas of the theory of experimental design. Over the years,
quite a number of books have been published on that subject with a varying degree
of specialization. This book is organized in five chapters that lay out in a rather
compact form all ingredients of experimental designs: models, optimization criteria,
algorithms, constrained optimization. The last third of the volume covers topics that
are relatively new and rarely discussed in form of a book: designs for inference
in nonlinear models, in models with random parameters, in stochastic processes,
and in functional spaces; for model discrimination, and for incorrectly specified
(contaminated) models.
Data collected by performing an experiment are based on two elements: (i) a
clearly defined objective and (ii) a piece of real world that generates—under control
of the experimenter—the data. These elements have analogues in the statistical
theory: (i) the optimality criterion to be applied has to be chosen so that it reflects
appropriately the objective of the experimenter, and (ii) the model has to picture—in
adequate accuracy—the data generating process.

vii
viii Preface

When applying the theory of experimental design, it is perhaps more true than
for many other areas of applied statistics that the complexity of the real world
and the ongoing processes can hardly be adequately captured by the concepts and
methods provided by the statistical theory. This theory contains a set of strong and
beautiful results, but it permits in only rare cases closed-form solutions, and only
in special situations is it possible to construct unique and clear-cut designs for an
experiment. Planning an experiment means rather to work out several scenarios
which together yield insights into and understanding of the data generating process,
thereby strengthening the intuition of the experimenter. In that sense, a real life
experiment is a compromise between results from statistical theory and the a priori
knowledge and intuition of the experimenter.
We have kept the list of references as short as possible; it contains only easy
accessible material. We hope that the collection of monographs given in References
will be sufficient for readers who are interested in the origin and history of the
particular results. A bibliography related to the more recent results can be found in
the papers by Cook and Fedorov (1995) and Fedorov (1996). Note that Volume 13
of the Handbook of Statistics, edited by Ghosh and Rao (1996), consists entirely of
survey-type papers related to experimental design.
We gratefully acknowledge the help and encouragement of friends and col-
leagues during the preparation of the text. Debby Flanagan, Grace Montepiedra,
and Chis Nachtsheim participated in the development of some results from Chap. 5;
we are very grateful for their contributions. We are thankful to Agnes Herzberg,
Darryl Downing, Max Morris, Werner Müller, and Bob Wheeler for discussions
and critical reading of various parts of this book. Stelmo Poteet and Christa Hackl
helped us tremendously in the preparation of the text for publication.

Oak Ridge, TN, USA Valerii V. Fedorov


Vienna, Austria Peter Hackl
December, 1996
Introduction

The collection of data requires a certain amount of effort such as time, financial
means, or other material resources. A proper design potentially allows to make use
of the resources in the most efficient way.
The history of publications and the corresponding statistical theory goes back as
far as to 1918 when Smith (1918) published a paper that presents optimal designs
for univariate polynomials up to the sixth degree. However, the need to optimize
experiments under certain conditions was understood by many even earlier. Stigler
(1974) provides an interesting historical survey on this subject. After some singular
earlier work, the core of theory of optimal experimental design was developed
during the fifties and sixties. The main contribution done during that time is due
to Jack Kiefer. A survey of Kiefer’s contribution to the theory of optimal design
is contained in the paper by Wynn (1984). Brown et al. (1985) published Kiefer’s
collected papers. Other important names and papers from that early times may be
found in Karlin and Studden (1966), Fedorov and Malyutov (1974), and Atkinson
and Fedorov (1988). Box and coauthors discussed related problems associated with
actual applications; see, e.g., Box and Draper (1987). The work of the Russian
statisticians that covers both mathematical theory and algorithms is surveyed by
Nalimov et al. (1985).
The first comprehensive volume on the theory of optimal experimental design
was written by Fedorov (1972). The book by Silvey (1980) gives a very compact
description of the theory of optimal design for estimation in linear models. Other
systematic monographs were published by Bandemer et al. (1977), Ermakov (1983),
Pázman (1986), Pilz (1993), and Pukelsheim (1993). Helpful introductory textbooks
are Atkinson and Donev (1992) and López-Fidalgo (2023).
Models and Optimization Problems In the description of experiments we distin-
guish between
• variables that are the focus of our interest and response to the experimental
situation and
• variables that state the conditions or design under which the response is obtained.

ix
x Introduction

The former variables usually are denoted by y, often indexed or otherwise sup-
plemented with information about the experimental conditions. For the latter
we distinguish between variables x that are controlled by the experimenter, and
variables t that are—like time or ambient temperature—out of the control of the
experimenter. In real-life experiments, y is often and x and t are almost always
vectors of variables. The theory of optimal designs discussed in this book is mainly
related to the linear regression. But various extensions comprise
• multi-response linear regression,
• nonlinear regression,
• regression models with random parameters,
• models that represent random processes,
and other generalizations of the regression model concept including discrimination
between competing models.
The set of values at which it is possible and intended to observe the response is
called the design region X. In general, X is a finite set with dimension corresponding
to the number of design variables. The classical design theory has been derived for
this case. However, in real-life problems we often encounter design restrictions. In
a time-series context, it is typically not possible to make multiple observations at
the same time point, so that the design region consists of a (in the simplest case,
equidistant) grid in time. Similar restrictions may be required due to geographical
conditions, cost, and ethical constraints as in clinical studies, etc. The most common
cause for restrictions are due to cost limitations: costs often depend on the design
point; e.g., the investment and maintenance costs of a sensor can strongly be
determined by the accessibility of its location.
The classical optimal design problem is the estimation of the model parameters
subject to the condition that a design criterion is optimized. In the case of a
simple linear regression with E{y} = β0 + β1 x, the variance of the estimate β̂1
is proportional to the reciprocal of the meansquared deviations between the design
points xi , and their mean x: Var{β̂1 } ∝ 1/ N i=1 (xi − x) . Consequently, we can—
2

given the number N of observations—minimize the variance of the estimate by


shifting one half of the design points to each of the limits of the design region.
In cases where the parameter vector has k > 1 components, a possible design
criterion is a (scalar) function of the corresponding covariance matrix. As it
is intuitively obvious, the choice of the design criterion will turn out to be a
crucial part of an optimal design problem, both from the aspect of technical effort
and interpretation of results. The design problems discussed in the literature go
beyond the estimation of model parameters. Design criteria similar to those used
in estimating model parameters can be applied, for instance, for
• Estimating functions of the model parameters
• Constructing a statistical test with optimal power
• Screening for significant variables
• Discriminating between several models
Introduction xi

Illustrating Examples

In many practical cases, y, x, and t are vectors in the Euclidean space. For instance,
y may be the yield of crop(s), x are concentrations of fertilizers, and t are weather
conditions. Experimental data let us infer—to the desired degree of accuracy and
reliability—what dosage of an fertilizer is optimal under certain conditions. In
many cases it will help the reader to achieve a better understanding of the general
theoretical results (Chaps. 1–4) if she or he tries to relate these with this or a similar
situation. The recent development shows (see Chap. 5) that the main ideas of the
optimal experimental design may be applied to the large number of problems, in
which y, x, and t have more complicated structures. We sketch a few examples
that are typical for various situations where design considerations can be used to
economize the experimental effort in one or another way.

Air Pollution

The air pollution that is observed in a certain area is determined, among other
factors, by the time of observation, by the location of the sensor, and by the
direction of wind. The wind direction determines what sources of immission are
effective at the location of observation. The air pollution is measured in terms
of the concentration y of one or several (in general K) pollutants; the location
of the observation station is x = (x1 , x2 )T , the wind direction is described by
v(t) = (v1 (t), v2 (t))T , and time is denoted by t. In some cases, the K locations
X = {xk∗ }K 1 of pollution emitters and the respective rates E(t) = {ek (t)}1 of
K

emission are known; in others they should be identified, the so-called inverse
problem. Typical sets of information are:
1. the vector function y (x, v, X , E, t), the concentration of pollutants for a given
location x at time t;
2. the scalar K-vector
  T
y(x) = T −1 y (x, v, X , E, t) dt dv ,
v 0

i.e., the mean air pollution at location x over the period [0, T ].
Note that the vectors v and E usually depend on t.
The design problem in this context could be: Where should we locate sensors
so that the result of a certain analysis such as the identification of the location of
polluters has maximal accuracy? Sections 2.6, 5.1, and 5.3 may help to answer such
questions.
xii Introduction

Clinical Studies

Understanding of the patient response to a new experimental compound (or more


generally “treatment”) is usually based on the development of a dose-response
model. In these models x represents the dose, or the doses in the case of multiple
compounds; the response y describes the respective disease progression. Both x
and y can depend on time, and the result could be to some extent specific for each
patient.
A very popular model in this context is the logistic model

1 , the response is positive
y(x) =
0 , no response

and

eη(x,θ)
Pr{y(x) = 1} = ,x∈X,
1 − eη(x,θ)

where x is the administered dose, θ is a vector of unknown parameters, and X is the


set of admissible doses.
In an early phase of a clinical study, given the number of patients who agreed to
participate in the study, we have to find the allocation of the patients to the various
doses x that provides the “best” estimators for the unknown parameters θ .

Chemical Reactor

The output y of a chemical process that is going on in a reactor is determined by


the amounts x1 and x2 of two input components, and by the profile x3 (s) of the
temperature over the length s of the reactor. The design problem in this context could
be: Find the set {x1 , x2 , x3 (s)} that provides an optimal approximation of y. Note
that x3 (s) is an element in some functional space. A suitable design technique can
be found in Sect. 5.1; although this section refers to time-dependence, the discussed
methods may be applied to a broader class of problems.

Spectroscopic Analysis

To describe a compound spectrum, knowledge of the concentration θi of the spectral


components fi (ν), i = 1, . . . , m, is needed for certain frequencies ν. These
components can be assessed by observing the spectrum in certain “windows,” i.e.,
subsets of the entire frequency domain. Such a window is defined by an indicator
Introduction xiii

function xj (ν), which has the value one for the frequency interval of interest and
zero elsewhere. The observed signal for the given window xj (ν) is
 
m
y(x) = θi fi (ν) xj (ν) dν.
i=1

The design problem is to choose windows xj (ν), j = 1, . . . , n, in such a way that


the best estimation of all or selected θi is possible.

The Book’s Outline

The first four chapters cover general material. In particular, Chap. 1 contains a
very short collection of facts from regression analysis. Chapter 2 is essential
for understanding and describes the basic ideas of convex design theory. The
subsequent chapters concern the numerical methods (Chap. 3) and a few theoretical
extensions (Chap. 4). The reader may abstain from detailed reading of the sections
on numerical methods. Some basic algorithms are already available either in widely
used statistical packages, e.g., SAS, JMP, and SYSTAT, or in more specific software
like ECHIP and STAT-EASE; see also numerous R packages, most of which are
cited by Groemping and Morgan-Wall (2023). Chapter 5 is the largest chapter and
describes applications of the convex design theory to various specific models. The
appendix provides the elemental Fisher information for popular distributions and
contains, for the reader’s convenience, a rather standard collection of formulas,
mainly from matrix algebra.
Contents

1 Some Facts from Regression Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1


1.1 The Linear Model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 More About the Information Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3 Generalized Versions of the Linear Regression . . . . . . . . . . . . . . . . . . . . . . . 8
1.4 Nonlinear LSE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.4.1 Consistency of θ̂ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.4.2 The Dispersion Matrix Var{θ̂} . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.4.3 Iterative Estimation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.5 Using the Response Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.5.1 The Maximum Likelihood Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.5.2 The Fisher Information Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.5.3 A Lower Bound for the Elemental FIM . . . . . . . . . . . . . . . . . . . . . . . 18
1.5.4 Cramer Rao Inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.6 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2 Convex Design Theory. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.1 Optimality Criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.2 Some Properties of Optimality Criteria. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.3 Continuous Optimal Designs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.4 Sensitivity Function and Equivalence Theorems . . . . . . . . . . . . . . . . . . . . . . 35
2.5 Some Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
2.6 Complements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3 Numerical Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.1 First Order Algorithm: D-Criterion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.2 First Order Algorithm: The General Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
3.3 Finite Sample Size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
4 Optimal Design Under Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.1 Cost Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.2 Constraints for Auxiliary Criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.3 Directly Constrained Design Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

xv

You might also like