Dose Response Analysis Using R, 1st Edition Unlimited Download
Dose Response Analysis Using R, 1st Edition Unlimited Download
Visit the link below to download the full version of this book:
https://medipdf.com/product/dose-response-analysis-using-r-1st-edition/
This book contains information obtained from authentic and highly regarded sources. Rea-
sonable efforts have been made to publish reliable data and information, but the author
and publisher cannot assume responsibility for the validity of all materials or the conse-
quences of their use. The authors and publishers have attempted to trace the copyright
holders of all material reproduced in this publication and apologize to copyright holders if
permission to publish in this form has not been obtained. If any copyright material has not
been acknowledged please write and let us know so we may rectify in any future reprint.
Except as permitted under U.S. Copyright Law, no part of this book may be reprinted,
reproduced, transmitted, or utilized in any form by any electronic, mechanical, or other
means, now known or hereafter invented, including photocopying, microfilming, and record-
ing, or in any information storage or retrieval system, without written permission from the
publishers.
For permission to photocopy or use material electronically from this work, please access
www.copyright.com (http://www.copyright.com/) or contact the Copyright Clearance Cen-
ter, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400. CCC is a not-
for-profit organization that provides licenses and registration for a variety of users. For
organizations that have been granted a photocopy license by the CCC, a separate system
of payment has been arranged.
Preface ix
1 Continuous data 1
1.1 Analysis of single dose-response curves . . . . . . . . . . . . 2
1.1.1 Inhibitory effect of secalonic acid . . . . . . . . . . . 2
1.1.1.1 Fitting the model . . . . . . . . . . . . . . . 3
1.1.1.2 Estimation of arbitrary ED values . . . . . 6
1.1.2 Data from a fish test in ecotoxicology . . . . . . . . . 6
1.1.3 Ferulic acid as an herbicide . . . . . . . . . . . . . . 9
1.1.4 Glyphosate in barley . . . . . . . . . . . . . . . . . . 13
1.1.5 Lower limits for dose-response data . . . . . . . . . . 19
1.1.6 A hormesis effect on lettuce growth . . . . . . . . . . 23
1.1.7 Nonlinear calibration . . . . . . . . . . . . . . . . . . 26
1.2 Analysis of multiple dose-response curves . . . . . . . . . . 31
1.2.1 Effect of an herbicide mixture on Galium aparine . . 31
1.2.2 Glyphosate and bentazone treatment of Sinapis alba 35
1.2.2.1 A joint dose-response model . . . . . . . . . 36
1.2.2.2 Fitting separate dose-response models . . . 39
v
vi Contents
5 Time-to-event-response data 95
5.1 Analysis of a single germination curve . . . . . . . . . . . . 97
5.1.1 Germination of Stellaria media seeds . . . . . . . . . 97
5.2 Analysis of data from multiple germination curves . . . . . 102
5.2.1 Time to death of daphnias . . . . . . . . . . . . . . . 104
5.2.1.1 Step 1 . . . . . . . . . . . . . . . . . . . . . 104
5.2.1.2 Step 2 . . . . . . . . . . . . . . . . . . . . . 107
5.2.2 A hierarchical three-way factorial design . . . . . . . 109
5.2.2.1 Step 1 . . . . . . . . . . . . . . . . . . . . . 112
5.2.2.2 Step 2 . . . . . . . . . . . . . . . . . . . . . 114
Bibliography 199
Index 211
Preface
The history of dose-response analysis goes back many hundred years. One of
the more unusual applications is that numerous rulers had cupbearers who
tried the ruler’s food and drink to avoid poisoning and probably the demise
of the regent. The dose-response was the survival/health of the cupbearer.
In more recent times, dose-response analysis was applied to data from
controlled experiments where a limited number of doses of a toxic chemical
compound were to be compared to a control group (dose 0) in terms of binary
responses such as whether or not a treated insect was dead or alive after a
certain time period (Finney, 1949). Later dose-response analysis crystallized
into being a certain type of regression analysis. In the seminal work by Finney
(1971) it is explained how to carry out the estimation in the so-called probit re-
gression model through manual calculations. By the late 1970s dose-response
analysis had been extended to log-logistic models for continuous response
(Finney, 1979). In the beginning, such dose-response data were fitted through
linearization (e.g., Streibig, 1981, 1983). Later nonlinear estimation of such
models became available through add-ons and macros for spreadsheet pro-
grams (e.g., Vindimian et al., 1983; Caux and Moore, 1997). General-purpose
statistical software programs also included nonlinear estimation procedures
but without any specific focus on dose-response analysis.
By 2005 the first version of the extension package drc was developed for
the statistical programming environment R (R Core Team, 2018). Originally,
it was developed for nonlinear fitting of log-logistic models that were routinely
carried out in weed science (Ritz and Streibig, 2005). However, subsequently,
the package has been modified and extended substantially, mostly in response
to inquiries and questions from the user community. It has developed into a
veritable ecosystem for dose-response analysis (Ritz et al., 2015). Currently,
such extensive functionality for dose-response analysis does not exist in any
other statistical software. One of the problems that non-statistical scientists
were facing in the past was that guestimates of nonlinear regression parameters
had to be provided upfront before any estimation of parameters could take
place; this was an insuperable problem for many practitioners. To a very large
extent this problem has now been resolved in the package drc through the use
of so-called self-starter routines.
The development of dose-response analysis has undergone dramatic
changes from struggling with cumbersome more or less manual calculations
and transformations with pen and paper to the blink-of-an-eye estimation of
relevant parameters on any laptop.
ix
x Preface
A unified framework
The dose does not necessarily need to be a chemical compound. We define a
dose (metameter) as any pre-specified amount of biological, chemical, or ra-
diation stimuli or stress eliciting a certain, well-defined response. Other kinds
of exposure or stress could also be imagined, e.g., time elapsed in germination
experiments. However, in any case, the dose is a non-negative quantity.
Specifically, we define the response evoked by a specific dose as the quan-
tification of a biologically relevant effect, and as such, it is subject to random
variation. The most common type is a continuous response such as biomass,
enzyme activity, or optical density. A binary or aggregated binary (binomial)
response is also frequently used to describe results such as dead/alive, immo-
bile/mobile, or present/absent (Van der Vliet and Ritz, 2013). The response
may also be discrete as in a number of events observed in a specific time inter-
val such as a number of juveniles, offspring, or roots (Ritz and Van der Vliet,
2009). We will have more examples in later chapters.
A key feature of dose-response analysis is that the experimenter or re-
searcher has to have some a priori idea about the type of model function that
would be relevant for the analysis of her/his dose-response data. In principle,
many nonlinear model functions could be considered for describing how the
average response changes over the range of doses considered. In practice, only
a limited number of functions are used in the majority of applications. Specif-
ically, we will focus on modeling average trends through mostly s-shaped or
related biphasic functions. These functions reflect an a priori basic under-
standing of the causal relationship between the dose and the response, e.g.,
when a dose increases the response decreases between certain limits referred
to as the lower and upper limits, respectively. S-shaped functions have turned
out to be extremely versatile for describing various biological mechanisms;
one key feature is that model parameters provide useful interpretations of
observed effects within a biologically plausible framework. Specifically, dose-
response analysis is often used for screening and ranking of compounds using
estimated effective or lethal doses such as ED50 or LD50 (e.g., WHO, 2005).
The full specification of a statistical dose-response model involves both
specifying the parametric model function and assumptions about the distri-
bution of the responses, i.e., how they randomly fluctuate around the average
value determined by an assumed model function. Distributional assumptions
depend on the type of response observed. However, the same model functions
may be meaningful for different types of responses, and this is the unify-
ing feature of dose-response analysis: It involves dose-response models that
are a collection of statistical models that have a certain mean structure in
common. This is not a mathematical definition in any sense, but rather a
definition driven by applications, which actually makes sense for a statisti-
cal methodology. Consequently, dose-response models encompass a range of
statistical models that could be classified as nonlinear regression, generalized
Preface xi
Acknowledgment
We are fortunate in having some colleagues and experts, Florent Baty, Andrew
Kniss, Andrea Onofri, Janine Wong, and Ming Yi, who kindly agreed to read
sections or the entire manuscript. We are grateful for their valuable comments
and correction of the substance and language. We would stress, however, that
all these helpful people are in no way responsible for any mistakes which still
occur; these are ours alone.
1
Continuous data
yi = f (xi , β) + εi , i = 1, . . . , n (1.1)
with the fixed and random contributions adding up to the observed response
value for each pair of dose and response (xi , yi ), for a total of n measurements.
It is common to assume that the random contributions, the εi ’s in Equa-
tion (1.1), follow a mean-zero normal distribution with an unknown residual
standard deviation, which also is a model parameter to be estimated from
the data. The residual standard error is a measure of the variation between
measurements beyond what is explained by the assumed dose-response model
function. We will also address how to deal with dose-response data that do not
fully satisfy the above assumptions (see Subsection 1.1.3 and Subsection 1.1.4
for examples).
The model specification in Equation 1.1 relies on the assumption that the
variation between replicates is the same for all doses (referred to as variance
homogeneity). In this case, estimation may be carried out using nonlinear
1
2 Dose-response analysis using R
least squares (see Section A.1 for more details) as the dose-response model is
a special case of a nonlinear regression model (Ritz and Streibig, 2008).
In the examples below we will only specify the model function f and im-
plicitly assume that a statistical model is defined through Equation (1.1).
However, we will also address situations where the assumptions of normality
and variance homogeneity are not fulfilled.
In this chapter, we use the following extension packages:
library(drc)
library(devtools)
install_github("DoseResponse/drcData")
library(drcData)
library(boot)
library(lmtest)
library(metafor)
library(sandwich)
secalonic
## dose rootl
## 1 0.000 5.5
## 2 0.010 5.7
Continuous data 3
## 3 0.019 5.4
## 4 0.038 4.6
## 5 0.075 3.3
## 6 0.150 0.7
## 7 0.300 0.4
The first argument supplied to the function drm() is a model formula relating
the response to the predictor. The second argument data is specifying the
dataset where the variables rootl and dose are found. R will not automat-
ically look for variables in the dataset secalonic because they are not in
the search path and it is a good habit to specify the relevant dataset every
time a model is fitted. The third argument fct specifies the dose-response
model function that we want to fit. As there are 7 different doses, a four-
parameter model may easily be fitted (usually as many doses as parameters
are said to be required, but less will also do sometimes, possibly depending
on the choice of model function and the number of replicates). The built-in
function LL.4() in drc provides the four-parameter log-logistic model that
is commonly used in toxicology (see Section B.1 for more details). In short,
this model has four parameters: a lower limit, an upper limit, a parameter
corresponding to ED50, and a parameter for the relative slope at the dose
equal to ED50 (see Figure 1.1).
4 Dose-response analysis using R
plot(secalonic.LL.4,
bp = 1e-3, broken = TRUE,
ylim = c(0, 7),
xlab = "Dose (mM)",
ylab = "Root length (cm)")
7
6
Root length (cm)
5
4
3
2
1
0
0 0.01 0.1
Dose (mM)
FIGURE 1.1
The four-parameter log-logistic model fitted to dose-response data from the
dataset secalonic is plotted together with the original data (no replicates).
Continuous data 5
instance we must choose a smaller value to ensure that the bp value is smaller
than all positive concentrations/doses (otherwise some observations are not
displayed!). The argument log = "" may be used to switch off the default
logarithmic dose axis.
A summary of the fit is obtained using the summary method when applied
to the model fit secalonic.LL.4:
summary(secalonic.LL.4)
##
## Model fitted: Log-logistic (ED50 as parameter) (4 parms)
##
## Parameter estimates:
##
## Estimate Std. Error t-value p-value
## b:(Intercept) 2.6542086 0.6962333 3.8122 0.0317398 *
## c:(Intercept) 0.0917852 0.3747246 0.2449 0.8223012
## d:(Intercept) 5.5297495 0.2010300 27.5071 0.0001055 ***
## e:(Intercept) 0.0803547 0.0078829 10.1935 0.0020121 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error:
##
## 0.2957497 (3 degrees of freedom)
The output shows the type of model that was fitted and the parameter esti-
mates for the four model parameters together with the corresponding (esti-
mated) standard errors. Briefly, the parameter b, c, d, and e refer to the slope
parameter for a dose equal to e, which is the dose resulting in a reduction
halfway between the upper limit d and the lower limit c, which is also called
ED50. We refer to Subsection B.1.1 for more details about the four-parameter
log-logistic model.
For each parameter, there are also t-values, which are parameter esti-
mates divided by their standard error, and the resulting p-values, looked up
in an appropriate t distribution; each of them corresponds to testing the null
hypothesis that the parameter is equal to 0 (not necessarily a relevant null
hypothesis to consider). The estimated residual variance is also shown, al-
though it is hardly reported in any publications, but it may still be useful
for understanding variation in the experiment. Possibly, the most interesting
parameter in the summary output is the parameter estimate for e; it is equal
to 0.08 (0.0079).
We also see that the estimated lower limit, which is equal to 0.0918 with a
standard error of 0.375, is not significantly different from zero (p-value = 0.82),
possibly indicating that a three-parameter log-logistic model (with an assumed
lower limit of 0) would also fit the data. However, such ad hoc data-driven