0% found this document useful (0 votes)

42 views37 pages

Introduction To Model Parameter Estimation

This document provides an introduction to model parameter estimation. It discusses using linear regression to estimate the basic reproduction number (R0) from epidemic curve data. Specifically, it shows how to estimate R0 for an influenza outbreak by fitting a linear model to the log-transformed data from the early exponential growth phase. It finds R0 is approximately 3.7 for this outbreak. It also discusses how the R0 estimate depends on how many initial data points are used. The document concludes by introducing a more general method of fitting deterministic epidemiological models to data using least squares to estimate parameters like R0 and transmission rate.

Uploaded by

Kutha Ardana

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

42 views37 pages

Introduction To Model Parameter Estimation

Uploaded by

Kutha Ardana

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 37

Introduction to model parameter estimation

Aaron A. King
with contributions from
Ottar Bjørnstad, Ben Bolker, John Drake, Pej Rohani, and Dave Smith

October 30, 2017

Licensed under the Creative Commons attribution-noncommercial license,

http://creativecommons.org/licenses/by-nc/3.0/. Please
share and remix noncommercially, mentioning its origin.

1 Introduction
This course will focus on the utility of models in understanding, predicting, and controlling
infectious disease systems. Models play a central role in this work because they allow us to
precisely and quantitatively express our ideas about the mechanisms of infectious disease trans-
mission, immunity, and ecology. To the extent our understanding of the important mechanisms
is correct, such models are extremely useful in the design of policy. On the other hand, mod-
els are useful in uncovering the important mechanisms, inasmuch as we can compare model
predictions to data. Specifically, if we can translate competing hypotheses into mathematical
models, these can be compared in terms of their ability to correctly capture the patterns seen
in data. In order to fairly compare competing models, we must first try to find out what is the
best each model can do. Practically speaking, this means that we have to find the values of
the models’ parameters that give the closest correspondence between model predictions and
data. Parameter estimation can be important even when we are fairly confident in the ability
of a single model to explain the dynamics. Not surprisingly, the all-important quantity R0 is
frequently the focus of considerable parameter-estimation effort. Here, we’ll try our hand as
estimating R0 and other model parameters from an epidemic curve using a couple of different
methods.

1
2 Estimating R0 in an invasion
We saw in the lecture how, during the early stages of an outbreak, the number of infected
individuals Y is approximately

Y ≈ Y0 e((R0 −1) (γ+µ) t)

1 1
where Y0 is the (small) number of infectives at time 0, γ
is the infectious period, and µ
is the
host lifespan. Taking logs of both sides, we get

log Y ≈ log Y0 + (R0 − 1) (γ + µ) t,

which implies that a semi-log plot of Y vs t should be approximately linear with a slope
proportional to R0 − 1 and the recovery rate.
We can plot the 1977 boarding school influenza data in this way to see if this is the case.

url <- "https://kingaa.github.io/thid/data/bbs.csv"

flu <- read.csv(url)
## or download the file and do:
## flu <- read.csv(file="bbs.csv")
plot(flu˜day,data=flu,type='b',bty='l',
main='boarding school influenza outbreak',
xlab='day',ylab='Influenza cases')

plot(flu˜day,data=flu,type='b',log='y',bty='l',
xlab='day',ylab='Influenza cases')

2
Plotted on a log scale, the linearity of the first several data points is indeed striking. This
suggests that we can obtain a cheap and cheerful estimate of R0 by a simple linear regression.
fit <- lm(log(flu)˜day,data=subset(flu,day<=4))
summary(fit)

##
## Call:
## lm(formula = log(flu) ˜ day, data = subset(flu, day <= 4))
##
## Residuals:
## 1 2 3 4
## 0.03073 -0.08335 0.07450 -0.02188
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.02703 0.10218 -0.265 0.81611
## day 1.09491 0.03731 29.346 0.00116 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.08343 on 2 degrees of freedom
## Multiple R-squared: 0.9977,Adjusted R-squared: 0.9965
## F-statistic: 861.2 on 1 and 2 DF, p-value: 0.001159

coef(fit)

## (Intercept) day
## -0.02703361 1.09491261

3
slope <- coef(fit)[2]
slope

## day
## 1.094913

Now, we know that influenza’s infectious period is about 2.5 da. Moreover, since this is far
shorter than an average human life (µ γ), we can neglect µ in our estimating equation. Thus
our estimate of R0 is

R̂0 = slope/γ + 1 ≈ 1.1 × 2.5 + 1 ≈ 3.7.

Our strategy in this case has been to redefine the problem so that it fits a standard form, i.e., we
showed how to rearrange the model so that the relevant quantity (R0 ) could be obtained from
linear regression. We can get a rough estimate of the uncertainty in our estimate by looking at
the standard errors in our estimator.

coef(summary(fit))

## Estimate Std. Error t value Pr(>|t|)

## (Intercept) -0.02703361 0.10217980 -0.264569 0.816111672
## day 1.09491261 0.03731079 29.345738 0.001159189

slope.se <- coef(summary(fit))[2,2]

2.5*slope.se

## [1] 0.09327697

So we reckon we’ve got an error of ±0.09 in our estimate of R0 , i.e., we feel pretty confident
that 3.55 < R0 < 3.92.
A defect of this method is that it uses only a small amount of the data to compute an important
quantity. Moreover, we have to make a subjective judgement as to how much of the data to
use. Further, as we use more data, and presumably obtain more precise estimates, we simulata-
neously get further from the realm where our approximation is valid, which introduces greater
bias. Let’s see how our estimates of R0 depend on what we choose to be the “initial phase” of
the outbreak. Below, we estimate R0 and its standard error using the first 2, 3, 4, . . . , 10 data
points.

days <- 2:10

slope <- numeric(length=length(days))
slope.se <- numeric(length=length(days))

4
for (k in seq_along(days)) {
fit <- lm(log(flu)˜day,data=subset(flu,day<=days[k]))
slope[k] <- coef(summary(fit))[2,1]
slope.se[k] <- coef(summary(fit))[2,2]
}
R0.hat <- slope*2.5+1
R0.se <- slope.se*2.5
plot(slope˜days,type='o')

We’ll plot these estimates against their uncertainties to show this precision-accuracy tradeoff.

plot(slope.se˜days,type='o')

5
plot(range(days),
range(c(R0.hat-2*R0.se,R0.hat+2*R0.se),na.rm=T),
type='n',bty='l',
xlab="length of initial phase (da)",
ylab=expression("estimated"˜R[0]))
lines(R0.hat˜days,type='o',lwd=2)
lines(R0.hat+2*R0.se˜days,type='l')
lines(R0.hat-2*R0.se˜days,type='l')

Exercise 1. Biweekly data for outbreaks of measles in three communities within Niamey,
Niger are provided in the file https://kingaa.github.io/thid/data/niamey.

6
csv. Use this method to obtain estimates of R0 for measles using the data from each of the
communities of Niamey, assuming that the infectious period is approximately two weeks.

3 Fitting deterministic dynamical epidemiological models to

data
Now we move on to a much more general but considerably more complicated technique for
estimating R0 . The method of least squares gives us a way to quantify the discrepancy between
the data and a model’s predictions. We can then search over all possible values of a model’s
parameters to find the parameters that minimize this discrepancy.
We’ll illustrate this method using the Niamey data, which we’ll load and plot using the follow-
ing commands:

url <- "https://kingaa.github.io/thid/data/niamey.csv"

niamey <- read.csv(url)
plot(measles˜biweek,data=niamey,type='n')
lines(measles˜biweek,data=subset(niamey,community=="A"),col=1)
lines(measles˜biweek,data=subset(niamey,community=="B"),col=2)
lines(measles˜biweek,data=subset(niamey,community=="C"),col=3)
legend("topleft",col=1:3,lty=1,bty='n',
legend=paste("community",c("A","B","C")))

7
Since this is a single outbreak, and the number of births and deaths into the population over the
course of the outbreak is small relative to the size of the population, we can treat this outbreak
as if it were occurring in a closed population. We saw earlier how we can formulate the SIR
equations in such a case and how we can solve the model equations to obtain trajectories.
Before, we formulated the model in terms of the fractions, S, I, R of the host population in
each compartment. Here, however, we need to track the numbers, X, Y , Z, of individuals in
the S, I, and R compartments, respectively. In terms of X, Y , Z, our frequency-dependent SIR
model is
dX βXY
=−
dt N
dY βXY
= −γY
dt N
dZ
=γY
dt
Where N = X + Y + Z is the total host population size. A function suitable for use with
deSolve for the closed SIR epidemic is:

require(deSolve)

closed.sir.model <- function (t, x, params) {

X <- x[1]
Y <- x[2]
Z <- x[3]

beta <- params["beta"]

gamma <- params["gamma"]
pop <- params["popsize"]

dXdt <- -betaXY/pop

dYdt <- beta*X*Y/pop-gamma*Y
dZdt <- gamma*Y

list(c(dXdt,dYdt,dZdt))
}

Thus far, we have only considered deterministic models. In the next lab, we will begin to think
about more realistic models that begin to take into account some aspects of the stochastic nature
of real epidemics. For now, under the assumption that the epidemic is deterministic, parameter
estimation is a matter of finding the model trajectory that gives the best fit to the data. The first
thing we need is a function that computes a trajectory given parameters of the model.

prediction <- function (params, times) {

xstart <- params[c("X.0","Y.0","Z.0")]
out <- ode(
func=closed.sir.model,

8
y=xstart,
times=times,
parms=params
)
out[,3] # return the number of infectives
}

Now we set up a function that will calculate the sum of the squared differences (or errors)
between the data and the model predictions.

sse <- function (params, data) {

times <- c(0,data$biweek/26) # convert to years
pred <- prediction(params,times)
discrep <- pred[-1]-data$measles
sum(discrepˆ2) # sum of squared errors
}

To get a sense of what this gives us, let’s explore varying some parameters and computing the
SSE for community “A” of the Niamey data set. To begin with, we’ll assume we know that
γ = 365/13 and that the initial numbers of susceptibles, infectives, and recovereds, X0 , Y0 , Z0
were 10000, 10, and 20000, respectively. We’ll write a little function that will plug a value of
β into the parameter vector and compute the SSE.

dat <- subset(niamey,community=="A")

params <- c(X.0=10000,Y.0=10,Z.0=39990,popsize=50000,
gamma=365/13,beta=NA)
f <- function (beta) {
params["beta"] <- beta
sse(params,dat)
}
beta <- seq(from=0,to=1000,by=5)
SSE <- sapply(beta,f)

We take our estimate, β̂ to be the value of β that gives the smallest SSE.

beta.hat <- beta[which.min(SSE)]

Fig. 1A shows SSE vs. β. What does the SIR model predict at β = β̂? We compute the model’s
trajectory to find out:

params["beta"] <- beta.hat

plot(measles˜biweek,data=dat)
lines(dat$biweek,prediction(params,dat$biweek/26))

This plot is Fig. 1B.

9
Exercise 2. Use this method to obtain estimates of R0 for measles from each of the three
communities in Niamey. You may again assume that the infectious period is approximately
two weeks.

Clearly, this fit leaves much to be desired, but recall that we’ve here assumed that we know the
correct values for all parameters but β. In particular, we’ve assumed we know the infectious
period, and the initial conditions. [NB: the initial value of Z is entirely irrelevant. Why?] Let’s
see what happens when we try to estimate two parameters at once.

dat <- subset(niamey,community=="A")

params <- c(X.0=NA,Y.0=10,Z.0=1,popsize=50000,
gamma=365/13,beta=NA)
f <- function (par) {
params["beta"] <- par[1]
params["X.0"] <- par[2]
sse(params,dat)
}
beta <- seq(from=100,to=300,by=5)
X.0 <- seq(from=4000,to=20000,by=1000)
grid <- expand.grid(beta=beta,X.0=X.0)
grid$SSE <- apply(grid,1,f)

We can visualize this as a surface. A convenient function for this is contourplot from the
lattice package:

require(lattice)
contourplot(sqrt(SSE)˜beta+X.0,data=grid,cuts=30)

10
Figure 1: (A) Discrepancy (sum of squared error) between SIR model predictions and the
measles data from Niamey community “A” as a function of β. The vertical line gives the least-
squared estimate, β̂ = 220. (B) Trajectory of the model at β = β̂ (solid curve) compared to
the data (circles).

11
Exercise 3. Discuss the shape of this surface: what does it tell us about the uncertainty in the
model’s parameters?

*Exercise 4. Repeat the estimation using a closed SEIR model. Assume that the infectious
period is 5 da and the latent period is 8 da. How and why does your estimate of R0 differ from
that you obtained using the SIR model?

4 Optimization algorithms
When we have more than two parameters to estimate (as we usually will), we cannot rely on
grid searches or simple graphical techniques to find the region of parameter space with the
best parameters. We need more systematic ways of searching through the parameter space.
Mathematicians have devoted much study to optimization algorithms, and there are many of
these. Many of them are implemented in R.
The first place to go is the function optim, which implements several common, well-studied,
generally-useful optimization algorithms.

?optim

To use it, we have to specify the function we want to minimize and a starting value for the
parameters. Starting from this point, optim’s algorithms will search the parameter space for
the value that minimizes the value of our objective function.

12
We’ll write an objective function to try to estimate β, X0 , and Y0 simultaneously. For the
moment, we’ll continue to assume that the recovery rate γ is known.
dat <- subset(niamey,community=="A")
params <- c(X.0=NA,Y.0=NA,Z.0=1,popsize=50000,
gamma=365/13,beta=NA)
f <- function (par) {
params[c("X.0","Y.0","beta")] <- par
sse(params,dat)
}
optim(fn=f,par=c(10000,10,220)) -> fit

fit

## $par
## [1] 9150.769987 1.017659 271.651714
##
## $value
## [1] 166584.5
##
## $counts
## function gradient
## 273 NA
##
## $convergence
## [1] 0
##
## $message
## NULL

Exercise 5. In the foregoing, we’ve estimated parameters by minimizing the sum of squared
differences between model-predicted number of cases and the data. What would happen if
we tried to minimize the squared error on the log scale, i.e., to minimize (log(model) −
2
log(data))
√ ? What
√ would happen if we minimized the squared error on the square-root scale,
2
i.e., ( model − data) ? What’s the “right” scale to choose?
*Exercise 6. Try to estimate all four parameters at once. Start your algorithm from several
places to check that they all converge to the same place. You may find it useful to restart the
optimizer to verify its convergence.
*Exercise 7. Fig. 1 shows a second local minimum of the SSE at a much higher value of β.
Why is this?

Many other optimization algorithms exist. optim implements several of these (see ?optim).
Other functions and packages you might look into include: constrOptim, optimx, nlm,
nlminb, nloptr (from the nloptr package), and subplex (from the subplex package).

13
5 The likelihood
We have seen that fitting mechanistic models to data is a powerful and general approach to
estimating parameters. We saw too that least-squares fitting, is a straightforward way to do
this. However, several issues arose. First, there was an element of arbitrariness in the choice of
discrepancy measure. Second, although we could fairly easily obtain point estimates of model
parameters using least-squares, it was not clear how we could obtain concomitant estimates
of parameter uncertainty (e.g., confidence intervals). Finally, we began to see that there are
limits to our ability to estimate parameters. In this lab, we’ll explore these issues, and see that
likelihood offers an attractive resolution to the first and second of these, but that the third is a
fundamental challenge.
Likelihood has many advantages:

1. fidelity to model
2. a deep and general theory
3. a sound theoretical basis for confidence intervals and model selection
4. statistical efficiency

and some disadvantages:

1. fidelity to model
2. fragility (lack of robustness)

General definition

Likelihood is the probability of a given set of data D having occurred under a particular
hypothesis H:
L(H, D) = P [D|H]

A simple example: suppose n individuals participate in a serological survey and k of these

individuals are found to be seropositive. One parameter of interest is the true fraction, p, of
the population that has seroconverted. Assuming the sample was drawn at random and the
population is large, then the probability of the data (m of n individuals seropositive) given the
hypothesis that the true probability is p is

n k
P [D|H] = p (1 − p)n−k
k

If the true seroprevalence was, say, p = 0.3, what does the probability of observing k seropos-
itives in a sample of size n = 50 look like?

14
p <- 0.3
n <- 50
k <- seq(0,50,by=1)
prob <- dbinom(x=k,size=n,prob=p)
plot(k,prob,type='h',lwd=5,lend=1,
ylab="probability")

The likelihood is a function of the unknown parameters. In this case, if we assume n is known,
then the likelihood is a function of p alone:

n k
L(p) = p (1 − p)n−k
k

Typically the logarithm of this function is more interesting than L itself. Looking at this func-
tion for each of two different surveys:

k1 <- 18
n1 <- 50
p <- seq(0,1,by=0.001)
plot(p,dbinom(x=k1,size=n1,prob=p,log=TRUE),
ylim=c(-10,-2),ylab="log-likelihood",
type='l')
abline(h=dbinom(x=k1,size=n1,prob=k1/n1,log=TRUE)-
0.5*qchisq(p=0.95,df=1),col='red')
abline(v=k1/n1,col='blue')

15
k2 <- 243
n2 <- 782
p <- seq(0,1,by=0.001)
plot(p,dbinom(x=k2,size=n2,prob=p,log=TRUE),
ylim=c(-10,-2),ylab="log-likelihood",
type='l')
abline(h=dbinom(x=k2,size=n2,prob=k2/n2,log=TRUE)-
0.5*qchisq(p=0.95,df=1),col='red')
abline(v=k2/n2,col='blue')

In the above two plots, the likelihood is a function of the model parameter p. Vertical lines show

16
the maximum likelihood estimate (MLE) of p. Horizontal lines show the critical likelihoods
for the likelihood ratio test at the 95% confidence level.

Exercise 8. How do the two curves just plotted differ from one another? What features of the
data are responsible for the differences?

From data points to data sets

Let’s suppose we have three samples, D1 , D2 , D3 , taken by three different researchers, for the
same large population. If these samples are independent, then

P [D|H] = P [D1 |H] × P [D2 |H] × P [D3 |H]

which means that the likelihood of the full data set is the product of the likelihoods from each
of the samples. In other words, the likelihood gives a general recipe for combining data from
different studies. We’d compute the likelihood as follows:

n <- c(13,484,3200)
k <- c(4,217,1118)
dbinom(x=k,size=n,prob=0.2,log=TRUE)

## [1] -1.873761 -79.243371 -197.561806

sum(dbinom(x=k,size=n,prob=0.2,log=TRUE))

## [1] -278.6789

ll.fn <- function (p) {

sum(dbinom(x=k,size=n,prob=p,log=TRUE))
}
p <- seq(0,1,by=0.001)
loglik <- sapply(p,ll.fn)
plot(p,loglik,type='l',ylim=max(loglik)+c(-10,0))

17
6 Fitting SIR to an epidemic curve using likelihood
Let’s revisit the model-fitting we did yesterday for the case of measles in Niger. We’ll sim-
plify the model slightly to eliminate some unnecessary and wasteful elements. Our frequency-
dependent SIR model, again, is
dX βXY
=−
dt N
dY βXY
= −γY
dt N
dZ
=γY
dt
Notice that 1. the Z equation is irrelevant for the dynamics of the epidemic and we can drop it
entirely, and 2. β only ever occurs in combination with N , so we can combine these two into
a single parameter by defining b = β/N . We can modify the R codes we used before to take
account of this.

require(deSolve)

closed.sir.model <- function (t, x, params) {

inc <- params["b"]*x[1]*x[2] # incidence
list(c(-inc,inc-params["gamma"]*x[2]))
}

18
prediction <- function (params, times) {
out <- ode(
func=closed.sir.model,
y=params[c("X.0","Y.0")],
times=c(0,times),
parms=params
)
## return the Y variable only
## and discard Y(0)
out[-1,3]
}

Earlier, we used SSE as a measure of the discrepancy between model predictions and data:

sse <- function (params, data) {

times <- data$biweek/26 # convert to years
pred <- prediction(params,times)
discrep <- pred-data$measles
sum(discrepˆ2) # sum of squared errors
}

Now let’s use likelihood instead. Let’s suppose that, when we record cases, we make errors
that are normal. Here’s how we can compute the likelihood of the data given the model and its
parameters:

loglik <- function (params, data) {

times <- data$biweek/26
pred <- prediction(params,times)
sum(dnorm(x=data$measles,mean=pred,sd=params["sigma"],log=TRUE))
}

dat <- subset(niamey,community=="A")

params <- c(X.0=10000,Y.0=10,gamma=365/13,b=NA,sigma=1)

f <- function (b) {

par <- params
par["b"] <- b
loglik(par,dat)
}

b <- seq(from=0,to=0.02,by=0.0001)
ll <- sapply(b,f)

We plot the results:

19
plot(b,-ll,type='l',ylab=expression(-log(L)))
b.hat <- b[which.max(ll)]
abline(v=b.hat,lty=2)

The great similarity in the likelihood estimate to our first least-squares estimate is no accident.
Why is this? Let yt be the observed number of infectives at time t and Yt be the model’s
prediction. Then the log likelihood is

(yt − Yt )2

1
log P [yt |Yt ] = log √ exp −
2πσ 2 2σ 2
(yt − Yt )2
= − 21 log 2πσ 2 − 12
σ2
and
!
1 X
log L = − 12 (yt − Yt )2 + log (σ 2 ) + log (2π)
σ2 t

So MLE and least-squares are equivalent if the errors are normal with constant variance!

Exercise 9. Suppose, alternatively, that the errors are log-normal with constant variance. Under
what definition of SSE will least-squares and maximum likelihood give the same parameter
estimates?

20
7 Modeling the noise
All this raises the question of what the best model for the errors really is. Of course, the answer
will certainly depend on the nature of the data. The philosophy of likelihood encourages us
to think about the question mechanistically. When the data, yt , are the result of a sampling
process, for example, we can think of them as binomial samples
n
yt ∼ binomial Yt ,
N
where n is the sample size, N the population size, and Yt is the true number of infections at
time t. Alternatively, we might think of yt as Poisson samples
yt ∼ Poisson(p Yt )
where the parameter p reflects a combination of sampling efficiency and the detectability of
infections. The latter leads to the following log-likelihood function

poisson.loglik <- function (params, data) {

times <- data$biweek/26
pred <- prediction(params,times)
sum(dpois(x=data$measles,lambda=params["p"]*pred[-1],log=TRUE))
}

Let’s see what the MLE parameters are for this model. We’ll start by estimating just one
parameter. Now, we must have b > 0. This is a constraint on the parameter. One way to
enforce this constraint is by transforming the parameter so that it cannot ever be negative.
We’ll log-transform b.

dat <- subset(niamey,community=="A")

params <- c(X.0=20000,Y.0=1,gamma=365/13,b=NA,p=0.2)

## objective function (-log(L))

f <- function (log.b) {
params[c("b")] <- exp(log.b) # un-transform 'b'
-poisson.loglik(params,dat)
}

For something new, we’ll use the mle2 function from the bbmle package to maximize the
likelihood. mle2 will employ an iterative algorithm for maximizing the likelihood. To get
started, it needs an initial guess.

require(bbmle)
guess <- list(log.b=log(0.01))

fit0 <- mle2(f,start=guess)

print(fit0)

21
## An object of class "mle2"
## Slot "call":
## mle2(minuslogl = f, start = guess, lower = -Inf, upper = Inf,
## control = list())
##
## Slot "call.orig":
## mle2(minuslogl = f, start = guess)
##
## Slot "coef":
## log.b
## -5.977186
##
## Slot "fullcoef":
## log.b
## -5.977186
##
## Slot "vcov":
## log.b
## log.b 2.565754e-06
##
## Slot "min":
## [1] 1963.317
##
## Slot "details":
## $par
## log.b
## -5.977186
##
## $value
## [1] 1963.317
##
## $counts
## function gradient
## 46 9
##
## $convergence
## [1] 0
##
## $message
## NULL
##
## $hessian
## [,1]
## [1,] 389749
##
## $maxgrad

22
## [1] 1.421447
##
## $eratio
## [1] 1
##
##
## Slot "minuslogl":
## function (log.b) {
## params[c("b")] <- exp(log.b) # un-transform 'b'
## -poisson.loglik(params,dat)
## }
## <environment: 0x455be78>
##
## Slot "method":
## [1] "BFGS"
##
## Slot "data":
## list()
##
## Slot "formula":
## [1] ""
##
## Slot "optimizer":
## [1] "optim"

fit <- mle2(f,start=as.list(coef(fit0)))

print(fit)

## An object of class "mle2"

23
## log.b
## log.b 2.565754e-06
##
## Slot "min":
## [1] 1963.317
##
## Slot "details":
## $par
## log.b
## -5.977186
##
## $value
## [1] 1963.317
##
## $counts
## function gradient
## 12 1
##
## $convergence
## [1] 0
##
## $message
## NULL
##
## $hessian
## [,1]
## [1,] 389749
##
## $maxgrad
## [1] 1.421508
##
## $eratio
## [1] 1
##
##
## Slot "minuslogl":
## function (log.b) {
## params[c("b")] <- exp(log.b) # un-transform 'b'
## -poisson.loglik(params,dat)
## }
## <environment: 0x66ee3c0>
##
## Slot "method":
## [1] "BFGS"
##
## Slot "data":

24
## list()
##
## Slot "formula":
## [1] ""
##
## Slot "optimizer":
## [1] "optim"

We can get an idea about the uncertainty and in particular obtain confidence intervals using
the profile likelihood. To profile over a parameter, we fix the value of that parameter at each
of several values, then maximize the likelihood over the remaining unknown parameters. In
bblme, this is quite easy to obtain.

prof.b <- profile(fit)

plot(prof.b)

Now let’s try to estimate both b and the reporting probability p. Since we have constraints on p
(0 ≤ p ≤ 1), we’ll transform it as well. For this, the logit function and its inverse are useful:
p 1
logit(p) = log ilogit(x) =
1−p 1 + exp(−x)

dat <- subset(niamey,community=="A")

params <- c(X.0=20000,Y.0=1,gamma=365/13,b=NA,p=NA)

logit <- function (p) log(p/(1-p)) # the logit transform

ilogit <- function (x) 1/(1+exp(-x)) # inverse logit

25
f <- function (log.b, logit.p) {
par <- params
par[c("b","p")] <- c(exp(log.b),ilogit(logit.p))
-poisson.loglik(par,dat)
}

guess <- list(log.b=log(0.005),logit.p=logit(0.2))

fit0 <- mle2(f,start=guess); fit0

## An object of class "mle2"

## Slot "call":
## mle2(minuslogl = f, start = guess, lower = -Inf, upper = Inf,
## control = list())
##
## Slot "call.orig":
## mle2(minuslogl = f, start = guess)
##
## Slot "coef":
## log.b logit.p
## -5.9918688 -0.1470357
##
## Slot "fullcoef":
## log.b logit.p
## -5.9918688 -0.1470357
##
## Slot "vcov":
## log.b logit.p
## log.b 2.882471e-06 -9.272553e-06
## logit.p -9.272553e-06 6.151853e-04
##
## Slot "min":
## [1] 394.1967
##
## Slot "details":
## $par
## log.b logit.p
## -5.9918688 -0.1470357
##
## $value
## [1] 394.1967
##
## $counts
## function gradient
## 51 10
##

26
## $convergence
## [1] 0
##
## $message
## NULL
##
## $hessian
## [,1] [,2]
## [1,] 364603.198 5495.584
## [2,] 5495.584 1708.360
##
## $maxgrad
## [1] 0.04106786
##
## $eratio
## [1] 0.00445631
##
##
## Slot "minuslogl":
## function (log.b, logit.p) {
## par <- params
## par[c("b","p")] <- c(exp(log.b),ilogit(logit.p))
## -poisson.loglik(par,dat)
## }
## <environment: 0x60384f8>
##
## Slot "method":
## [1] "BFGS"
##
## Slot "data":
## list()
##
## Slot "formula":
## [1] ""
##
## Slot "optimizer":
## [1] "optim"

fit <- mle2(f,start=as.list(coef(fit0))); print(fit)

## An object of class "mle2"

## Slot "call":
## mle2(minuslogl = f, start = as.list(coef(fit0)), lower = -Inf,
## upper = Inf, control = list())
##

27
## Slot "call.orig":
## mle2(minuslogl = f, start = as.list(coef(fit0)))
##
## Slot "coef":
## log.b logit.p
## -5.9918687 -0.1470357
##
## Slot "fullcoef":
## log.b logit.p
## -5.9918687 -0.1470357
##
## Slot "vcov":
## log.b logit.p
## log.b 2.882471e-06 -9.272548e-06
## logit.p -9.272548e-06 6.151852e-04
##
## Slot "min":
## [1] 394.1967
##
## Slot "details":
## $par
## log.b logit.p
## -5.9918687 -0.1470357
##
## $value
## [1] 394.1967
##
## $counts
## function gradient
## 11 1
##
## $convergence
## [1] 0
##
## $message
## NULL
##
## $hessian
## [,1] [,2]
## [1,] 364603.158 5495.581
## [2,] 5495.581 1708.360
##
## $maxgrad
## [1] 0.01437765
##
## $eratio

28
## [1] 0.004456311
##
##
## Slot "minuslogl":
## function (log.b, logit.p) {
## par <- params
## par[c("b","p")] <- c(exp(log.b),ilogit(logit.p))
## -poisson.loglik(par,dat)
## }
## <environment: 0x5519cd8>
##
## Slot "method":
## [1] "BFGS"
##
## Slot "data":
## list()
##
## Slot "formula":
## [1] ""
##
## Slot "optimizer":
## [1] "optim"

## now untransform the parameters:

mle <- with(
as.list(coef(fit)),
c(
b=exp(log.b),
p=ilogit(logit.p)
)
)
mle

## b p
## 0.00249899 0.46330717

prof2 <- profile(fit)

plot(prof2)

29
We can also get confidence intervals:

ci <- confint(prof2)
ci

## 2.5 % 97.5 %
## log.b -5.9951952 -5.98854089
## logit.p -0.1953059 -0.09803465

ci[1,] <- exp(ci[1,])

ci[2,] <- ilogit(ci[2,])
rownames(ci) <- c("b","p")
ci

## 2.5 % 97.5 %
## b 0.002490691 0.00250732
## p 0.451328148 0.47551095

Let’s make a contour plot to visualize the likelihood surface.

dat <- subset(niamey,community=="A")

## this time the objective function has to

## take a vector argument
f <- function (pars) {
par <- params
par[c("b","p")] <- as.numeric(pars)

30
poisson.loglik(par,dat)
}

b <- seq(from=0.001,to=0.005,by=0.0001)
p <- seq(0,1,by=0.05)
grid <- expand.grid(b=b,p=p)
grid$loglik <- apply(grid,1,f)
grid <- subset(grid,is.finite(loglik))
require(lattice)
contourplot(loglik˜b+p,data=grid,cuts=20)

The scale over which the log likelihood is varying is clearly huge relative to what is meaningful.
Let’s focus in on the region around the MLE.

b <- seq(from=0.00245,to=0.00255,length=50)
p <- seq(0.44,0.49,length=50)
grid <- expand.grid(b=b,p=p)
grid$loglik <- apply(grid,1,f)
grid <- subset(grid,is.finite(loglik))
require(lattice)
contourplot(loglik˜b+p,data=grid,cuts=20)

31
Let’s look at the model’s predictions at the MLE. The model is a probability distribution, so
we should look at a number of simulations. An important question is: are the data a plausible
sample from the predicted probability distribution?

params[c("b","p")] <- mle

times <- c(dat$biweek/26)
model.pred <- prediction(params,times)

nsim <- 1000

simdat <- replicate(
n=nsim,
rpois(n=length(model.pred),
lambda=params["p"]*model.pred)
)
quants <- t(apply(simdat,1,quantile,probs=c(0.025,0.5,0.975)))
matplot(times,quants,col="blue",lty=c(1,2,1),type='l')
points(measles˜times,data=dat,type='b',col='red')

32
Clearly the model is not doing a very good job of capturing the pattern in the data. It appears
that we will need an error model that has the potential for more variability than does the Pois-
son. Recall that, under the Poisson assumption, the variance of the error is equal to the mean.
The negative binomial distribution is such a distribution. Let’s explore the alternative assump-
tion that yt is negative-binomially distributed with mean p Yt , as before, but larger variance,
p Yt (1 + θ p Yt ), i.e.,
1
yt ∼ Negbin mu = p Yt , size =
θ

loglik <- function (params, data) {

times <- data$biweek/26
pred <- prediction(params,times)
sum(dnbinom(x=data$measles,
mu=params["p"]*pred[-1],size=1/params["theta"],
log=TRUE))
}

f <- function (log.b, logit.p, log.theta) {

par <- params
par[c("b","p","theta")] <- c(exp(log.b),
ilogit(logit.p),
exp(log.theta))
-loglik(par,dat)
}

guess <- list(log.b=log(params["b"]),

logit.p=logit(params["p"]),
log.theta=log(1))

33
fit0 <- mle2(f,start=guess)
fit <- mle2(f,start=as.list(coef(fit0)))
print(fit)

## An object of class "mle2"

## Slot "call":
## mle2(minuslogl = f, start = as.list(coef(fit0)), lower = -Inf,
## upper = Inf, control = list())
##
## Slot "call.orig":
## mle2(minuslogl = f, start = as.list(coef(fit0)))
##
## Slot "coef":
## log.b logit.p log.theta
## -5.933817 1.407372 -0.594814
##
## Slot "fullcoef":
## log.b logit.p log.theta
## -5.933817 1.407372 -0.594814
##
## Slot "vcov":
## log.b logit.p log.theta
## log.b 0.0008464986 -0.005478065 0.0004644563
## logit.p -0.0054780649 1.036557802 0.0340473405
## log.theta 0.0004644563 0.034047340 0.1211614730
##
## Slot "min":
## [1] 102.4639
##
## Slot "details":
## $par
## log.b logit.p log.theta
## -5.933817 1.407372 -0.594814
##
## $value
## [1] 102.4639
##
## $counts
## function gradient
## 12 1
##
## $convergence
## [1] 0
##
## $message
## NULL

34
##
## $hessian
## [,1] [,2] [,3]
## [1,] 1228.367361 6.7083381 -6.5938786
## [2,] 6.708338 1.0103546 -0.3096332
## [3,] -6.593879 -0.3096332 8.3657348
##
## $maxgrad
## [1] 0.005287511
##
## $eratio
## [1] 0.0007843522
##
##
## Slot "minuslogl":
## function (log.b, logit.p, log.theta) {
## par <- params
## par[c("b","p","theta")] <- c(exp(log.b),
## ilogit(logit.p),
## exp(log.theta))
## -loglik(par,dat)
## }
## <environment: 0x59e4f20>
##
## Slot "method":
## [1] "BFGS"
##
## Slot "data":
## list()
##
## Slot "formula":
## [1] ""
##
## Slot "optimizer":
## [1] "optim"

prof3 <- profile(fit)

plot(prof3)

35
mle <- with(
as.list(coef(fit)),
c(
b=exp(log.b),
p=ilogit(logit.p),
theta=exp(log.theta)
)
)

params[c("b","p","theta")] <- mle

times <- c(dat$biweek/26)
model.pred <- prediction(params,times)

nsim <- 1000

simdat <- replicate(
n=nsim,
rnbinom(n=length(model.pred),
mu=params["p"]*model.pred,
size=1/params["theta"])
)
quants <- t(apply(simdat,1,quantile,probs=c(0.025,0.5,0.975)))
matplot(times,quants,col="blue",lty=c(1,2,1),type='l')
lines(times,simdat[,1],col='black')
points(measles˜times,data=dat,type='b',col='red')

36
What does this plot tell us? Essentially, the deterministic SIR model, as we’ve written it,
cannot capture the shape of the epidemic. In order to fit the data, the optimization algorithm
has expanded the error variance, to the point of absurdity. The typical model realization (in
black) does not much resemble the data.
Exercise 10. Revisit the other communities of Niamey and/or the British boarding school
influenza data using the Poisson model and bbmle.
*Exercise 11. Try to estimate p, b, and X0 simultaneously.
*Exercise 12. Reformulate the problem using the binomial error model. Modify the parameter
estimation codes appropriately, estimate the parameters, and comment on the results.
*Exercise 13. Reformulate the problem using a normal error model in which the variance is
proportional to the mean: p
yt ∼ normal p Yt , σ Yt .
Modify the parameter estimation codes appropriately, estimate the parameters (including both
p and σ), and comment on the results.
*Exercise 14. We’ve been treating the Niamey data as if they were direct—though inaccurate—
measurements of the prevalence. Actually, these are incidence data: they are measures of
unique infections. It would be more appropriate to model these data by adding another equa-
tion
dC βXY
=
dt N
to accumulate new infections and assuming the data are distributed according to, for example,
yt ∼ Poisson (p (Ct − Ct−1 )) .
Modify the codes above to reflect these more appropriate assumptions, estimate the parameters,
and comment on the results.

Estimation of The Final Size of The Coronavirus Epidemic by The SIR Model
No ratings yet
Estimation of The Final Size of The Coronavirus Epidemic by The SIR Model
11 pages
Estimation of The Final Size of The Coronavirus Epidemic by The SIR Model
No ratings yet
Estimation of The Final Size of The Coronavirus Epidemic by The SIR Model
11 pages
Sir 2
No ratings yet
Sir 2
7 pages
SIR Models: An Introduction: Tingda Wang July 2020
No ratings yet
SIR Models: An Introduction: Tingda Wang July 2020
19 pages
Seir (Kimhuat Lim)
No ratings yet
Seir (Kimhuat Lim)
46 pages
Ds Covid 19 Research
No ratings yet
Ds Covid 19 Research
6 pages
Epidemiology SEIR/SIR Models
100% (1)
Epidemiology SEIR/SIR Models
23 pages
1. Estimating the Basic Reproduction Number at the Beginning of an Outbreak
No ratings yet
1. Estimating the Basic Reproduction Number at the Beginning of an Outbreak
24 pages
Estimation of The Final Size of Coronavirus Epidemic by The Logistic Model
No ratings yet
Estimation of The Final Size of Coronavirus Epidemic by The Logistic Model
15 pages
Epidemic Modeling
No ratings yet
Epidemic Modeling
13 pages
SIR MODEL
No ratings yet
SIR MODEL
22 pages
PPT
No ratings yet
PPT
25 pages
"Desolve": E Class (Exposed) - After Infection From Virus, Some of Them Are Asymptotic and Are
No ratings yet
"Desolve": E Class (Exposed) - After Infection From Virus, Some of Them Are Asymptotic and Are
8 pages
Sir
0% (1)
Sir
9 pages
Covid 19
No ratings yet
Covid 19
6 pages
Epidemics Models and Data using R One-Click eBook Download
100% (8)
Epidemics Models and Data using R One-Click eBook Download
17 pages
Lecture 02
No ratings yet
Lecture 02
40 pages
A Primer On Stochastic Epidemic Models
No ratings yet
A Primer On Stochastic Epidemic Models
15 pages
Projec 2
No ratings yet
Projec 2
5 pages
Lecture Note 2
No ratings yet
Lecture Note 2
38 pages
COVID-19: Statistical Exploration: March 2020
No ratings yet
COVID-19: Statistical Exploration: March 2020
5 pages
CCST Lecture5
No ratings yet
CCST Lecture5
42 pages
Epidemic Modelling
100% (3)
Epidemic Modelling
75 pages
Mathematical Modeling
No ratings yet
Mathematical Modeling
3 pages
The SIR Model For Spread of Disease - The Differential Equation Model - Mathematical Association of America
No ratings yet
The SIR Model For Spread of Disease - The Differential Equation Model - Mathematical Association of America
5 pages
Sample Technical Report (1)
No ratings yet
Sample Technical Report (1)
12 pages
Analysis of The COVID-19 Pandemic by SIR Model and Machine Learning Technics For Fore-Casting
No ratings yet
Analysis of The COVID-19 Pandemic by SIR Model and Machine Learning Technics For Fore-Casting
13 pages
COVID-19 Epidemic SIR-based Model
No ratings yet
COVID-19 Epidemic SIR-based Model
20 pages
Project Slides
No ratings yet
Project Slides
15 pages
Analysis of Vaccine Efficacy
No ratings yet
Analysis of Vaccine Efficacy
37 pages
Mathematical Modelling of Infectious Diseases
No ratings yet
Mathematical Modelling of Infectious Diseases
10 pages
Epidemics Lectures PANDA
No ratings yet
Epidemics Lectures PANDA
31 pages
PDF Mathematical Modeling of The Spread
No ratings yet
PDF Mathematical Modeling of The Spread
31 pages
Science Abb9789 Full
No ratings yet
Science Abb9789 Full
15 pages
The SIR Model When S (T) Is A Multi-Exponential Function.
No ratings yet
The SIR Model When S (T) Is A Multi-Exponential Function.
47 pages
COVID-19 Outbreak Prediction With Machine Learning
No ratings yet
COVID-19 Outbreak Prediction With Machine Learning
39 pages
SCNC1001 2004b
No ratings yet
SCNC1001 2004b
69 pages
SEIR
No ratings yet
SEIR
27 pages
Step Forward in Epidemiological Modeling Introducing The Rate Indicator
No ratings yet
Step Forward in Epidemiological Modeling Introducing The Rate Indicator
9 pages
Modelling COVID-19 Outbreak: Segmented Regression To Assess Lockdown Effectiveness
No ratings yet
Modelling COVID-19 Outbreak: Segmented Regression To Assess Lockdown Effectiveness
5 pages
Lectures Bio
No ratings yet
Lectures Bio
41 pages
LE-Differential-Equations
No ratings yet
LE-Differential-Equations
11 pages
Nature Paper1
No ratings yet
Nature Paper1
13 pages
Simple Models
No ratings yet
Simple Models
52 pages
Make Your Own SIR Model
No ratings yet
Make Your Own SIR Model
7 pages
2021 Forein#COVID_in_France
No ratings yet
2021 Forein#COVID_in_France
23 pages
Slide 1
No ratings yet
Slide 1
10 pages
NIH Public Access: A Note On The Derivation of Epidemic Final Sizes
No ratings yet
NIH Public Access: A Note On The Derivation of Epidemic Final Sizes
14 pages
Presentation1
No ratings yet
Presentation1
17 pages
Branching: Processes and The Theory of Epidemics
No ratings yet
Branching: Processes and The Theory of Epidemics
11 pages
Lecture1 Intro2016
No ratings yet
Lecture1 Intro2016
53 pages
COVID-19 Pandemic Prediction For Hungary A Hybrid Machine Learning Approach
No ratings yet
COVID-19 Pandemic Prediction For Hungary A Hybrid Machine Learning Approach
20 pages
Nonlinear Dynamic Analysis of An Epidemiological Model For COVID-19 Including Public Behavior and Government Action
No ratings yet
Nonlinear Dynamic Analysis of An Epidemiological Model For COVID-19 Including Public Behavior and Government Action
14 pages
Mathcad SIR Model
No ratings yet
Mathcad SIR Model
9 pages
Lectures Bio
No ratings yet
Lectures Bio
49 pages
1 s2.0 S0378475423004196 Main
No ratings yet
1 s2.0 S0378475423004196 Main
13 pages
Howard (Howie) Weiss: Sir Ronald Ross
0% (1)
Howard (Howie) Weiss: Sir Ronald Ross
17 pages
Analysis Sensitivity-Ebola-Solutions
No ratings yet
Analysis Sensitivity-Ebola-Solutions
13 pages
Modeling Infectious Epidemics: Ottar N. Bjørnstad, Katriona Shea, Martin Krzywinski and Naomi Altman
No ratings yet
Modeling Infectious Epidemics: Ottar N. Bjørnstad, Katriona Shea, Martin Krzywinski and Naomi Altman
2 pages
Digital Signal Processing (DSP) with Python Programming
From Everand
Digital Signal Processing (DSP) with Python Programming
Maurice Charbit
No ratings yet
8th National Moot Court
No ratings yet
8th National Moot Court
12 pages
Recommended Books by RGUHS
No ratings yet
Recommended Books by RGUHS
30 pages
ASTM F 2094 - 01 - Silicon Nitride Bearing Balls
No ratings yet
ASTM F 2094 - 01 - Silicon Nitride Bearing Balls
7 pages
Grundfosliterature 6375
No ratings yet
Grundfosliterature 6375
109 pages
TPS Poster
No ratings yet
TPS Poster
1 page
ECE 313 Spring24 Hw9
No ratings yet
ECE 313 Spring24 Hw9
7 pages
Algorithms 1
No ratings yet
Algorithms 1
10 pages
ENB33 - Soft Diversity Reception
No ratings yet
ENB33 - Soft Diversity Reception
13 pages
Animal Reproduction
No ratings yet
Animal Reproduction
17 pages
DS-06004 1305019 Rev.h
No ratings yet
DS-06004 1305019 Rev.h
25 pages
Ekc471 Notes
No ratings yet
Ekc471 Notes
109 pages
Assignment Simple Harmonic Moiton - 2024 (With Key)
No ratings yet
Assignment Simple Harmonic Moiton - 2024 (With Key)
6 pages
Calculating Weld Volume and Weight: Welding Costs
No ratings yet
Calculating Weld Volume and Weight: Welding Costs
4 pages
Grein 2
No ratings yet
Grein 2
11 pages
ITP109 - WEEK 7 To 8 (Memory Management - Virtual Memory) (20240403084907)
No ratings yet
ITP109 - WEEK 7 To 8 (Memory Management - Virtual Memory) (20240403084907)
32 pages
Power Supply NGMO1 and Power Supply NGMO2: Operating Manual
No ratings yet
Power Supply NGMO1 and Power Supply NGMO2: Operating Manual
84 pages
Portable Mobile Charger
57% (7)
Portable Mobile Charger
11 pages
Medhistory Milestones History of CT at Siemens English PDF
No ratings yet
Medhistory Milestones History of CT at Siemens English PDF
60 pages
Basic Anatomy and Kinesiology
No ratings yet
Basic Anatomy and Kinesiology
4 pages
Module 4
No ratings yet
Module 4
16 pages
Fault
No ratings yet
Fault
20 pages
Thinking Time: Equal Matrices
No ratings yet
Thinking Time: Equal Matrices
26 pages
Hiaura Water
No ratings yet
Hiaura Water
12 pages
Leon Kass - The Wisdom of Repugnance
No ratings yet
Leon Kass - The Wisdom of Repugnance
11 pages
Review of Ayurvedic Drugs Acting On Hypothyroidism: Tarun Sharma, Malvika, Monu Gupta, Sumit Nathani
No ratings yet
Review of Ayurvedic Drugs Acting On Hypothyroidism: Tarun Sharma, Malvika, Monu Gupta, Sumit Nathani
8 pages
135) NP PYQ (Recent)
No ratings yet
135) NP PYQ (Recent)
8 pages
10 Secrets To Being A Badass
No ratings yet
10 Secrets To Being A Badass
11 pages
Grade 9 Sample Paper 2025
No ratings yet
Grade 9 Sample Paper 2025
6 pages
NS & TECH GRADE 5 T2 BASELINE ASSESSMENT (1)
No ratings yet
NS & TECH GRADE 5 T2 BASELINE ASSESSMENT (1)
6 pages
30125
No ratings yet
30125
57 pages