0% found this document useful (0 votes)
35 views15 pages

Methods Ecol Evol - 2022 - DiRenzo - A Practical Guide To Understanding and Validating Complex Models Using Data

Uploaded by

bina.sn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views15 pages

Methods Ecol Evol - 2022 - DiRenzo - A Practical Guide To Understanding and Validating Complex Models Using Data

Uploaded by

bina.sn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Received: 24 March 2022 | Accepted: 7 October 2022

DOI: 10.1111/2041-210X.14030

PERSPECTIVE

A practical guide to understanding and validating complex


models using data simulations

Graziella V. DiRenzo1 | Ephraim Hanks2 | David A. W. Miller3

1
U. S. Geological Survey, Massachusetts
Cooperative Fish and Wildlife Research Abstract
Unit, University of Massachusetts,
1. Biologists routinely fit novel and complex statistical models to push the limits
Amherst, Massachusetts, USA
2
Department of Statistics, Pennsylvania
of our understanding. Examples include, but are not limited to, flexible Bayesian
State University, University Park, approaches (e.g. BUGS, stan), frequentist and likelihood-­based approaches (e.g.
Pennsylvania, USA
3
packages lme4) and machine learning methods.
Department of Ecosystem Science
and Management, Pennsylvania State 2. These software and programs afford the user greater control and flexibility in tailor-
University, University Park, Pennsylvania, ing complex hierarchical models. However, this level of control and flexibility places a
USA
higher degree of responsibility on the user to evaluate the robustness of their statis-
Correspondence tical inference. To determine how often biologists are running model diagnostics on
Graziella V. DiRenzo
Email: [email protected] hierarchical models, we reviewed 50 recently published papers in 2021 in the jour-
nal Nature Ecology & Evolution, and we found that the majority of published papers
Funding information
U.S. Government did not report any validation of their hierarchical models, making it difficult for the
reader to assess the robustness of their inference. This lack of reporting likely stems
Handling Editor: Holger Schielzeth
from a lack of standardized guidance for best practices and standard methods.
3. Here, we provide a guide to understanding and validating complex models using
data simulations. To determine how often biologists use data simulation tech-
niques, we also reviewed 50 recently published papers in 2021 in the journal
Methods Ecology & Evolution. We found that 78% of the papers that proposed a
new estimation technique, package or model used simulations or generated data
in some capacity (18 of 23 papers); but very few of those papers (5 of 23 papers)
included either a demonstration that the code could recover realistic estimates
for a dataset with known parameters or a demonstration of the statistical prop-
erties of the approach. To distil the variety of simulations techniques and their
uses, we provide a taxonomy of simulation studies based on the intended infer-
ence. We also encourage authors to include a basic validation study whenever
novel statistical models are used, which in general, is easy to implement.
4. Simulating data helps a researcher gain a deeper understanding of the models
and their assumptions and establish the reliability of their estimation approaches.
Wider adoption of data simulations by biologists can improve statistical infer-
ence, reliability and open science practices.

This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium,
provided the original work is properly cited.
© 2022 The Authors. Methods in Ecology and Evolution published by John Wiley & Sons Ltd on behalf of British Ecological Society. This article has been
contributed to by U.S. Government employees and their work is in the public domain in the USA.

Methods Ecol Evol. 2023;14:203–217.  wileyonlinelibrary.com/journal/mee3 | 203


|

2041210x, 2023, 1, Downloaded from https://besjournals.onlinelibrary.wiley.com/doi/10.1111/2041-210X.14030 by University Of Jyväskylä Library, Wiley Online Library on [23/09/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
204 Methods in Ecology and Evolu on DIRENZO et al.

KEYWORDS
Bayesian, frequentist, goodness-­of-­fit, hierarchical models, occupancy model, power analysis,
statistical properties, study design

1 | I NTRO D U C TI O N violations in assumptions? Brown et al., 2018; Link et al., 2018). One
reason for this lack of quantitative rigour is the absence of standard
Ecologists and evolutionary biologists increasingly use complex hier- guidelines that would make it easy for biologists to evaluate their
archical models to answer novel questions of theoretical and practi- statistical models (Barraquand et al., 2014; Conn et al., 2018).
cal importance (e.g. Conn et al., 2018; Hooten & Hobbs, 2015; Kéry The goal of this paper is to lay out a framework for validation when
& Royle, 2016, 2021; Kéry & Schaub, 2012). For example, 52% of 50 complex hierarchical models are used. By validation, we mean, ‘are
recently published papers in 2021 in the journal Nature Ecology and the estimates we generate from a statistical model providing sound
Evolution use a hierarchical model to analyse their data (Table S1). inferences (i.e. can we generalize the results)?’ Thus, validation in-
Examples of hierarchical models include generalized linear mixed cludes everything from whether code is correct, to whether param-
models (GLMMs, Bolker et al., 2009), latent state models with some eters are identifiable and estimates unbiased, to whether our model
observation process (e.g. occupancy models MacKenzie et al., 2002; can be robustly applied when assumptions are violated or new data
Tyre et al., 2003) and mixture models (Kéry & Schaub, 2012). Several are collected. In the real world, however, we rarely know true values
factors have contributed to the increased application of hierarchical of ecological parameters of interest (e.g. Kéry & Schaub, 2012); thus, in
models in ecology and evolutionary biology (e.g. Bolker et al., 2009; most cases, our ability to test and validate statistical methods relies on
Kéry & Royle, 2016, 2021; Kéry & Schaub, 2012). First, with greater simulating datasets where truth is set and known by the user (e.g. Kéry
access to community science data, open-­source datasets, genomic & Royle, 2016; Kéry & Schaub, 2012). Simulated data provide an op-
data and long-­term ecological research (e.g. Dryad®, GenBank®, portunity to compare different properties of our statistical estimators
TreeBASE®), biologists can ask bigger and more complicated ques- to the true parameter values used to generate them and to evaluate
tions, which typically lead to the use of more complicated modelling model behaviour or performance. To determine how often biologists
methods. Second, more biologists are learning to use flexible pro- simulate data, we reviewed 50 recently published papers in 2021 in
gramming languages that facilitate writing tailored complex hierar- the journal Methods Ecology & Evolution (Table S2), and we found that
chical models (e.g. BUGS Sturtz et al., 2005, JAGS Plummer, 2003, 78% of the papers that proposed a new estimation technique, package
stan Carpenter et al., 2017, package lme4 Bates et al., 2015, machine or model used simulations or generated data in some capacity (18 of
learning methods Joseph, 2020). Lastly, policy and conservation 23 papers). However, even in this journal the approaches used by au-
decision-­makers are increasingly relying on the insights from com- thors varied greatly. For example, only five of the 23 papers included
plex datasets to guide their actions (Runting et al., 2020). a basic demonstration that code can recover realistic estimates for a
However, when practitioners fit custom-­built hierarchical mod- dataset with known parameters. Similarly, only nine of the 23 papers
els, their methods are often largely untested (e.g. Conn et al., 2018; included simulations that demonstrate the statistical properties of an
Hooten & Hobbs, 2015). As the complexity of hierarchical models approach (i.e. quantifying accuracy, precision, bias and coverage of
increases, it becomes increasingly difficult to intuitively understand the estimator). As demonstrated by our review of papers published in
the assumptions, uncertainty and potential biases of the specified Nature Ecology & Evolution, validation is even less common in journals
model. For example, a recent paper published in Science used a hier- not focused on methods development, despite most applications of
archical statistical model to examine the effects of climate change on complex hierarchical models using novel methods.
bumble bee occupancy (Soroye et al., 2020), and a follow-­up study While simulation studies are a natural tool for understanding and
using data simulations showed that the hierarchical model used was validating the statistical properties of a method, model or analysis,
not robust to violations of model assumptions (Guzman et al., 2021). there is no clear standard for when ecologists can use simulation
Although such cases may seem rare, it is likely more common than studies, and which simulation studies are useful in different scenar-
appreciated. To determine how often biologists are validating the re- ios (e.g. Olivetti et al., 2021; Rossman et al., 2016; Smith et al., 2021;
sults of the hierarchical models they use to analyse data, we reviewed Tingley et al., 2020). Therefore, in this paper, we provide a guide to
50 recently published papers in 2021 in the journal Nature Ecology simulation studies for biologists. Specifically, we present a taxonomy
& Evolution, and we found that the majority of published papers that of simulation study types based on the intended inference, with two
used hierarchical models did not report any validation of the models broad divisions: (1) study-­specific simulations (i.e. studies focused on
(5 of the 26 papers checked the diagnostics and fit of hierarchical a particular ecological system, such as an analysis of an ecological
models; 19%; Table S1). Similarly, in the journal Ecology, only 25% of dataset aimed at answering a scientific question relevant to that
articles routinely report model diagnostics (Conn et al., 2018). Even ecological system) and (2) general property simulations (i.e. studies fo-
more rarely do biologists report an evaluation of the soundness cused on methods and guidelines for adoption in future studies). We
of their code or the reliability of their novel statistical models (i.e. provide general guidelines on what questions each simulation study
are the statistical models unbiased? how robust are the models to can help answer, and we encourage authors to at a minimum include
|

2041210x, 2023, 1, Downloaded from https://besjournals.onlinelibrary.wiley.com/doi/10.1111/2041-210X.14030 by University Of Jyväskylä Library, Wiley Online Library on [23/09/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
DIRENZO et al. Methods in Ecology and Evolu on 205

an easily implementable basic validation simulation whenever novel It is important to clearly identify the goal of any simulation study
statistical models are used. In an effort to facilitate the implemen- and to identify the statistics of interest that will help address this goal
tation of these methods, we provide a running example throughout (see Table 1 for a list of common goals of simulations studies; Kéry
the text with fully reproducible code in R (R Core Team, 2021) and & Royle, 2016). The basic steps of conducting any simulation study
Nimble (de Valpine et al., 2017, 2022a, 2022b). This running example are to (1) simulate one to many unique datasets using a data gener-
takes advantage of a common hierarchical model in ecology—­the oc- ating model (referred to as M), (2) estimate the desired parameters
cupancy model (MacKenzie et al., 2002; Tyre et al., 2003)—­in which (or other statistics) using the statistical model of interest (referred
the ecological process is decomposed from the sampling process. to as A) and (3) summarize the performance of those estimates using
We suggest that new statistical models be accompanied by data Monte Carlo methods (see Box 1 for a more detailed algorithm).
simulations to avoid erroneous conclusions and to avoid the use of Monte Carlo approaches are those which rely on random variables
biased models in policy and decision-­making, just as it has become simulated from a distribution, instead of the theoretical proper-
standard practice that field studies are accompanied by laboratory ties of the distribution itself (see Rizzo, 2019 for an introduction).
experiments to validate conclusions (Kéry & Royle, 2016). Two very common uses of Monte Carlo methods are Markov Chain
Monte Carlo (MCMC) methods used primarily in Bayesian statistical
analysis (Gelman et al., 1995) to draw samples from a posterior distri-
2 | U S E S O F DATA S I M U L ATI O N S T U D I E S bution, and simulation studies, which we focus on here. A simulation
I N ECO LO G Y A N D E VO LU TI O N A RY study is simply the process of drawing samples from a distribution of
B I O LO G Y a desired statistic, and using those samples to understand the statis-
tical properties, like bias and variance, of a statistic (a function of the
Simulation studies are valuable for a wide range of analyses conducted random samples). The questions which can be addressed by a given
using a wide range of statistical frameworks. Frequentist, Bayesian simulation study depend heavily on (1) the way random samples are
and optimization-­based machine learning approaches all lend them- drawn and (2) the statistics, or quantities, of interest. In this paper,
selves equally well to simulation studies (Muff et al., 2020; Weber we outline a classification of simulation studies, and provide an illus-
et al., 2021). Similarly, simulation studies are valuable when inference tration of many common types of simulation studies.
is conducted analytically (i.e. when using an analytic maximum likeli- The first classification of simulation studies, which we refer to as
hood estimator and analytic asymptotic confidence intervals) or nu- study-­specific simulations, are methods appropriate when the goal is
merically (i.e. when using numerical optimization Kendall et al., 1997; to validate the analysis of a dataset already in hand and the interpre-
Kendall & Nichols, 1995; Mackenzie & Royle, 2005). tation of the ecological results of the analysis in the context of the

TA B L E 1 Summary table of the taxonomy of simulation studies used to understand and validate statistical models. Under each category,
there are three types of simulation studies. For each type of simulation study, we summarize the types of questions that each aid in
answering alongside the goals of the simulation

When to use this simulation study?


Type of simulation
Category study Questions to answer Goal

Study-­specific Basic validation Are the parameters identifiable? Is this model Explore and verify identifiability of model
simulations simulation computationally feasible? parameters, given available data
Determining statistical What are the basic statistical properties of my Understand statistical properties of an
properties model under a standard set of conditions? algorithm output
How does the approach perform with respect
to parameter accuracy, bias, precision and
coverage? What are the computational
requirements/time of running the model?
Assessing Does my model sufficiently and accurately Understand if the estimator can sufficiently
goodness-­of-­fit explain my data? reproduce the observed data
General property Simulation-­based How many samples will be needed to generate Understand sampling design requirements
simulations study design quality estimates? What is the optimal for robust inference
allocation of samples? Where and when do I
collect samples?
Assessing statistical What are the properties of the estimator across Understand model performance under a
robustness different parameter spaces? Can the approach wide array of parameter conditions
be applied to different conditions?
Comparing the efficacy What happens when data violate model Understand model performance when
of different assumptions? How do different approaches assumptions are violated
approaches perform under non-­optimal conditions?
|

2041210x, 2023, 1, Downloaded from https://besjournals.onlinelibrary.wiley.com/doi/10.1111/2041-210X.14030 by University Of Jyväskylä Library, Wiley Online Library on [23/09/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
206 Methods in Ecology and Evolu on DIRENZO et al.

BOX 1 General framework for data simulations

The general procedure for a simulation study can be defined by three steps (Figure B1.1).

F I G U R E B 1 . 1 Graphical illustration of the steps of a simulation study. These steps are described in detail in this box. [Correction
added on 14 December 2022, after first online publication: Figure B1.1 has been revised].

(1) Simulation. To simulate data, the user specifies a probabilistic model Ms for simulation, parameters ps of that model and inde-
pendent variables X s. The ability to vary the model, parameters and independent variables determine the types of inferences that
can be made from a simulation study. Thus, it is common for simulation studies to consider a range of simulation settings. We denote
the kth simulated dataset as ys(k), with the ‘s’ subscript denoting that this is a ‘simulated’ dataset, and the ‘(k)’ superscript denoting the
replicate number, with ‘k’ ranging from 1 to K, the total number of simulations.
(2) Model Fitting. The goal of most simulation studies is to understand the distributional properties of a ‘statistic’, which is often
an estimate of a parameter or a summary of a set of data. We assume an algorithm, A, is applied to a simulated dataset ys and returns a
set of statistics or outputs Os, and the user has a lot of flexibility in defining the algorithm, A. Note that the algorithm is not always the
statistical model, Mf, and the statistical model often matches the data generating model. The statistics or algorithm outputs of interest
may be an estimate ̂𝛽 of a parameter, the width or coverage of a credible interval or a p value or test statistic associated with a parameter.
(3) Monte Carlo Analysis. In most cases, many datasets will be simulated and analysed, and we summarize these results. To ac-
{ } { }
complish this, after simulating data y (1) (2) (K)
s , ys , … , ys and calculating the statistics of interest O(1) , O(2) , … O(K) , the distributional
{ }
properties of O ∣ Ms , ps , X s are explored by use of the samples O(1) , O(2) , … O(K) from this distribution. Exploring or estimating
properties of a distribution using random samples from that distribution is referred to as a ‘Monte Carlo’ analysis. For example, the
( ) � � ∑ (k)
mean of the statistics E O| Ms , ps , X s could be approximated using the sample mean, E O� Ms , ps , X s ≈ 1 ∕ K O .

General considerations for simulation studies

Monte Carlo approaches are approximate approaches, and their accuracy depends on the number of simulated datasets gen-
erated (i.e. accuracy increases as the sample size increases). Often the goal of a simulation study is to estimate a property of a
distribution—­for example, ‘what is the expected distribution of parameter estimates given some true value of the parameter?’. After
conducting a simulation study, along with the estimate of interest, it is possible to compute a standard error of that estimate using
the Monte Carlo samples. Following Koehler et al. (2009), the Monte Carlo standard error of O ̂ is:

( ) √ ( )
MCSE O ̂
̂ = Var O (k) ,

where the variance is calculated over the K outputs from the simulation. This provides a straightforward way to assess whether or not
more simulations are needed. In general, let S be the statistic of interest (i.e. the power of a test, or the upper bound of a 95% CI of a
|

2041210x, 2023, 1, Downloaded from https://besjournals.onlinelibrary.wiley.com/doi/10.1111/2041-210X.14030 by University Of Jyväskylä Library, Wiley Online Library on [23/09/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
DIRENZO et al. Methods in Ecology and Evolu on 207

( )
parameter) which will be approximated using simulation s. An approximate 95% confidence interval of the statistic S is ̂ S ± 1.96 ∗ MCSE ̂ S
( )
. As the number K of the simulation study replicates increases, the MCSE ̂ S will converge to the standard error of the estimate ̂S. A good
( )
visual check of the effect of replication size K of the simulation study is a plot of MCSE ̂
S for increasing values of K. If the resulting plot
shows convergence of the MCSE, then it is clear that the size of the simulation study is high enough that the error in estimation is due
mostly to standard uncertainty that is associated with any estimate based on a finite dataset, and not strongly being driven by not having
enough replicates in the simulation study to effectively quantify uncertainty.

system where the data were collected (McClintock, 2021; Palencia standard parameter estimation toolkit (e.g. bootstrap approaches to
et al., 2021; Santos-­Fernandez & Mengersen, 2021). Simulation hypothesis tests, simulation-­based inference such as particle filter)
studies of this type can range from a basic validation of the code and (Lahiri, 2005; Loh & Stein, 2004) and a critical tool for prediction and
model to extensive explorations of the model properties as it relates forecasting (Bergmeir et al., 2018; Pagel & Schurr, 2012), neither of
to analysing the specific dataset. which are addressed here.
The second classification of simulation studies, which we refer
to as general property simulations, are methods used when deter-
mining the efficacy of applying a novel analytical approach for the 3 | S T U DY-­S PEC I FI C S I M U L ATI O N S
design and analysis of future studies (Bellier et al., 2016; Rossman
et al., 2016; Tingley et al., 2020; Zipkin et al., 2017). Simulations in this Study-­specific simulations studies are appropriate when the focus
category can validate model performance across a broad parameter of the scientific study is the analysis of a single dataset with the goal
space, guide data collection and study design, determine how robust to understand the system being studied. Below we review study-­
an approach will be to assumption violations and provide guidance specific simulations that accomplish three goals: (1) basic validation
regarding the relative performance of multiple analytical approaches. simulation, (2) determining statistical properties and (3) assessing
In the following sections, we describe three types of study-­ goodness-­of-­fit.
specific simulations and three types of general property simulations.
For each type of simulation, we include a worked example using
detection/non-­detection data for the spatial distribution of Cape 3.1 | Basic validation simulation
Weavers in South Africa (Clark & Altwegg, 2019b; see Box 2 for a
description of the dataset). We provide code to reproduce all model 3.1.1 | Objective
fitting and simulation studies discussed in the text, which can be
found in Supplement S1 (DiRenzo, Hanks, et al., 2022). In choos- The goal of a basic validation simulation is to determine whether the
ing an example dataset we are left with the challenge of choosing model, fitting algorithm and code can generate realistic parameter
an example that easily illustrates concepts, while also meeting our estimates for an observed dataset (Table 1). This serves as a bare
definition of a complex hierarchical model. Our example dataset minimum check of code and model validity (Kéry & Schaub, 2012). It
consists in its simplest form as an exercise in building a regression also provides a description for the data generating model M (Box 1)
model to predict the expected probability of observing a specific that can be used to evaluate the model assumptions and be used as a
type of bird at a specific location given specific survey conditions. template for more extensive validation methods, such as evaluating
It also includes two types of structure typical of many hierarchi- bias and interval coverage statistical properties. A basic validation
cal models. First, there is non-­independence in the data that must simulation can help confirm parameter identifiability given the avail-
be controlled for using a random-­effects structure. Second, the able data, illuminate weaknesses in the model and fitting algorithms
true state of the system is not observable (i.e. we can only ob- for a given dataset, identify when major issues (e.g. coding errors or
serve whether the bird is detected and not whether it is actually model identifiability) have occurred, and provide a minimum thresh-
present at a site) and thus it includes a latent variable that is linked old of evidence that these issues are not likely to exist in a particu-
to data through an observation model. It mirrors the structure of lar study. In addition, inclusion of a basic validation simulation in a
many other models used by ecological and evolutionary research- published paper facilitates a more open, transparent and reproduc-
ers such as those used for GLMMs (Harrison et al., 2018), phylo- ibility approach for the implementation of novel analytical methods.
genetic analyses (Revell & Harmon, 2022), for hierarchical data Therefore, a basic validation can be an important contribution when
collection (Miller & Grant, 2015) or for predicting system dynamics a novel model is fit or when new code is developed for a model. A
(Buderman et al., 2020). basic validation can be comprised of relatively minor computing, as
Lastly, we note that in this paper we only focus on simulation it is comprised of only fitting the statistical model twice (once for
studies as a tool for understanding and validating statistical mod- the analysis of the observed data and once for a single simulated
els and methods. Simulations are also a critical component of the dataset).
|

2041210x, 2023, 1, Downloaded from https://besjournals.onlinelibrary.wiley.com/doi/10.1111/2041-210X.14030 by University Of Jyväskylä Library, Wiley Online Library on [23/09/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
208 Methods in Ecology and Evolu on DIRENZO et al.

BOX 2 Spatial occupancy modelling of Cape Weaver in South Africa

As part of the second Southern African Bird Atlas project, community scientist birders were asked to spend at least 2 h on a check-
list, recording all species they observed and the order in which they were observed. Here, we consider only the data related to the
Cape Weaver Ploceus capensis collected by this community science project, as made available by Clark and Altwegg (2019b) in Clark
and Altwegg (2019a). A total of 9356 recorded detection/non-­detection observations are available for this species (Figure B2.1a,b).
Clark and Altwegg (2019b) use two principal components (PC) to summarize multiple spatial covariates, with PC1 interpretable as
a temperature-­related factor (Figure B2.1c) and PC2 interpretable as a measure of climate intensity (Figure B2.1d). As a measure of
observation accuracy of each individual birder, Clark and Altwegg (2019b) used the total number of species observed by the birder
as a covariate on detection probability.

(a) Number of Observation Events (b) Empirical Percent Observed (c) PC1

50 1.0 4
−26

−26

−26
40 0.8
30 0.6 0
20 0.4 −4
−30

−30

−30
10 0.2
0 0.0 −8
−34

−34

−34
20 25 30 20 25 30 20 25 30

(d) PC2 (e) Estimated Occupancy Prob. (f) Estimated Spatial Random Effect

2
−26

−26

−26
2
0.8
0 1
0.6
−2 0.4 0
−30

−30

−30
−4 0.2 −1
−34

−34

−34

20 25 30 20 25 30 20 25 30

F I G U R E B 2 . 1 Summary of the Cape Weaver dataset from South Africa: (a) Depicts the number of observation events per
location. (b) Shows the empirical percent observed. (c) Shows the spatial mapping of principal component PC1, and (d) shows the
spatial mapping of principal component PC2. (e) Shows the estimated occupancy probability of the Cape Weaver across South
Africa using our spatial occupancy model (Equations (1)–­(5)), and (f) shows the estimated spatial random effect across South Africa.
[Correction added on 14 December 2022, after first online publication: Figure B2.1 has been revised].

In the following, we give a brief ecological description of the model and its structure. The goal of the model is to estimate the
probability the Cape Weaver occurs at a given location across the study area. The model we describe will include two hierarchical
components, each of which add greater structure and complexity as compared to a standard logistic regression analysis. First, the
probability of observing a Cape Weaver during a survey is a function of not only whether the species is present, but also whether it is
detected given it does occur at the location. To account for this, a nested model is used to estimate the probability of observing the
species as a function of both whether it is present and whether it is detected given it is present. Second, we want to account for spa-
tial dependency among observations and this is done by including a spatial random effect. However, as noted in the text, accounting
for spatial dependency in an unbiased manner is a non-­trivial problem.
To fit data, we considered a spatial occupancy model for this dataset, with binary observations ysi being the ith observation at
spatial location s. As noted, above, the probability ysi = 1 depends both on whether the Cape Weaver is present and whether it is
observed. Therefore, we model the probability of detecting a bird during a survey as:

( )
ysi ∼ Bern zs ∗ psi , (1)

where zs = 1 if Cape Weaver occupies the sth spatial location and zs = 0 if not (i.e. zs is the latent true occupancy), and psi is the probability
of detection for the ith observation at location s. We model detection probability using a probit regression model and as a function of
observer experience, with
|

2041210x, 2023, 1, Downloaded from https://besjournals.onlinelibrary.wiley.com/doi/10.1111/2041-210X.14030 by University Of Jyväskylä Library, Wiley Online Library on [23/09/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
DIRENZO et al. Methods in Ecology and Evolu on 209

( )
Φ−1 psi = 𝛼 0 + 𝛼 1 wo(si) , (2)

where wo(si) is the total number of species observed in the second South African Bird Atlas project by the observer o who made the ith
observation at spatial location s (see Clark & Altwegg, 2019b for additional explanation). The second component of the probability of ob-
serving a bird is whether it is actually present at the location, which also happens to be the variable we are most interested in estimating.
This probability is latent, and can be modelled using a probit regression model where:
( )
zs ∼ Bern us , (3)

( )
Φ−1 us = 𝛽 1 + 𝛽 2 x1s + 𝛽 3 x2s + 𝜂 s , (4)

where x1s and x2s are the first two principal components described above (Figure B2.1c,d). Spatial autocorrelation in occupancy is modelled
by a spatial random effect 𝜂 s, which we model using a basis function approach (Cressie et al., 2022) with basis vectors constructed from
the first M eigenvectors of the inverse of an Intrinsic Conditional Auto-­Regressive (ICAR) precision matrix. This differs slightly from Clark
and Altwegg (2019b), who used similar eigenvectors, but first removed some correlation between the spatial random effect and the fixed
effects.
∑M ( )
𝜂s =
m=1
𝛾 m v m , 𝛾 m ∼ N 0, 𝜎 2m , (5)

where v m is the mth eigenvector and 𝜎 2m is the corresponding mth eigenvalue. All regression parameters are assigned diffuse (variance = 100)
zero-­mean Gaussian priors.
We specified diffuse Gaussian priors on all regression parameters, and diffuse half-­normal priors on all variance parameters, and fit this
BHM using MCMC. All computing was done using the NIMBLE R-­package (de Valpine et al., 2017, 2022a, 2022b). We ran the MCMC
sampler for 20,000 iterations and removed the first 10% of the chain as burn-­in. The posterior mean occupancy probabilities are shown in
Figure B2.1e, and the estimated spatial random effect is shown in Figure B2.1f.

3.1.2 | Simulation settings 3.1.4 | Example

A basic validation simulation consists of first fitting the statistical Here, we perform a basic validation simulation of a spatial occu-
model, A, used for the scientific analysis using the observed data (y* pancy model that is fit to the Cape Weaver dataset (Box 2). First, we
and X*) to generate parameter estimates (p̂∗; Kéry & Schaub, 2012). fit the spatial occupancy model described in Box 2 (Equations (1)–­
Then, the parameter estimates are used to simulate a single new (5)) to the observations of Cape Weavers, and our model parameter
( )
dataset from the simulation distribution, such as ys ∼ M p̂∗ , X ∗ . This estimates (specifically the posterior means) are shown graphically
simulated dataset is the same size as the observed data and uses the in Figure 1a as vertical dashed grey lines. For brevity and high-
same covariates, spatial locations and settings (X ∗) as the observed lighting interesting results, we only display parameters 𝛽 2 and 𝛽 3
data. in Figure 1. These two parameters capture the relationship be-
tween local climate and the presence of the Cape Weaver. These
parameters are both interesting as they capture important eco-
3.1.3 | Model fitting logical relationships and because they have a high degree of spatial
structure, which as we explain below is relevant to our ability to
The same statistical model used for the original fit, A, is then fit to estimate parameters. Full results are found in Appendix Figure A1.
this simulated dataset (Kéry & Schaub, 2012). The parameter esti- After fitting the statistical model to the observation data, we simu-
mates, confidence or credible intervals, and other model diagnostics lated data based on the parameter estimates from the field data
are checked to make sure that the results are reasonable given the to perform a basic validation. Note that simulating data presented
parameter values. For example, do most parameter estimate include a specific challenge in our case, which was to capture the spatial
the true value in the 95% CI? or, are credible interval widths nar- random process estimated in our model. As our spatial occupancy
row enough to suggest that there is sufficient power to estimate a model includes a latent spatial random effect, our simulation was
parameter? etc. done hierarchically by:
|

2041210x, 2023, 1, Downloaded from https://besjournals.onlinelibrary.wiley.com/doi/10.1111/2041-210X.14030 by University Of Jyväskylä Library, Wiley Online Library on [23/09/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
210 Methods in Ecology and Evolu on DIRENZO et al.

(a) (b)
Basic Validation Full Simulation (Spatial)

5 20 20
β2 3.0 β3 β2 β3
4
15 15
2.0

Simulation
3

Simulation
Density

Density
10 10
2
1.0
1 5 5

0 0.0

−0.8 −0.7 −0.6 −0.5 −0.4 −0.3 −1.0 −0.8 −0.6 −0.4 −0.2 −1.4 −1.2 −1.0 −0.8 −0.6 −0.4 −1.0 −0.5 0.0

Parameter estimate Parameter estimate Parameter estimate Parameter estimate

(c) Full Simulation (Nonspatial)


Power By Sample Size
20

20
1.0 β2 1.0 β3 β2 β3

0.8 0.8 15 15

Simulation

Simulation
0.6 0.6
Power

Power

10 10
0.4 0.4

0.2 0.2 5 5

0.0 0.0

0 1000 3000 5000 0 1000 3000 5000 −1.4 −1.2 −1.0 −0.8 −0.6 −0.4 −1.0 −0.5 0.0

Total Number of Observation Events Total Number of Observation Events Parameter estimate Parameter estimate

(e) (d)
# of Spatial Vectors Misspecified Model
−0.2 20 20
β2 −0.6 β3 β2 β3
Parameter Estimate

Parameter Estimate

15 15
−0.6
−0.8
Simulation

Simulation
10 10
−1.0 −1.0
5 5

−1.4 −1.2

0 100 200 300 400 500 0 100 200 300 400 500 −1.4 −1.2 −1.0 −0.8 −0.6 −0.4 −1.0 −0.5 0.0

Number of Eigenvectors Number of Eigenvectors Parameter estimate Parameter estimate

F I G U R E 1 Simulation study results. Panel (a) shows the density plots of the posterior distributions of model estimated parameters from
a simulated dataset demonstrating a basic validation simulation. The vertical dashed grey line represents the model parameter estimates
when fitting the Cape Weaver dataset using a spatial occupancy model. Results suggest that the spatial occupancy model may have difficulty
recovering parameter 𝛽 2. Panel (b) shows the 95% CIs of the posterior distributions of 𝛽 2 and 𝛽 3 for the first 20 of the simulated datasets under
the spatial and nonspatial models. The point ranges highlighted in red show simulation runs that do not overlap with the true parameter value,
which is represented by the horizontal vertical line. Point ranges in black represent simulation runs that do overlap with the true parameter
value. Results suggest that 𝛽 2 suffers from identifiability due to spatial confounding. Panel (c) shows a power analysis for recovering estimates
of 𝛽 2 and 𝛽 3. The results show that 𝛽 3 requires more independent samples than 𝛽 2 for consistent estimation. Panel (d) shows the results of a
simulation study with model misspecification. Point ranges in red and black represent those that did not and that did overlap with the true
parameter value, respectively. We simulated data with observer heterogeneity in the detection process and analysed the data using a model
that assumes homogeneity in detection. Results show that ignoring heterogenous detection probabilities can lead to bias in the estimates of
𝛽 2 and 𝛽 3, as shown by the lack of overlap between the 95% CIs of the posterior distributions of 𝛽 2 and 𝛽 3 with the true parameter value (grey
dashed vertical line). Panel (e) is a comparison of models with different numbers of spatial basis functions. We find that inference on 𝛽 2 and 𝛽 3 is
relatively stable when more than 100 basis functions are used. [Correction added on 14 December 2022, after first online publication: Figure 1
has been revised].

1. Simulating the parameters in the spatial random effect These specifications create a simulated dataset with the exact
( )
𝛾 m ∼ N 0, 𝜎 2m and then creating the simulated spatial random same scientific settings as our observed data. We then completed
∑M
effect with 𝜂 s = m=1 𝛾 m v m. the loop by fitting our spatial occupancy model to this new simulated
2. Using this spatial random effect and the existing covariates, we data, using the same MCMC algorithm used to fit the model to the
simulated first true spatial occupancy (z) and then observations of observed detection/non-­detection data.
detection/non-­detection. These simulations used Equations (1)–­ Density plots of the posterior distribution of model parameters,
(5) in Box 2 with all 𝛼 and 𝛽 parameters being set to their poste- given the simulated data, are shown in Figure 1a, with full results
rior means from our statistical model fit using a spatial occupancy in Appendix Figure A1a–­f. The posterior distributions for most
model to the Cape Weaver dataset. model parameter estimates overlapped the value used to simulate
|

2041210x, 2023, 1, Downloaded from https://besjournals.onlinelibrary.wiley.com/doi/10.1111/2041-210X.14030 by University Of Jyväskylä Library, Wiley Online Library on [23/09/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
DIRENZO et al. Methods in Ecology and Evolu on 211

the data. This is what we hope to see with a basic validation, that weight to big differences when assessing performance. Precision an-
our model fit to simulated data generates results that are consistent swers the question ‘how large is the 95% credible interval?’. Again,
with the parameters used to simulate that data. However, the pos- multiple measures of precision exist, including calculating CI width
terior distribution for 𝛽 2 does not overlap the value used to simu- (1) by subtracting the lower 95% CI estimate from the upper 95%
late the data, suggesting that the spatial occupancy model may have CI estimate or (2) by estimating standard error of an estimate. Bias
difficulty in estimating this parameter. This basic validation has no answers the question ‘what are patterns of parameter over-­ versus
replication, so it is not immediately clear if the results we see are under-­ estimation?’ For simulation methods, bias can be estimated
indicative of something systematically wrong with our analysis (i.e. by subtracting the average model estimate across many simulated
non-­identifiability of parameters, an error in our code), or just an ex- datasets from the true parameter value. Lastly, coverage answers
treme resulting from normal sampling variation. Comparing the true the question ‘how often does the true parameter value fall within
occupancy probabilities used to simulate datasets with the corre- the range of the 95% CI?’, and it can be obtained by calculating the
sponding values estimated by the statistical model indicates that we proportion of simulations where the true value fell within the 95%
are able to estimate the parameters used to generate the simulated CI of the model estimate.
data reasonably well (Figure A1f). Given these results, our basic val-
idation study indicates that we may have some difficulty with the
identifiability of 𝛽 2, given the available data. Note our use of the term 3.2.2 | Simulation settings
‘indicates’ in describing these results. Confirming the fit and identi-
fiability of the statistical model for the simulated datasets entails The process for simulating data to determine statistical properties
simulating more than a single dataset, as shown in the next section. is identical to the process of simulating data for a basic validation
By completing the basic validation, we have greatly increased the above, except 100 s to 1000s of unique simulated datasets are gen-
likelihood of identifying major coding errors, issues of identifiability, erated using the estimated parameter values (p̂∗) rather than the sin-
model misspecification and potential power issues. The next sec- gle dataset (e.g. DiRenzo, Miller, et al., 2022; Rossman et al., 2016;
tions outline how that inference can be strengthened using a more Tingley et al., 2020).
comprehensive approach to estimate the distribution of outcomes
expected for an estimation approach and to identify whether real-­
world data used to fit the model are consistent with the assumed 3.2.3 | Model fitting
distributions underlying our approach.
Again, the statistical model is fit to each of the simulated datasets
(e.g. DiRenzo, Miller, et al., 2022; Rossman et al., 2016; Tingley
3.2 | Determining statistical properties et al., 2020). Once all datasets are fit, Monte Carlo methods are used
to examine the frequentist properties of the 100s to 1000s of simu-
3.2.1 | Objective lated datasets.

A more robust exploration would help determine the full statistical


properties of an estimation approach, such as when there are con- 3.2.4 | Example
cerns of parameter identifiability, the behaviour of model parameters
(e.g. estimating bias, accuracy and precision) or when there are ques- Continuing the spatial occupancy model example from above, we
tions about whether interval coverage is well calibrated (Table 1). To next conducted a full simulation study to explore the statistical
do this, we would use a full simulation study. A ‘full simulation study’ properties of model parameters. We focus again on our ability
means that 100s to 1000s of simulations are performed, and Monte to estimate the relationship between each of our covariates and
Carlo methods can be used to understand estimator properties of the probability, a Cape Weaver occurs at a location. As was sug-
the simulated datasets, which ought to resemble the observed data- gested by Clark and Altwegg (2019b), and also explored by Hanks
set in hand (e.g. DiRenzo, Miller, et al., 2022; Rossman et al., 2016; et al. (2015), Hodges and Reich (2010), and Paciorek (2010), and
Tingley et al., 2020). others, parameter identifiability in spatial regression models when
When evaluating statistical properties, there are several metrics the predictor variables are spatially structured (or spatially auto-
of potential interest: accuracy, precision, bias and coverage. Each correlated) can be challenging and our basic validation suggested
metric has lots of ways of being calculated. Here, we present defi- that identifiability may be an issue in our case (Figure 1a). Thus, we
nitions and a couple of ways to calculate each metric. Accuracy an- simulated 100 datasets from the spatial occupancy model using
swers the question ‘how close are model estimates to true values?’ the posterior means obtained for parameters when the model was
and can be quantified multiple ways. For example, accuracy can be fit to the observed dataset as the ‘true’ value. We also simulated
calculated as the mean error by taking the absolute difference be- 100 datasets from the spatial occupancy model with the spatial
tween the model estimate and truth. Alternatively, accuracy can be component set to zero. Given our suspicion that spatial struc-
calculated using mean squared error (MSE) methods, giving greater ture would be an issue, this second set of simulations gave us a
|

2041210x, 2023, 1, Downloaded from https://besjournals.onlinelibrary.wiley.com/doi/10.1111/2041-210X.14030 by University Of Jyväskylä Library, Wiley Online Library on [23/09/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
212 Methods in Ecology and Evolu on DIRENZO et al.

reference to test whether this was the case and to determine if the • Does my model do an adequate job of representing my observed
biases we observed are in fact due to spatial non-­independence. data?
Next, we fit the spatial occupancy model to each of these 200
simulated datasets. Figure 1b shows the 95% CIs of the posterior Note that lack of fit does not always mean an estimator will per-
distributions of 𝛽 2 and 𝛽 3 for the first 20 of the simulated data- form poorly (or vice versa) in part because goodness-­of-­fit assess-
sets under the spatial model and the corresponding 95% CIs for 𝛽 2 ments are highly dependent on sample size to identify lack of fit.
and 𝛽 3 estimated from the simulated datasets under the nonspatial If lack of fit is identified, simulation studies can be used to deter-
model. Our expectation if our estimates were unbiased and we mine how robust the estimator is to violation of assumptions (see
correctly were estimating precision was that on average 19 out of ‘Assessing statistical robustness’ section below).
20 times the true value would occur in the 95% CI.
When data are simulated without spatial autocorrelation, we see
that the 95% CIs are well-­calibrated, with the posteriors for 𝛼 2, 𝛽 2 and 3.3.2 | Simulation settings
𝛽 3 all overlapping the true parameter used for simulation a large pro-
portion of the time (93%, 96% and 98%, respectively; Figure A1j–­l). The simulation settings for the goodness-­of-­fit assessment are
However, when the data are simulated with spatial autocorrelation, identical to those presented above for the ‘Determining statistical
we see that the credible intervals for 𝛽 2 and 𝛽 3 often do not overlap properties’ section. That is, many datasets are simulated from the
the true parameter (Figure A1g–­i). This simulation study illuminates fitted model. When Bayesian approaches are used, typically simu-
how spatial confounding, as described by Hodges and Reich (2010), lations are conducted using values from iterations of the posterior
Hanks et al. (2015), Silk et al. (2020) and others, can result in bi- distribution.
ased parameter estimates, especially when covariates are spatially
smooth, as are the temperature and climate covariates associated
with 𝛽 2 and 𝛽 3. The reason for this confounding boils down to the cor- 3.3.3 | Model fitting
relation between the covariate and the spatial autocorrelation (Hanks
et al., 2015). The detection covariate associated with 𝛼 2 is much less For each simulated dataset, estimates of the dataset characteris-
spatially smooth, as multiple different individuals (each with different tics (e.g. variance, frequency of zeros, measures of normality) are
levels of the detection covariate wo) often make observations at loca- calculated and the distribution of these values is compared to the
tions close in space. We see from the simulation results that spatial observed dataset.
confounding is much less pronounced when covariates have less spa-
tial structure, like the detection covariate in this example.
3.3.4 | Example

3.3 | Goodness-­of-­fit assessments The appropriate goodness-­of-­fit test to use for a dataset will vary
among applications. For occupancy models, a common approach
3.3.1 | Objectives to assessing goodness of fit is to use Pearson Chi-­square statistics
∑ � �2
(MacKenzie & Bailey, 2004), X 2 = s,i zs − psi ∕ psi, where (as in
Simulation studies also have an important role in goodness-­of-­fit as- Box 2) zs is the true latent occupancy of site s, and psi is the probabil-
sessment. Goodness-­of-­fit assessments are used to determine if the ity of detection at site s by observer i. This test focuses on determin-
statistical model applied in the analysis can generate the observed ing whether the distribution of times a species is detected at a site
data. In the case of goodness-­to-­fit assessments, the simulated data is more variable than expected, and it can help identify unexplained
are compared to the observed data to determine whether the data variation in detection that can bias results. Previous work has shown
fit model assumptions. Examples of commonly used goodness-­ that too much heterogeneity among sites can lead to bias in estimat-
of-­fit assessments include Bayesian posterior predictive checks ing occupancy probabilities (e.g. Ferguson et al., 2015; McNew &
(Kéry & Schaub, 2012) and those used in the DHARMa package Handel, 2015). While in some cases, the distribution of X 2 is known,
(Hartig, 2020) in R for generalized mixed effects models. Note that and the calculated value of X 2 can be compared with theoretical criti-
the focus on fit using simulations has shifted from whether the esti- cal values, we instead illustrate the more general situation where the
mates are reasonable given the true values of the parameters (step distribution of the statistic of interest is unknown, and goodness-­of-­
3 in Figure B1.1) to whether the simulated data are a reasonable ap- fit assessment is carried out using Monte Carlo methods.
̂
proximation of our true data. We first calculated the chi-­square statistic X 2 using posterior
Assessing the goodness-­of-­fit allows biologists to answer the fol- mean estimates for all parameters to obtain p̂si . We then simulated
lowing questions: 1000 datasets using the posterior mean estimates of all param-
eters. Each of these datasets were fit using our Bayesian MCMC
• Can my model replicate or reproduce the patterns in my observed approach, and the resulting posterior means for each simulated
data? dataset were used to compute corresponding Chi-­square statistics.
|

2041210x, 2023, 1, Downloaded from https://besjournals.onlinelibrary.wiley.com/doi/10.1111/2041-210X.14030 by University Of Jyväskylä Library, Wiley Online Library on [23/09/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
DIRENZO et al. Methods in Ecology and Evolu on 213

This provides 1000 samples of the chi-­square statistic under the random sampling design, affect the sample size to ensure against
̂
null hypothesis that our model is correct. The rank of X 2 compared Type I and II error, the model estimates of parameters in terms of
to these values provides a Monte Carlo p-­value to test the null hy- accuracy and bias, and ecological inference across space and time.
pothesis that our model is correct. For this dataset, the Monte Carlo
p-­value was 0.866, indicating that we do not have strong evidence
to reject the null hypothesis that our model is reasonable for this 4.1.2 | Simulation settings
data. If the p-­value was small (e.g. <0.05), we would have evidence
that our model is missing something to accurately capture the vari- Depending on the objectives of the simulation study, the values to be
ation in the observed dataset. p-­values higher than 0.05 (especially varied when simulating datasets may include the model parameters
p-­values much higher than 0.05) indicate that there is no strong evi- (p), the explanatory variables (X) or the sample sizes of the datasets
dence that our model is missing something important. We note that, (n). Varying model parameters allows for evaluation of how the esti-
while we are using Bayesian methods, our approach to this simula- mator will perform across different ecological scenarios, while vary-
tion study example and calculating p-­values is frequentist. We are ing the distribution of the explanatory variables and the sample sizes
interested in the statistical properties of a particular statistic—­the will provide inference about optimal study design. Typically, a finite
posterior mean—­and thus we are not conducting posterior predic- set of values across the range of the parameters are chosen, and
tive inference (Gelman et al., 1995), but rather we are taking a fre- many datasets (from 100s to 1000s) are simulated at each of these
quentist approach, with the statistics of interest being estimated values. Alternatively, ‘space-­filling’ designs can be used to sample
quantities of a Bayesian posterior distribution. across many combinations of parameter values (Carnell, 2022;
DiRenzo, Miller, et al., 2022).

4 | G E N E R A L PRO PE RT Y S I M U L ATI O N S
4.1.3 | Model fitting
In this section, we review general property simulations, which are
used when the goal is to make general recommendations regarding The simulated datasets are fit to the model that the researcher plans
the efficacy of applying a novel analytical approach for the design to use for the analysis of the observations. The chosen metric for per-
and analysis of future studies. formance (e.g. root MSE as a measure of accuracy or standard error
as a measure of precision) is calculated for each simulated dataset and
summarized across the different parameter values that were varied.
4.1 | Simulation-­based study design

4.1.1 | Objectives 4.1.4 | Example

There are many reasons to perform a simulation-­based assessment In the context of our spatial occupancy model and Cape Weaver
of study design. First, a biologist may want to understand how the dataset, we consider a simulation study aimed at understanding
estimator performs across a wide range of parameter values and how sample size of a spatial occupancy dataset effects estimator
sample sizes when selecting a new statistical approach to design performance, which can be used to inform future studies. Here,
and analyse data for a new study (e.g. DiRenzo, Miller, et al., 2022). we were interested in determining the degree to which reducing
This information could assist others in deciding whether to adopt a sampling effort will affect inference from our hierarchical model.
method to analyse existing data or design new studies with the ap- Understanding how sample size influences our ability to estimate
proach in mind. The second goal of simulation-­based study design parameters can guide future survey efforts to ensure that limited
falls under the category of a power analysis, with a goal of under- monitoring resources are properly allocated. For our simulation
standing the effect that sample size has on our power to detect study, we considered simulations that included only a subset of the
non-­zero parameters when they occur, on the accuracy of model observations in our Cape Weaver dataset. First, we randomly sub-
parameter estimates or on the predictive performance of the model sampled N of the 9356 observations in the detection/non-­detection
(e.g. Guillera-­A rroita & Lahoz-­Monfort, 2012). Biologists are rou- dataset without replacement, with N varying from 100 up to 7000.
tinely interested in examining the effect of sample size during the We then simulated 100 independent occupancy datasets at each
study design phase when there are concerns about being able to value of N. For each simulated dataset, we fit the spatial occupancy
collect enough observations, especially as they relate to Type I and model and estimated the power for each parameter. We define
II error, which occurs regularly during the early stages of an eco- power as the proportion of the posterior 95% credible intervals for
logical project when designing field studies. Lastly, biologists may each parameter that did not overlap zero and showed the same sign
use simulation-­based study design to evaluate model performance (i.e. positive or negative) as the true, simulated parameter value.
under different study designs (e.g. Wright et al., 2022). In this case, The results for varying dataset sample sizes are shown in
we can explore how varying sampling designs, such as stratified vs Figure 1c and Figure A1m. We see that the effect of heterogenous
|

2041210x, 2023, 1, Downloaded from https://besjournals.onlinelibrary.wiley.com/doi/10.1111/2041-210X.14030 by University Of Jyväskylä Library, Wiley Online Library on [23/09/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
214 Methods in Ecology and Evolu on DIRENZO et al.

detection (𝛼 2) and the temperature-­related principal component (𝛽 2) probabilities has on estimating the occupancy parameters using our
are estimated well with a relatively small sample size (N < 1000), spatial occupancy model applied to the Cape Weaver dataset. We
while the effect of the second principal component (𝛽 3) suggests that simulated 100 datasets from our fitted model but where we also in-
a relatively larger sample size is needed to consistently observe an cluded heterogeneity in detection probabilities for each observer.
estimate that does not include zero in the CI. We then fit our spatial occupancy model to this simulated data that
assumed homogeneous detection probabilities (i.e. all observers had
equal probability of observing a species when present).
4.2 | Assessing statistical robustness Figure 1d shows the posterior 95% credible intervals for 20 such
simulated datasets. In the figure, we see that the credible intervals
4.2.1 | Objectives for most simulated datasets do not overlap the true value used for
simulation, indicating that the model does a poor job of recovering
Simulation studies often focus on simulating datasets using a data true parameter values when we ignore heterogeneity in detection
generation model that matches the statistical model (e.g. if residual probability. This result highlights the importance of testing for de-
variation in the statistical model is assumed to be normally distributed, tection heterogeneity using goodness-­of-­fit methods (see Section 3
the data generating model for the simulations will use a normal dis- ‘Study-­specific simulation studies’) and for models that address het-
tribution). However, it is also important to understand the effect of erogeneity when it occurs.
model misspecification, which occurs when the data generating model
does not match the model used for statistical analysis (e.g. Dennis
et al., 2019; Dey et al., 2022; DiRenzo, Miller, et al., 2022). Examples 4.3 | Comparing the efficacy of different
include cases where the dataset has extra sources of heterogeneity, modelling approaches
there are distributional mismatches between the statistical model
and the data generating model, or extra explanatory variables are not 4.3.1 | Objectives
included in the statistical model. The goal of this type of simulation
study is to determine how impactful such misspecifications are on In many cases, multiple modelling approaches will be available to
our desired scientific inference (Table 1). These simulation study ap- estimate parameters of interest, and simulations as a tool can be
proaches address misspecification by considering the case where we used to compare the efficacy and robustness of the different mod-
define a particular form of misspecification (i.e. ignoring an important elling approaches. Here, a common goal is to compare a proposed
explanatory variable). This can provide insight into what forms of mis- approach with an existing approach (i.e. a more simplified version of
specification are most impactful on the specific aims of a given study. a model) for estimating parameters or testing hypotheses (Table 1).
Often, this is in conjunction with studying the effects of model mis-
specification (see ‘Assessing statistical robustness’ above), as we are
4.2.2 | Simulation settings considering comparing existing approaches to a novel approach that
models more complexity.
Data are simulated using a data generating model that does not
match the structure of the statistical model used for the scientific
analysis. In most cases, these data are compared to data from a data 4.3.2 | Simulation settings
generating model that matches the statistical model as a baseline for
comparison. Similarly, it is possible to vary parameter values in the The user can set the simulation settings to compare the efficacy of
simulations (see ‘Simulation-­based study design’ section) to deter- different modelling approaches to match the most important out-
mine how the study design affects the robustness of the estimator. comes of the planned study. For example, simulation settings can
follow those of the section ‘Determining statistical properties’ if
interest is in a single set of parameters, ‘Simulation-­based study
4.2.3 | Model fitting design’ if interest is in performance across different parameters, or
‘Assessing statistical robustness’ if interest is in determining whether
The model fitting procedure is the same as the one described in the one approach is more robust to violations.
‘Simulation-­based study design’ section above.

4.3.3 | Model fitting


4.2.4 | Example
For each dataset that is simulated, the data are analysed using mul-
To illustrate how a simulation study can help understand the ef- tiple statistical models (e.g. DiRenzo, Miller, et al., 2022). Results
fects of model misspecification, we conducted a simulation study across simulated datasets are summarized for each statistical model
to explore the effect that ignoring heterogeneity in detection and their output is compared. The comparison among statistical
|

2041210x, 2023, 1, Downloaded from https://besjournals.onlinelibrary.wiley.com/doi/10.1111/2041-210X.14030 by University Of Jyväskylä Library, Wiley Online Library on [23/09/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
DIRENZO et al. Methods in Ecology and Evolu on 215

models is dependent on the practitioner and their needs. Widely (e.g. error in coding, parameter identifiability issues). In addition,
used measures of model performance include: MSE (where the the inclusion of model code increases reproducibility and trans-
mean square difference between truth and a point estimate are parency in an age where open science in ecology and evolution-
calculated), mean square predictive error (where the mean square ary biology is gaining traction (Powers & Hampton, 2019). By also
predictive difference between truth at an unobserved point and the including code that simulate a dataset and fits them to a statistical
statistical estimate at the unobserved point are calculated) and cov- model, it will open doors to understanding the assumed data gen-
erage (where a ‘1’ is assigned if the 95% CI covers the true value and eration process underlying our statistical inferences. Simulations
‘0’ if it does not). have an integral role in testing the reliability and limits of statistical
inference, providing information about a statistical model's ability
to recover accurate, precise, unbiased parameter estimates (Kéry
4.3.4 | Example & Royle, 2016, 2021; Kéry & Schaub, 2012). Simulations can also
help answer many common questions asked by biologists, such as
In our case study of the Cape Weaver dataset, we have focused on ‘how many samples do I collect?’, ‘which model do I use to analyse
a single specific model to account for spatial autocorrelation in the my data?’, ‘does my model do an adequate job of representing my
data. To this point, we modelled spatial autocorrelation in occupancy data?’. Simulations may therefore become an integral part of a bi-
probabilities using a zero-­mean Gaussian spatial random effect ologist's tool kit.
with an intrinsic conditional autoregressive (ICAR) precision matrix
(Box 2). There are many ways to model spatial correlation and even AU T H O R C O N T R I B U T I O N S
within the method we chose, the settings can be varied. We now Graziella V. DiRenzo, Ephraim Hanks and David A. W. Miller worked
consider different spatial models for our dataset, to see how perfor- together to write the first draft. All co-­authors edited the manuscript.
mance varies based on our choice of statistical model. As is common
in the spatial statistical literature, we considered a basis function ap- AC K N OW L E D G E M E N T S
proximation to this spatial random effect (Cressie et al., 2022) with We thank L. M. Browne for constructive comments on a previous
basis functions being the first m eigenvectors of the inverse of the version of this manuscript. Any use of trade, firm or product names
ICAR precision matrix. These m basis vectors capture the most pos- is for descriptive purposes only and does not imply endorsement by
sible variation in the spatial random effect using only m vectors. For the U.S. Government.
our initial data analysis, we chose m = 100 basis vectors (shown in
the vertical line in Figure 1e). This choice was based on a comparison C O N FL I C T O F I N T E R E S T
of the results from a varying set of basis vectors. We let m vary from We have no conflicts of interest.
2 to 500 and fit the model.
Figure 1e shows the posterior mean and 95% credible intervals PEER REVIEW
for different model parameters as a function of m. As m increases, The peer review history for this article is available at https://publo​
the computational complexity of the model increases and the time ns.com/publo​n/10.1111/2041-­210X.14030.
to fit increases as well (see Appendix Figure A1t). These figures show
that when m < 50, the posteriors for multiple parameters are very DATA AVA I L A B I L I T Y S TAT E M E N T
different than for larger values of m, but when m > 75, the posteriors All data for analyses can be found in the Dryad Data Repository:
are all reasonably similar. This simulation study shows that our ap- https://doi.org/10.5061/dryad.jt2002k (Clark & Altwegg, 2019a). All
proximate approach to modelling spatial autocorrelation by keeping code for analyses can be found in the Supplement S1 R code and:
only m eigenvectors provides a good approximation to the full model https://doi.org/10.5066/P99B0IJ7 (DiRenzo, Hanks, et al., 2022).
when m > 75.
ORCID
Graziella V. DiRenzo https://orcid.org/0000-0001-5264-4762
5 | CO N C LU S I O N S Ephraim Hanks https://orcid.org/0000-0003-0345-7164
David A. W. Miller https://orcid.org/0000-0002-3011-3677
We delineate the uses and purposes of simulation studies to under-
stand and validate hierarchical models. In doing so, we propose a REFERENCES
new status quo for reporting of the properties of new and complex Barraquand, F., Ezard, T. H. G., Jørgensen, P. S., Zimmerman, N.,
statistical models, as well as when applying them to routinely used Chamberlain, S., Salguero-­Ǵomez, R., Curran, T. J., & Poisot, T.
(2014). Lack of quantitative training among early-­c areer ecologists:
statistical models. We encourage all studies that use a novel esti-
A survey of the problem and potential solutions. PeerJ, 2014(1), 1–­
mation procedure—­whether it be an extension of existing methods, 14. https://doi.org/10.7717/peerj.285
development of new code or employing a new procedure for fitting Bates, D., Maechler, M., Bolker, B., & Walker, S. (2015). Fitting linear
the model—­to include at least a basic validation simulation. This mixed-­effects models using lme4. Journal of Statistical Software,
67(1), 1–­48. https://doi.org/10.18637/​jss.v067.i01
step will help avoid many potential pitfalls in fitting a new model
|

2041210x, 2023, 1, Downloaded from https://besjournals.onlinelibrary.wiley.com/doi/10.1111/2041-210X.14030 by University Of Jyväskylä Library, Wiley Online Library on [23/09/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
216 Methods in Ecology and Evolu on DIRENZO et al.

Bellier, E., Kéry, M., & Schaub, M. (2016). Simulation-­b ased assessment DiRenzo, G. V., Hanks, E., & Miller, D. A. W. (2022). A practical guide to
of dynamic N-­m ixture models in the presence of density depen- understanding and validating complex models using data simulations.
dence and environmental stochasticity. Methods in Ecology and Version v1.0.0. U.S. Geological Survey Software Release. https://
Evolution, 7(9), 1029–­1040. https://doi.org/10.1111/2041-­210X.​ doi.org/10.5066/P99B0IJ7
12572 DiRenzo, G. V., Miller, D. A. W., & Grant, E. H. C. (2022). Ignoring species
Bergmeir, C., Hyndman, R. J., & Koo, B. (2018). A note on the validity of availability biases occupancy estimates in single-­scale occupancy
cross-­validation for evaluating autoregressive time series predic- models. Methods in Ecology and Evolution, 13, 1790–­1804. https://
tion. Computational Statistics and Data Analysis, 120, 70–­83. https:// doi.org/10.1111/2041-­210X.13881
doi.org/10.1016/j.csda.2017.11.003 Ferguson, P. F. B., Conroy, M. J., & Hepinstall-­Cymerman, J. (2015).
Bolker, B. M., Brooks, M. E., Clark, C. J., Geange, S. W., Poulsen, J. R., Occupancy models for data with false positive and false neg-
Stevens, M. H. H., & White, J.-­S. S. (2009). Generalized linear ative errors and heterogeneity across sites and surveys.
mixed models: A practical guide for ecology and evolution. Trends Methods in Ecology and Evolution, 6(12), 1395–­1406. https://doi.
in Ecology & Evolution, 24(3), 127–­135. https://doi.org/10.1016/j. org/10.1111/2041-­210X.12442
tree.2008.10.008 Gelman, A., Carlin, J. B., Stern, H. S., & Rubin, D. B. (1995). Bayesian data
Brown, A. W., Kaiser, K. A., & Allison, D. B. (2018). Issues with data analysis. Chapman and Hall/CRC.
and analyses: Errors, underlying themes, and potential solutions. Guillera-­Arroita, G., & Lahoz-­Monfort, J. J. (2012). Designing studies to
Proceedings of the National Academy of Sciences of the United detect differences in species occupancy: Power analysis under im-
States of America, 115(11), 2563–­2570. https://doi.org/10.1073/ perfect detection. Methods in Ecology and Evolution, 3(5), 860–­869.
pnas.17082​79115 https://doi.org/10.1111/j.2041-­210X.2012.00225.x
Buderman, F. E., Devries, J. H., & Koons, D. N. (2020). Changes in climate Guzman, L. M., Johnson, S. A., Mooers, A. O., & M'Gonigle, L. K. (2021).
and land use interact to create an ecological trap in a migratory Using historical data to estimate bumble bee occurrence: Variable
species. Journal of Animal Ecology, 89(8), 1961–­1977. trends across species provide little support for community-­level
Carnell, R. (2022). Lhs: Latin hypercube samples. R package version 1.1.5. declines. Biological Conservation, 257(July 2020), 109141. https://
Carpenter, B., Gelman, A., Hoffman, M. D., Lee, D., Goodrich, B., doi.org/10.1016/j.biocon.2021.109141
Betancourt, M., Brubaker, M., Guo, J., Li, P., & Riddell, A. (2017). Hanks, E. M., Schliep, E. M., Hooten, M. B., & Hoeting, J. A. (2015).
Stan: A probabilistic programming language. Journal of Statistical Restricted spatial regression in practice: Geostatistical mod-
Software, 76(1), 1–­32. https://doi.org/10.18637/​jss.v076.i01 els, confounding, and robustness under model misspecification.
Clark, A. E., & Altwegg, R. (2019a). Data from: Efficient Bayesian analy- Environmetrics, 26(4), 243–­254. https://doi.org/10.1002/env.2331
sis of occupancy models with logit link functions. Dryad, Dataset. Harrison, X. A., Donaldson, L., Correa-­C ano, M. E., Evans, J., Fisher, D. N.,
https://doi.org/10.5061/dryad.jt2002k Goodwin, C. E. D., Robinson, B. S., Hodgson, D. J., & Inger, R. (2018).
Clark, A. E., & Altwegg, R. (2019b). Efficient Bayesian analysis of occu- A brief introduction to mixed effects modelling and multi-­model
pancy models with logit link functions. Ecology and Evolution, 9(2), inference in ecology. PeerJ, 6, e4794. https://doi.org/10.7717/
756–­768. https://doi.org/10.1002/ece3.4850 peerj.4794
Conn, P. B., Johnson, D. S., Williams, P. J., Melin, S. R., & Hooten, M. Hartig, F. (2020). DHARMa: Residual diagnostics for hierarchical (multi-­
B. (2018). A guide to Bayesian model checking for ecologists. level/mixed) regression models. R Package Version 0.3.3.0.
Ecological Monographs, 88(4), 526–­542. https://doi.org/10.1002/ Hodges, J. S., & Reich, B. J. (2010). Adding spatially-­correlated errors can
ecm.1314 mess up the fixed effect you love. American Statistician, 64(4), 325–­
Cressie, N., Sainsbury-­Dale, M., & Zammit-­Mangion, A. (2022). Basis-­ 334. https://doi.org/10.1198/tast.2010.10052
function models in spatial statistics. Annual Review of Statistics and Hooten, M. B., & Hobbs, N. T. (2015). A guide to Bayesian model selec-
Its Application, 9(1), 1–­28. https://doi.org/10.1146/annur​ev-­stati​ tion for ecologists M. Ecological Monographs, 85(1), 3–­28. https://
stics​- ­0 4012​0 -­020733 doi.org/10.1890/07-­1861.1
de Valpine, P, Paciorek C, Turek D, Michaud N, Anderson-­Bergman C, Joseph, M. B. (2020). Neural hierarchical models of ecological popu-
Obermeyer F, Wehrhahn Cortes C, Rodrìguez A, Temple Lang lations. Ecology Letters, 23(4), 734–­747. https://doi.org/10.1111/
D, Paganin S (2022a). NIMBLE: MCMC, particle filtering, and pro- ele.13462
grammable hierarchical modeling. https://doi.org/10.5281/ze- Kendall, W., & Nichols, J. D. (1995). On the use of secondary capture–­
nodo.1211190, R package version 0.12.2, https://cran.r-­proje​ recapture samples to estimate temporary emigration and breeding
ct.org/packa​ge=nimble. proportions. Journal of Applied Statistics, 22(5–­6), 751–­762. https://
de Valpine, P, Paciorek C, Turek D, Michaud N, Anderson-­Bergman C, doi.org/10.1080/02664​76952​4595
Obermeyer F, Wehrhahn Cortes C, Rodrìguez A, Temple Lang D, Kendall, W. L., Nichols, J. D., & Hines, J. E. (1997). Estimating tempo-
Paganin S. (2022b). NIMBLE user manual. https://doi.org/10.5281/ rary emigration using capture-­recapture data with Pollock's robust
zenodo.1211190, R package manual version 0.12.2, https://r-­ design. Ecology, 78(2), 563–­578. https://doi.org/10.1890/0012-­
nimble.org. 9658(1997)078[0563:ETEUC​R]2.0.CO;2
de Valpine, P., Turek, D., Paciorek, C., Anderson-­Bergman, C., Temple Kéry, M., & Royle, J. A. (2016). Applied hierarchical modeling in ecology
Lang, D., & Bodik, R. (2017). Programming with models: Writing analysis of distribution, abundance and species richness in R and BUGS:
statistical algorithms for general model structures with NIMBLE. Volume 1: Prelude and static models. Academic Press.
Journal of Computational and Graphical Statistics, 26, 403–­413. Kéry, M., & Royle, J. A. (2021). Applied hierarchical modeling in ecology
https://doi.org/10.1080/10618​600.2016.1172487 analysis of distribution, abundance and species richness in R and BUGS:
Dennis, B., Ponciano, J. M., Taper, M. L., & Lele, S. R. (2019). Errors in Volume 2: Dynamic and advanced models. Academic Press.
statistical inference under model misspecification: Evidence, hy- Kéry, M., & Schaub, M. (2012). Bayesian population analysis using
pothesis testing, and AIC. Frontiers in Ecology and Evolution, 7, 372. WinBUGS: A hierarchical perspective. Elsevier.
https://doi.org/10.3389/fevo.2019.00372 Koehler, E., Brown, E., & Haneuse, S. J. P. A. (2009). On the assessment
Dey, S., Bischof, R., Dupont, P. P. A., & Milleret, C. (2022). Does the pun- of Monte Carlo error in simulation-­based statistical analyses.
ishment fit the crime? Consequences and diagnosis of misspecified American Statistician, 63(2), 155–­162. https://doi.org/10.1198/
detection functions in Bayesian spatial capture–­recapture mod- tast.2009.0030
eling. Ecology and Evolution, 12(2), 1–­15. https://doi.org/10.1002/ Lahiri, S. N. (2005). Consistency of the jackknife-­after-­bootstrap vari-
ece3.8600 ance estimator for the bootstrap quantiles of a studentized
|

2041210x, 2023, 1, Downloaded from https://besjournals.onlinelibrary.wiley.com/doi/10.1111/2041-210X.14030 by University Of Jyväskylä Library, Wiley Online Library on [23/09/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
DIRENZO et al. Methods in Ecology and Evolu on 217

statistic. Annals of Statistics, 33(5), 2475–­2506. https://doi. rates and local abundance from detection-­ nondetection data.
org/10.1214/00905​36050​0 0000507 Statistical Reports, 97(12), 3300–­3307. https://doi.org/10.1002/
Link, W. A., Schofield, M. R., Barker, R. J., & Sauer, J. R. (2018). On the ro- ECY.1598
bustness of N-­mixture models. Ecology, 99(7), 1547–­1551. https:// Runting, R. K., Phinn, S., Xie, Z., Venter, O., & Watson, J. E. M. (2020).
doi.org/10.1002/ecy.2362 Opportunities for big data in conservation and sustainability.
Loh, J. M., & Stein, M. L. (2004). Bootstrapping a spatial point process. Nature Communications, 11, 1–­4.
Institute of Statistical Science, Academia Sinica, 14(1), 69–­101. Santos-­Fernandez, E., & Mengersen, K. (2021). Understanding the re-
MacKenzie, D. I., & Bailey, L. L. (2004). Assessing the fit of site-­occupancy liability of citizen science observational data using item response
models. Journal of Agricultural, Biological, and Environmental Statistics, models. Methods in Ecology and Evolution, 12(8), 1533–­1548.
9(3), 300–­318. https://doi.org/10.1198/10857​1104X​3361 https://doi.org/10.1111/2041-­210X.13623
MacKenzie, D. I., Nichols, J. D., Lachman, G. B., Droege, S., Royle, J. A., Silk, M. J., Harrison, X. A., & Hodgson, D. J. (2020). Perils and pitfalls of
& Langtimm, C. A. (2002). Estimating site occupancy rates when mixed-­effects regression models in biology. PeerJ, 8, e9522. https://
detection probabilities are less than one. Ecology, 83, 2248–­2255. doi.org/10.7717/peerj.9522
Mackenzie, D. I., & Royle, J. A. (2005). Designing occupancy studies: General Smith, S. M., Stayton, C. T., & Angielczyk, K. D. (2021). How many trees
advice and allocating survey effort. Journal of Applied Ecology, 42(6), to see the forest? Assessing the effects of morphospace coverage
1105–­1114. https://doi.org/10.1111/j.1365-­2664.2005.01098.x and sample size in performance surface analysis. Methods in Ecology
McClintock, B. T. (2021). Worth the effort? A practical examination of and Evolution, 12(8), 1411–­1424.
random effects in hidden Markov models for animal telemetry Soroye, P., Newbold, T., & Kerr, J. (2020). Climate change contributes to
data. Methods in Ecology and Evolution, 12(8), 1475–­1497. https:// widespread declines among bumble bees across continents. Science,
doi.org/10.1111/2041-­210X.13619 367(6478), 685–­688. https://doi.org/10.1126/scien​ce.aax8591
McNew, L. B., & Handel, C. M. (2015). Evaluating species richness: Biased Sturtz, S., Ligges, U., & Gelman, A. (2005). R2WinBUGS: A package
ecological inference results from spatial heterogeneity in detection for running WinBUGS from R. Journal of Statistical Software, 12,
probabilities. Ecological Applications, 25(6), 1669–­1680. https://doi. 1–­16.
org/10.1890/14-­1248.1 Tingley, M. W., Nadeau, C. P., & Sandor, M. E. (2020). Multi-­species
Miller, D. A., & Grant, E. H. C. (2015). Estimating occupancy dynamics for occupancy models as robust estimators of community richness.
large-­scale monitoring networks: Amphibian breeding occupancy Methods in Ecology and Evolution, 11(5), 633–­6 42. https://doi.
across protected areas in the Northeast United States. Ecology and org/10.1111/2041-­210X.13378
Evolution, 5(21), 4735–­4746. Tyre, A. J., Tenhumberg, B., Field, S. A., Niejalke, D., Parris, K., &
Muff, S., Signer, J., & Fieberg, J. (2020). Accounting for individual-­ Possingham, H. P. (2003). Improving precision and reducing bias in
specific variation in habitat-­selection studies: Efficient estima- biological surveys: Estimating false-­negative error rates. Ecological
tion of mixed-­effects models using Bayesian or frequentist com- Applications, 13(6), 1790–­1801. https://doi.org/10.1890/02-­5078
putation. Journal of Animal Ecology, 89(1), 80–­92. https://doi. Weber, F., Knapp, G., Glass, Ä., Kundt, G., & Ickstadt, K. (2021). Interval
org/10.1111/1365-­2656.13087 estimation of the overall treatment effect in random-­effects meta-­
Olivetti, S., Gil, M. A., Sridharan, V. K., & Hein, A. M. (2021). Merging analyses: Recommendations from a simulation study comparing
computational fluid dynamics and machine learning to reveal an- frequentist, Bayesian, and bootstrap methods. Research Synthesis
imal migration strategies. Methods in Ecology and Evolution, 12(7), Methods, 12(3), 291–­315. https://doi.org/10.1002/jrsm.1471
1186–­1200. Wright, A. D., Campbell Grant, E. H., & Zipkin, E. F. (2022). A compari-
Paciorek, C. (2010). The importance of scale for spatial-­confounding bias son of monitoring designs to assess wildlife community parameters
and precision of spatial regression estimators. Statistical Science, across spatial scales. Ecological Applications, 32(6), 1–­13. https://doi.
25(1), 107–­125. https://doi.org/10.1214/10-­STS326.The org/10.1002/eap.2621
Pagel, J., & Schurr, F. M. (2012). Forecasting species ranges by statisti- Zipkin, E. F., Rossman, S., Yackulic, C. B., Wiens, J. D., Thorson, J. T., Davis,
cal estimation of ecological niches and spatial population dynam- R. J., & Grant, E. H. C. (2017). Integrating count and detection–­
ics. Global Ecology and Biogeography, 21(2), 293–­3 04. https://doi. nondetection data to model population dynamics. Ecology, 98(6),
org/10.1111/j.1466-­8238.2011.00663.x 1640–­1650. https://doi.org/10.1002/ecy.1831
Palencia, P., Fernández-­López, J., Vicente, J., & Acevedo, P. (2021).
Innovations in movement and behavioural ecology from camera
traps: Day range as model parameter. Methods in Ecology and Evolution,
S U P P O R T I N G I N FO R M AT I O N
12, 1201–­1212. https://doi.org/10.1111/2041-­210X.13609
Plummer, M. (2003). JAGS: A program for analysis of Bayesian graphical Additional supporting information can be found online in the
models using Gibbs sampling. Proceedings of the 3rd International Supporting Information section at the end of this article.
Workshop on Distributed Statistical Computing. Vol. 124. No. 125.10.
Powers, S. M., & Hampton, S. E. (2019). Open Science, reproducibility,
and transparency in ecology. Ecological Applications, 29, e01822.
R Core Team. (2021). R: A language and environment for statistical comput-
How to cite this article: DiRenzo, G. V., Hanks, E., & Miller, D.
ing. R Foundation for Statistical Computing.
Revell, L. J., & Harmon, L. J. (2022). Phylogenetic comparative methods in A. W. (2023). A practical guide to understanding and validating
R. Princeton University Press. complex models using data simulations. Methods in Ecology and
Rizzo, M. L. (2019). Statistical computing with R. Chapman and Hall/CRC. Evolution, 14, 203–217. https://doi.org/10.1111/2041-
https://www.R-­proje​c t.org/
210X.14030
Rossman, S., Yackulic, C. B., Saunders, S. P., Reid, J., Davis, R., & Zipkin, E.
F. (2016). Dynamic N-­ occupancy models: Estimating demographic

You might also like