100% found this document useful (10 votes)
101 views

Bayesian Compendium Direct eBook Download

The 'Bayesian Compendium' by Marcel van Oijen serves as an introductory guide for scientists new to Bayesian methods, emphasizing the importance of addressing uncertainties in data and models. It offers a concise overview of Bayesian thinking, modeling methods, and practical applications in various fields, particularly for dynamic modelers. The book is structured into six sections, covering basics, advanced material, specific models, and future outlooks, with a focus on clarity and accessibility for readers with limited statistical backgrounds.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (10 votes)
101 views

Bayesian Compendium Direct eBook Download

The 'Bayesian Compendium' by Marcel van Oijen serves as an introductory guide for scientists new to Bayesian methods, emphasizing the importance of addressing uncertainties in data and models. It offers a concise overview of Bayesian thinking, modeling methods, and practical applications in various fields, particularly for dynamic modelers. The book is structured into six sections, covering basics, advanced material, specific models, and future outlooks, with a focus on clarity and accessibility for readers with limited statistical backgrounds.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Bayesian Compendium

Visit the link below to download the full version of this book:

https://medipdf.com/product/bayesian-compendium/

Click Download Now


Marcel van Oijen

Bayesian Compendium

123
Marcel van Oijen
Edinburgh, UK

ISBN 978-3-030-55896-3 ISBN 978-3-030-55897-0 (eBook)


https://doi.org/10.1007/978-3-030-55897-0
© Springer Nature Switzerland AG 2020
This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part
of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations,
recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission
or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar
methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this
publication does not imply, even in the absence of a specific statement, that such names are exempt from
the relevant protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this
book are believed to be true and accurate at the date of publication. Neither the publisher nor the
authors or the editors give a warranty, expressed or implied, with respect to the material contained
herein or for any errors or omissions that may have been made. The publisher remains neutral with regard
to jurisdictional claims in published maps and institutional affiliations.

This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Preface

Why This Book?

The recurring topic during my 35 years in science has been a struggle with
uncertainties. My primary field is agricultural, environmental and ecological sci-
ence using both process-based and statistical models. And in each of my studies,
uncertainties in data and models have kept cropping up. I found that the literature
contains many ways for classifying and analysing these uncertainties, but they
mostly seem based on arbitrary criteria: a confusing menagerie of methods.
My encounters with probability theory, and with Bayesian methods in particular,
showed that there is no need for confusion; there is a general perspective that can be
applied in every case. The online papers by Edwin Jaynes, and his posthumously
published book (2003), followed by the more practice-oriented small book by Sivia
(2006) did much to clarify matters. Very striking was Jaynes’ explanation of the
Cox postulates, which proved that consistent rational thinking requires the use
of the rules of probability theory. Jaynes and Sivia took many of their examples
from physics, but it was clear that Bayesian probabilistic methods are completely
generic and can be applied in any field. It also became clear that the methods do
come with some practical problems. They require researchers to think carefully
about their prior knowledge before embarking on an analysis. And the Bayesian
analysis itself tends to be computationally demanding. Much research effort has
gone into resolving these issues, and good methods are available.
So I have happily employed Bayesian methods during the past two decades.
However, when teaching the approach to others, I found that there was no short yet
comprehensive introductory text for scientists who do not have much statistical
background. Especially amongst dynamic modellers, the study of statistical meth-
ods is not given high priority. But this book is intended to provide a quick and easy
entry into Bayesian methods to all modellers, irrespective of the types of models
and data that they work with.

v
vi Preface

Who is This Book for?

This book is primarily for scientists who are newcomers to Bayesian methods, and
who want to use data to parameterise and compare their models while accounting
for uncertainties in data, model parameters and model structure. There are of course
excellent books on Bayesian methods for beginners, but they tend to be very long
texts that neglect the dynamic modeller. There is a need for short and easily
digestible descriptions of Bayesian methods that cover the wide range of
process-based and empirical models that are in common use. All chapters in this
book are short, so it is more a compendium than a detailed manual.
The book is for every scientist who works with models and data, irrespective
of the type of model or data. Although aimed at beginners, I hope that more advanced
users of Bayesian methods will also find material of interest in this book. They may
be unfamiliar with some of the more specialized topics, or find the book useful for
teaching. They may also be interested in the way this book tries to clarify the
terminological confusion that reigns in scientific modelling and in applied statistics.

What is in the Book?

The book begins with a gentle introduction to Bayesian thinking. Once the basic
ideas are in place, it provides explanations of a very wide range of modelling
methods in three languages: English, mathematics and R. I believe that a mixture
of the three languages, complemented with figures, is the best way to teach
quantitative science. You may find English too woolly, mathematics too compact or
R-code too problem-specific. But together they may provide clear, reasonably short
and practical explanations. Some exercises are provided as well.
The code examples in this book are all short, and they mostly use basic R that
can easily be understood even if you are used to programming in a different
computer language. R does come with a wealth of optional add-ons, called
R-packages, but those will be used only occasionally. All essential code is in the
book itself, and the complete code is available online (https://www.springer.com/
de/book/9783030558963).
Almost every example will make use of very simple artificial data sets. I mostly
leave out the messiness of real data sets because they are distracting and obscure the
view of how the methods work. Likewise, I focus on brevity and simplicity of code,
rather than on speed of execution. The aim is to explain the diversity of Bayesian
computational techniques that you can find in the literature. The algorithms that we
show, such as MCMC, can be used with any kind of model and data. In fact, one
of the main advantages of Bayesian statistical modelling is that all idiosyncrasies of
data, such as missing data, can be easily handled.
Preface vii

Outline of Chapters

This book has six sections, with the following contents:


• The basics: Chaps. 1–12.
– In these chapters, we introduce the basics of Bayesian thinking to you. In
terms of abstract concepts, there is not too much to learn: it’s all about using
a prior distribution and likelihood function to derive the posterior distribu-
tion. For simple models, the mathematics is very easy. But practical appli-
cation of Bayesian methods can be harder if our models or data are
complicated. We then rely on computer algorithms to find representative
samples from the posterior distribution.
• Advanced material: Chaps. 13–18.
– These chapters introduce methods that help us with more complicated prior
distributions and model structures. We also show how Bayesian thinking
helps with risk analysis and decision making.
• Specific models: Chaps. 19–23.
– The models that we discuss in these chapters are quite simple functions, or
built from combinations of simple functions. But the models are highly
versatile and widely applicable, even in situations where we do not know
beforehand what the shape of our function of interest is!
• Outlook to the future: Chap. 24.
– This chapter reviews the state of Bayesian modelling and identifies trends of
future methodological development. We give a few pointers for further
self-study.
• Appendices with tips and tricks.
– In these appendices, we present the notation we use in this book, and very
briefly summarize relevant mathematics, probability theory and software.
• Solutions to exercises, index, references.

Acknowledgements

Science is a highly collaborative affair, and over the years I have discussed
Bayesian methods with a great number of people. I believe that most of them know
how grateful I am to them, so I will do some cluster analysis and thank people by
group—mentioning just a few names explicitly. I thank many statisticians in the
Netherlands and the UK, who taught me over the years, including Ron Smith and
Jonathan Rougier who were co-authors on my first Bayesian paper. To the present
viii Preface

day, my publications with a Bayesian or otherwise probabilistic focus have been


with about 140 different co-authors: a large group of inspiring people! Further
inspiration came from colleagues in projects Nitro-Europe, Carbo-Extreme,
GREENHOUSE, MACSUR, UK-SCaPE, PRAFOR and various COST-Actions,
and from the Cambridge gang of July 2019. I thank my colleagues in the UK,
Netherlands, Belgium, Germany, France, Portugal, Hungary, Norway and Finland
who invited me to teach, lecture or work on Bayesian methods, and the many
students. Mats Höglind gave me the opportunity for a sabbatical year in Norway
where I spent much time with Edwin Jaynes. I enjoyed organizing a Bayesian
course in Edinburgh because of the happy collaboration with Lindsay Banin, Kate
Searle, David Cameron and Peter Levy, and the enthusiasm of the students. David
and Peter have been my main discussion partners over the years, so my extra special
thanks go to them.
Finally, with more than gratitude, my wife Netty.

Edinburgh, UK Marcel van Oijen


Contents

1 Introduction to Bayesian Thinking . . . . . . . . . . . . . . . . . . . . . . . . . 1


1.1 Bayesian Thinking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 A Murder Mystery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Bayes’ Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3.1 Implications of Bayes’ Theorem . . . . . . . . . . . . . . . . . 4
1.3.2 The Odds-Form of Bayes’ Theorem and a Simple
Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . .... 5
2 Introduction to Bayesian Science . . . . . . . . . . . . . . . . . . . ....... 7
2.1 Measuring, Modelling and Science: The Three Basic
Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2 Terminological Confusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.3 Process-Based Models Versus Empirical Models . . . . . . . . . . . 11
2.4 Errors and Uncertainties in Modelling . . . . . . . . . . . . . . . . . . . 12
2.4.1 Errors and Uncertainties in Model Drivers . . . . . . . . . . 12
2.4.2 Errors and Uncertainties in Model Parameters . . . . . . . 13
2.4.3 Errors and Uncertainties in Model Structure . . . . . . . . . 13
2.4.4 Forward Propagation of Uncertainty to Model
Outputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . ....... 14
2.5 Bayes and Science . . . . . . . . . . . . . . . . . . . . . . . . . . ....... 15
2.6 Bayesian Parameter Estimation . . . . . . . . . . . . . . . . . ....... 15
3 Assigning a Prior Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.1 Quantifying Uncertainty and MaxEnt . . . . . . . . . . . . . . . . . . . . 18
3.2 Final Remarks for Priors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4 Assigning a Likelihood Function . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.1 Expressing Knowledge About Data Error in the Likelihood
Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.2 What to Measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

ix
x Contents

5 Deriving the Posterior Distribution . . . . . . . . . . . . . . . . . . . . ..... 29


5.1 Analytically Solving Bayes’ Theorem: Conjugacy . . . . . . ..... 29
5.2 Numerically ‘Solving’ Bayes’ Theorem: Sampling-Based
Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 31
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 32
6 Sampling from Any Distribution by MCMC . . . . . . . . . . . . . . . . . . 33
6.1 MCMC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
6.2 MCMC in Two Lines of R-Code . . . . . . . . . . . . . . . . . . . . . . . 34
6.3 The Metropolis Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
7 Sampling from the Posterior Distribution by MCMC . . . . . . . . . . . 39
7.1 MCMC and Bayes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
7.1.1 MCMC and Models . . . . . . . . . . . . . . . . . . . . . . . . . . 39
7.1.2 The Need for Log-Transformations in MCMC . . . . . . . 40
7.2 Bayesian Calibration of a 2-Parameter Model Using
the Metropolis Algorithm . . . . . . . . . . . . . . . . . . . . . . . ..... 41
7.2.1 The Metropolis Algorithm . . . . . . . . . . . . . . . . ..... 41
7.2.2 Failed Application of MCMC Using the Default
Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 42
7.3 Bayesian Calibration of a 3-Parameter Model Using
the Metropolis Algorithm . . . . . . . . . . . . . . . . . . . . . . . ..... 44
7.4 More MCMC Diagnostics . . . . . . . . . . . . . . . . . . . . . . . ..... 44
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 46
8 Twelve Ways to Fit a Straight Line . . . . . . . . . . . . . . . . . . . ..... 49
8.1 Hidden Equivalences . . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 49
8.2 Our Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 50
8.3 The Normal Equations for Ordinary Least Squares
Regression (OLS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
8.3.1 Uncertainty Quantification . . . . . . . . . . . . . . . . . . . . . 51
8.4 Regression Using Generalised Least Squares (GLS) . . . . . . . . . 52
8.4.1 From GLS to WLS and OLS . . . . . . . . . . . . . . . . . . . 53
8.5 The Lindley and Smith (LS72) Equations . . . . . . . . . . . . . . . . . 53
8.6 Regression Using the Kalman Filter . . . . . . . . . . . . . . . . . . . . . 54
8.7 Regression Using the Conditional Multivariate Gaussian . . . . . . 55
8.8 Regression Using Graphical Modelling (GM) . . . . . . . . . . . . . . 56
8.9 Regression Using a Gaussian Process (GP) . . . . . . . . . . . . . . . 57
8.10 Regression Using Accept-Reject Sampling . . . . . . . . . . . . . . . . 58
8.11 Regression Using MCMC with the Metropolis Algorithm . . . . . 58
8.12 Regression Using MCMC with Gibbs Sampling . . . . . . . . . . . . 58
8.13 Regression Using JAGS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
8.14 Comparison of Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
Contents xi

9 MCMC and Complex Models . . . . . . . . . . . . . . . . . . . . . ........ 63


9.1 Process-Based Models (PBMs) . . . . . . . . . . . . . . . . ........ 63
9.1.1 A Simple PBM for Vegetation Growth: The
Expolinear Model . . . . . . . . . . . . . . . . . . . . ........ 63
9.2 Bayesian Calibration of the Expolinear Model . . . . . ........ 65
9.3 More Complex Models . . . . . . . . . . . . . . . . . . . . . . ........ 66
10 Bayesian Calibration and MCMC: Frequently Asked
Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
10.1 The MCMC Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
10.2 Data and Likelihood Function . . . . . . . . . . . . . . . . . . . . . . . . . 70
10.3 Parameters and Prior . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
10.4 Code Efficiency and Computational Issues . . . . . . . . . . . . . . . . 72
10.5 Results from the Bayesian Calibration . . . . . . . . . . . . . . . . . . . 74
11 After the Calibration: Interpretation, Reporting, Visualization . ... 77
11.1 Interpreting the Posterior Distribution and Model
Diagnostics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ... 77
11.2 Reporting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ... 79
11.3 Visualising Uncertainty . . . . . . . . . . . . . . . . . . . . . . . . . . . ... 80
12 Model Ensembles: BMC and BMA . . . . . . . . . . . . . . . . . . . ...... 81
12.1 Model Ensembles, Integrated Likelihoods and Bayes
Factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
12.2 Bayesian Model Comparison (BMC) . . . . . . . . . . . . . . . . . . . . 82
12.3 Bayesian Model Averaging (BMA) . . . . . . . . . . . . . . . . . . . . . 83
12.4 BMC and BMA of Two Process-Based Models . . . . . . . . . . . . 84
12.4.1 EXPOL5 and EXPOL6 . . . . . . . . . . . . . . . . . . . . . . . . 84
12.4.2 Bayesian Calibration of EXPOL6’s Parameters . . . . . . 85
12.4.3 BMC and BMA of EXPOL5 and EXPOL6 . . . . . . . . . 85
13 Discrepancy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
13.1 Treatment of Discrepancy in Single-Model Calibration . . . . . . . 90
13.2 Treatment of Discrepancy in Model Ensembles . . . . . . . . . . . . 91
14 Gaussian Processes and Model Emulation . . . . . . . . . . . . . . . . . . . 93
14.1 Model Emulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
14.2 Gaussian Processes (GP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
14.3 An Example of Emulating a One-Input, One-Output Model . . . 96
14.3.1 Analytical Formulas for GP-calibration
and Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . ... 96
14.3.2 Using R-Package geoR for GP-calibration
and Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . ... 98
14.4 An Example of Emulating a Process-Based Model
(EXPOL6) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
xii Contents

14.4.1 Training Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100


14.4.2 Calibration of the Emulator . . . . . . . . . . . . . . . . . . . . . 101
14.4.3 Testing the Emulator . . . . . . . . . . . . . . . . . . . . . . . . . 102
14.5 Comments on Emulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
15 Graphical Modelling (GM) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
15.1 Gaussian Bayesian Networks (GBN) . . . . . . . . . . . . . . . . . . . . 107
15.1.1 Conditional Independence . . . . . . . . . . . . . . . . . . . . . . 108
15.2 Three Mathematically Equivalent Specifications
of a Multivariate Gaussian . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
15.2.1 Switching Between the Three Different
Specifications of the Multivariate Gaussian . . . . . . . . . 110
15.3 The Simplest DAG Is the Causal One! . . . . . . . . . . . . . . . . . . . 112
15.4 Sampling from a GBN and Bayesian Updating . . . . . . . . . . . . . 112
15.4.1 Updating a GBN When Information About Nodes
Becomes Available . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
15.5 Example I: A 4-Node GBN Demonstrating DAG Design,
Sampling and Updating . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
15.6 Example II: A 5-Node GBN in the form of a Linear Chain . . . . 116
15.7 Examples III & IV: All Relationships in a GBN are Linear . . . . 116
15.7.1 Example III: A GBN Representing Univariate
Linear Dependency . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
15.7.2 Example IV: A GBN Representing Multivariate
Stochastic Linear Relations . . . . . . . . . . . . . . . . . . . . . 117
15.8 Example V: GBNs can do Geostatistical Interpolation . . . . . . . . 118
15.9 Comments on Graphical Modelling . . . . . . . . . . . . . . . . . . . . . 119
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
16 Bayesian Hierarchical Modelling (BHM) . . . . . . . . . . . . . . . . . . . . 121
16.1 Why Hierarchical Modelling? . . . . . . . . . . . . . . . . . . . . . . . . . 122
16.2 Comparing Non-hierarchical and Hierarchical Models . . . . . . . . 123
16.2.1 Model A: Global Intercept and Slope, Not
Hierarchical . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
16.2.2 Model B: Cv-Specific Intercepts and Slopes,
Not Hierarchical . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
16.2.3 Model C: Cv-Specific Intercepts and Slopes,
Hierarchical . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
16.2.4 Comparing Models A, B and C . . . . . . . . . . . . . . . . . . 127
16.3 Applicability of BHM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
17 Probabilistic Risk Analysis and Bayesian Decision Theory . . . . . . . 129
17.1 Risk, Hazard and Vulnerability . . . . . . . . . . . . . . . . . . . . . . . . 129
17.1.1 Theory for Probabilistic Risk Analysis (PRA) . . . . . . . 130
Contents xiii

17.2 Bayesian Decision Theory (BDT) . . . . . . . . . . . . . . . . . . . . . . 131


17.2.1 Value of Information . . . . . . . . . . . . . . . . . . . . . . . . . 132
17.3 Graphical Modelling as a Tool to Support BDT . . . . . . . . . . . . 133
18 Approximations to Bayes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
18.1 Approximations to Bayesian Calibration . . . . . . . . . . . . . . . . . . 136
18.2 Approximations to Bayesian Model Comparison . . . . . . . . . . . . 136
19 Linear Modelling: LM, GLM, GAM and Mixed Models . . . . . . . . 137
19.1 Linear Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
19.2 LM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
19.3 GLM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
19.4 GAM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
19.5 Mixed Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
19.6 Parameter Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
19.6.1 Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
20 Machine Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
20.1 The Family Tree of Machine Learning Approaches . . . . . . . . . . 142
20.2 Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
20.2.1 Bayesian Calibration of a Neural Network . . . . . . . . . . 145
20.2.2 Preventing Overfitting . . . . . . . . . . . . . . . . . . . . . . . . . 147
20.3 Outlook for Machine Learning . . . . . . . . . . . . . . . . . . . . . . . . . 148
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
21 Time Series and Data Assimilation . . . . . . . . . . . . . . . . . . . . . . . . . 151
21.1 Sampling from a Gaussian Process (GP) . . . . . . . . . . . . . . . . . 151
21.2 Data Assimilation Using the Kalman Filter (KF) . . . . . . . . . . . 154
21.2.1 A More General Formulation of KF . . . . . . . . . . . . . . 157
21.3 Time Series, KF and Complex Dynamic Models . . . . . . . . . . . 159
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
22 Spatial Modelling and Scaling Error . . . . . . . . . . . . . . . . . . . . . . . . 161
22.1 Spatial Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
22.2 Geostatistics Using a GP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
22.3 Geostatistics Using geoR . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
22.4 Adding a Nugget . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
22.5 Estimating All GP-hyperparameters . . . . . . . . . . . . . . . . . . . . . 165
22.6 Spatial Upscaling Error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
23 Spatio-Temporal Modelling and Adaptive Sampling . . . . . . . . . . . . 169
23.1 Spatio-Temporal Modelling . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
23.2 Adaptive Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
23.3 Comments on Spatio-Temporal Modelling and Adaptive
Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
xiv Contents

24 What Next? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173


24.1 Some Crystal Ball Gazing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
24.2 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
24.3 Closing Words . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176

Appendix A: Notation and Abbreviations . . . . . . . . . . . . . . . . . . . . . . . . . 177


Appendix B: Mathematics for Modellers . . . . . . . . . . . . . . . . . . . . . . . . . . 179
Appendix C: Probability Theory for Modellers . . . . . . . . . . . . . . . . . . . . 181
Appendix D: R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
Appendix E: Bayesian Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
Appendix F: Solutions to Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
Chapter 1
Introduction to Bayesian Thinking

In recent years, there has been a trend toward basing scientific research under con-
ditions of incomplete information, i.e. most of science, on probability theory (e.g.
Hartig et al. 2012; Jaynes 2003; Ogle and Barber 2008; Sivia and Skilling 2006;
Van Oijen et al. 2011). This is the approach that we take in this book too. We
aim to show how defining all uncertainties in modelling as probability distributions
allows for rigorous reduction of those uncertainties when new data become available.
The approach that we are presenting is known in the literature under many differ-
ent names, including Bayesian calibration, data assimilation, model-data fusion
and inverse modelling. Whilst the different names refer to different applications of
modelling, they all share the idea of specifying probability distributions which are
modified according to the rules of probability theory (in particular, Bayes’ Theorem)
when new data come in. It is this idea that facilitates the comprehensive analysis of
errors and uncertainties.
Lindley (1991) stated the importance of probability theory as follows: “Probabil-
ity, it has been said, is merely common sense reduced to calculation. It is the basic
tool for appreciating uncertainty, and uncertainty cannot be adequately handled with-
out a knowledge of probability.” And it is possible to show formally that rational,
coherent thinking implies using the rules of probability theory Jaynes (2003).

1.1 Bayesian Thinking

The basics of Bayesian thinking are simple. There are just three elements, connected
by probability theory. The elements are: (1) your prior belief about a quantity or
proposition, (2) new information, (3) your posterior belief. Probability theory pro-
vides the logical connection from the first two elements to the last. So all we need
to learn is how to express beliefs and new information in the form of probability

© Springer Nature Switzerland AG 2020 1


M. van Oijen, Bayesian Compendium,
https://doi.org/10.1007/978-3-030-55897-0_1
2 1 Introduction to Bayesian Thinking

distributions, and then we can simply follow the rules of probability theory. That is
all!
While the simplicity of Bayesian thinking is a given, that does not mean it is
necessarily easy to learn. Consistently thinking in terms of probability theory is not
second nature to everyone. And there is no unique simplest way to teach Bayesian
thinking to you. It all depends on your background, and your favourite mode of
rational thinking. Do you prefer to begin with abstract concepts, then mathematical
equations, then examples? Or do you wish to begin with puzzles or anecdotes, and
learn how they can all be approached in the same way? Perhaps you like to start
from your knowledge of classical statistics and learn how its methods can always be
interpreted, and often improved, in a Bayesian way? But here we begin with a short
detective story …

1.2 A Murder Mystery

You are called to a country house: the owner has been found murdered in the library.
The three possible suspects are his wife, his son, and the butler. Before reading on,
who do you believe committed the crime? And do not say: “I can’t answer that, I
have not inspected the evidence yet.” You are a Bayesian detective, so you can state
your prior probabilities.
Your assistant says “I bet it’s the wife”, but you find that silly bias. You see the
butler as the prime suspect, and would give odds of 4:1 that he is the culprit. You
find the wife just as improbable as the son. So your prior probability distribution
for butler-wife-son is 80–10–10%. Of course, you would not really bet money on
the outcome of the case—you’re a professional—but you decide to investigate the
butler first. To your surprise, you find that the butler has a perfect alibi. What is your
probability distribution now? The answer is that the alibi of the butler has no bearing
on the wife or son, so they remain equally likely candidates, and your probability
distribution becomes 0–50–50%.
Next you inspect the library, and find that the murder was committed with a blunt
instrument. How does that change your probabilities? You assess the likelihood of
such a murder weapon being chosen by a man to be twice as high as by a woman.
So that changes your probabilities to 0–33–67%.
I leave you to finish the story to a logical conclusion where sufficient evidence
has been processed such that the murderer is identified beyond reasonable doubt. But
what does this murder mystery teach us? Well, our story is fiction, but it contains
the three steps of any Bayesian analysis: assigning a prior probability distribution,
acquiring new information, updating your probability distribution following the rules
of probability theory. Everything in this book, and in Bayesian statistics generally,
is about one or more of these three steps. So if you found the detective’s reasoning
plausible, then you are already a Bayesian! In fact, there is a rich literature in the field
of psychology that shows that human beings at least to some extent make decisions
in a Bayesian way.

You might also like