57719
57719
https://ebookmeta.com/product/probabilistic-numerics-computation-
as-machine-learning-1st-edition-philipp-hennig/
https://ebookmeta.com/product/probabilistic-machine-learning-
advanced-topics-kevin-p-murphy/
https://ebookmeta.com/product/probabilistic-machine-learning-
advanced-topics-draft-1st-edition-kevin-p-murphy/
https://ebookmeta.com/product/probabilistic-machine-learning-for-
finance-and-investing-early-release-1st-edition-deepak-kanungo/
https://ebookmeta.com/product/king-of-queens-king-
university-4-1st-edition-nikki-pennington/
Mastering ARKit: Apple’s Augmented Reality App
Development Platform 1st Edition Jayven Nhan
https://ebookmeta.com/product/mastering-arkit-apples-augmented-
reality-app-development-platform-1st-edition-jayven-nhan/
https://ebookmeta.com/product/computational-quantum-chemistry-
second-edition-ram-yatan-prasad/
https://ebookmeta.com/product/tabata-training-the-science-and-
history-of-hiit-1st-edition-izumi-tabata/
https://ebookmeta.com/product/southern-african-wildlife-3rd-
edition-mike-unwin/
https://ebookmeta.com/product/best-selling-house-plans-4th-
edition-over-360-dream-home-plans-in-full-color-creative-
homeowner/
Advanced Nanomaterials and Their Applications in
Renewable Energy 2nd Edition Tian-Hao Yan
https://ebookmeta.com/product/advanced-nanomaterials-and-their-
applications-in-renewable-energy-2nd-edition-tian-hao-yan/
Probabilistic Numerics
Probabilistic numerical computation formalises the connection between machine learning and
applied mathematics. Numerical algorithms approximate intractable quantities from computable
ones. They estimate integrals from evaluations of the integrand, or the path of a dynamical system
described by differential equations from evaluations of the vector field. In other words, they
infer a latent quantity from data. This book shows that it is thus formally possible to think of
computational routines as learning machines, and to use the notion of Bayesian inference to build
more flexible, efficient, or customised algorithms for computation.
The text caters for Masters’ and PhD students, as well as postgraduate researchers in artificial
intelligence, computer science, statistics, and applied mathematics. Extensive background material
is provided along with a wealth of figures, worked examples, and exercises (with solutions) to
develop intuition.
Philipp Hennig holds the Chair for the Methods of Machine Learning at the University of
Tübingen, and an adjunct position at the Max Planck Institute for Intelligent Systems. He has
dedicated most of his career to the development of Probabilistic Numerical Methods. Hennig’s
research has been supported by Emmy Noether, Max Planck and ERC fellowships. He is a co-
Director of the Research Program for the Theory, Algorithms and Computations of Learning
Machines at the European Laboratory for Learning and Intelligent Systems (ELLIS).
Michael A. Osborne is Professor of Machine Learning at the University of Oxford, and a co-
Founder of Mind Foundry Ltd. His research has attracted £10.6M of research funding and has
been cited over 15,000 times. He is very, very Bayesian.
Hans P. Kersting is a postdoctoral researcher at INRIA and École Normale Supérieure in Paris,
working in machine learning with expertise in Bayesian inference, dynamical systems, and
optimisation.
‘In this stunning and comprehensive new book, early developments from Kac and Larkin have been
comprehensively built upon, formalised, and extended by including modern-day machine learn-
ing, numerical analysis, and the formal Bayesian statistical methodology. Probabilistic numerical
methodology is of enormous importance for this age of data-centric science and Hennig, Osborne,
and Kersting are to be congratulated in providing us with this definitive volume.’
– Mark Girolami, University of Cambridge and The Alan Turing Institute
‘This book presents an in-depth overview of both the past and present of the newly emerging
area of probabilistic numerics, where recent advances in probabilistic machine learning are used to
develop principled improvements which are both faster and more accurate than classical numerical
analysis algorithms. A must-read for every algorithm developer and practitioner in optimization!’
– Ralf Herbrich, Hasso Plattner Institute
‘Probabilistic numerics spans from the intellectual fireworks of the dawn of a new field to its
practical algorithmic consequences. It is precise but accessible and rich in wide-ranging, principled
examples. This convergence of ideas from diverse fields in lucid style is the very fabric of good
science.’
– Carl Edward Rasmussen, University of Cambridge
‘An important read for anyone who has thought about uncertainty in numerical methods; an
essential read for anyone who hasn’t’
– John Cunningham, Columbia University
‘This is a rare example of a textbook that essentially founds a new field, re-casting numerics on
stronger, more general foundations. A tour de force.’
– David Duvenaud, University of Toronto
‘The authors succeed in demonstrating the potential of probabilistic numerics to transform the way
we think about computation itself.’
– Thore Graepel, Senior Vice President, Altos Labs
PROBABILISTIC NUMERICS
C O M P U TAT I O N A S M A C H I N E L E A R N I N G
www.cambridge.org
Information on this title:www.cambridge.org/9781107163447
DOI: 10.1017/9781316681411
© Philipp Hennig, Michael A. Osborne and Hans P. Kersting 2022
This publication is in copyright. Subject to statutory exception
and to the provisions of relevant collective licensing agreements,
no reproduction of any part may take place without the written
permission of Cambridge University Press.
First published 2022
Printed in the United Kingdom by TJ Books Limited, Padstow Cornwall
A catalogue record for this publication is available from the British Library.
ISBN 978-1-107-16344-7 Hardback
Cambridge University Press has no responsibility for the persistence or accuracy of
URLs for external or third-party internet websites referred to in this publication
and does not guarantee that any content on such websites is, or will remain,
accurate or appropriate.
I Mathematical Background 17
1 Key Points 19
2 Probabilistic Inference 21
3 Gaussian Algebra 23
4 Regression 27
5 Gauss–Markov Processes: Filtering and SDEs 41
6 Hierarchical Inference in Gaussian Models 55
7 Summary of Part I 61
II Integration 63
8 Key Points 65
9 Introduction 69
10 Bayesian Quadrature 75
11 Links to Classical Quadrature 87
12 Probabilistic Numerical Lessons from Integration 107
13 Summary of Part II and Further Reading 119
22 Proofs 183
23 Summary of Part III 193
References 369
Index 395
Philipp Hennig
Michael A. Osborne
I would like to thank Isis Hjorth, for being the most valuable
source of support I have in life, and our amazing children
Osmund and Halfdan – I wonder what you will think of this
book in a few years?
Hans P. Kersting
Bold symbols (x) are used for vectors, but only where the fact that a variable is a vector is relevant.
Square brackets indicate elements of a matrix or vector: if x = [ x1 , . . . , x N ] is a row vector, then
[ x]i = xi denotes its entries; if A ∈ R n×m is a matrix, then [ A]ij = Aij denotes its entries. Round
brackets (·) are used in most other cases (as in the notations listed below).
Notation Meaning
a∝c a is proportional to c: there is a constant k such that a = k · c.
A ∧ B, A ∨ B The logical conjunctions “and” and “or”; i.e. A ∧ B is true iff
both A and B are true, A ∨ B is true iff ¬ A ∧ ¬ B is false.
AB The Kronecker product of matrices A, B. See Eq. (15.2).
A
B The symmetric Kronecker product. See Eq. (19.16).
AB The element-wise product (aka Hadamard product) of two
matrices A and B of the same shape, i.e. [ A B]ij = [ A]ij · [ B]ij .
#– #– #–
A, A A is the vector arising from stacking the elements of a matrix A
#–
row after row, and its inverse (A = A). See Eq. (15.1).
cov p ( x, y) The covariance of x and y under p. That is,
cov p ( x, y) := E p ( x · y) − E p ( x )E p (y).
C q (V, R d ) The set of q-times continuously differentiable functions from
V to R d , for some q, d ∈ N.
δ( x − y) The Dirac delta, heuristically characterised by the property
f ( x )δ( x − y) dx = f (y) for functions f : R R.
δij The Kronecker symbol: δij = 1 if i = j, otherwise δij = 0.
det( A) The determinant of a square matrix A.
diag( x) The diagonal matrix with entries [diag( x)]ij = δij [ x ]i .
dωt The notation for an an Itô integral in a stochastic differential
equation. See Definition 5.4.
x
erf ( x ) The error function erf( x ) := √2π 0 exp(−t2 ) dt.
Ep( f ) The expectation of f under p. That is, E p ( f ) := f ( x ) dp( x ).
E |Y ( f ) The expectation of f under p( f | Y ).
∞
Γ(z) The Gamma function Γ(z) := 0 x z−1 exp(− x ) dx. See
Eq. (6.1).
G(·; a, b) The Gamma distribution with shape a > 0 and rate b > 0, with
a a −1
probability density function G(z; a, b) := bΓz( a) e−bz .
GP ( f ; μ, k) The Gaussian process measure on f with mean function μ and
covariance function (kernel) k. See §4.2
H p (x) The (differential) entropy of the distribution p( x ).
That is, H p ( x ) := − p( x ) log p( x ) dx. See Eq. (3.2).
H ( x | y) The (differential) entropy of the cond. distribution p( x | y).
That is, H ( x | y) := H p(·|y) ( x ).
I ( x ; y) The mutual information between random variables X and Y.
That is, I ( x ; y) := H ( x ) − H ( x | y) = H (y) − H (y | x ).
Notation Meaning
I, IN The identity matrix (of dimensionality N): [ I ]ij = δij .
I (· ∈ A) The indicator function of a set A.
Kν The modified Bessel function for some parameter ν ∈ C.
∞
That is, Kν ( x ) := 0 exp(− x · cosh(t))cosh(νt) dt.
L The loss function of an optimization problem (§26.1), or the
log-likelihood of an inverse problem (§41.2).
M The model M capturing the probabilistic relationship between
the latent object and computable quantities. See §9.3.
N, C, R, R + The natural numbers (excluding zero), the complex numbers,
the real numbers, and the positive real numbers, respectively.
N ( x; μ, Σ) = p( x ) The vector x has the Gaussian probability density function
with mean vector μ and covariance matrix Σ. See Eq. (3.1).
N (μ, Σ) ∼ X The random variable X is distributed according to a Gaussian
distribution with mean μ and covariance Σ.
O(·) Landau big-Oh: for functions f , g defined on N, the notation
f (n) = O( g(n)) means that f (n)/g(n) is bounded for n ∞.
p(y | x ) The conditional the probability density function for variable Y
having value y conditioned on variable X having value x.
rk( A) The rank of a matrix A.
span{ x1 , . . . , xn } The linear span of { x1 , . . . , xn }.
St(·; μ, λ1 , λ1 ) The Student’s-t probability density function with parameters
μ ∈ R and λ1 , λ2 > 0, see Eq. (6.9).
tr( A) The trace of matrix A, That is, tr( A) = ∑i [ A]ii .
A The transpose of matrix A: [ A ]ij = [ A] ji .
U a,b The uniform distribution with probability density function
p(u) := I (u ∈ ( a, b)), for a < b.
V p (x) The variance of x under p. That is, V p ( x ) := cov p ( x, x ).
V |Y ( f ) The variance of f under p( f | Y ). That is, H ( x | y) :=
− log p( x | y) dp( x | y).
W (V, ν) The Wishart distribution with probability density function
−1
W ( x; V, ν) ∝ | x |(ν− N −1)/2 e−1/2 tr(V x) . See Eq. (19.1).
x⊥y x is orthogonal to y, i.e. x, y = 0.
x := a The object x is defined to be equal to a.
Δ
x=a The object x is equal to a by virtue of its definition.
xa The object x is assigned the value of a (used in pseudo-code).
X∼p The random variable X is distributed according to p.
1, 1d A column vector of d ones, 1d := [1, . . . , 1] ∈ R d .
∇ x f ( x, t) The gradient of f w.r.t. x. (We omit subscript x if redundant.)
Kuinka pitkän ajan kaikki tämä kysyi, en tiedä. Ajan aisti oli tietysti
pallossa vielä tehottomampi kuin kuussa. Tavaramyttyihin kajottuani
minä ikäänkuin heräsin horros-unesta. Minä käsitin heti, että jos
mieli minun pysyä valveilla ja hengissä, niin minun täytyy saada
valoa tai avata akkuna, niin että silmäni voisi kohdistua johonkin. Ja
sitäpaitsi minua paleli. Minä ponnahdin myttyjen päältä, tartuin
hienoihin nuoriin lasin sisäpuolella ja pääsin niitä myöten
kopeloimalla kulkuaukolle, jossa jälleen osasin laskea, missäpäin
valon ja uudinten nappulat ovat. Heittäysin taas irti, ja silloin minut
lennätti kerran pallon ympäri. Siinä minuun iski joku suuri, hauras
esine, joka leijui irrallaan. Pääsin tuosta nuoran johdolla nappulain
luokse ja sytytin ensinnäkin sähkölampun, nähdäkseni, mikä esine se
oli minuun törmännyt: se oli vanha numero Lloyds News'ia, joka oli
päässyt irti ja lähtenyt lentoon. Tämä se palautti minut
äärettömyyden tunnelmista takaisin omiin äärellisiin mittoihini. Minua
rupesi naurattamaan ja yskittämään, ja siitä johtui mieleeni laskea
lieriöstä palloon hiukan happea. Samalla panin lämmityslaitoksen
käymään, kunnes lämpenin, jonka jälkeen nautin ravintoa. Lopuksi
rupesin mitä suurimmalla varovaisuudella käsittelemään cavoriiti-
uutimia, osatakseni edes joissain määrin arvata, mihin suuntaan
pallo kulkee.
Ilma löi minua niin äkisti rintaa vasten, että oli salvata minulta
hengen. Lasiruuvi kirposi kädestäni. Minä parkasin, painoin kädet
rintaani vasten ja vaivuin istumaan. Hetken aikaa tunsin kovia kipuja,
koetin sitten hengittää syvään ja pääsin vihdoin ylös ja jaksoin
kävellä taas.
— Niin on.
Minua epäilytti.
Harmaa aamu oli tähän asti painanut minun mieltäni, mutta äkkiä
pilkisti aurinko pilvien lomista ja loi kirkkaan valonsa maailmaan,
muuttaen lyijynkarvaisen merenkin kimalteleviksi aalloiksi. Henkeni
voimat elpyivät. Mieleeni johtui, kuinka sanomattoman paljon
tärkeätä minä olin jo saanut aikaan, ja kuinka paljon minun vielä on
aikaan saatava. Sen se auringonpaiste teki. Ihan minä purskahdin
nauramaan, kun etummainen mies kompastui minun kultani alla.
Jahka minä tästä oikealle sijalleni maailmassa nousen, niin mahtaa
se maailma hämmästyä!
— Kuustako?
— Oli miten oli, minäpä pidän silmällä tuota laivaa tuolla, — kuului
yksi heistä kuiskasevan naapurilleen.
Pau… hih!
Meri, joka tähän asti oli ollut tyyni, kävi nyt lakkapäissä. Siinä
kohden, missä pallo oli ollut, vesi kihisi kuin laivan vanassa. Taivaalla
kieri muuan pilvenpilkku hälvenevän savun lailla, ja kolme neljä
miestä rannalla katsella ammotti kysyväisesti tuohon
odottamattoman pamauksen tulokseen. Siinä kaikki! Passari ja
äskeiset neljä herraa hotellista riensivät minun perässäni. Huutoja
kuului ovista ja akkunoista, ihmisiä jos jonkinlaisia juosta läähätti
joka puolelta, suut ammollaan jok'ainoalla.
Tuossa minä nyt seisoin liikahtamatta. Tämä asiain käänne oli niin
valtava, etten kyennyt ajattelemaankaan ihmisiä ympärilläni tuossa.
Olin niin huumautunut, etten vielä käsittänyt tätä tapausta
varsinaiseksi onnettomuudeksi, — olin huumautunut kuin se, joka
äkkiä on saanut ankaran iskun: kärsitty vahinko alkoi selvetä vasta
myöhemmin.
Minä olin tietysti ihan selvillä siitä, mitenkä pojan oli käynyt. Hän
oli kiivennyt pallon sisään, sormitellut nappuloita, sulkenut cavoriiti-
akkunat ja lähtenyt ylä-ilmoihin. Sangen vähän on luultavissa, että
hän oli kiertänyt kulkuaukon kannen kiinni, ja vaikkapa niinkin olisi,
niin oli hänen palajamiseensa yksi mahdollisuus tuhannesta. Siellä
hän nyt varmaankin leijuu tavaramyttyjeni kanssa jossain
keskikohdalla palloa, ja siihen hän jää. Ja jos lopulta päätynee
johonkin avaruuden kohtaan, niin ei ne sikäläiset vähääkään huoli
siitä, että hän on ollut laillisesti oikeutettu asukas maan päällä, niin
merkillisenä ilmiönä kuin häntä muutoin pitänevätkään. Siitä asiasta
minä tulin piankin aivan varmaan vakaumukseen.