Download full Probabilistic Numerics Computation as Machine Learning 1st Edition Philipp Hennig ebook all chapters
Download full Probabilistic Numerics Computation as Machine Learning 1st Edition Philipp Hennig ebook all chapters
com
https://ebookmeta.com/product/probabilistic-numerics-
computation-as-machine-learning-1st-edition-philipp-hennig/
OR CLICK BUTTON
DOWNLOAD NOW
https://ebookmeta.com/product/probabilistic-machine-learning-advanced-
topics-kevin-p-murphy/
ebookmeta.com
https://ebookmeta.com/product/probabilistic-machine-learning-advanced-
topics-draft-1st-edition-kevin-p-murphy/
ebookmeta.com
https://ebookmeta.com/product/probabilistic-machine-learning-for-
finance-and-investing-early-release-1st-edition-deepak-kanungo/
ebookmeta.com
https://ebookmeta.com/product/king-of-queens-king-university-4-1st-
edition-nikki-pennington/
ebookmeta.com
Mastering ARKit: Apple’s Augmented Reality App Development
Platform 1st Edition Jayven Nhan
https://ebookmeta.com/product/mastering-arkit-apples-augmented-
reality-app-development-platform-1st-edition-jayven-nhan/
ebookmeta.com
https://ebookmeta.com/product/computational-quantum-chemistry-second-
edition-ram-yatan-prasad/
ebookmeta.com
https://ebookmeta.com/product/tabata-training-the-science-and-history-
of-hiit-1st-edition-izumi-tabata/
ebookmeta.com
https://ebookmeta.com/product/southern-african-wildlife-3rd-edition-
mike-unwin/
ebookmeta.com
https://ebookmeta.com/product/best-selling-house-plans-4th-edition-
over-360-dream-home-plans-in-full-color-creative-homeowner/
ebookmeta.com
Advanced Nanomaterials and Their Applications in Renewable
Energy 2nd Edition Tian-Hao Yan
https://ebookmeta.com/product/advanced-nanomaterials-and-their-
applications-in-renewable-energy-2nd-edition-tian-hao-yan/
ebookmeta.com
Probabilistic Numerics
Probabilistic numerical computation formalises the connection between machine learning and
applied mathematics. Numerical algorithms approximate intractable quantities from computable
ones. They estimate integrals from evaluations of the integrand, or the path of a dynamical system
described by differential equations from evaluations of the vector field. In other words, they
infer a latent quantity from data. This book shows that it is thus formally possible to think of
computational routines as learning machines, and to use the notion of Bayesian inference to build
more flexible, efficient, or customised algorithms for computation.
The text caters for Masters’ and PhD students, as well as postgraduate researchers in artificial
intelligence, computer science, statistics, and applied mathematics. Extensive background material
is provided along with a wealth of figures, worked examples, and exercises (with solutions) to
develop intuition.
Philipp Hennig holds the Chair for the Methods of Machine Learning at the University of
Tübingen, and an adjunct position at the Max Planck Institute for Intelligent Systems. He has
dedicated most of his career to the development of Probabilistic Numerical Methods. Hennig’s
research has been supported by Emmy Noether, Max Planck and ERC fellowships. He is a co-
Director of the Research Program for the Theory, Algorithms and Computations of Learning
Machines at the European Laboratory for Learning and Intelligent Systems (ELLIS).
Michael A. Osborne is Professor of Machine Learning at the University of Oxford, and a co-
Founder of Mind Foundry Ltd. His research has attracted £10.6M of research funding and has
been cited over 15,000 times. He is very, very Bayesian.
Hans P. Kersting is a postdoctoral researcher at INRIA and École Normale Supérieure in Paris,
working in machine learning with expertise in Bayesian inference, dynamical systems, and
optimisation.
‘In this stunning and comprehensive new book, early developments from Kac and Larkin have been
comprehensively built upon, formalised, and extended by including modern-day machine learn-
ing, numerical analysis, and the formal Bayesian statistical methodology. Probabilistic numerical
methodology is of enormous importance for this age of data-centric science and Hennig, Osborne,
and Kersting are to be congratulated in providing us with this definitive volume.’
– Mark Girolami, University of Cambridge and The Alan Turing Institute
‘This book presents an in-depth overview of both the past and present of the newly emerging
area of probabilistic numerics, where recent advances in probabilistic machine learning are used to
develop principled improvements which are both faster and more accurate than classical numerical
analysis algorithms. A must-read for every algorithm developer and practitioner in optimization!’
– Ralf Herbrich, Hasso Plattner Institute
‘Probabilistic numerics spans from the intellectual fireworks of the dawn of a new field to its
practical algorithmic consequences. It is precise but accessible and rich in wide-ranging, principled
examples. This convergence of ideas from diverse fields in lucid style is the very fabric of good
science.’
– Carl Edward Rasmussen, University of Cambridge
‘An important read for anyone who has thought about uncertainty in numerical methods; an
essential read for anyone who hasn’t’
– John Cunningham, Columbia University
‘This is a rare example of a textbook that essentially founds a new field, re-casting numerics on
stronger, more general foundations. A tour de force.’
– David Duvenaud, University of Toronto
‘The authors succeed in demonstrating the potential of probabilistic numerics to transform the way
we think about computation itself.’
– Thore Graepel, Senior Vice President, Altos Labs
PROBABILISTIC NUMERICS
C O M P U TAT I O N A S M A C H I N E L E A R N I N G
www.cambridge.org
Information on this title:www.cambridge.org/9781107163447
DOI: 10.1017/9781316681411
© Philipp Hennig, Michael A. Osborne and Hans P. Kersting 2022
This publication is in copyright. Subject to statutory exception
and to the provisions of relevant collective licensing agreements,
no reproduction of any part may take place without the written
permission of Cambridge University Press.
First published 2022
Printed in the United Kingdom by TJ Books Limited, Padstow Cornwall
A catalogue record for this publication is available from the British Library.
ISBN 978-1-107-16344-7 Hardback
Cambridge University Press has no responsibility for the persistence or accuracy of
URLs for external or third-party internet websites referred to in this publication
and does not guarantee that any content on such websites is, or will remain,
accurate or appropriate.
I Mathematical Background 17
1 Key Points 19
2 Probabilistic Inference 21
3 Gaussian Algebra 23
4 Regression 27
5 Gauss–Markov Processes: Filtering and SDEs 41
6 Hierarchical Inference in Gaussian Models 55
7 Summary of Part I 61
II Integration 63
8 Key Points 65
9 Introduction 69
10 Bayesian Quadrature 75
11 Links to Classical Quadrature 87
12 Probabilistic Numerical Lessons from Integration 107
13 Summary of Part II and Further Reading 119
22 Proofs 183
23 Summary of Part III 193
References 369
Index 395
Philipp Hennig
Michael A. Osborne
I would like to thank Isis Hjorth, for being the most valuable
source of support I have in life, and our amazing children
Osmund and Halfdan – I wonder what you will think of this
book in a few years?
Hans P. Kersting
Bold symbols (x) are used for vectors, but only where the fact that a variable is a vector is relevant.
Square brackets indicate elements of a matrix or vector: if x = [ x1 , . . . , x N ] is a row vector, then
[ x]i = xi denotes its entries; if A ∈ R n×m is a matrix, then [ A]ij = Aij denotes its entries. Round
brackets (·) are used in most other cases (as in the notations listed below).
Notation Meaning
a∝c a is proportional to c: there is a constant k such that a = k · c.
A ∧ B, A ∨ B The logical conjunctions “and” and “or”; i.e. A ∧ B is true iff
both A and B are true, A ∨ B is true iff ¬ A ∧ ¬ B is false.
AB The Kronecker product of matrices A, B. See Eq. (15.2).
A
B The symmetric Kronecker product. See Eq. (19.16).
AB The element-wise product (aka Hadamard product) of two
matrices A and B of the same shape, i.e. [ A B]ij = [ A]ij · [ B]ij .
#– #– #–
A, A A is the vector arising from stacking the elements of a matrix A
#–
row after row, and its inverse (A = A). See Eq. (15.1).
cov p ( x, y) The covariance of x and y under p. That is,
cov p ( x, y) := E p ( x · y) − E p ( x )E p (y).
C q (V, R d ) The set of q-times continuously differentiable functions from
V to R d , for some q, d ∈ N.
δ( x − y) The Dirac delta, heuristically characterised by the property
f ( x )δ( x − y) dx = f (y) for functions f : R R.
δij The Kronecker symbol: δij = 1 if i = j, otherwise δij = 0.
det( A) The determinant of a square matrix A.
diag( x) The diagonal matrix with entries [diag( x)]ij = δij [ x ]i .
dωt The notation for an an Itô integral in a stochastic differential
equation. See Definition 5.4.
x
erf ( x ) The error function erf( x ) := √2π 0 exp(−t2 ) dt.
Ep( f ) The expectation of f under p. That is, E p ( f ) := f ( x ) dp( x ).
E |Y ( f ) The expectation of f under p( f | Y ).
∞
Γ(z) The Gamma function Γ(z) := 0 x z−1 exp(− x ) dx. See
Eq. (6.1).
G(·; a, b) The Gamma distribution with shape a > 0 and rate b > 0, with
a a −1
probability density function G(z; a, b) := bΓz( a) e−bz .
GP ( f ; μ, k) The Gaussian process measure on f with mean function μ and
covariance function (kernel) k. See §4.2
H p (x) The (differential) entropy of the distribution p( x ).
That is, H p ( x ) := − p( x ) log p( x ) dx. See Eq. (3.2).
H ( x | y) The (differential) entropy of the cond. distribution p( x | y).
That is, H ( x | y) := H p(·|y) ( x ).
I ( x ; y) The mutual information between random variables X and Y.
That is, I ( x ; y) := H ( x ) − H ( x | y) = H (y) − H (y | x ).
Notation Meaning
I, IN The identity matrix (of dimensionality N): [ I ]ij = δij .
I (· ∈ A) The indicator function of a set A.
Kν The modified Bessel function for some parameter ν ∈ C.
∞
That is, Kν ( x ) := 0 exp(− x · cosh(t))cosh(νt) dt.
L The loss function of an optimization problem (§26.1), or the
log-likelihood of an inverse problem (§41.2).
M The model M capturing the probabilistic relationship between
the latent object and computable quantities. See §9.3.
N, C, R, R + The natural numbers (excluding zero), the complex numbers,
the real numbers, and the positive real numbers, respectively.
N ( x; μ, Σ) = p( x ) The vector x has the Gaussian probability density function
with mean vector μ and covariance matrix Σ. See Eq. (3.1).
N (μ, Σ) ∼ X The random variable X is distributed according to a Gaussian
distribution with mean μ and covariance Σ.
O(·) Landau big-Oh: for functions f , g defined on N, the notation
f (n) = O( g(n)) means that f (n)/g(n) is bounded for n ∞.
p(y | x ) The conditional the probability density function for variable Y
having value y conditioned on variable X having value x.
rk( A) The rank of a matrix A.
span{ x1 , . . . , xn } The linear span of { x1 , . . . , xn }.
St(·; μ, λ1 , λ1 ) The Student’s-t probability density function with parameters
μ ∈ R and λ1 , λ2 > 0, see Eq. (6.9).
tr( A) The trace of matrix A, That is, tr( A) = ∑i [ A]ii .
A The transpose of matrix A: [ A ]ij = [ A] ji .
U a,b The uniform distribution with probability density function
p(u) := I (u ∈ ( a, b)), for a < b.
V p (x) The variance of x under p. That is, V p ( x ) := cov p ( x, x ).
V |Y ( f ) The variance of f under p( f | Y ). That is, H ( x | y) :=
− log p( x | y) dp( x | y).
W (V, ν) The Wishart distribution with probability density function
−1
W ( x; V, ν) ∝ | x |(ν− N −1)/2 e−1/2 tr(V x) . See Eq. (19.1).
x⊥y x is orthogonal to y, i.e. x, y = 0.
x := a The object x is defined to be equal to a.
Δ
x=a The object x is equal to a by virtue of its definition.
xa The object x is assigned the value of a (used in pseudo-code).
X∼p The random variable X is distributed according to p.
1, 1d A column vector of d ones, 1d := [1, . . . , 1] ∈ R d .
∇ x f ( x, t) The gradient of f w.r.t. x. (We omit subscript x if redundant.)
1.D. The copyright laws of the place where you are located also
govern what you can do with this work. Copyright laws in most
countries are in a constant state of change. If you are outside the
United States, check the laws of your country in addition to the terms
of this agreement before downloading, copying, displaying,
performing, distributing or creating derivative works based on this
work or any other Project Gutenberg™ work. The Foundation makes
no representations concerning the copyright status of any work in
any country other than the United States.
• You pay a royalty fee of 20% of the gross profits you derive
from the use of Project Gutenberg™ works calculated using
the method you already use to calculate your applicable
taxes. The fee is owed to the owner of the Project
Gutenberg™ trademark, but he has agreed to donate
royalties under this paragraph to the Project Gutenberg
Literary Archive Foundation. Royalty payments must be
paid within 60 days following each date on which you
prepare (or are legally required to prepare) your periodic tax
returns. Royalty payments should be clearly marked as
such and sent to the Project Gutenberg Literary Archive
Foundation at the address specified in Section 4,
“Information about donations to the Project Gutenberg
Literary Archive Foundation.”
• You comply with all other terms of this agreement for free
distribution of Project Gutenberg™ works.
1.F.
1.F.4. Except for the limited right of replacement or refund set forth in
paragraph 1.F.3, this work is provided to you ‘AS-IS’, WITH NO
OTHER WARRANTIES OF ANY KIND, EXPRESS OR IMPLIED,
INCLUDING BUT NOT LIMITED TO WARRANTIES OF
MERCHANTABILITY OR FITNESS FOR ANY PURPOSE.
Please check the Project Gutenberg web pages for current donation
methods and addresses. Donations are accepted in a number of
other ways including checks, online payments and credit card
donations. To donate, please visit: www.gutenberg.org/donate.
Most people start at our website which has the main PG search
facility: www.gutenberg.org.