Introduction To 3+1 Numerical Relativity PDF
Introduction To 3+1 Numerical Relativity PDF
OF
MONOGRAPHS ON P HYSICS
SERIES EDITORS
J . B IR MAN C I T Y U N I V ER S I T Y O F N E W Y O R K
S. F . E DWA RD S U N I V ERS I T Y O F C A M B R I DGE
R . F R IE ND U N I V ERS I T Y O F C A M B R I DGE
M. R E E S U N I V ERS I T Y O F C A M B R I DGE
D . SHE R R I N G T O N U N I V ERS I T Y O F OX FO R D
G . VE NE ZI A N O C ERN , G EN E VA
International Series of Monographs on Physics
1
3
Great Clarendon Street, Oxford OX2 6DP
Oxford University Press is a department of the University of Oxford.
It furthers the University’s objective of excellence in research, scholarship,
and education by publishing worldwide in
Oxford New York
Auckland Cape Town Dar es Salaam Hong Kong Karachi
Kuala Lumpur Madrid Melbourne Mexico City Nairobi
New Delhi Shanghai Taipei Toronto
With offices in
Argentina Austria Brazil Chile Czech Republic France Greece
Guatemala Hungary Italy Japan Poland Portugal Singapore
South Korea Switzerland Thailand Turkey Ukraine Vietnam
Oxford is a registered trade mark of Oxford University Press
in the UK and in certain other countries
Published in the United States
by Oxford University Press Inc., New York
c Miguel Alcubierre 2008
The moral rights of the author have been asserted
Database right Oxford University Press (maker)
First published 2008
All rights reserved. No part of this publication may be reproduced,
stored in a retrieval system, or transmitted, in any form or by any means,
without the prior permission in writing of Oxford University Press,
or as expressly permitted by law, or under terms agreed with the appropriate
reprographics rights organization. Enquiries concerning reproduction
outside the scope of the above should be sent to the Rights Department,
Oxford University Press, at the address above
You must not circulate this book in any other binding or cover
and you must impose this same condition on any acquirer
British Library Cataloguing in Publication Data
Data available
Library of Congress Cataloging in Publication Data
Data available
Printed in Great Britain
on acid-free paper by
Biddles Ltd. www.biddles.co.uk
1 3 5 7 9 10 8 6 4 2
To Miguel, Raul and Juan,
for filling my life with beautiful moments.
There have been a large number of people that, both directly and indirectly, have
helped me get to a point where I could write this book. First of all, I would like
to thank my mentors and advisors, who put me in the path of studying physics
in general, and relativity in particular. Among them I would like to mention Luis
de la Peña and Ana Marı́a Cetto who, perhaps somewhat reluctantly, accepted
my change of heart when I decided to leave the study of the foundations of
quantum mechanics in favor of general relativity. I also thank Eduardo Nahmad
for pointing the way, Bernard F. Schutz for guiding me and being there up to
the present time, and Ed Seidel for always being a friend.
I am also in debt to many friends and colleagues who have been there along
the road. My old friends from Cardiff, Gabrielle Allen, Gareth Jones, Mario An-
tonioletti, Nils Anderson, Kostas Kokkotas, as well as my friends from Potsdam,
John Baker, Werner Benger, Steve Brandt, Bernd Bruegmann, Manuela Cam-
panelli, Peter Diener, Tony Font, Tom Goodale, Carsten Gundlach, Ian Hawke,
Scott Hawley, Frank Herrmann, Daniel Holz, Sasha Husa, Michael Koppitz, Gerd
Lanfermann, Carlos Lousto, Joan Masso, Philippos Papadopoulos, Denis Pollney,
Luciano Rezzolla, Alicia Sintes, Nikolaus Stergioulas, Ryoji Takahashi, Jonathan
Thornburg and Paul Walker. There have been other people who have helped me
in many ways, either by giving direct answers to my questions, or from whom
I have simply learned a lot of what has contributed to this book throughout
the years, Carles Bona, Matt Choptuik, Eric Gourgoulhon, Pablo Laguna, Luis
Lehner, Roy Maartens, Mark Miller, Jorge Pullin, Oscar Reula, Olivier Sarbach,
Erik Schnetter, Deirdre Shoemaker, Wai-Mo Suen, Manuel Tiglio and Jeff Wini-
cour, among others.
I also want to thank my friends and colleagues in Mexico, Alejandro Corichi,
Jose Antonio Gonzalez, Francisco S. Guzman, Tonatiuh Matos, Dario Nuñez,
Luis Ureña, Marcelo Salgado, Daniel Sudarsky, Roberto Sussman and Hernando
Quevedo. And of course the students of our numerical relativity group, most
of whom have proof-read different sections of the original manuscript, Antonio
Castellanos, Juan Carlos Degollado, Cesar Fuentes, Pablo Galaviz, David Mar-
tinez, Martha Mendez, Jose Antonio Nava, Bernd Reimann, Milton Ruiz and
Jose Manuel Torres.
Finally, I wish to thank my editor, Sonke Adlung, who suggested the idea
of writing this book in the first place and has been pushing hard ever since. I
wouldn’t have made it without him.
vii
This page intentionally left blank
PREFACE
General relativity is a highly successful theory. Not only has it radically modified
our understanding of gravity, space and time, but it also possesses an enormous
predictive power. To date, it has passed with extraordinary precision all the
experimental and observational tests that it has been subjected to. Among its
more important results are the predictions of exotic objects such as neutron
stars and black holes, and the cosmological model of the Big Bang. Also, general
relativity has predicted the existence of gravitational waves, which might be
detected directly for the first time before this decade is out.
General relativity, for all its conceptual simplicity and elegance, turns out
to be in practice a highly complex theory. The Einstein field equations are a
system of ten coupled, non-linear, partial differential equations in four dimen-
sions. Written in fully expanded form in a general coordinate system they have
thousands of terms. Because of this complexity, exact solutions of the Einstein
equations are only known in cases with high symmetry, either in space or in
time: solutions with spherical or axial symmetry, static or stationary solutions,
homogeneous and/or isotropic solutions, etc. If we are interested in studying
systems with astrophysical relevance, which involve strong and dynamical grav-
itational fields with little or no symmetry, it is simply impossible to solve the
field equations exactly. The need to study this type of system has given birth to
the field of numerical relativity, which tries to solve the Einstein field equations
using numerical techniques and complex computational codes.
Numerical relativity appeared as an independent field of research in the mid
1960s with the pioneering efforts of Hahn and Lindquist [158], but it wasn’t
until the mid 1970s when the first truly successful simulations where carried out
by Smarr [271] and Eppley [123, 124] in the context of the head-on collision of
two black holes. At that time, however, the power of the available computers
was very modest, and the simulations that could be performed where limited to
either spherical symmetry or very low resolution axial symmetry. This situation
has changed, and during the decades of the 1980s and 1990s a true revolution has
taken place in numerical relativity. Researchers have studied ever more complex
problems in many different aspects of general relativity, from the simulation of
rotating stars and black holes to the study of topological defects, gravitational
collapse, singularity structure, and the collisions of compact objects. Maybe the
most influential result coming from numerical relativity has been the discovery by
Choptuik of critical phenomena in gravitational collapse [98] (for a more recent
review see [153]). A summary of the history of numerical relativity and its more
recent developments can be found in [186].
Numerical relativity has now reached a state of maturity. The appearance of
ix
powerful super-computers, together with an increased understanding of the un-
derlying theoretical issues, and the development of robust numerical techniques,
has finally allowed the simulation of fully three-dimensional systems with strong
and highly dynamic gravitational fields. And all this activity is happening at
precisely the right time, as a new generation of advanced interferometric gravi-
tational wave detectors (GEO600, LIGO, VIRGO, TAMA) is finally coming on
line. The expected gravitational wave signals, however, are so weak that even
with the amazing sensitivity of the new detectors it will be necessary to extract
them from the background noise. As it is much easier to extract a signal if we
know what to look for, numerical relativity has become badly needed in order to
provide the detectors with precise templates of the type of gravitational waves
expected from the most common astrophysical sources. We are living a truly
exciting time in the development of this field.
The time has therefore arrived for a textbook on numerical relativity that
can help as an introduction to this promising field of research. The field has
expanded in a number of different directions in recent years, which makes writing
a fully comprehensive textbook a challenging task. In particular, there are several
different approaches to separating the Einstein field equations in a way that
allows us to think of the evolution of the gravitational field in time. I have
decided to concentrate on one particular approach in this book, namely the
3+1 formalism, not because it is the only possibility, but rather because it is
conceptually easiest to understand and the techniques associated with it have
been considerably more developed over the years. To date, the 3+1 formalism
continues to be used by most researchers in the field. Other approaches, such
as the characteristic and conformal formulations, have important strengths and
show significant promise, but here I will just mention them briefly.
This book is aimed particularly at graduate students, and assumes some
basic familiarity with general relativity. Although the first Chapter gives an
introduction to general relativity, this is mainly a review of some basic concepts,
and is certainly not intended to replace a full course on the subject.
Miguel Alcubierre
Mexico City, September 2007.
x
CONTENTS
xi
3.4 Multiple black hole initial data 105
3.4.1 Time-symmetric data 105
3.4.2 Bowen–York extrinsic curvature 109
3.4.3 Conformal factor: inversions and punctures 111
3.4.4 Kerr–Schild type data 113
3.5 Binary black holes in quasi-circular orbits 115
3.5.1 Effective potential method 116
3.5.2 The quasi-equilibrium method 117
4 Gauge conditions 121
4.1 Introduction 121
4.2 Slicing conditions 122
4.2.1 Geodesic slicing and focusing 123
4.2.2 Maximal slicing 123
4.2.3 Maximal slices of Schwarzschild 127
4.2.4 Hyperbolic slicing conditions 133
4.2.5 Singularity avoidance for hyperbolic slicings 136
4.3 Shift conditions 140
4.3.1 Elliptic shift conditions 141
4.3.2 Evolution type shift conditions 145
4.3.3 Corotating coordinates 151
5 Hyperbolic reductions of the field equations 155
5.1 Introduction 155
5.2 Well-posedness 156
5.3 The concept of hyperbolicity 158
5.4 Hyperbolicity of the ADM equations 164
5.5 The Bona–Masso and NOR formulations 169
5.6 Hyperbolicity of BSSNOK 175
5.7 The Kidder–Scheel–Teukolsky family 179
5.8 Other hyperbolic formulations 183
5.8.1 Higher derivative formulations 184
5.8.2 The Z4 formulation 185
5.9 Boundary conditions 187
5.9.1 Radiative boundary conditions 188
5.9.2 Maximally dissipative boundary conditions 191
5.9.3 Constraint preserving boundary conditions 194
6 Evolving black hole spacetimes 198
6.1 Introduction 198
6.2 Isometries and throat adapted coordinates 199
6.3 Static puncture evolution 206
6.4 Singularity avoidance and slice stretching 209
6.5 Black hole excision 214
6.6 Moving punctures 217
xii
6.6.1 How to move the punctures 217
6.6.2 Why does evolving the punctures work? 219
6.7 Apparent horizons 221
6.7.1 Apparent horizons in spherical symmetry 223
6.7.2 Apparent horizons in axial symmetry 224
6.7.3 Apparent horizons in three dimensions 226
6.8 Event horizons 230
6.9 Isolated and dynamical horizons 234
7 Relativistic hydrodynamics 238
7.1 Introduction 238
7.2 Special relativistic hydrodynamics 239
7.3 General relativistic hydrodynamics 245
7.4 3+1 form of the hydrodynamic equations 249
7.5 Equations of state: dust, ideal gases and polytropes 252
7.6 Hyperbolicity and the speed of sound 257
7.6.1 Newtonian case 257
7.6.2 Relativistic case 260
7.7 Weak solutions and the Riemann problem 264
7.8 Imperfect fluids: viscosity and heat conduction 270
7.8.1 Eckart’s irreversible thermodynamics 270
7.8.2 Causal irreversible thermodynamics 273
8 Gravitational wave extraction 276
8.1 Introduction 276
8.2 Gauge invariant perturbations of Schwarzschild 277
8.2.1 Multipole expansion 277
8.2.2 Even parity perturbations 280
8.2.3 Odd parity perturbations 283
8.2.4 Gravitational radiation in the TT gauge 284
8.3 The Weyl tensor 288
8.4 The tetrad formalism 291
8.5 The Newman–Penrose formalism 294
8.5.1 Null tetrads 294
8.5.2 Tetrad transformations 297
8.6 The Weyl scalars 298
8.7 The Petrov classification 299
8.8 Invariants I and J 303
8.9 Energy and momentum of gravitational waves 304
8.9.1 The stress-energy tensor for gravitational waves 304
8.9.2 Radiated energy and momentum 307
8.9.3 Multipole decomposition 313
xiii
9 Numerical methods 318
9.1 Introduction 318
9.2 Basic concepts of finite differencing 318
9.3 The one-dimensional wave equation 322
9.3.1 Explicit finite difference approximation 323
9.3.2 Implicit approximation 325
9.4 Von Newmann stability analysis 326
9.5 Dissipation and dispersion 329
9.6 Boundary conditions 332
9.7 Numerical methods for first order systems 335
9.8 Method of lines 339
9.9 Artificial dissipation and viscosity 343
9.10 High resolution schemes 347
9.10.1 Conservative methods 347
9.10.2 Godunov’s method 348
9.10.3 High resolution methods 350
9.11 Convergence testing 353
10 Examples of numerical spacetimes 357
10.1 Introduction 357
10.2 Toy 1+1 relativity 357
10.2.1 Gauge shocks 359
10.2.2 Approximate shock avoidance 362
10.2.3 Numerical examples 364
10.3 Spherical symmetry 369
10.3.1 Regularization 370
10.3.2 Hyperbolicity 374
10.3.3 Evolving Schwarzschild 378
10.3.4 Scalar field collapse 383
10.4 Axial symmetry 391
10.4.1 Evolution equations and regularization 391
10.4.2 Brill waves 395
10.4.3 The “Cartoon” approach 399
A Total mass and momentum in general relativity 402
B Spacetime Christoffel symbols in 3+1 language 409
C BSSNOK with natural conformal rescaling 410
D Spin-weighted spherical harmonics 413
References 419
Index 437
xiv
1
BRIEF REVIEW OF GENERAL RELATIVITY
1.1 Introduction
The theory of general relativity, postulated by Einstein at the end of 1915 [120,
121], is the modern theory of gravitation. According to this theory, gravity is
not a force as it used to be considered in Newtonian physics, but rather a man-
ifestation of the curvature of spacetime. A massive object produces a distortion
in the geometry of spacetime around it, and in turn this distortion controls the
movement of physical objects. In the words of John A. Wheeler, “matter tells
spacetime how to curve, and spacetime tells matter how to move” [298].
When Einstein introduced special relativity in 1905 it became clear that
Newton’s theory of gravity would have to be modified. The main reason for
this was that Newton’s theory implies that the gravitational interaction was
transmitted between different bodies at infinite speed, in clear contradiction
with one of the fundamental results of special relativity: No physical interaction
can travel faster than the speed of light. It is interesting to note that Newton
himself was never happy with the existence of this action at a distance, but he
considered that it was a necessary hypothesis to be used until a more adequate
explanation of the nature of gravity was found. In the years from 1905 to 1915,
Einstein focused his efforts on finding such an explanation.
The basic ideas that guided Einstein in his quest towards general relativity
were the principle of general covariance, which says that the laws of physics must
take the same form for all observers, the principle of equivalence, which says that
all objects fall with the same acceleration in a gravitational field regardless of
their mass, and Mach’s principle, formulated by Ernst Mach at the end of the
19th century, which states that the local inertial properties of physical objects
must be determined by the total distribution of matter in the universe. The
principle of general covariance led Einstein to ask for the equations of physics to
be written in tensor form, the principle of equivalence led him to the conclusion
that the natural way to describe gravity was identifying it with the geometry of
spacetime, and Mach’s principle led him to the idea that such geometry should
be fixed by the distribution of mass and energy.
The discussion that follows will serve to present some of the basic concepts
of general relativity, but it is certainly not intended to be a detailed introduc-
tion to this theory, it is simply too short for that. Readers with no training in
relativity are well advised to read any of the standard textbooks on the subject
(for example: Misner, Thorne and Wheeler [206], Wald [295] and Schutz [259]).
1
2 BRIEF REVIEW OF GENERAL RELATIVITY
special relativity.1
One can ask what is “special” about special relativity. It is common to hear,
even among physicists, that special relativity is special because it can not de-
scribe accelerating objects or accelerating observers. This is, of course, quite
wrong. Special relativity is essentially a new kinematic framework on which we
can do dynamics, that is, we can study the effects of forces on physical objects,
so that accelerations are included all the time. Accelerating observers, or more
to the point accelerating coordinate systems, can also be dealt with, though the
mathematics becomes more involved. What makes special relativity “special” is
the fact that it assumes the existence of global inertial frames, that is, reference
frames where Newton’s first law holds: Objects free of external forces remain
in a state of uniform rectilinear motion. Inertial frames play a crucial role in
special relativity. In fact, one of the best known results from this theory are
the Lorentz transformations that relate the coordinates in one inertial frame to
those of another. If we assume that we have two inertial frames O and O , with
O moving with respect to O with constant speed v along the x axis, then the
Lorentz transformations are:
Notice that the Lorentz transformations mix space and time components, some-
thing that was difficult to interpret before special relativity.
The Lorentz transformations have a number of important consequences. The
first of these can be easily derived by asking where the events that happen at
t = 0 according to O end up in the frame O . From the equations above we see
that these events will have coordinates in O such that t = −vx . That is, events
that happen at the same time t = 0 in frame O and are thus simultaneous,
happen at times that depend on their spatial positions according to O and are
then not simultaneous. Simultaneity is therefore relative, or in other words it has
1 Einstein did not cite Michelson and Morley in his original papers on relativity [223].
4 BRIEF REVIEW OF GENERAL RELATIVITY
Einstein’s second postulate then turns out to be equivalent to saying that the
interval defined above between any two events is absolute, that is, all inertial ob-
servers will find the same value of ∆s2 . This means that we can define a concept
of invariant distance between events, and once we have a measure of distance
we can do geometry. Notice that the ordinary three-dimensional Euclidean dis-
tance ∆l2 = ∆x2 + ∆y 2 + ∆z 2 between two events is not absolute, nor is the
time interval ∆t2 , as can be easily seen from the Lorentz transformations. Only
Minkowski’s four-dimensional spacetime interval is absolute. In Minkowski’s own
words “... henceforth space by itself, and time by itself, are doomed to fade away
into mere shadows, and only a kind of union of the two will preserve an inde-
pendent reality”.
A crucial property of the spacetime interval defined in (1.3.4) is the fact that,
because of the minus sign in front of the first term, it is not positive definite.
Rather than this being a drawback, it has an important physical interpretation
as it allows us to classify the separation of events according to the sign of ∆s2 :
time
Causal future
Elsewhere Elsewhere
event
Causal past
space
Fig. 1.1: The light-cone of a given event defines its causal relationship with other events,
and divides space into three regions: the causal past (those events that can influence
the event under consideration), the causal future (those events that can be influenced
by the event under consideration), and elsewhere (those events with which there can
be no causal relation).
the Lorentz transformations is that the time order of events is in fact absolute
for events with timelike or null separations – it is only for events with spacelike
separations that there is no fixed time order. This allows us to define a notion
of causality in an invariant way: Only events separated in a timelike or null way
can be causally related. Events separated in a spacelike way must be causally
disconnected, as otherwise in some inertial frames the effect would be found to
precede the cause. In particular this implies that no physical interaction can
travel faster than the speed of light as this would violate causality – this is
one of the reasons why in relativity nothing can travel faster than light. In
fact, in relativity all material objects move following timelike trajectories, while
light moves along null trajectories. Null trajectories also define the light-cone
(see Figure 1.1), which indicates which events can be causally related with each
other.
From the invariance of the interval, or equivalently from the Lorentz trans-
formations, we can derive two other important results. Let us start by defining
the proper time between two events as the time measured by an ideal clock that
is moving at constant speed in such a way that it sees both events happen at
the same place. From the point of view of this clock we have ∆l2 = 0, which
implies ∆s2 = −∆t2 . If √ we use the Greek letter τ to denote the proper time
we will then have ∆τ = −∆s2 . We clearly see that the proper time can only
be defined for timelike or null intervals – it has no meaning for spatial intervals
(which makes sense since no physical clock can travel faster than light). Having
6 BRIEF REVIEW OF GENERAL RELATIVITY
defined the proper time, it is not difficult to show that the interval of time ∆t
measured between two events in a given inertial frame is related to the proper
time between those events in the following way:
∆t = γ∆τ ≥ ∆τ . (1.3.8)
This effect is known as time dilation, and implies that in a given reference frame
all moving clocks are measured to go slow. The effect is of course symmetrical:
If I measure the clocks of a moving frame as going slow, someone in that frame
will measure my clocks as going slow.
Another consequence of the Lorentz transformations is related with the mea-
sure of spatial distances. Let us assume that we have a rod of length l as measured
when it is at rest (the proper length). If the rod moves with speed v we will mea-
sure it as being contracted along the direction of motion. The length L measured
will be related to l and the speed of the rod v according to:
L = l/γ ≤ l . (1.3.9)
it was considered a dynamical effect due to the interaction between the moving object and the
luminiferous aether. However, the fact that the contraction was independent of the physical
properties of the moving object was difficult to justify. In special relativity the contraction is
not dynamical in origin, but is instead a purely kinematical consequence of the invariance of
the interval.
1.4 MANIFOLDS AND TENSORS 7
along √the curve, while for timelike curves it is defined instead as the integral of
dτ = −ds2 and corresponds directly to the time measured by an ideal clock
following that trajectory. Null curves, of course, have zero length by definition.
We can then easily see that for spacelike curves a straight line between two events
is the one of minimum length, while for timelike curves the straight line has in
fact maximum length. In any case straight lines are always extremal curves, also
known as geodesics. We can use this fact to rewrite Newton’s first law in geomet-
ric terms. Notice first that when thinking about spacetime, Newton’s first law
is simply stated as: Objects free of external forces move along straight timelike
trajectories in spacetime. This simple statement captures both the fact that the
trajectories are straight in three-dimensional space, and also that the motion is
uniform. Using the notion of a geodesic we can rewrite this as: Objects free of
external forces move along timelike geodesics in spacetime. We can even extend
Newton’s first law to light and say: Light rays (photons) in vacuum move along
null geodesics of spacetime. At this point, we should also mention that it is cus-
tomary in relativity to refer to the trajectory of a particle through spacetime as
its world-line, so Newton’s first law states that a free particle moves in such a
way that its world-line corresponds to a geodesic of spacetime.
mapping is nothing more than a set of coordinates that label the different points
in M . Notice that a coordinate system on a given patch of M is not unique,
coordinate systems are in fact arbitrary.
Once we have a manifold, we can consider curves in this manifold defined as
functions from a segment of the real line into the manifold. It is important to
distinguish between the image of a curve (i.e. its trajectory), and the curve itself:
The curve is the function and contains information both about which points on
M we are traversing, and how fast we are moving with respect to the parameter.
In terms of a set of coordinates {xα } on M , a curve is represented as
Notice that a change of coordinates will alter the explicit functional form above,
but will not alter the curve itself. Working with coordinates helps to make many
concepts explicit, but we should always be careful not to confuse the coordinate
representation of a geometric object with the object itself.
Vectors are defined as derivative operators along a given curve. The precise
definition is somewhat abstract, and although this is certainly convenient from
a mathematical point of view, here we will limit ourselves to working with the
components of a vector. The components of a vector
v tangent to a curve xα (λ)
are given simply by
v α = dxα /dλ . (1.4.2)
Vectors are defined at a given point, and on that point they form a vector
space known as the tangent space of M (one should really think of vectors as
representing only infinitesimal displacements on M ).
Since vectors form a vector space, we can always represent them as linear
combinations of some basis vectors {
eα }, where here α is an index that identifies
the different vectors in the basis and not their components. For example, for an
arbitrary vector
v we have
v = v α
eα . (1.4.3)
A common basis choice (though certainly not the only possibility) is the so-
called coordinate basis for which we take as the basis those vectors that are
tangent to the coordinate lines, using as parameters the coordinates themselves.
It is precisely in this basis that the components of a vector are given as defined
above. From here on we will always work with the coordinate basis (but see
Chapter 8 where we will discuss the use of a non-coordinate basis known as a
tetrad frame).
Let us now consider functions of vectors on the tangent space. A linear, real-
valued function of one vector is called a one-form. We will denote one-forms
with a tilde and write the action of a one-form q̃ on a vector
v as q̃(
v ).4 It is not
4 The name one-form comes from the calculus of differential forms and the notion of exterior
derivatives. Forms are very important in the theory of integration in manifolds. Here, however,
we will not consider the calculus of forms, and we will take “one-form” to be just a name –
1.4 MANIFOLDS AND TENSORS 9
difficult to show that one-forms also form a vector space of the same dimension
as that of the manifold – this is known as the dual tangent space, and for this
reason one-forms are often called co-vectors.
The components of a one-form are defined as the value of the one-form acting
on the basis vectors:
qα := q̃(
eα ) . (1.4.4)
Notice that while the components of a vector are represented with indices up,
those of a one-form have the indices down.
We can also define a basis for the space of one-forms, known as the dual basis
ω̃ α , and defined as those one-forms such that, when acting on the basis vectors
1
(Tαβ + Tβα ) ,
T(αβ) := (1.4.9)
2!
1
T[αβ] := (Tαβ − Tβα ) , (1.4.10)
2!
where the round and square brackets are standard notation for the symmetric
and antisymmetric parts, respectively, and the 1/2! is a normalization factor.
Generalizations to tensors of higher rank are straightforward. The symmetries
of tensors are quite important in general relativity where many of the most
important tensors have very specific symmetry properties.
= 0 (this also implies that the components gαβ form an invertible matrix).
The tensor g giving this scalar product is called the metric tensor, and allows us
to define the magnitude or norm of a vector as
2
|v| := g(
v ,
v ) =
v ·
v = gαβ v α v β . (1.5.2)
which allows us to define a notion of distance between these two points. Not
all manifolds have such a metric tensor defined on them, the obvious physical
example of a manifold with no metric being the phase space of classical mechan-
ics. The metric tensor gives an extra degree of structure to a manifold. In the
particular case of special relativity we do in fact have such a notion of distance,
with the metric tensor g given by Minkowski’s interval: gαβ = ηαβ . This also
means that in special relativity we can not only calculate distances, but in fact
we can also construct the scalar product of two arbitrary vectors.
As already mentioned, the metric of special relativity is not positive definite.
In general, the signature of a metric is given by the signs of the eigenvalues of
its matrix of components. We call a positive definite metric, i.e. one with all
eigenvalues positive, Euclidean, while a metric like the one of special relativity
with signature (−, +, +, +) is called Lorentzian. For a Lorentzian metric like the
one in relativity, we classify vectors in the same way as intervals according to
the sign of their magnitude and talks about spacelike, null and timelike vectors.
The metric tensor can be used to define a one-to-one mapping between vectors
and one-forms. Assume, for example, that we are given a vector
v . If we use that
vector in one of the slots of the tensor g, we will have an object g(
v , ) that takes
one arbitrary vector
u as an argument and gives us a real number, namely
v ·
u.
That is, g(
v , ) defines a one-form. Since this one-form is associated with the
vector
v , we will denote it simply by ṽ. Its components are easy to obtain from
the standard definition
vα := ṽ (
eα ) = g (
v ,
eα ) = gµν v µ (
eα )ν = gµν v µ δαν , (1.5.4)
and finally,
vα = gαβ v β . (1.5.5)
Since by definition g is non-degenerate, we can invert this relation to find
v α = g αβ vβ , (1.5.6)
where g αβ are the components of the inverse matrix to gαβ , that is g αµ gµβ = δβα .
This mapping between vectors and one-forms means that we can think of them
12 BRIEF REVIEW OF GENERAL RELATIVITY
as the same geometric object, that is, just a “vector”. The operations given
by (1.5.5) and (1.5.6) are then simply referred to as lowering and raising the
index of the vector. But it is important to stress the fact that this mapping
between vectors and one-forms can only be defined if we have a metric tensor.
In the case of Euclidean space, the metric tensor is given by the identity
matrix, which implies that the components of a vector and its associated one-
form are identical. This explains why the distinction between vectors and one-
forms is usually not even introduced for Euclidean space. In special relativity,
however, we need to be more careful as the components of a vector and its
associated one-form are no longer the same since the metric is now given by ηαβ :
We see then that lowering and raising indices of vectors in special relativity
changes the sign of the time components.
Notice that once we have defined the operation of lowering indices, the dot
product between two vectors
v and
u can be simply calculated as
g(
v ,
u) = gαβ v α uβ = vα uα , (1.5.8)
that is, the direct contraction between one of the vectors and the one-form as-
sociated with the other one.
We can easily generalize the notion of raising and lowering indices to tensors
of arbitrary rank by simply contracting a given index with the metric tensor gαβ
or its inverse g αβ . For example:
In this way we will think of tensors as objects of a given rank indicated by the
total number of indices, irrespective of where those indices are (but the position
of the indices is important once we assign explicit values to the components).
Notice how in this expression we have already used the components of one-form
sµ associated with the vector
s. If we now apply the projection operator to an
arbitrary vector
v and calculate its dot product with
s we find
that is, the norm of a projected vector can be calculated directly with Pµν .
If instead of a unit spacelike vector
s we consider a unit timelike vector
n,
then the projection operator takes the slightly different form
In this case, for a manifold with Lorentzian signature, the induced metric
v ·
u = |v| |u| cos θ . (1.5.14)
Notice that the angle between two vectors defined above remains invariant if we
change the metric tensor in the following way:
with Φ some scalar function. Such a change of the metric is called a confor-
mal transformation (since it preserves angles), and the function Φ is called the
conformal factor.
The metric tensor can also be used to measure volumes and not just linear dis-
tances. Consider for a moment a two-dimensional manifold, and assume that we
want to find the area element associated with the infinitesimal coordinate square
defined by dx1 and dx2 . If the coordinate lines are orthogonal at the point consid-
ered, the area element will clearly be given by dA = |e1 | |e2 | dx1 dx2 , with
e1 and
e2 the corresponding basis vectors. Of course, in the general case, the coordinate
lines will not be orthogonal, but it is clear that the general expression will be
given by the formula for the area of a parallelogram, dA = |e1 | |e2 | sin θ dx1 dx2 ,
with θ the angle between
e1 and
e2 . Using now the definition of the angle θ given
above we find
1/2 1 2
dA = |e1 | |e2 | sin θ dx1 dx2 = |e1 | |e2 | 1 − cos2 θ dx dx
1/2
2 2
= (|e1 | |e2 |) − (
e1 ·
e2 ) dx1 dx2 ,
1/2
dA = g11 g22 − (g12 )2 = [det(g)]1/2 dx1 dx2 ,
In the last expression we have introduced the standard convention of using simply
g to denote the determinant of the metric. Also, the absolute value is there to
allow for the possibility of having a non-positive definite metric.
as the natural volume element. It is also only defined for orientable manifolds, and the standard
definition corresponds to a right handed basis.
1.6 LIE DERIVATIVES AND KILLING FIELDS 15
v congruence
h
h u integral curve
p h
u dragged curve
h
u integral curve
Fig. 1.2: Lie dragging of vectors. The integral curves of the vector
u are dragged along
the congruence associated with the vector field v an equal distance as measured by the
parameter λ.
parameter that defines the curve. We can then use this mapping to drag tensors
from the point p to the point q. This is known as Lie dragging.
For a scalar function it is easy to see how Lie dragging works: We say that
the dragged function φλ (f ) is such that its value on q is equal to f (p), that is
φλ (f )(q) = f (p). For a vector field
u we define the dragging by looking at the
curve associated with
u, and dragging the curve from p to q using the congruence
associated with
v (see Figure 1.2). The dragged vector φ(
u) at q will be the
tangent to the new curve.
The Lie derivative of a vector field
u is then defined in the following way:
Evaluate the vector at q = φλ (p), drag it back to p using φ−1 −λ , and take the
difference with the original vector at p in the limit when λ goes to zero:
φ−1 u|φλ (p) ) −
u|p
−λ (
£v
u := lim . (1.6.1)
λ→0 λ
The notation indicates that the Lie derivative depends on the vector field used
for the dragging. To find the components of the Lie derivative, let µ be the
parameter for the integral curves of the vector field
u. If we drag an integral
curve of
u an infinitesimal distance λ along the vector field
v we will find that
φλ (xα ) = xα + v α λ
⇒ φλ (uα ) = uα + (dv α /dµ) λ = uα + uβ ∂β v α λ . (1.6.2)
⎡ ⎤
uα |φλ (p) − λ uβ φ ∂β v α
− u α
|p
£v uα = lim ⎣ ⎦
λ (p)
λ→0 λ
= duα /dλ − uβ ∂β v α
= v β ∂β uα − uβ ∂β v α . (1.6.3)
£v ωα = v β ∂β ωα + ωβ ∂α v β . (1.6.6)
We can do the same for tensors of arbitrary rank, just adding one more term for
each index, with the adequate sign. For example
£v T α β = v µ ∂µ T α β − T µ β ∂µ v α + T α µ ∂β v µ . (1.6.7)
e1 =
v , which implies v α = δ1α . It is then easy to see that the Lie derivative of a
tensor T of arbitrary rank will simplify to
This shows that the Lie derivative is a way to write partial derivatives along the
direction of a given vector field in a way that is independent of the coordinates.
£ξ g = 0 . (1.6.9)
1.7 COORDINATE TRANSFORMATIONS 17
From the expression for the Lie derivative of a tensor we find that this implies
∂1 gαβ = 0 , (1.6.11)
so the components of the metric tensor are in fact independent of the coordinate
x1 . We will come back to the condition for the existence of a Killing field in
Section 1.8, where we will rewrite it in a more standard way.
v ᾱ = Λᾱ β
β v . (1.7.2)
When we transform the coordinates we clearly have also changed our coor-
dinate basis, as the new basis must refer to the new coordinates. From the fact
18 BRIEF REVIEW OF GENERAL RELATIVITY
say that an expression is covariant if it involves only tensorial quantities, as in the covariant
derivative that we will see in the following section, or in the principle of general covariance.
Because of this, when I refer to indices I will use the form co-variant with a hyphen.
1.7 COORDINATE TRANSFORMATIONS 19
We can use this to find the form of the infinitesimal distance in spherical
coordinates, starting from the Pythagoras rule
dl 2 = dx2 + dy 2 + dz 2 . (1.7.11)
Substituting the expressions for the differentials we find that the distance be-
tween two infinitesimally close points in spherical coordinates is
dl2 = gαβ dxα dxβ = dr2 + r2 dθ2 + r2 sin2 θ dφ2 ≡ dr2 + r2 dΩ2 , (1.7.12)
where the last equality defines the element of solid angle dΩ2 . The components of
the metric tensor are now clearly non-trivial. It is also clear that in order to find
the finite distance between two points in these coordinates we need to do a non-
trivial line integration. We can, of course, also decide to use spherical coordinates
in special relativity, in which case the components of the metric will no longer
be those of the Minkowski tensor. Moreover, we can even transform coordinates
to those of an accelerated observer, which will again give us a non-trivial metric.
The distance element (1.7.12) above still represents the metric of a flat three-
dimensional space. We can use it, however, to find distances on the surface of
a sphere of radius R. It is clear that if we remain on this surface, we will have
dr = 0, so the metric reduces to
dl2 = R2 dθ2 + sin2 θ dφ2 . (1.7.13)
The last expression now represents the metric of a curved two-dimensional sur-
face, namely the surface of the sphere. We then see that the fact that a metric is
non-trivial does not in general help us to distinguish between a genuinely curved
space, and a flat space in curvilinear coordinates (we will wait until Section 1.9
below to introduce properly the notion of curvature).
Notice also that in spherical coordinates not only is the metric non-trivial,
but also the basis vectors themselves become non-trivial. Consider the coordinate
20 BRIEF REVIEW OF GENERAL RELATIVITY
basis associated with the spherical coordinates {r, θ, φ}. Written in the spherical
coordinates themselves, the components of this basis are by definition
er → (1, 0, 0) ,
eθ → (0, 1, 0) ,
eφ → (0, 0, 1) . (1.7.14)
er → (sin θ cos φ, sin θ sin φ, cos θ) , (1.7.15)
eθ → (r cos θ cos φ, r cos θ sin φ, −r sin θ) , (1.7.16)
eφ → (−r sin θ sin φ, r sin θ cos φ, 0) , (1.7.17)
as can be found from the transformation law given above. Note, for example,
that not all of these basis vectors are unitary; their square-magnitudes are:
|
er |2 = (sin θ cos φ)2 + (sin θ sin φ)2 + (cos θ)2 = 1 (1.7.18)
|
eθ |2 = (r cos θ cos φ)2 + (r cos θ sin φ)2 + (r sin θ)2 = r2 , (1.7.19)
|
eφ |2 = (r sin θ sin φ)2 + (r sin θ cos φ)2 = r2 sin2 θ , (1.7.20)
where in order to calculate those magnitudes we have simply summed the squares
of their Cartesian components. As we have seen, the magnitude of a vector can
also be calculated directly from the components in spherical coordinates using
the metric tensor:
|
v |2 = gαβ v α v β . (1.7.21)
It is not difficult to see that if we use the above equation and the expression
for the metric in spherical coordinates (equation (1.7.12)) we will find the same
square-magnitudes for the vectors of the spherical coordinate basis given above.
This shows that the coordinate basis vectors in general can not be expected to
be unitary. In fact, they don’t even have to be orthogonal to each other.
of a product, it must reduce to the standard partial derivative for scalar func-
tions, and it must be symmetric in the sense that for a scalar function f we get
∇α ∇β f = ∇β ∇α f .8
Once we have an operator ∇ we can consider, for example, the derivative of
a vector with respect to a given coordinate:
∇α
v = ∇α v β
eβ = ∇α v β
eβ + v β (∇α
eβ )
= ∂α v β
eβ + v β (∇α
eβ ) , (1.8.1)
where in the last step we used the fact that the components v β are scalar func-
tions. This equation shows that the derivative of a vector is more than just the
derivative of its components. We must also take into account the change in the
basis vectors themselves.
Now, if we choose a fixed direction xα the derivative ∇α
eβ must in itself
be also a vector, since it represents the change in the basis vector along that
direction. This means that it can be expressed as a linear combination of the
basis vectors themselves. We introduce the symbols Γµαβ to denote the coefficients
of such linear combination:
∇α
eβ = Γµαβ
eµ . (1.8.2)
The Γµαβ are called the connection coefficients, as they allow us to map vectors
at different points in order to take their derivatives.
Using the above definition and rearranging indices we finally find that
∇α
v = ∂α v β + v µ Γβαµ
eβ , (1.8.3)
∇α v β = ∂α v β + v µ Γβαµ . (1.8.4)
This is called the covariant derivative, and is also commonly denoted by a semi-
colon ∇α v β ≡ v β ;α (in an analogous way the partial derivative is often denoted
by a comma ∂α v β ≡ v β ,α ). We can use the above results to show that the
covariant derivative of a one-form p̃ takes the form
∇α pβ = ∂α pβ − pµ Γµαβ . (1.8.5)
And if we now take our one-form to be the gradient of a scalar function we can
find that the symmetry requirement reduces to
It is important to mention the fact that this is only true for a coordinate basis
– for a non-coordinate basis the connection coefficients are generally not sym-
metric even if the derivative operator itself is symmetric. This is because, for a
8 The last requirement defines a manifold with no torsion. This requirement can in fact be
lifted, but we will only consider the case of zero torsion here.
22 BRIEF REVIEW OF GENERAL RELATIVITY
∂v ∂u f − ∂u ∂v f = v µ ∂µ (uν ∂ν f ) − uµ ∂µ (v ν ∂ν f )
= (v µ ∂µ uν − uµ ∂µ v ν ) ∂ν f
= 0 . (1.8.7)
T µν ;α = ∂α T µν + Γµαβ T βν + Γναβ T µβ ,
Tµν;α = ∂α Tµν − Γβαµ Tβν − Γβαν Tµβ ,
T µ ν;α = ∂α T µ ν + Γµαβ T β ν − Γβαν T µ β .
uβ ∇β v α = 0 . (1.8.8)
uβ ∇β (
v · w)
=0, (1.8.9)
uβ ∇β v α = uβ ∇β wα = 0 . (1.8.10)
From the definition of the scalar product we can easily show that this requirement
reduces to
∇α gµν = 0 . (1.8.11)
That is, in order for parallel transport to preserve the scalar product the
covariant derivative of the metric must vanish (which in particular means that
1.8 COVARIANT DERIVATIVES AND GEODESICS 23
the operation of raising and lowering indices commutes with the covariant deriva-
tives). Using now the general expression for the covariant derivative of a tensor
we find that this last condition implies that the connection coefficients must be
given in terms of the metric components as
g αµ ∂gβµ ∂gγµ ∂gβγ
Γα
βγ = + − . (1.8.12)
2 ∂xγ ∂xβ ∂xµ
v β ∇β v α = 0 . (1.8.13)
d2 xα dxβ dxγ
2
+ Γα
βγ =0, (1.8.14)
dλ dλ dλ
where λ is the parameter associated with the curve.9 Clearly, in Euclidean space
with Cartesian coordinates this equation simply reduces to v α = constant, but in
spherical coordinates the equation is not that simple because not all the Christof-
fel symbols vanish. As we have seen, geodesics are important as they are the paths
followed by objects free of external forces in special relativity. We will see later
that they also play a crucial role in general relativity.
Geodesics have another very important property. We can in fact show that
they are extremal curves, i.e. curves that extremize the distance between nearby
points on the manifold (they are not always curves of minimum length since the
metric need not be positive definite, as in the case of special relativity).
9 Strictly speaking, the geodesic equation only has this form when we use a so-called affine
parameter, for which we asks not only that the tangent vector remains parallel to itself but
also that it has constant magnitude. But here we will take the point of view that a geodesic is
a curve and not just its image. We will then say that if we don’t have an affine parameter we
in fact don’t have a geodesic, but rather a different curve with the same image.
24 BRIEF REVIEW OF GENERAL RELATIVITY
this is not the case. Looking at their definition in equation (1.8.2), we can see
that the connection coefficients„ «
map a vector
v onto a new vector v β ∇β
eα , so
they are in fact coefficients of 11 tensors, one such tensor for each basis vector
µ
eα . This becomes more clear if we write the coefficients as (Γα )β , where the
index α identifies the tensor, and the indices {µ, β} identify the components of
that tensor. So, contrary to what is often said, the connection coefficients are
indeed components of tensors, only not the tensors we might at first think about.
However, this fact is not very useful in practice since when we change coordinates
we are also changing the basis vectors and hence the tensors associated with the
connection coefficients.
Because of all this we can clearly not expect the Γµαβ to transform as com-
„ «
1
ponents of a 2
tensor, and indeed they don’t. The transformation law for the
connection coefficients is more complicated and takes the form:
Γµ̄ᾱβ̄ = Λµ̄ν Λγᾱ Λδβ̄ Γνγδ + ∂γ xµ̄ ∂ᾱ ∂β̄ xγ . (1.8.15)
Notice that the first term is precisely what we would expect from the compo-
nents of a tensor, but there is a second term that involves second derivatives of
the coordinate transformation. Another way in which we can understand that
the Christoffel symbols can not transform as tensors is precisely the fact that in
Cartesian coordinates they vanish. And if we have a tensor that has all its com-
ponents equal to zero in a given coordinate system, then the tensor itself must
be zero and its components in any other coordinate system must also vanish.
The transformation law (1.8.15) is in fact independent of any relation be-
tween the connection coefficients and the metric, i.e. it is valid for any derivative
operator ∇. An important consequence of this is the following: Consider the con-
nection coefficients associated with two different derivative operators ∇ and ∇, ˆ
µ µ µ µ
and define ∆αβ := Γαβ − Γ̂αβ . It turns out that ∆αβ transforms as a tensor since
the term involving second derivatives of the coordinate transformation in (1.8.15)
cancels out (it is independent of the Γs). In other words, ∆µαβ are the components
of a properly defined tensor. Similarly, assume that we have a manifold with a
given coordinate system and consider two different metric tensors gµν and ĝµν
defined on it. The difference between the Christoffel symbols associated with the
different metrics also transforms as a tensor. As we will see in later Chapters,
this observation has some interesting applications in numerical relativity.
Before finishing this section, there is an important fact about the relation be-
tween covariant derivatives and Lie derivatives that deserves mention. Consider,
for example, the commutator of two vectors but written in terms of covariant
derivatives instead of partial derivatives:
1.9 CURVATURE 25
v β ∇β uα − uβ ∇β v α = v β ∂β uα + Γα
βγ u
γ
− uβ ∂β v α + Γα
βγ v
γ
= v β ∂β uα − uβ ∂β v α = £v uα . (1.8.16)
We see that all contributions from the Christoffel symbols in the covariant deriva-
tives have canceled out. In fact, it is easy to convince oneself that this happens
for the Lie derivative of any tensor, i.e. the Lie derivatives can be written indis-
tinctively in terms of partial or covariant derivatives.
We can use this last observation to rewrite the condition for the existence of
a Killing field, equation (1.6.10), in the following way
Using the fact that the covariant derivative of the metric vanishes (so the metric
can move inside the covariant derivatives), we find that this reduces to
This is known as the Killing equation. It is interesting to notice that the condition
for the existence of a Killing vector field is more compact when written in terms
of the co-variant components of ξ
instead of the contra-variant ones.
1.9 Curvature
As we have seen, the metric tensor is not in itself the most convenient way of
describing a curved manifold as it can become quite non-trivial even in Euclidean
space by considering curvilinear coordinates. The correct way to differentiate
between flat and curved manifolds is by considering what happens to a vector as it
is parallel transported around a closed circuit on the manifold. On a flat manifold,
the vector does not change when this is done, while on a curved manifold it does.
This can be clearly seen if we think about moving a vector on the surface of the
Earth. Assume that we start on the equator with a vector pointing east. We
move north following a meridian all the way up to the north pole, then we come
back south along another meridian that is 90 degrees to the east of the first one
until we get back to the equator. Finally, we move back to our starting point
following the equator itself. If we do this we will find that our vector now points
south. That is, although on a curved manifold parallel transport defines a local
notion of parallelism, there is in fact no global notion of parallelism.
In order to define a tensor that is associated with the curvature of a man-
ifold we must consider the parallel transport of a vector along an infinitesimal
closed circuit. If we take this closed circuit as one defined by the coordinate lines
themselves (see Figure 1.3), we can show that the change of the components of
a vector
v as it is parallel transported along this circuit is given by
x2 coordinate lines
x1 coordinate lines
dx1
dx2
Fig. 1.3: Parallel transport of a vector around a closed infinitesimal circuit formed
by coordinate lines. On a flat manifold the vector will not change when this is done,
while on a curved manifold it will. The measure of the change is given by the Riemann
curvature tensor.
where Rα βµν are the components of the Riemann curvature tensor, which are
given in terms of the Christoffel symbols as10
ρ α ρ
βν − ∂ν Γβµ + Γρµ Γβν − Γρν Γβµ .
Rα βµν := ∂µ Γα α α
(1.9.2)
10 The overall sign in the definition of the Riemann tensor is a matter of convention and
changes from one text to another. The sign used here is the same as that of MTW [206].
1.9 CURVATURE 27
that is, the fully co-variant Riemann tensor is antisymmetric on the first and
second pair of indices, and symmetric with respect to exchange of these two
pairs. Also, it obeys a cyclic relation on the last three indices of the form
Using the anti-symmetry with respect to exchange of the last two indices, we
find that this cyclic relation can also be expressed as
Rα [βµν] = 0 , (1.9.6)
so the completely antisymmetric part of Riemann with respect to the last three
indices vanishes.
The symmetries of the Riemann tensor imply that in the end it has only
n2 (n2 − 1)/12 independent components in n dimensions, that is 20 independent
components in four dimensions. These symmetries also imply that the trace (i.e.
the contraction) of the Riemann tensor over its first and last pairs of indices
vanishes. On the other hand, the trace over the first and third indices does not
vanish and is used to define the Ricci curvature tensor:
The Ricci tensor is clearly symmetric in its two indices, and in four dimensions
has 10 independent components. Notice that in four dimensions having the Ricci
tensor vanish does not mean that the manifold is flat, as the remaining 10 com-
ponents of Riemann can still be different from zero. However, in three dimensions
it does happen that the vanishing of the Ricci tensor implies that the Riemann
is zero, as in that case both these tensors have only six independent components
and Riemann turns out to be given directly in terms of the Ricci tensor.
Finally, the Ricci scalar, also known as the scalar curvature, is defined as the
trace of the Ricci tensor itself
R := Rµ µ . (1.9.8)
An important result regarding the form of the metric and the Christoffel
symbols is the fact that any differentiable manifold with a metric is locally flat
in the sense that at any given point on the manifold we can always find a set
of coordinates such that the metric tensor becomes diagonal with ±1 elements,
and all the Christoffel symbols vanish at that point. This means that we can
always choose orthogonal coordinates at the given point p and guarantee that
such coordinates will remain orthogonal in the immediate vicinity of that point
so that ∂µ gαβ |p = 0. What we can not do is guarantee that all second derivatives
of the metric will vanish because there are in general fewer third derivatives of the
coordinate transformation that can be freely specified than second derivatives of
the metric tensor itself. In four dimensions, for example, we have 80 independent
third derivatives of the coordinate transformation ∂µ ∂ν ∂σ xᾱ , while there are
28 BRIEF REVIEW OF GENERAL RELATIVITY
100 second derivatives of the metric ∂α ∂β gµν that we would need to cancel.
This leaves 20 degrees of freedom in the second derivatives of the metric which
correspond precisely with the 20 independent components of the Riemann tensor.
or equivalently
Rα β[µν;λ] = 0 . (1.10.2)
These relations are known as the Bianchi identities, and play a very important
role in general relativity. One of their most important consequences comes from
contracting them twice, which results in
Gµν ;ν = 0 , (1.10.3)
notice that the geodesics of a curved spacetime need not look straight when
projected into three-dimensional space. Notice also that gravity is not considered
a force here – a particle in the presence of a gravitational field with no other
forces acting on it is considered free. This simple law is enough to describe the
motion of the planets around the Sun, for example.
It is usual to parameterize a geodesic using the particle’s own proper time τ ,
so the equation of motion for a free particle in spacetime takes the form11
d2 xα dxβ dxγ
2
+ Γα
βγ =0. (1.11.1)
dτ dτ dτ
The geodesic equation of motion can also be written as
uµ ∇µ uα = 0 , (1.11.2)
where
u is just the tangent vector to the geodesic curve: uα = dxα /dτ . This vector
is know as the 4-velocity of the particle, and is the relativistic generalization of
the Newtonian concept of velocity. It is important to notice that
p
:= m
u , (1.11.4)
E := −
v ·
p = −v α pα . (1.11.5)
In the particular case when the observer is static in a given coordinate system
√
its 4-velocity will have components v α = (1/ −g00 , 0, 0, 0), which is easy to see
from the fact that we must have
v ·
v = −1, so that for this static observer
√
E = p0 / −g00 . In special relativity this reduces to E = mγ, with γ the Lorentz
factor associated with the motion of the particle (or, in SI units, E = mγc2 ).
Notice that since on a curved spacetime there is no global notion of parallelism
and no preferred family of observers, in general a given observer can only define
an energy for a particle at his own location but not for a particle far away. In
11 Of course, this doesn’t work for a light ray for which proper time is always zero, so in the
implies that the time components of co-variant vectors have the opposite sign to what one
would expect.
1.11 GENERAL RELATIVITY 31
special relativity, however, given a specific inertial frame such a preferred family
of observers does exist, and the energy of a particle on that inertial frame can
therefore be globally defined.
There are, however, some special situations when one can define a useful
notion of energy and other conserved quantities in a global sense on a curved
manifold. Let us assume that we have a manifold with some symmetry. As we
have seen, associated with that symmetry there will be a Killing vector satisfying
equation (1.8.18). If we now take
u to be the 4-velocity of a free particle, or more
generally the tangent vector to a geodesic curve, we will have
uµ ∇µ ξ
·
u = uµ ∇µ (ξν uν ) = ξν uµ ∇µ uν + uν uµ ∇µ ξν = uµ uν ∇µ ξν , (1.11.6)
where we have used the fact that the curve is a geodesic. If we now notice that
the Killing equation implies that ∇µ ξν is antisymmetric so that uµ uν ∇µ ξν = 0,
we finally find
uµ ∇µ ξ
·
u = 0 . (1.11.7)
13 One might find confusing the fact that the conserved energy on a static spacetime is
√
E = −p0 , while the energy measured locally by a static observer is instead E = −p0 / −g00 .
The reason for the difference is that the energy measured locally only corresponds to kinetic
plus rest mass energy without the potential energy contribution. This can be seen more clearly
in the case of weak gravitational fields and small velocities (see Section 1.14 below), for which
g00 −1 − 2φ, 2
√ with φ 1 the Newtonian potential, and where we find E m + mv /2 + mφ
and E = E/ −g00 m + mv2 /2 (neglecting contributions quadratic in φ and v).
32 BRIEF REVIEW OF GENERAL RELATIVITY
∂t T 00 + ∂i T 0i = 0 .
∇β T αβ = 0 . (1.12.5)
Notice that this equation includes the conservation of energy for α = 0, and
the conservation of momentum for α = i. This conservation law for energy and
momentum is a fundamental requirement of any physical theory, so we must
demand the stress-energy tensor of any matter field to satisfy it. As we will see
below, it also plays a crucial role in the field equations of general relativity.
There is another important comment to be made about equation (1.12.5).
Strictly speaking this equation only leads to true conservation of energy, in the
sense that the change of energy in a finite region equals the integrated flux of
energy into that region, when we have a flat spacetime. For a curved spacetime
strict conservation no longer holds as the gravitational field can do work and
change the energy and momentum of the matter (the conservation law (1.12.5)
actually represents local conservation as seen by freely falling observers). We
might think that the obvious thing to do would be to include the stress-energy
tensor for the gravitational field itself in order to recover full conservation of
energy, but it turns out that in general relativity no local expression for the
energy of the gravitational field exists.
The conservation laws can in fact also be derived from a variational principle.
This can be done in the case when we have a matter field that can be specified by
a Lagrangian, that is a scalar function L that depends only on the field variables,
their derivatives, and the metric coefficients. Assume that our system is described
by a series of field variables φa , where here a is just a label that numbers the
different fields, and not a spacetime index. The Lagrangian will then be a scalar
function such that L = L(φa , ∂α φa , gαβ ). We define the action as the integral of
the Lagrangian over an (open) region Ω of our manifold
S (Ω) = L dV , (1.12.6)
Ω
with dV = |g| dn x the volume element. We then assume that the dynami-
cal equations are obtained from a least action variational principle, that is, we
demand that the action is stationary with respect to variation in the fields. No-
tice, in particular, that the metric gαβ must be considered a dynamical field
in this variation. Variation with respect to the field variables φa will give us
the so-called field equations, that is the dynamical equations for the field under
consideration, while variation with respect to the metric gives us an equation
that represents the equilibrium between the geometry and the fields, or in other
words, the conservation laws.
34 BRIEF REVIEW OF GENERAL RELATIVITY
A general result from linear algebra tells us that δ|g| = |g| g αβ δgαβ , so the last
equation can be rewritten as
δg S = T αβ δgαβ dV , (1.12.8)
Ω
∂L g αβ
T αβ := + L
∂gαβ 2
∂L g αβ
= −g αµ g βν + L. (1.12.9)
∂g µν 2
It is usually more convenient to rewrite this definition of the stress-energy tensor
in terms of its co-variant components. The expression then becomes
∂L gαβ
Tαβ := − + L. (1.12.10)
∂g αβ 2
Asking now for the action to be stationary with respect to the variation, i.e. ask-
ing for δg S = 0, can be shown to directly imply the conservation laws (1.12.5).14
There is an important comment to be made here. Instead of working with the
Lagrangian L as the basic dynamical function, we might choose to absorb the
volume element and work with the Lagrangian density L := |g| L. In terms
of the Lagrangian density the expression for the stress-energy tensor is more
compact, and takes the simple form:
1 ∂L
Tαβ = − . (1.12.11)
|g| ∂g αβ
Both expressions for Tαβ are entirely equivalent, but in practice it is faster and
more transparent to use equation (1.12.10) directly to find the stress-energy
tensor of a field for which we have a Lagrangian L.
14 The proof of this involves assuming that the field equations are satisfied, and then consid-
ering an arbitrary change of coordinates inside the region Ω. By making use of the fact that
this change of coordinates can not affect the action integral, and looking at the change in the
form of the integrand itself, the conservation laws follow.
1.12 MATTER AND THE STRESS-ENERGY TENSOR 35
The electric field E and magnetic field B are then defined through
⎛ ⎞
0 Ex Ey Ez
⎜ −Ex 0 Bz −By ⎟
F αβ = ⎜
⎝ −Ey −Bz 0 Bx ⎠ .
⎟ (1.12.15)
−Ez By −Bx 0
This shows that the electric and magnetic fields are in fact not two separate
3-vectors, but rather the six independent components of an antisymmetric rank
2 tensor. In particular, under Lorentz transformations the components of the
electric and magnetic fields mix together. The Lagrangian for the electromagnetic
field turns out to be
1 αβ 1 αµ βν
L=− F Fαβ = − g g Fµν Fαβ , (1.12.16)
8π 8π
which is not surprising since, apart from normalization factors, it is the only
scalar we can form that is quadratic in the field. From this Lagrangian we find
the following stress-energy tensor:
1 gαβ µν
Tαβ = Fαµ Fβ µ − F Fµν . (1.12.17)
4π 4
We can readily verify that this gives the standard expressions for the energy
density and momentum density (Poynting vector) for the electromagnetic field.
36 BRIEF REVIEW OF GENERAL RELATIVITY
A Lagrangian can in fact also be defined for a perfect fluid, but the derivation
of the corresponding stress-energy tensor from such a Lagrangian is considerably
less transparent than in the cases of the Klein-Gordon and electromagnetic fields,
so we will not consider it here.
The conservation laws (1.12.5) are often enough to find the dynamical equa-
tions for the system under consideration. For example, in the case of a perfect
fluid they reduce to the first law of thermodynamics (conservation of energy)
and the relativistic Euler equations (conservation of momentum). In the case of
a scalar field, the conservation laws imply directly the Klein–Gordon equation:
2φ − m2 φ = 0 , (1.12.18)
where 2 := g µν
∇µ ∇ν is the d’Alambertian operator.
theory this symbol should be understood to mean just the standard three-dimensional gradient,
and not a covariant derivative.
1.13 THE EINSTEIN FIELD EQUATIONS 37
We then see that, in flat space, geodesics do not deviate from each other, or in
other words parallel straight lines remain parallel (which is Euclid’s famous fifth
postulate). In a curved space, however, geodesic lines deviate from each other.
Comparing the Newtonian and relativistic expressions for the tidal accel-
eration we see that Rα µβν uµ uν plays the role of ∂i ∂j φ. On the other hand,
the energy density measured by an observer following the geodesic is given by
ρ = Tµν uµ uν , which suggests that the relativistic version of Newton’s field equa-
tion ∇2 φ = 4πρ should have the form Rα µαν uµ uν = 4πTµν uµ uν , and since the
vector uα is arbitrary this reduces to Rµν = 4πTµν . The last equation has the
structure we would expect, as it relates second derivatives of the metric (the
gravitational “potential”) with the energy density, just as in Newton’s theory.
This field equation was in fact considered by Einstein, but he quickly realized
that it can’t be correct since the conservation laws ∇µ T µν = 0 would imply that
∇µ Rµν = 0, which imposes a serious restriction in the allowed geometries of
spacetime.
The solution to the problem of the compatibility of the field equations and
the conservation laws comes from considering the contracted Bianchi identities.
In particular, these identities imply that the Einstein tensor (1.10.4) has zero
divergence, so a consistent version of the field equations would be
1
Gµν ≡ Rµν − gµν R = 8πTµν , (1.13.3)
2
where the factor of 8π is there in order to recover the correct Newtonian limit.
These are the field equations of general relativity, or in short Einstein’s equations.
Note that they are in fact ten equations, as the indices µ and ν take values from
0 to 3, and both the Einstein and stress-energy tensors are symmetric. Notice
also that, as written above, the field equations imply the conservation laws, as
∇µ T µν = 0 follows from pure geometric consistency.
It is sometimes useful to rewrite the field equations in the equivalent form
1
Rµν = 8π Tµν − gµν T , (1.13.4)
2
with T := T µ µ the trace of the stress-energy tensor. This equation can be ob-
tained by simply taking the trace of the original field equations and noticing
that they imply R = −8πT (the trace of the metric is always equal to the total
number of dimensions of the manifold, which in the case of spacetime is 4).
38 BRIEF REVIEW OF GENERAL RELATIVITY
The Einstein equations just introduced could not appear to be more simple.
This simplicity, however, is only apparent since each term is a short-hand for
considerably more complex objects. Written in their most general form, in an
arbitrary coordinate system and with all terms expanded out, the Einstein equa-
tions become a system of ten coupled, non-linear, second order partial differential
equations with thousands of terms.
In the case of vacuum the stress-energy tensor vanishes, and Einstein’s equa-
tions reduce to:
Gµν = 0 , (1.13.5)
or equivalently,
Rµν = 0 . (1.13.6)
Note that, as mentioned before, the fact that the Ricci tensor vanishes does not
imply that spacetime is flat. This is as it should be, since we know that the
gravitational field of an object extends beyond the object itself, which means
that the curvature of spacetime in the empty space around a massive object
can not be zero. The Einstein equations in vacuum have another important
consequence, they describe the way in which the gravitational field propagates in
empty space and, in an analogous way to electromagnetism, predict the existence
of gravitational waves: perturbations in the gravitational field that propagate at
the speed of light. The prediction of the existence of gravitational waves tells
us that, in Einstein’s theory, gravitational interactions do not propagate at an
infinite speed, but rather at the speed of light (see Section 1.14 below).
The field equations (1.13.3) can also be derived from a variational principle,
we only need to introduce a Lagrangian for the gravitational field itself. Since
gravity is associated with the Riemann tensor, and the Lagrangian must be a
scalar, the only possibility is to take the Ricci scalar as the Lagrangian:
LG = R . (1.13.7)
There are two important final comments to be made about the Einstein field
equations. The first has to do with the relation between the field equations and
the conservation laws. Since the conservation laws are a consequence of the field
equations, and they in turn contain all the dynamical information in many cases,
it turns out that often we do not need to postulate the equations of motion for the
matter as they can be derived directly from the field equations. In particular, for
1.14 WEAK FIELDS AND GRAVITATIONAL WAVES 39
a perfect fluid with zero pressure (known as dust in relativity), the field equations
predict that the flow lines should be timelike geodesics. Also, it can be shown
that a small test body with weak self-gravity will necessarily move on a geodesic.
So there is really no need to postulate geodesic motion of test particles in general
relativity – it is a consequence of the field equations.
The second comment has to do with the meaning of a solution to the Ein-
stein equations. Notice that we have ten second-order differential equations for
the ten components of the metric tensor gµν , so naively we might expect that,
given a matter distribution and appropriate boundary conditions, the full met-
ric tensor would be completely determined. But this can not be since the gµν
are components of a tensor in some specific coordinate system. If we change
the coordinates the components of the metric tensor will change, but the phys-
ical content of the solution can not depend on the choice of coordinates. This
means that a “solution” of the field equations should really be understood as
the equivalence class of all four-dimensional metrics related to each other by a
change of coordinates. Since there are four coordinates, there are four arbitrary
degrees of freedom in the metric components, so the field equations should only
fix the remaining six components. But this is precisely the content of the Bianchi
identities ∇ν Gµν = 0 – they provide us with four differential relations between
the ten field equations, so there are in fact only six independent equations. In
general there is no “natural” way of separating the six independent components
of gµν in a clear way. However, such a separation can be achieved by choosing a
specific set of coordinates such as the one used in the 3+1 formulation that we
will study in Chapter 2.
g µν = η µν − hµν . (1.14.3)
The linearized field equations are now easy to derive. We can show that to
linear order in hµν the Riemann tensor turns out to be
1
Rαβµν = (∂β ∂µ hαν + ∂α ∂ν hβµ − ∂β ∂ν hαµ − ∂α ∂µ hβν ) , (1.14.4)
2
and the Ricci tensor becomes
1
Rµν = ∂ α ∂(µ hν)α − (∂µ ∂ν h + ∂α ∂ α hµν ) , (1.14.5)
2
with h := hαα the trace of hµν , and ∂ ≡ η
α αβ
∂β . If we define the trace reversed
field tensor h̄µν := hµν − ηµν h/2, the Einstein tensor turns out to be
1
Gµν = ∂ α ∂(µ h̄ν)α − ∂α ∂ α h̄µν + ηµν ∂ α ∂ β h̄αβ , (1.14.6)
2
and the linearized field equations become
1
∂ α ∂(µ h̄ν)α − ∂α ∂ α h̄µν + ηµν ∂ α ∂ β h̄αβ = 8πTµν . (1.14.7)
2
These equations, however, are still more complicated than they need to be.
In order to simplify them further, consider for a moment a small, but arbitrary,
change of coordinates of the form xµ̄ = xµ + ξ µ , where ξ
is a small vector in the
sense that |∂ν ξ µ | 1. The Jacobian matrix will then be given by
or in terms of h̄µν :
h̄µν → h̄µν − 2∂(µ ξν) + ηµν ∂α ξ α . (1.14.12)
This is entirely analogous to the electromagnetic case, where we find that the
electromagnetic field Fαβ (and hence the physics) remains unaffected by trans-
formations in the potential of the form Aµ → Aµ + ∂µ f , for an arbitrary scalar
function f . In the case of gravity the word gauge is perhaps even more appro-
priate as we are in fact dealing with a change in the conventions for measuring
distances, i.e. the coordinates.
We can now use the gauge freedom to simplify the equations by choosing a
vector ξ
that solves the equation
∂α ∂ α ξ β = ∂α h̄αβ . (1.14.13)
That this equation can always be solved is easy to see from the fact that it is
nothing more than the wave equation for ξ β with a source given by ∂α h̄αβ . If we
do this we find that the gauge transformation implies that
That is, we can always find a gauge such that h̄µν has zero divergence. Such a
gauge is called the Lorentz gauge, a name taken from the analogous condition in
electromagnetism ∂µ Aµ = 0.
If we assume that we are in the Lorentz gauge we will have
∂ν h̄νµ = 0 , (1.14.15)
where now 2 stands for the d’Alambertian operator in flat space, or in other
words the wave operator. These are the field equations for a weak gravitational
field in their standard form.
42 BRIEF REVIEW OF GENERAL RELATIVITY
Moreover, small velocities also imply that time derivatives in the d’Alambertian
operator are much smaller than spatial derivatives, so the equation simplifies to
which is nothing more than the wave equation for waves propagating at the speed
of light (c = 1). So we find that perturbations in the gravitational field behave
16 The Newtonian gravitational field at the surface of the Sun in geometric units is φ ∼ 10−6 ,
as waves that propagate at the speed of light, i.e. the field equations predict the
existence of gravitational waves. The simplest solution to the above equation are
plane gravitational waves of the form
with Aµν the amplitude tensor of the waves and k µ the wave vector. Substituting
this into the wave equation we immediately find that
η αβ kα kβ = kα k α = 0 , (1.14.22)
that is, the wave vector must be null, which is just another way of saying that the
waves propagate at the speed of light. We must remember, however, that the field
equations only take the simplified form (1.14.20) in the Lorentz gauge (1.14.15),
which in this case implies that
Aαβ kβ = 0 , (1.14.23)
that is, the amplitude tensor Aµν must be orthogonal to the wave vector.
At this point we might think that gravitational waves have six independent
degrees of freedom: The ten independent components of the symmetric tensor
Aαβ , minus the four constraints imposing the Aαβ must be orthogonal to k α . This
is, however, not true since there is still considerable gauge freedom left within
the Lorentz gauge. The reason is that when we impose the Lorentz gauge by
choosing a specific gauge transformation vector ξ α , we are in fact only restricting
the value of 2ξ α , so we can add to this ξ α any new vector ξ˜α such that 2ξ˜α = 0
without changing anything. In particular, we can take ξ˜α = B α exp(ikβ xβ ) for
any arbitrary constant vector B α . We then have four extra gauge degrees of
freedom, which reduces the independent components of Aµν to only two. In fact,
we can always choose a B α such that two further conditions are imposed on Aµν
Aµ µ = 0 , Aµν uν = 0 , (1.14.24)
direction of propagation to be along the z axis, then all these conditions imply
that Aµν has the form
⎛ ⎞
0 0 0 0
⎜ 0 A+ A× 0 ⎟
Aµν =⎜ ⎟
⎝ 0 A× −A+ 0 ⎠ , (1.14.26)
0 0 0 0
1
Rµxνx = − ∂µ ∂ν hTxxT , (1.14.29)
2
1
Rµyνy = − ∂µ ∂ν hTyyT , (1.14.30)
2
1
Rµxνy = − ∂µ ∂ν hTxyT , (1.14.31)
2
where here µ and ν only take the values (t, z). This shows that gravitational
waves are not just gauge effects since they produce non-zero curvature.
1.14 WEAK FIELDS AND GRAVITATIONAL WAVES 45
+ polarization
× polarization
x
Fig. 1.4: Effect of gravitational waves on a ring of free particles. In the case of the +
polarization the ring oscillates by being elongated and compressed along the x and y
direction, while for a × polarization the elongations and compressions are along the
diagonal directions.
Consider now the equation for geodesic deviation (1.13.2). For two nearby
particles initially at rest with separation vector z µ this becomes
or in other words
1 TT x k2 + x
ax = ḧxx z + ḧTxyT z y = − 0 A z + A× z y exp(ikα xα ) , (1.14.33)
2 2
1 k 2 × x
ay = ḧTxyT z x + ḧTyyT z y = − 0 A z − A+ z y exp(ikα xα ) . (1.14.34)
2 2
From this we see that gravitational waves have two independent polarizations.
The first one corresponds to A+
= 0 and A× = 0 and is called the + polarization,
while the second one corresponds to A+ = 0 and A×
= 0 and is called the ×
polarization. We find that under the effect of a passing gravitational wave, the
relative position of nearby particles changes even if the particles’ coordinates do
not. If we have a ring of free particles floating in space, and a gravitational wave
moving along the perpendicular direction to the ring passes by, a + polarization
will cause the ring to oscillate by being alternately elongated and compressed
along the x and y directions, while a wave with × polarization will produce
elongations and compressions along the diagonal directions (see Figure 1.4).
The effect that gravitational waves have on a ring of free particles can be
used to build a gravitational wave detector. Two basic types of detectors have
been considered. The first type are resonant bars, essentially large cylindrical
aluminum bars that have longitudinal modes of vibration with frequencies close
to those of the expected gravitational waves coming from astrophysical sources.
A passing gravitational wave should excite those vibrational modes and make
46 BRIEF REVIEW OF GENERAL RELATIVITY
the bars resonate. The first of these bars were constructed by Joseph Weber
in the 1960s, and several modern versions that work at cryogenic temperatures
are still in use today. The second type of detectors are laser interferometers,
where we track the separation of freely suspended masses using highly accurate
interferometry. The first prototype interferometers where build in the 1980s,
and today several advanced kilometer-scale interferometric detectors are either
already working or in the last stages of development: The LIGO project in the
United States, the VIRGO and GEO 600 projects in Europe, and the TAMA
project in Japan.
To this day, there has been no unambiguous detection of gravitational waves,
though this might change in the next few years as the new interferometers slowly
reach their design sensitivities. The main reason why gravitational waves have
not been observed yet has to do with the fact that gravity is by far the weakest of
all known interactions. Estimates of the amplitudes of gravitational waves coming
from violent astrophysical events such as the collisions of neutron stars or black
holes from distances as far away as the Virgo Cluster put the dimensionless
amplitude of the waves as they reach Earth at the level of A ∼ 10−21 . In other
words, a ring of particles one meter in diameter would be deformed a distance
of roughly one millionth of the size of a proton (the effect scales with the size
of the system, which explains why the large interferometric detectors have arms
with lengths of several kilometers). Detecting such small distortions is clearly a
huge challenge. However, with modern technology such a detection has become
not only possible but even likely before this decade is out.
The prediction of the gravitational wave signal coming from highly relativistic
gravitational systems is one of the most important applications of numerical
relativity, and because of this we will come back to the subject of gravitational
waves in later Chapters.
where for simplicity we have dropped the prime from r . In the general case we
have therefore only two unknown functions to determine, f (r) and h(r). The
coordinates (r, θ, φ) are known as Schwarzschild coordinates, and in particular
the radial coordinate is called the areal radius since in this case the area of the
1.15 THE SCHWARZSCHILD SOLUTION AND BLACK HOLES 47
sphere is always given by 4πr2 .17 Equation (1.15.2) for the metric allows for
a huge simplification in the field equations: Instead of having to solve for 10
independent metric components we only need to find two functions of r.
The next step is to substitute the metric (1.15.2) into the Einstein field
equations in order to find f and h. Since we are interested in the exterior field,
we must use the vacuum field equations Rµν = 0. Doing this we find
1 −1/2 d −1/2 df 1 df
0 = Rtt = (f h) (f h) + , (1.15.3)
2 dr dr rf h dr
1 d df 1 dh
0 = Rrr = − (f h)−1/2 (f h)−1/2 + 2 , (1.15.4)
2 dr dr rh dr
1 df 1 dh 1 1
0 = Rθθ = Rφφ = − + 2
+ 2 1− , (1.15.5)
2rf h dr 2rh dr r h
with all other components of Rµν equal to zero. The first two equations imply
d ln f d ln h
+ =0, (1.15.6)
dr dr
which can be trivially integrated to find f = K/h, with K some constant. With-
out loss of generality we can take K = 1, as this only requires rescaling the time
coordinate. Substituting this into the third equation we find
df 1−f d
− + =0 ⇒ (rf ) = 1 , (1.15.7)
dr r dr
whose solution is
f = 1 + C/r , (1.15.8)
with C another constant. The metric then takes the form
−1
C C
ds2 = − 1 + dt2 + 1 + dr2 + r2 dΩ2 . (1.15.9)
r r
The value of the constant C can be obtained by comparing with the Newto-
nian limit (1.14.19), which should correspond to r >> 1. Taking the Newtonian
potential to be φ = −M/r we find C = −2M , which implies
−1
2M 2M
ds2 = − 1 − dt2 + 1 − dr2 + r2 dΩ2 . (1.15.10)
r r
17 Notice, however, that there is no reason why the distance to the origin of the coordinate
system r = 0 should be equal to r. In fact, there is no reason why the point r = 0 should be
part of the manifold at all. For example, consider the geometry of a parabola of revolution.
The areal radius in this case will measure the circumference of the surface at a given place,
but there is clearly no point with r = 0 on the surface. In fact, the areal radius is not always
well behaved. Think of a two-dimensional surface of revolution that resembles a bottle with a
narrow throat (this is called a bag of gold geometry). It is clear that as we go from inside the
bottle towards the throat and then out of the bottle, the areal radius first becomes smaller
and then larger again, i.e. it is not monotone so it is not a good coordinate.
48 BRIEF REVIEW OF GENERAL RELATIVITY
Notice that none of these components are singular at r = 2M , but they are all
singular at r = 0. This means that the gravitational field is singular at r = 0
(as we would expect for a point particle), but perfectly regular at r = 2M . The
only possibility is therefore that at r = 2M something goes wrong with the
Schwarzschild coordinates.
There is something else to notice about Schwarzschild’s solution. For r < 2M
it turns out that the coordinates r and t change roles: r becomes a timelike
coordinate (grr < 0), while t becomes spacelike (gtt > 0). This implies that once
an object has crossed r = 2M , the advance of time becomes equivalent with a
decrease in r, that is, the object must continue toward smaller values of r for
the same reason that time must flow to the future. As nothing can stop the flow
of time, there is no force in the Universe capable of preventing the object from
reaching r = 0, where it will find infinite tidal forces (the Riemann is singular)
that will destroy it. The Schwarzschild radius represents a surface of no return:
Further out it is always possible to escape the gravitational field, but move closer
than r = 2M and the fall all the way down to r = 0 becomes inevitable.
Since the regularity of Riemann implies that the problem at r = 2M must
be caused by a poor choice of coordinates, we can ask the question of whether
or not a better set of coordinates exists to study the Schwarzschild spacetime.
In fact, we can construct several coordinate systems that are regular at r = 2M .
One of the first are the so-called Eddington–Finkelstein coordinates, discovered
1.15 THE SCHWARZSCHILD SOLUTION AND BLACK HOLES 49
which implies
dt = ±(1 − 2M/r)−1 dr . (1.15.16)
This equation can be easily integrated to find
r∗ := r + 2M ln (r/2M − 1) . (1.15.18)
In these coordinates, radially ingoing null lines turn out to have constant coor-
dinate speed dr/dt̃ = −1, just as in Minkowski’s spacetime. The outgoing null
lines, on the other hand, have a speed given by dr/dt̃ = (1 − 2M/r)/(1 + 2M/r),
which implies that their speed is less than 1 for r > 2M , it is zero for r = 2M
(r = 2M is in fact the trajectory of an “outgoing” null geodesic), and becomes
negative for r < 2M (i.e. the “outgoing” null lines move in instead of out). Since
all null geodesics move in for r < 2M , we must conclude that this region can have
no causal influence on the outside, as no physical interaction can travel faster
50 BRIEF REVIEW OF GENERAL RELATIVITY
“outgoing” photons
time
r=0
light-cones
ingoing photon
r=2M
singularity radius
than light and light itself can not escape. These properties of the Schwarzschild
spacetime in Kerr–Schild coordinates are shown in Figure 1.5.
Eddington–Finkelstein coordinates, or the closely related Kerr–Schild version,
behave much better than Schwarzschild coordinates, but they have a major flaw:
The time symmetry of the original solution is now gone, as ingoing and outgoing
geodesics do not behave in the same way. It is in fact possible to construct
coordinates of Eddington–Finkelstein type using the null coordinate Ũ := t − r∗
instead of Ṽ := t+r∗ , and in that case we finds that r = 2M becomes a barrier for
ingoing null lines instead of outgoing ones. We then distinguish between ingoing
and outgoing Eddington–Finkelstein coordinates.
We can construct a regular coordinate system that preserves the time sym-
metry by going from the {t, r} Schwarzschild coordinates to the purely null
coordinates {Ũ , Ṽ }. The metric then becomes
There is still a problem with this metric at r = 2M since the first metric coeffi-
cient vanishes, but this is easy to solve. Define ũ := −e−Ũ/4M and ṽ := +e+Ṽ /4M .
A change of coordinates from {Ũ , Ṽ } to {ũ, ṽ} takes the metric into the form
ds2 = − 32M 3 /r e−r/2M dũ dṽ + r2 dΩ2 . (1.15.22)
This expression is manifestly regular for all r > 0. Notice that here r still mea-
sures areas, but it is now a function of ũ and ṽ. The coordinates {ũ, ṽ} are well
behaved, but being null they are not easy to visualize. However, we can use them
to construct new spacelike and timelike coordinates by defining
1.15 THE SCHWARZSCHILD SOLUTION AND BLACK HOLES 51
The allowed range for {η, ξ} is given by r > 0, which implies η 2 < ξ 2 + 1.
In Kruskal–Szekeres coordinates the radial null lines turn out to be given
by dξ/dη = 1, i.e. they are lines at 45 degrees on the spacetime diagram. This
means that light-cones (and the causal relationships tied to them) behave just as
they do in flat space. Figure 1.6 shows a diagram of the metric (1.15.24). When
looking at this diagram one must remember that only {η, ξ} are represented, the
angular coordinates have been suppressed so that every point in the diagram in
fact corresponds to a sphere.
There are several important things to learn from the Kruskal diagram and the
relations (1.15.25). First, notice that lines of constant r correspond to hyperbolae
in the diagram, vertical for r > 2M and horizontal for r < 2M . This means that
a line of constant r is timelike for r > 2M , and hence an allowed trajectory for a
particle, but becomes spacelike for r < 2M , so objects can not stay at constant r
there. The degenerate hyperbola r = 2M is in fact a null line. The two branches
of the hyperbola at r = 0 mark the boundary of spacetime since there is a
physical singularity there. On the other hand, lines of constant Schwarzschild
time t correspond to straight lines through the origin in the diagram. Infinite
time t = ±∞ corresponds to the lines at 45 degrees and coincides with r = 2M .
We clearly see here the problem with Schwarzschild coordinates: At r = 2M
they collapse the full line into a single coordinate point.
Even more interesting is the fact that the lines r = 2M separate the spacetime
into four regions. In region I we have r > 2M so it is an exterior region, while
region II has r < 2M and is clearly the interior. An object that moves from
region I to region II can never get out again and must reach the singularity
at r = 0 sometime in its future. Since neither light nor any physical influence
can leave region II this region in called a black hole. The line at r = 2M that
separates the black hole from the exterior is called the black hole horizon.
There is a very important relationship between the area of the horizon and
the mass of the Schwarzschild spacetime. Notice that, by the very definition of
52 BRIEF REVIEW OF GENERAL RELATIVITY
2M
t <
ns 2M
co >
d r= nst
0 co
r= r=
rity
gu
la ')
in =
ures (t
fut 2M
r=
on t = const
riz
ho
pa
st
sin
gu
lar
ity
r=
0
the areal radius, the area of a sphere of radius r is given by A = 4πr2 . At the
horizon we have r = 2M , so that
M 2 = AH /16π . (1.15.26)
Let us consider now regions III and IV. Notice first that region IV is equivalent
to II but inverted in time: The singularity is in the past, and nothing can enter
region IV from the outside. This region is called a white hole. Finally, region III
is clearly also an exterior region, but it is completely disconnected from region
I: It is another exterior region, or in other words, another universe.
To understand more clearly the relationship between the two exterior regions
consider the surface t = 0 (the horizontal line in the diagram). If we approach
the origin from region I, we see that r becomes smaller and smaller until it
reaches a minimum value of r = 2M , then as we penetrate into region III it
starts growing again. The resulting geometry is known as an Einstein–Rosen
bridge or a wormhole: two asymptotically flat regions joined through a narrow
tunnel. There are two important things to mention about the Einstein–Rosen
bridge. First, it is impossible to traverse it since objects moving from region I
must always remain inside their light-cones, so if they attempt to reach region
III they will find themselves first in region II and will end up in the future
1.16 BLACK HOLES WITH CHARGE AND ANGULAR MOMENTUM 53
At the risk of boring the reader with yet another coordinate system for the
Schwarzschild spacetime, I will introduce one last set of coordinates that is fre-
quently used in numerical relativity. It turns out that it is possible to rewrite the
metric in such a way that the spatial part is conformally flat, that is, the spatial
metric is just the Minkowski metric times a scalar function. In order to do this
we must define a new radial coordinate r̃ such that
2
r = r̃ (1 + M/2r̃) . (1.15.27)
Finally, in 1965 the charged rotating black hole solution, that contains all other
previous solutions as special cases, was found by Newman et al. [217]. The metric
for this so-called Kerr–Newman black hole is given by
∆ − a2 sin2 θ 2a sin2 θ r2 + a2 − ∆
ds = −
2
dt −
2
dtdφ
ρ2 ρ2
2 2 !
r + a2 − ∆ a2 sin2 θ 2 2 ρ2 2
+ sin θ dφ + dr + ρ2 dθ2 , (1.16.1)
ρ2 ∆
where
∆ = r2 + a2 + Q2 − 2M r , ρ2 = r2 + a2 cos2 θ , (1.16.2)
and with a, M and Q free parameters. For a = Q = 0 the metric reduces to that
of Schwarzschild’s spacetime, for a = 0 and Q
= 0 it reduces to the Reissner–
Nordstrom solution, while for Q = 0 and a
= 0 it reduces to the Kerr solution.
The Kerr–Newman spacetime is clearly stationary (i.e. the metric coefficients are
time independent), and axisymmetric (the metric is independent of the angle φ),
but the spacetime is not time-symmetric for a
= 0.
Though here we will not derive equation (1.16.1), the reader can verify that
for Q = 0 this is indeed a vacuum solution of Einstein’s equations. For Q
= 0,
however, the metric (1.16.1) is not a vacuum solution and corresponds instead to
a solution of the Einstein–Maxwell equations, i.e. the Einstein field equations in
the presence of an electromagnetic field. The potential one-form for this solution
is given by
rQ
Aµ = − 2 1, 0, 0, a sin2 θ . (1.16.3)
ρ
The parameters {a, M, Q} have clear physical interpretations. The parameter
Q is interpreted as the total electric charge of the black hole. To see this, consider
for a moment the case a = 0, then the electromagnetic potential reduces to
Now, since the metric is asymptotically flat, far away we can make the identifica-
tion Aµ = (ϕ, Ai ), with ϕ the electrostatic potential and Ai the magnetic vector
potential. This implies A0 = −ϕ, and comparing with the above equation we
find ϕ = Q/r. This clearly shows that far away observers will measure Q to be
the electric charge of the black hole. The calculation also follows for a
= 0 but
it becomes more involved since in that case we have a non-zero magnetic field.
For the other two parameters, we can use the results about global measures
of mass and momentum discussed in Appendix A to see that M corresponds to
the total mass of the spacetime, while the total angular momentum is given by
J = aM (so that a is the angular momentum per unit mass).
1.16 BLACK HOLES WITH CHARGE AND ANGULAR MOMENTUM 55
The Kerr–Newman metric has some other interesting properties. For example,
it becomes singular when either ∆ = 0 or ρ2 = 0. The singularity at ρ2 = 0 can
be shown to be a true curvature singularity and corresponds to
r2 + a2 cos2 θ = 0 , (1.16.5)
which implies cos θ = 0, so that the singularity lies on the equatorial plane, and
r = 0 which naively would seem to imply that it is just a point at the origin
(but it would still be rather strange to have a singularity only on the equatorial
plane). In fact, the singularity has the structure of a ring, a fact that can be seen
more clearly if we take the case M = Q = 0, a
= 0, for which we find √ that the
circumference of a ring in the equatorial plane is given by C = 2π r2 + a2 , so
that r = 0 corresponds to a ring of radius a (in fact, in this case the spacetime
is Minkowski in spheroidal coordinates).
The singularity at ∆ = 0, on the other hand, can be shown to be only a
coordinate singularity (the curvature tensor is regular there) and corresponds to
r2 + a2 + Q2 − 2M r = 0 ⇒ r=M± M 2 − Q 2 − a2 . (1.16.6)
In the general case we then have two different coordinate singularities corre-
sponding to two distinct horizons. For the case a = Q = 0 the exterior horizon
coincides with the Schwarzschild radius r = 2M , while the interior horizon col-
lapses to r = 0. Notice that for M 2 < a2 + Q2 there are no horizons, and the
geometry describes a naked singularity (i.e. a singularity unprotected by a hori-
zon), a situation which is considered to be unphysical (see the following Section).
The limiting case M 2 = a2 + Q2 is called an extremal black hole.
Finally, there is another interesting surface in the Kerr–Newman geometry.
Notice that for a static observer the metric component g00 changes sign when
corresponding to
r=M+ M 2 − Q2 − a2 cos2 θ . (1.16.8)
This means that for smaller values of r an observer can not remain static since
that would be a spacelike trajectory. Instead, inside this surface an observer with
fixed r and θ must in fact rotate around the black hole in the same direction
as the hole spins. This effect is known as the dragging of inertial frames, and it
shows that to some extent general relativity does incorporate Mach’s principle
(local inertial properties are severely affected close to a rotating black hole).
This static limiting surface lies everywhere outside the horizon, except at the
poles where it coincides with it. The region between the static limiting surface
and the horizon is called the ergosphere, as we can show that inside this region
56 BRIEF REVIEW OF GENERAL RELATIVITY
it is possible to extract rotational energy from the black hole via the so-called
Penrose process.18
The relationship between the area and the radius of the exterior horizon for
a Kerr–Newman black hole is not as simple as in Schwarzschild. In this case we
find that the area of the exterior horizon is
2
AH = 4π r+ + a2 = 4π 2M 2 − Q2 + 2M M 2 − Q2 − a2 . (1.16.9)
AH 4πJ 2
M2 = + . (1.16.10)
16π AH
Since the rotational energy of the black
hole can be extracted, it is usual to
define the irreducible mass as MI := AH /16π, so that the above expression
becomes M 2 = MI2 + J 2 /4MI2 .
t̃ = Ṽ − r , (1.16.11)
x = sin θ r cos φ − a sin φ̃ , (1.16.12)
y = sin θ r sin φ + a cos φ̃ , (1.16.13)
z = r cos θ , (1.16.14)
with
r 2 + a2 a
dṼ = dt + dr , dφ̃ = dφ + dr . (1.16.15)
∆ ∆
In terms of Kerr–Schild coordinates {t̃, x, y, z}, the metric of the Kerr–Newman
spacetime takes the particularly simple form
ing one part into the black hole and allowing the other to escape to infinity carrying part of
the rotational energy of the hole.
1.17 CAUSAL STRUCTURE, SINGULARITIES AND BLACK HOLES 57
x2 + y 2 z2
2 2
+ 2 =1. (1.16.18)
r +a r
There are several interesting things to notice about the metric (1.16.16). First,
this form of the metric is now clearly not singular at the horizon. Second, for
a = Q = 0 it reduces precisely to the Kerr–Schild metric for the Schwarzschild
spacetime (1.16.16). Finally, the one-form lµ defined above turns out to be null
with respect to the Minkowski metric, that is, η µν lµ lν = 0. In particular, this
means that the inverse metric g µν can be written as
with l∗µ := η µν lν .
19 This corresponds to the idea of causal determinism in classical mechanics: If we know the
physical state of the whole Universe now, we can predict completely the future and the past.
1.17 CAUSAL STRUCTURE, SINGULARITIES AND BLACK HOLES 59
about generic solutions to the Einstein field equations unless something is said
first about the properties of the matter that is the source of the gravitational field.
Notice that any spacetime can be considered a “solution” of Einstein’s equations
by simply defining the corresponding stress-energy tensor as Tµν := Gµν /8π, but
doing this we would generally find a stress-energy tensor that corresponds to a
completely nonsensical form of matter. There are a series of conditions that can
be imposed on the stress-energy tensor that are satisfied by all known forms
of matter. These so-called energy conditions are not physical laws as such, but
they are rather assumptions about how any reasonable form of matter should
behave.20 The three most common energy conditions are the following:
1. Weak energy condition: The energy density seen by all observers should be
non-negative. That is, Tµν uµ uν ≥ 0 for any unit timelike vector uµ .
2. Strong energy condition: The energy density plus the sum of the principal
pressures must be non-negative (for a perfect fluid ρ+ 3p ≥ 0). In covariant
terms this condition is stated as Tµν uµ uν + T /2 ≥ 0, with uµ an arbitrary
unit timelike vector and T ≡ T µ µ . The field equations imply that this
condition is equivalent to Rµν uµ uν ≥ 0.
3. Null energy condition: The energy density plus any of the principal pres-
sures must be non-negative (for a perfect fluid ρ+p ≥ 0). In covariant terms
this takes the form Tµν k µ k ν ≥ 0 for any null vector k µ , which through the
field equations is equivalent to Rµν k µ k ν ≥ 0.
The strong and weak energy conditions are independent of each other, but
they both imply the null energy condition.
In order to see the relevance of the energy conditions in the context of gravi-
tational collapse; we will start by considering the two-dimensional boundary S of
a closed region in a three-dimensional spatial hypersurface Σ. Let sµ be the unit
spacelike vector orthogonal to S in Σ, and nµ the timelike unit vector orthogonal
to Σ. Define now
√ √
lµ := (nµ + sµ ) / 2 , k µ := (nµ − sµ ) / 2 . (1.17.1)
The vectors lµ and k µ are clearly null, and correspond to the tangent vectors to
the congruence of outgoing and ingoing null geodesics through S. The projection
operator onto the surface S is given by
20 Energy conditions, however, are notoriously problematic when dealing with quantum fields,
as such fields often violate them. Worse, even quite reasonable looking classical scalar fields can
violate all energy conditions (see e.g. [47]). The issue of the range of applicability and general
relevance of the energy conditions remains a largely open problem.
60 BRIEF REVIEW OF GENERAL RELATIVITY
where the second term projects an arbitrary vector onto Σ, and the third projects
it onto S once it is in Σ. Define now the tensor κµν as the projection of the
covariant derivatives of the null vector lµ :21
The expansion θ measures the increase in the separation of the null geodesics as
they leave the surface. If the expansion is negative everywhere on S, then we say
that the surface is trapped, and if it is zero everywhere we say that the surface
is marginally trapped. Physically this means that if light rays are sent outward
from a trapped surface, then a moment later the volume enclosed by them will
be smaller than the initial volume, i.e they are all getting closer. Notice that
in flat space there can be no trapped surfaces (the outgoing light rays must be
separating somewhere for any closed two-dimensional surface), but in the case
of Schwarzschild’s spacetime we find that all spheres with r < 2M are in fact
trapped (“outgoing” light rays actually fall in).
A crucial result of Penrose states that in a globally hyperbolic spacetime that
satisfies the null energy condition, the existence of a trapped surface implies that
a singularity will form in the future (for a formal proof see [161] or [295]). In other
words, if a trapped surface ever develops in the dynamical evolution of a strong
gravitational field, then gravitational collapse to a singularity is inevitable.
The previous result establishes the fact that singularities are inevitable in
gravitational collapse, however, it does not guarantee that a black hole will form
as the singularity can be naked, i.e. not protected by the presence of a horizon.
If a horizon is absent then the singularity can be causally connected to the rest
of the Universe which will mark the breakdown of predictability, as in principle
anything could come out of a singularity (the field equations make no sense
there). In other words the presence of a naked singularity would imply that
the spacetime is not be globally hyperbolic. It turns out that in most cases
studied so far naked singularities do not develop, which has given rise to the
cosmic censorship conjecture which basically says that all physically reasonable
spacetimes are globally hyperbolic, and that apart from a possible initial Big
Bang type singularity all other singularities are hidden inside black hole horizons.
The cosmic censorship conjecture has not been proven except in some special
cases, and there are in fact a number of counterexamples to it, but all of them are
non-generic in the sense that naked singularities only seem to occur for finely
tuned sets of initial data. Still, the issue has not been settled and there are
21 The tensor κ
µν corresponds to the extrinsic curvature of the null hypersurface generated
by the outgoing null geodesics from S. See Chapter 2 for a more detailed discussion of the
extrinsic curvature.
1.17 CAUSAL STRUCTURE, SINGULARITIES AND BLACK HOLES 61
There is still one final issue to discuss related to the formal definition of a
black hole. The presence of a black hole must imply that there is a horizon, that
is, a boundary that separates a region where light can escape to infinity from a
region where it can not. The precise definition of a black hole involves the notion
of conformal infinity. This notion is very important in the modern theoretical
framework of general relativity, but its precise definition would require a full
Chapter on its own so here we will just mention some of the basic ideas. We
start by considering a conformal transformation of our physical spacetime M to
an unphysical, or conformal, spacetime M̄ such that
i+
J + J +
i0 i0
J < J <
i<
Fig. 1.7: Conformal diagram for Minkowski spacetime. The angular coordinates are
suppressed, so that each point on the diagram represents a sphere in spacetime. The
exception is the point i0 which is really only one point even though it appears twice in
the diagram (the correct picture is obtained by wrapping the diagram around a cylinder
in such a way that the two points i0 touch).
i+ singularity i+
II
J + J +
on
riz
ho
i0 III I i0
ho
riz
J < IV J <
on
The event horizon marks the true boundary of a black hole, but it is clear
that, if one wishes to locate an event horizon (assuming there is one), we must
know the entire history of spacetime in order to be able to decide which outgoing
null lines escape to infinity and which do not. The event horizon is therefore a
non-local notion.
When we are considering the dynamical evolution of a spacetime from some
initial data as is done in numerical relativity, it is convenient to have a more local
criteria that can be used to determine the presence of a black hole at any given
point in time. This is achieved by the notion of an apparent horizon, which is
defined as the outermost marginally trapped surface on the spacetime. A crucial
property of apparent horizons is the fact that if the cosmic censorship conjecture
holds, and the null energy condition is satisfied, then the presence of an apparent
horizon implies the existence of an event horizon that lies outside, or coincides
with, the apparent horizon. In the static Schwarzschild spacetime the apparent
and event horizons in fact coincide, and we can talk simply of the “horizon”.
2
THE 3+1 FORMALISM
2.1 Introduction
The Einstein field equations for the gravitational field described in the previous
Chapter are written in a fully covariant way, where there is no clear distinction
between space and time. This form of the equations is quite natural from the
point of view of differential geometry, and has important implications in our
understanding of the relationship between space and time. However, there are
situations when we would like to recover a more intuitive picture where we can
think of the dynamical evolution of the gravitational field in “time”. For example,
we could be interested in finding the future evolution of the gravitational field
associated with an astrophysical system given some appropriate initial data. On
the other hand, we might also be interested in studying gravity as a field theory
similar to electrodynamics, and define a Hamiltonian formulation that can be
used, for example, as the starting point for a study of the quantization of the
gravitational field.
There exist several different approaches to the problem of separating the
Einstein field equations in a way that allows us to give certain initial data, and
from there obtain the subsequent evolution of the gravitational field. Specific
formalisms differ in the way in which this separation is carried out. Here we
will concentrate on the 3+1 formalism, where we split spacetime into three-
dimensional space on the one hand, and time on the other. The 3+1 formalism
is the most commonly used in numerical relativity, but it is certainly not the
only one. The two main alternatives to the 3+1 approach are known as the
characteristic formalism where spacetime is separated into light-cones emanating
from a central timelike world-tube, and the conformal formalism where we use
hyperboloidal slices that are everywhere spacelike but intersect asymptotic null
infinity, plus a conformal transformation that brings the boundary of spacetime
to a finite distance in coordinate space. Both these alternatives will be discussed
briefly in Section 2.9.
We should also mention yet another approach that is based on evolving the
full four-dimensional spacetime metric directly by simply expanding out the Ein-
stein equations is some adequate coordinate system. Indeed, this was the original
approach taken by Hahn and Lindquist in their pioneering work on numerical
relativity [158], and has also been recently used by Pretorius with considerable
success in the context of the collision of two orbiting black holes [231]. The dif-
ferent formalisms have advantages and disadvantages depending on the specific
physical system under consideration.
64
2.2 3+1 SPLIT OF SPACETIME 65
Y3 t3
Y2 t2
Y1 t1
In the following sections I will introduce the 3+1 formalism of general rela-
tivity. The discussion found here can be seen in more detail in [206] and [305].
xi – `idt t + dt
xi
_ dt
xi t
Fig. 2.2: Two adjacent spacelike hypersurfaces. The figure shows the definitions of the
lapse function α and the shift vector β i .
two hypersurfaces can be determined from the following three basic ingredients
(see Figure 2.2):
• The three-dimensional metric γij (i, j = 1, 2, 3) that measures proper dis-
tances within the hypersurface itself:
dτ = α(t, xi ) dt . (2.2.2)
23 The notation for lapse and shift used here is common, but certainly not universal. A
frequently used alternative is to denote the lapse function by N , and the shift vector by N i .
2.2 3+1 SPLIT OF SPACETIME 67
In terms of the functions {α, β i , γij }, the metric of spacetime can be easily
seen to take the following form:
ds2 = −α2 + βi β i dt2 + 2βi dtdxi + γij dxi dxj , (2.2.4)
where we have defined βi := γij β j (from here on we will assume that indices of
purely spatial tensors are raised and lowered with the spatial metric γij ). The
last equation is known as the 3+1 split of the metric.
More explicitly we have:
−α2 + βk β k βi
gµν = , (2.2.5)
βj γij
−1/α2 β i /α2
g µν = . (2.2.6)
β j /α2 γ ij − β i β j /α2
From the above expressions we can also show that the four-dimensional volume
element in 3+1 language turns out to be given by
√ √
−g = α γ , (2.2.7)
with g and γ the determinants of gµν and γij respectively.
Consider now the unit normal vector nµ to the spatial hypersurfaces. It is
not difficult to show that, in the coordinate system just introduced, this vector
has components given by
nµ = 1/α, −β i /α , nµ = (−α, 0) . (2.2.8)
Note that this unit normal vector corresponds by definition to the 4-velocity of
the Eulerian observers.
We can use the normal vector nµ to introduce the 3+1 quantities in a more
formal way that is not tied up with the choice of a coordinate system adapted
to the foliation. The spatial metric γij is simply defined as the metric induced
on each hypersurface Σ by the full spacetime metric gµν :
γµν = gµν + nµ nν . (2.2.9)
Notice that written in this way the spatial metric is a full four-dimensional tensor,
but when written in the adapted coordinates its time components become trivial.
Also, the last expression shows that the spatial metric is nothing more than the
projection operator onto the spatial hypersurfaces.
Consider now our global time function t associated with the foliation. The
lapse function is defined as
−1/2
α = (−∇t · ∇t) . (2.2.10)
(The vector ∇t is clearly timelike because the level sets of t are spacelike). The
unit normal vector to the hypersurfaces can then be expressed in terms of α and
t as
nµ = −α∇µ t , (2.2.11)
where the minus sign is there to guarantee that
n is future pointing.
68 THE 3+1 FORMALISM
For the definition of the shift vector we start by introducing three scalar
functions β i such that when we move from a given hypersurface to the next
following the normal direction, the change in the spatial coordinates is given as
before by
xit+dt = xit − β i dt , (2.2.12)
from which we can easily find
β i = −α
n · ∇xi , (2.2.13)
Thus defined, the β i are scalars, but we can use them to define a 4-vector β
tµ := αnµ + β µ . (2.2.14)
The vector
t is nothing more than the tangent vector to the time lines, i.e. the
lines of constant spatial coordinates. Notice that, in general, we have tµ
= ∇µ t.
From the above definition we find that
t is such that tµ nµ = −α, which implies
tµ ∇µ t = 1 . (2.2.15)
We then find that the shift is nothing more than the projection of
t onto the
spatial hypersurface
βµ := γµν tν . (2.2.16)
From this we see that we can introduce the shift vector in a completely
coordinate-independent way by first choosing a vector field
t satisfying (2.2.15),
and then defining the shift through (2.2.16). It is important to stress the fact
that the vector field
t does not need to be timelike – it can easily be null or even
spacelike (we will later see that this situation frequently arises in the case of
black hole spacetimes). All we need to ask is that
t is not tangent to the spatial
hypersurfaces, and that it points to the future, which is precisely the content of
equation (2.2.15). It might seem strange to allow
t to be spacelike, since that
would correspond to a faster than light motion of the coordinate lines, that is, a
superluminal or tachionic shift. But we must remember that it is not a physical
effect but only the coordinate lines that are moving “faster than light”, and the
coordinates can be chosen freely.
Y
parallel transport
Fig. 2.3: The extrinsic curvature tensor is defined as a measure of the change of the
normal vector under parallel transport.
tensor defined in terms of the 3-metric γij . The extrinsic curvature, on the other
hand, is defined in terms of what happens to the normal vector
n as it is parallel-
transported from one point in the hypersurface to another. In general, we will find
that as we parallel transport this vector to a nearby point, the new vector will
not be normal the hypersurface anymore. The extrinsic curvature tensor Kαβ is
a measure of the change of the normal vector under such parallel transport (see
Figure 2.3).
In order to define the extrinsic curvature, we need to introduce the projection
operator Pβα onto the spatial hypersurfaces:
which as we have seen is in fact nothing more than the induced spatial metric,
Pαβ = γαβ . Using this projection operator, the extrinsic curvature tensor is
defined as:
Kµν := −Pµα ∇α nν = − (∇µ nν + nµ nα ∇α nν ) , (2.3.2)
As defined above, the tensor Kµν is clearly a purely spatial tensor, that is,
nµ Kµν = nν Kµν = 0. This means, in particular, that in a coordinate system
adapted to the foliation we will have K 00 = K 0i = 0 (though in general we find
that K00 and K0i are not zero). Because of this, we will usually only consider
the spatial components of Kij . Moreover, the tensor Kµν also turns out to be
symmetric:
Kµν = Kνµ . (2.3.3)
A couple of remarks are important regarding the definition of Kµν . First,
notice that the projection of ∇µ nν is crucial in order to make Kµν purely spatial.
We could argue that because nµ is unitary, its gradient is necessarily orthogonal
to it. This is of course true in the sense that nν ∇µ nν = 0, but ∇µ nν is in general
not symmetric and nν ∇ν nµ
= 0 unless the normal lines are geodesic (which is not
always the case). Let us now consider the symmetry of Kµν . As just mentioned
∇µ nν is not in general symmetric even though nµ is hypersurface orthogonal. The
reason for this is that nα is a unitary vector and thus in general is not equal to
the gradient of the time function t, except when the lapse is unity. However, once
70 THE 3+1 FORMALISM
we project onto the hypersurface, it turns out that Pµα ∇α nν is indeed symmetric
(the non-symmetry of ∇µ nν has to do with the lapse, which is not intrinsic to the
hypersurface). In order to see this, consider the congruence of timelike geodesics
orthogonal to Σ, with unit tangent vector ξ.
In the neighborhood of Σ consider a
new foliation of spacetime given by a time function t̃ such that ξµ = ∇µ t̃. Since ξ
where we have used the fact that the covariant derivative of gµν is zero and also
that nα ∇µ nα = 0. We then find that the extrinsic curvature is essentially the
“velocity” of the spatial metric as seen by the Eulerian observers. Notice that
the extrinsic curvature depends only on the behavior of
n within the slice Σ – it
is therefore a geometric property of the slice itself.
Now, since
n is normal to the hypersurface, it turns out that for any scalar
function φ we have
1
£n γµν = £φn γµν . (2.3.8)
φ
If, in particular, we take as our scalar function the lapse, we find that
2.4 THE EINSTEIN CONSTRAINTS 71
1 1
Kµν = − £αn γµν = − £t − £β γµν , (2.3.9)
2α 2α
which implies
£t − £β γµν = −2αKµν . (2.3.10)
where here Di represents the three-dimensional covariant derivative, that is, the
one associated with the 3-metric γij , which is in fact nothing more than the
projection of the full four-dimensional covariant derivative: Dµ := Pµα ∇α .
This brings us half-way to our goal of writing Einstein’s equations as a Cauchy
problem: We already have an evolution equation for the spatial metric γij . In
order to close the system we still need an evolution equation for Kij . It is im-
portant to notice that until now we have only worked with purely geometric
concepts, and we have not used the Einstein field equations at all. It is precisely
from the field equations that we will obtain the evolution equations for Kij . In
other words, the evolution equation (2.3.11) for the 3-metric is purely kinematic,
while the dynamics of the system will be contained in the evolution equations
for Kij .
Pαδ Pβκ Pµλ Pνσ Rδκλσ = (3) Rαβµν + Kαµ Kβν − Kαν Kβµ , (2.4.1)
72 THE 3+1 FORMALISM
Similarly, the projection onto the hypersurfaces of the Riemann tensor contracted
once with the normal vector results in the Codazzi–Mainardi equations
with Gµν the Einstein tensor. On the other hand, the Gauss–Codazzi relations
imply that
P αµ P βν Rαβµν = (3) R + K 2 − Kµν K µν (2.4.4)
where K := Kµµ is the trace of the extrinsic curvature. We then find
where we have defined the quantity ρ := nµ nν Tµν that corresponds to the local
energy density as measured by the Eulerian observers. Notice that this equation
involves no explicit time derivatives, so it is not an evolution equation but rather
a constraint that must be satisfied at all times. This equation is known as the
Hamiltonian or energy constraint.
Consider now the mixed contraction of the Einstein tensor. We find,
γ αµ nν Gµν = Dα K − Dµ K αµ , (2.4.8)
Dµ (K αµ − γ αµ K) = 8πj α , (2.4.9)
with
ρ := nµ nν Tµν , j i := −P iµ nν Tµν . (2.4.12)
It is important to notice that the constraints not only do not involve time
derivatives, but they are also completely independent of the gauge functions α
and β i . This indicates that the constraints are relations that refer purely to a
given hypersurface.
Notice that having a set of constraint equations is not a feature of general
relativity alone. In electrodynamics we have the Maxwell equations which in
three-dimensional vector calculus notation, and in Gaussian units, take the form
∂t E = ∇ × B − 4πj , ∂t B = −∇ × E , (2.4.14)
where E and B are the electric and magnetic fields respectively, ρ is the charge
density and j the current density (here ∇ stands for the ordinary flat space
gradient operator and should not be confused with a four-dimensional covariant
derivative). The first two equations involving the divergence of the electric and
magnetic fields do not involve time derivatives, so they are in fact constraints,
just as in general relativity. The main difference is that Maxwell’s theory has
only two constraint equations, while general relativity has four (one Hamilto-
nian constraint and three momentum constraints). The remaining two Maxwell
equations (or rather six since they are vector-valued equations) are the true evo-
lution equations for electrodynamics. The corresponding equations for gravity
will be derived in the next Section.
The existence of the constraints implies, in particular, that in the 3+1 formu-
lation it is not possible to specify arbitrarily all 12 dynamical quantities {γij , Kij }
as initial conditions. The initial data must already satisfy the constraints, other-
wise we will not be solving Einstein’s equations. We will come back to this issue
in Chapter 3 when we discuss how to find initial data. The constraints also play
other important roles. For example, they are crucial in the Hamiltonian formu-
lation of general relativity (see Section 2.7). They are also very important in the
study of the well-posedness of the system of evolution equations, something we
will have a chance to discuss briefly in Section 2.8 and also in Chapter 5.
the dynamical variables that must be satisfied at all times. The evolution of the
gravitational field is contained in the remaining six field equations.
In order to find these equations we still need the projection onto the hyper-
surfaces of the Riemann tensor contracted twice with the normal vector. This
will give us the last six independent components of Riemann (the Gauss–Codazzi
and Gauss–Mainardi equations give us 14 components of Riemann). These pro-
jections turn out to be given by
1
Pµδ Pνκ nλ nσ Rδλκσ = £n Kµν + Kµλ Kνλ + Dµ Dν α . (2.5.1)
α
The first thing to notice is that these relations do involve the lapse function α.
Also, they make reference to the Lie derivative of the extrinsic curvature along
the normal direction, which clearly corresponds to evolution in time.
Now, from the Gauss–Codazzi equations (2.4.1) we also find
Pµδ Pνκ nλ nσ Rδλκσ + Rδκ = (3) Rµν + KKµν − Kµλ Kνλ , (2.5.2)
Using now the Einstein equations written in terms of Rµν , equations (1.13.4),
we find
£tKµν − £β Kµν = −Dµ Dν α + α (3) Rµν + KKµν − 2Kµλ Kνλ
+ 4πα [γµν (S − ρ) − 2Sµν ] , (2.5.4)
where ρ is the same as before, and where we have defined Sµν := Pµα Pνβ Tαβ as
the spatial stress tensor measured by the Eulerian observers (with S := Sµµ ).
Concentrating on the spatial components and again writing £t = ∂t , the last
expression becomes
∂t Kij − £β Kij = −Di Dj α + α (3) Rij + KKij − 2Kik Kjk
+ 4πα [γij (S − ρ) − 2Sij ] , (2.5.5)
These equations give us the dynamical evolution of the six independent com-
ponents of the extrinsic curvature Kij . Together with equations (2.3.11) for the
2.5 THE ADM EVOLUTION EQUATIONS 75
evolution of the spatial metric they finally allow us to write down the field equa-
tions for general relativity as a Cauchy problem. It is important to notice that
we do not have evolution equations for the gauge quantities α and β i . As we
have mentioned before, these quantities represent our coordinate freedom and
can therefore be chosen freely.
The evolution equations (2.5.5) are known in the numerical relativity com-
munity as the Arnowitt–Deser–Misner (ADM) equations. However, as written
above, these equations are in fact not in the form originally derived by ADM [31],
but they are instead a non-trivial rewriting due to York [305]. It is important to
mention exactly what the difference is between the original ADM equations and
the ADM equations à la York, which we will call from now on standard ADM.
The two groups of equations differ in two main aspects. In the first place, the
original ADM variables are the spatial metric γij and its canonical conjugate
momentum πij coming from the Hamiltonian formulation of general relativity
(see the following Section), and which is related to the extrinsic curvature as 24
1 1
Kij = − √ πij − γij π , (2.5.7)
γ 2
with π = πii and γ the determinant of γij . This change of variables is, of course,
a rather minor detail. However, even if we rewrite the original ADM evolution
equations in terms of Kij , they still differ from (2.5.5) and have the form
∂t Kij − £β Kij = −Di Dj α + α (3) Rij + KKij − 2Kik Kjk
αγij
+ 4πα [γij (S − ρ) − 2Sij ] − H, (2.5.8)
2
with H the Hamiltonian constraint (2.4.10) written as:
1 (3)
H := R + K 2 − Kij K ij − 8πρ = 0 , (2.5.9)
2
and where the factor 1/2 in the definition of H is there for later convenience.
The difference between the ADM and York evolution equations can be traced
back to the fact that the version of ADM comes from the field equations written in
terms of the Einstein tensor Gµν , whereas the version of York was derived instead
from the field equations written in terms of the Ricci tensor Rµν . It is clear that
both sets of evolution equations for Kij are physically equivalent since they only
differ by the addition of a term proportional to the Hamiltonian constraint, which
must vanish for any physical solution. However, the different evolution equations
for Kij are not mathematically equivalent. There are basically two reasons why
this is so:
24 One must remember that the original goal of ADM was to write a Hamiltonian formulation
for general relativity that could be used as a basis for quantum gravity, and not a system of
evolution equations for dynamical simulations.
76 THE 3+1 FORMALISM
1. In the first place, the space of solutions to the evolution equations is differ-
ent in both cases, and only coincides for physical solutions, that is, those
that satisfy the constraints. In other words, both systems are only equiv-
alent in a subset of the full space of solutions. This subset is called the
constraint hypersurface (but notice that this is not a hypersurface in space-
time, but instead a hypersurface in the space of solutions to the evolution
equations). Of course, we could always argue that since in the end we are
only interested in physical solutions, this distinction is irrelevant. This is
strictly true only if we can solve the equations exactly. But in the case of
numerical solutions there will always be some error that will take us out of
the constraint hypersurface, and the issue then becomes not only relevant
but crucial: If we move slightly off the constraint hypersurface, does the
subsequent evolution remain close to it, or does it diverge rapidly away
from it?
2. The second reason why both systems of evolution equations differ math-
ematically is related to the last point and is of greater importance. Since
the Hamiltonian constraint has second derivatives of the spatial metric
(hidden inside the Ricci scalar), then by adding a multiple of it to the evo-
lution equations we are in fact altering the very structure of the differential
equations.
These types of considerations take us to a fundamental observation that has
today become one of the most active areas of research associated with numerical
relativity: The 3+1 evolution equations are highly non-unique since we can al-
ways add to them arbitrary multiples of the constraints. The different systems of
evolution equations will still coincide in the physical solutions, but might differ
dramatically in their mathematical properties, and particularly in the way in
which they react to small violations of the constraints (inevitable numerically).
This observation is crucial, and we will come back to it both in Section 2.8, and
in Chapter 5.
A final consideration about the 3+1 evolution equations has to do with the
propagation of the constraints: If the constraints are satisfied initially, do they
remain satisfied during the evolution? The answer to this question is, not surpris-
ingly, yes, but it is interesting to see how this comes about. In fact it is through
the Bianchi identities that the propagation of the constraints during the evolu-
tion is guaranteed. To see this, we will follow an analysis due to Frittelli [136],
and define the following projections of the Einstein field equations
with ρ, j µ and Sµν defined as before in terms of the stress-energy tensor Tµν . No-
tice that here H = 0 corresponds precisely to the Hamiltonian constraint (2.5.9)
2.6 FREE VERSUS CONSTRAINED EVOLUTION 77
(with the correct 1/2 factor), while Mµ = 0 corresponds to the momentum con-
straints. On the other hand, Eµν = 0 reduces to the original ADM evolution
equations (multiplied by 2). An important observation is that in this context
York’s version of the evolution equations in fact corresponds to Eµν − γµν H = 0.
The Einstein field equations in terms of these quantities can then be written as
∇µ (Eµν + nµ Mν + nν Mµ + nµ nν H) = 0 . (2.5.14)
By taking the normal projection and the projection onto the hypersurface of these
identities and rearranging terms, we obtain the following system of evolution
equations for the constraints
where the L’s are shorthand for terms proportional to H and Mµ that have
no derivatives of these quantities. That these are evolution equations for the
constraints can be seen from the fact that the terms on the left hand side are
derivatives along the normal direction, i.e. out of the hypersurface. Notice that,
if we have initial data such that H = Mµ = 0, and we use the ADM evolution
equations Eµν = 0, the above equations guarantee that on the next hypersurface
the constraints will still vanish. This proves that the constraints will remain
satisfied. If, on the other hand, we use York’s evolution equations Eµν = γµν H,
the same result clearly follows. In fact, it is clear that we could take Eµν to be
equal to any combination of constraints.
There is, however, an important difference in the structure of the constraint
evolution equations when taking either the ADM or York’s version of the 3+1
evolution equations, and we will come back to it later, in Chapter 5. For the
moment, it is sufficient to mention that the constraint evolution equations are
mathematically well-posed for York’s system, but they are not well posed for the
original ADM system, so York’s system should be preferred.
guarantee that the discrete constraints remain satisfied during evolution. Unfor-
tunately, such a discretized form of the 3+1 equations is not known to exist at
this time. We must then live with the fact that not all 10 Einstein equations will
remain satisfied at the discrete level during a numerical simulation. Of course, we
expect that a good numerical implementation would be such that as we approach
the continuum limit we would recover a solution of the full set of equations.
In practice, we can take two different approaches to the problem of choosing
which set of equations to solve numerically. The first approach is known as free
evolution, and corresponds to the case when we start with a solution of the
constraint equations as initial data (see Chapter 3), and later advances in time
by solving all 12 evolution equations for γij and Kij . The constraints are then
only monitored to see how much they are violated during evolution, which gives
a rough idea of the accuracy of the simulation. Alternatively, we can choose to
solve some or all of the constraint equations at each time step for a specific
subset of the components of the metric and extrinsic curvature, and evolve the
remaining components using the evolution equations. This second approach is
known as constrained evolution.
Constrained evolution is in fact ideal in situations with high degree of sym-
metry, like spherical symmetry for example, but is much harder to use in the
case of fully three-dimensional systems. Also, the mathematical properties of a
constrained scheme are much more difficult to analyze in the sense of studying
the well-posedness of the system of equations (see Chapter 5). And finally, since
the constraints involve elliptic equations, a constrained scheme is also far slower
to solve numerically in three dimensions than a free evolution scheme. Because
of all these reasons, in the remainder of the book we will always assume that we
are using a free evolution scheme.
For completeness, we should mention a third alternative known as constrained
metric evolution. In this approach, we choose some extra condition on the metric
tensor, such as for example conformal flatness, and impose this condition during
the whole evolution, thus simplifying considerably the equations to be solved.
Such an approach has been used with some success by Wilson and Mathews in
hydrodynamical simulations [300]. However, imposing extra conditions on the
metric is not in general compatible with the Einstein field equations, and the
results from such simulations should therefore be regarded just as approximations
even in the continuum limit. For example, the condition of conformal flatness
essentially eliminates the gravitational wave degrees of freedom. Whether such an
approximation is good or not will depend on the specific physical system under
study, and the physical information we wish to extract from the simulation.
L=R, (2.7.1)
2.7 HAMILTONIAN FORMULATION 79
with R the Ricci scalar of the spacetime. As already mentioned, from a variational
principle we can obtain the field equations taking this Lagrangian as the starting
point.
The Lagrangian formulation of a field theory takes a covariant approach. In
the first place, the Lagrangian itself must be a scalar function, and also the field
equations derived from the variational principle come out in fully covariant form.
A different approach is to take instead a Hamiltonian formulation of the theory.
This approach has important advantages, and in particular is the starting point
of quantum field theory. However, a Hamiltonian formulation requires a clear
distinction to be made between space and time, so it is therefore not covariant.
In field theories other than general relativity, and particularly when working
on a flat spacetime background, there is already a natural way in which space
and time can be split. In general relativity, on the other hand, no such natural
splitting exists. However, we can take the 3+1 perspective and use this splitting
as a basis to construct a Hamiltonian formulation of the theory. Of course, we
can not interpret the time function t directly as a measure of the proper time of
any given observer, since the spacetime metric needed in order to do this is the
unknown dynamical variable under study.
The first step in a Hamiltonian formulation is to identify the configuration
variables that describe the state of the field at any given time. For this purpose
we will choose the spatial metric variables γij , together with the lapse α and
the co-variant shift vector βi . We now need to rewrite the Hilbert Lagrangian in
terms of these quantities and their derivatives. Notice that, from the definition
of the Einstein tensor we have
1
nµ nν Gµν = nµ nν Rµν + R ⇒ R = 2 (nµ nν Gµν − nµ nν Rµν ) . (2.7.2)
2
The first term on the right hand side of the last equation was already obtained
from the Gauss–Codazzi relations and is given by equation (2.4.5). For the sec-
ond term we use the Ricci identity that relates the commutator of covariant
derivatives to the Riemann tensor (equation (1.9.3)):
nµ nν Rµν = nµ nν Rλ µλν
= nν ∇λ ∇ν nλ − ∇ν ∇λ nλ
= ∇λ nν ∇ν nλ − ∇ν nν ∇λ nλ − ∇λ nν ∇ν nλ + ∇ν nν ∇λ nλ
= ∇λ nν ∇ν nλ − ∇ν nν ∇λ nλ − Kλν K λν + K 2 . (2.7.3)
In the previous expression, we have directly identified some terms with the ex-
trinsic curvature even though no projection operator is present. We can readily
verify that the contractions in those expressions guarantee that the result follows.
The Ricci scalar then takes the form
R = (3) R + Kµν K µν − K 2 − 2 ∇λ nν ∇ν nλ − nλ ∇ν nν . (2.7.4)
The last term in this equation is a total divergence, and since in the end we
are only interested in the action S which is an integral of the Lagrangian over a
80 THE 3+1 FORMALISM
given volume Ω, this term can be transformed into an integral over the boundary
of Ω and can therefore be ignored. The Lagrangian of general relativity in 3+1
language can then be written as
Notice that L has a similar structure to that of the Hamiltonian constraint, but
with the sign of the quadratic terms in the extrinsic curvature reversed.
To obtain the Lagrangian density L we first need to remember that
√ the four-
√
dimensional volume element in 3+1 adapted coordinates is given by −g = α γ,
so the Lagrangian density takes the form
√
L = α γ (3) R + Kij K ij − K 2 . (2.7.6)
The canonical momenta conjugate to the dynamical field variables are defined
now as derivatives of the Lagrangian density with respect to the velocities of
those fields. For the spatial metric we have the following conjugate momenta
∂L
π ij := , (2.7.7)
∂ γ̇ij
which, using the fact that γ̇ij = −2αKij + £β γij , can be reduced to
√
π ij = − γ K ij − γ ij K . (2.7.8)
By taking now the trace, we can invert this relation between πij and Kij to
recover equation (2.5.7):
1 1
Kij = − √ πij − γij π , (2.7.9)
γ 2
Since the Lagrangian density is independent of any derivatives of the lapse and
shift, it is clear that the momenta conjugate to these variables are zero.
The Hamiltonian density is now the defined as
H = π ij γ̇ij − L , (2.7.10)
= −2 γ αH + βi Mi , (2.7.12)
2.8 THE BSSNOK FORMULATION 81
with H and Mi defined as before, but without the matter contributions (which
would arise when we add the Hamiltonian density for the matter). The total
Hamiltonian is now defined as
H := H d3 x . (2.7.13)
which is nothing more than the standard evolution equation for γij written in
terms of πij instead of Kij , and from the second equation we recover the ADM
evolution equations for πij , which are equivalent to (2.5.8).
There is an interesting observation we can make at this point, due to An-
derson and York [22]. When using the Hamilton equations to derive the 3+1
evolution equations, we usually take the lapse function α as an independent
quantity to be kept constant during the variation with respect to γij . However,
we might take a different point of view and assume that the independent gauge
function is not the lapse as such, but rather the densitized lapse defined as
√
α̃ := α/ γ . (2.7.16)
This change is far from trivial, as keeping α̃ constant alters the dependency of
the Hamiltonian density on the metric during the variation (we have effectively
√
multiplied it by a factor of γ). The resulting evolution equations for πij are
now different, and correspond precisely to those of York, equations (2.5.5). This
observation points to the fact that perhaps the densitized lapse α̃ is a more
fundamental free gauge function than the lapse α itself. We will encounter the
densitized lapse several times throughout the text.
with γ the determinant of γij . Furthermore, we ask for this relation to remain
satisfied during the evolution. Now, from (2.3.11) we find that the evolution
equation for the determinant of the metric is
∂t γ = γ −2αK + 2Di β i = −2γ αK − ∂i β i + β i ∂i γ , (2.8.3)
which implies
1
∂t ψ = − ψ αK − ∂i β i + β i ∂i ψ . (2.8.4)
6
In practice we usually works with φ = ln ψ = 1
12 ln γ, so that γ̃ij = e−4φ γij and
1
∂t φ = − αK − ∂i β i + β i ∂i φ . (2.8.5)
6
Recently, however, it has been suggested by Campanelli et al. [93] that evolving
instead χ = 1/ψ 4 = exp(−4φ) is a better alternative when considering black hole
spacetimes for which ψ typically has a 1/r singularity (so that φ has a logarithmic
singularity), while χ is a C 4 function at r = 0. For regular spacetimes, of course,
it should make no difference if we evolve φ, ψ or χ.
The BSSNOK formulation also separates the extrinsic curvature into its trace
K and its tracefree part
1
Aij = Kij − γij K . (2.8.6)
3
We further make a conformal rescaling of the traceless extrinsic curvature of the
form25
Ãij = ψ −4 Aij = e−4φ Aij . (2.8.7)
A crucial point is that BSSNOK also introduces three auxiliary variables
known as the conformal connection functions and defined as
25 As we will see in Chapter 3 when we discuss initial data, the “natural” conformal rescal-
ing of the traceless extrinsic curvature is in fact Ãij = ψ10 Aij , which implies Ãij = ψ2 Aij
(assuming we raise and lower indices of conformal quantities with the conformal metric). Since
I wish to present the standard form of the BSSNOK equations, here I will continue to use
the rescaling Ãij = ψ−4 Aij . However, in order to avoid possible confusion later, the reader is
advised to keep in mind that this rescaling is different from the one we will use in the next
Chapter. It is also important to mention that if we choose to use Ãij = ψ2 Aij instead, some
of the equations in the BSSNOK formulation in fact simplify (most notably the momentum
constraints and the evolution equations for Γ̃i and Ãij itself), and it also becomes clear that
the densitized lapse α̃ = αγ −1/2 = αψ−6 plays an important role (the BSSNOK equations
with the natural conformal rescaling can be found in Appendix C).
84 THE 3+1 FORMALISM
where Γ̃i jk are the Christoffel symbols of the conformal metric, and where the
second equality comes from the definition of the Christoffel symbols in the case
when the determinant γ̃ is equal to 1 (which must be true by construction). So,
instead of the 12 ADM variables γij and Kij , BSSNOK uses the 17 variables
φ, K, γ̃ij , Ãij and Γ̃i .26 We can take the point of view that there are only 15
dynamical variables since Ãij is traceless and γ̃ij has unit determinant, but here
we will take the point of view that we are freely evolving all components of Ãij
and γ̃ij (however, enforcing the constraint à = 0 during a numerical calculation
does seem to improve the stability of the simulations significantly, so it has
become standard practice in most numerical codes).
Up to this point all we have done is redefine variables and introduce three
additional auxiliary variables. The evolution equation for φ was already found
above, while those for γ̃ij , K and Ãij can be obtained directly from the standard
ADM equations. The system of evolution equations then takes the form27
d
γ̃ij = −2αÃij , (2.8.9)
dt
d 1
φ = − αK , (2.8.10)
dt 6
d " #TF
Ãij = e−4φ −Di Dj α + αRij + 4πα [γij (S − ρ) − 2Sij ]
dt
+ α K Ãij − 2Ãik Ãk j , (2.8.11)
d 1
K = −Di Di α + α Ãij Ãij + K 2 + 4πα (ρ + S) , (2.8.12)
dt 3
with d/dt := ∂t − £β , and where TF denotes the tracefree part of the expression
inside the brackets. In the previous expressions we have adopted the convention
that indices of conformal quantities are raised and lowered with the conformal
metric so that, for example, Ãij = e4φ Aij . It is also important to notice that, in
the evolution equation for K, the Hamiltonian constraint has been used in order
to eliminate the Ricci scalar:
2 2
R = Kij K ij − K 2 + 16πρ = Ãij Ãij − K + 16πρ . (2.8.13)
3
We then see how we have already started to add multiples of constraints to
evolution equations.
Notice that in the evolution equations for Ãij and K there appear covariant
derivatives of the lapse function with respect to the physical metric γij . These
26 It should be noted that the formulation of [268] uses instead of the Γ̃i the auxiliary variables
P
Fi := − j ∂j γ̃ij .
27 From now on, and where there is no possibility of confusion, we will simply drop the index
can be easily calculated by using the fact that the Christoffel symbols are related
through:
1 k m
Γ̃kij = Γkij − δi Γjm + δjk Γm im − γij γ Γlm
kl m
3
= Γkij − 2 δik ∂j φ + δjk ∂i φ − γij γ kl ∂l φ , (2.8.14)
where Γ̃kij are the Christoffel symbols of the conformal metric, and where we
1
have used the fact that ∂i φ = 12 ∂i ln γ = 16 Γm
im . This implies, in particular, that
In the evolution equation for Ãij we also need to calculate the Ricci tensor as-
sociated with the physical metric, which can be separated into two contributions
in the following way:
φ
Rij = R̃ij + Rij , (2.8.16)
where R̃ij is the Ricci tensor associated with the conformal metric γ̃ij :
1
R̃ij = − γ̃ lm ∂l ∂m γ̃ij + γ̃k(i ∂j) Γ̃k + Γ̃k Γ̃(ij)k
2
lm
+ γ̃ 2Γ̃kl(i Γ̃j)km + Γ̃kim Γ̃klj . (2.8.17)
φ
and where Rij denotes additional terms that depend on φ:
φ
Rij = −2D̃i D̃j φ − 2γ̃ij D̃k D̃k φ + 4D̃i φ D̃j φ − 4γ̃ij D̃k φ D̃k φ , (2.8.18)
with D̃i the covariant derivative associated with the conformal metric.
We must also be careful with the fact that in the evolution equations above
we are computing Lie derivatives with respect to β
of tensor densities, that is
tensors multiplied by powers of the determinant of the metric γ. If a given object
is a tensor times γ w/2 , then we say that it is a tensor density of weight w. The
Lie derivative of a tensor density of weight w is simply given by
£β T = £β T + w T ∂i β i , (2.8.19)
w=0
where the first term denotes the Lie derivative assuming w = 0, and the second
is the additional contribution due to the density factor. The density weight of
ψ = eφ = γ 1/12 is clearly 1/6, so the weight of γ̃ij and Ãij is −2/3, and the
weight of γ̃ ij is 2/3. In particular we have
1
£β φ = β k ∂k φ + ∂k β k , (2.8.20)
6
2
£β γ̃ij = β k ∂k γ̃ij + γ̃ik ∂j β k + γ̃jk ∂i β k − γ̃ij ∂k β k . (2.8.21)
3
There are several motivations for the change of variables introduced above.
First, the conformal transformation and the separating out of the trace of the
86 THE 3+1 FORMALISM
extrinsic curvature are done in order to have better control over the slicing con-
ditions that, as we will see in Chapter 4, are generally related with the trace of
Kij . On the other hand, the introduction of the conformal connection variables
Γ̃i has the important consequence that when these functions are considered as
independent variables, then the second derivatives of the conformal metric that
appear on the right hand side of equation (2.8.11) (contained in the Ricci ten-
sor (2.8.17)) reduce to the simple scalar Laplace operator γ̃ lm ∂l ∂m γ̃ij . All other
terms with second derivatives of γ̃ij have been rewritten in terms of first deriva-
tives of the Γ̃i .
If the Γ̃i are to be considered as independent variables, we are of course still
missing an evolution equation for them. This equation can be obtained directly
from (2.8.8) and (2.3.11):
∂t Γ̃i = −∂j £β γ̃ ij − 2 α∂j Ãij + Ãij ∂j α , (2.8.22)
The last three terms of the first line clearly form the Lie derivative for a vector
density of weight 2/3, while the extra terms involving second derivatives of the
shift arise from the fact that the Γ̃i are not really components of a vector den-
sity, but are rather contracted Christoffel symbols. Bearing this in mind we can
rewrite the last equation in more compact form as
d i 1
Γ̃ = γ̃ jk ∂j ∂k β i + γ̃ ij ∂j ∂k β k − 2 α∂j Ãij + Ãij ∂j α . (2.8.23)
dt 3
We are still missing one key element of the BSSNOK formulation. In practice
it turns out to be that, in spite of the motivations mentioned above, if we use
equations (2.8.9), (2.8.10), (2.8.11), (2.8.12), and (2.8.23) in a numerical simula-
tion the system turns out be violently unstable. In order to fix this problem we
need to consider the momentum constraints, which in terms of the new variables
take the form
2
∂j Ãij = −Γ̃ijk Ãjk − 6Ãij ∂j φ + γ̃ ij ∂j K + 8π j̃ i , (2.8.24)
3
with j̃ i := e4φ j i . We can now use this equation to substitute the divergence of
Ãij that appears in the evolution equation for the Γ̃i . We find:
d i 1
Γ̃ = γ̃ jk ∂j ∂k β i + γ̃ ij ∂j ∂k β k − 2Ãij ∂j α
dt 3
2 ij
+ 2α Γ̃jk à + 6à ∂j φ − γ̃ ∂j K − 8π j̃
i jk ij i
. (2.8.25)
3
2.9 ALTERNATIVE FORMALISMS 87
Fig. 2.4: Domain of dependence for the characteristic formulation. (a) A single null
hypersurface has an empty domain of dependence as there are light rays coming from
infinity that pass arbitrarily close to it but never intersect it. (b) The double null
approach uses two null hypersurfaces. (c) A different approach is to use a central
timelike world-tube (or world-line) to have a non-trivial domain of dependence.
A very common choice is to use the Bondi–Sachs null coordinate system, which
in the general three-dimensional case corresponds to a spacetime metric of the
form [71, 247]
V
ds2 = − e2β − r2 hAB U A U B du2
r
− 2e2β dudr − 2r2 hAB U B dudxA + r2 hAB dxA dxB , (2.9.3)
where here the radial coordinate r is used instead of λ. We then find hypersurface
equations (i.e. involving only derivatives inside the hypersurface) for the metric
functions {β, V, U A }, and evolution equations (involving derivatives with respect
to the null coordinate u) for the metric functions hAB .
One important advantage of this approach is the fact that there are no elliptic
constraints on the data, so the initial data is free. Additionally, there are no
second derivatives in time (i.e. along the direction u), so there are fewer variables
than in a 3+1 approach. Plus, null infinity can be compactified and brought to a
finite distance in coordinate space, so that no artificial boundary conditions are
required.28
A lot of work has been devoted to developing characteristic codes in spherical
and axial symmetry, and today there are also well-developed three-dimensional
codes that have been used to study, for example, scattering of waves by a black
hole and even simulations of stars orbiting a black hole. A crucial development
was the evolution of a black hole spacetime in a stable way for an essentially
unlimited time by turning the problem around and considering a foliation of
ingoing null hypersurfaces interior to an outer timelike world-tube [147].
The characteristic formalism has a series of advantages over traditional 3+1
approaches, but is has one serious drawback. As already mentioned, caustics can
easily develop in null hypersurfaces, particularly in regions with strong gravita-
tional fields. In those regions, a 3+1 approach should be much better behaved.
This has led to an idea known as Cauchy-characteristic matching (see e.g. [57]),
which uses a standard 3+1 approach based on a timelike hypersurface in the inte-
rior strong field region, matched to a null hypersurface in the exterior to carry the
gravitational radiation to infinity (see Figure 2.5). Cauchy-characteristic match-
ing has been shown to work well in simple test cases, but the technique has
not yet been fully developed for the three-dimensional case for reasons related
mainly to finding a stable and consistent way of injecting boundary data coming
from the null exterior to the 3+1 interior.
28 Compactifying spatial infinity is generally not a good idea since it implies a reduction in
resolution at large distances: Wave packets get compressed as seen in coordinate space as they
move outward. This gradual reduction in resolution acts on numerical schemes essentially as a
change in the refraction index and causes waves to be “back-scattered” by the numerical grid.
Compactifying null infinity, however, does not have this problem. Nevertheless, some numerical
implementations do compactify spatial infinity but require the use of strong artificial damping
of the waves as they travel outward to avoid this numerical back-scattering (see e.g. [231]).
90 THE 3+1 FORMALISM
ce
rfa
nu su
ll er
hy hy
p
pe ll
sur nu
rfa
ce
Spacelike hypersurface
Fig. 2.5: Cauchy-characteristic matching. An interior region uses the standard 3+1
decomposition, while the exterior region uses a characteristic approach.
(b)
i+
(a)
t J +
J +
e
ur fac
ers i0 i0
p
hy e
l lin J –
nul J –
x
i–
equations can also be shown to be symmetric hyperbolic (see Chapter 5), which
guarantees that it is mathematically well posed.
Since in the conformal formulation we are evolving the conformal factor Ω
as an independent function, the position of the boundary of spacetime at null
infinity J is not known a priori (except at t = 0). We then need to extend
the physical initial data in some suitably smooth way and evolve the dynamical
variables “beyond infinity”. This has one important advantage, namely that it is
possible to put an arbitrary (but well behaved) boundary condition at the outer
boundary of the computational region without affecting the physical spacetime,
as anything beyond J is causally disconnected from the interior.
The conformal formulation would seem to be an ideal solution to the weak-
nesses of both the standard 3+1 approach and the characteristic formulation.
Being based on spatial hypersurfaces, it does not have to deal with the problem
of caustics associated with the characteristic formulation. At the same time, by
reaching null infinity, it allows clean extraction of gravitational radiation and
other physical quantities such as total mass and momentum. The main problem
faced by the conformal formulation today is related to the problem of construct-
ing hyperboloidal initial data. Also, being based on spatial hypersurfaces, it will
have to solve many of the same problems that standard 3+1 formulations are
currently faced with, namely the choice of a good gauge and the stability of
the evolutions against constraint violation. Though important progress has been
made in recent years and numerical simulations of weak data in the full three-
dimensional case have been carried out successfully, the conformal formulation is
still considerably less developed than the standard 3+1 formulation. However, its
conceptual elegance and fundamental strengths mean that this approach repre-
sents a very important promise for the future development of numerical relativity.
3
INITIAL DATA
3.1 Introduction
As we saw in the previous Chapter, out of the ten Einstein field equations only six
contain time derivatives and therefore represent the true evolution equations of
the spacetime geometry. The remaining four equations are constraints that must
be satisfied at all times. These are the Hamiltonian and momentum constraints,
which for concreteness we will rewrite here in a 3+1 adapted coordinate system
(3)
R + K 2 − Kij K ij = 16πρ , (3.1.1)
Dj K ij − γ ij K = 8πj i , (3.1.2)
i
with ρ and j the energy and momentum densities seen by the Eulerian (normal)
observers and defined as
ρ := nµ nν Tµν , j i := −P iµ nν Tµν . (3.1.3)
The existence of the constraint equations implies that it is not possible in
general to choose arbitrarily all 12 dynamical quantities {γij , Kij } as initial data.
The initial data has to be chosen in such a way that the constraints are satisfied
from the beginning, otherwise we will not be solving Einstein’s equations. This
means that before starting an evolution, it is necessary to first solve the initial
data problem to obtain adequate values of {γij , Kij } that represent the physical
situation that we are interested in.
The constraints form a system of four coupled partial differential equations
of elliptic type, and in general they are difficult to solve. Still, there are several
well-known procedures to solve these equations in specific circumstances. Until
a few years ago, the most common procedure was the conformal decomposition
of York and Lichnerowicz [189, 303, 304]. More recently, the so-called conformal
thin-sandwich approach [307] has become more and more popular when solving
the constraints, as it allows for a clearer interpretation of the freely specifiable
data. In this Chapter we will consider both these approaches to the problem of
finding initial data. As a particular application of these techniques, we will also
consider the special case of initial data for black hole spacetimes. A recent review
by Cook of the initial data problem in numerical relativity can be found in [102].
92
3.2 YORK–LICHNEROWICZ CONFORMAL DECOMPOSITION 93
associated with the spatial metric and extrinsic curvature {γij , Kij }. The first
question that must be answered is which of those 12 quantities will be taken as
free data, and which will be solved for using the constraints. Except in very simple
cases like that of the linearized theory, there is no natural way of identifying which
are the “true” dynamical components and which are the constrained components.
One must therefore develop some procedure that chooses eight components as
free data and allows one to solve for the remaining four in a clear way. The most
common procedure for doing this is known as the York–Lichnerowicz conformal
decomposition.
The York–Lichnerowicz conformal decomposition starts from a conformal
transformation of the 3-metric of the form29
where the conformal metric γ̄ij is considered as given. It is not difficult to show
that, in terms of the conformal metric, the Hamiltonian constraint takes the form
8D̄2 ψ − R̄ ψ + ψ 5 Kij K ij − K 2 + 16πψ 5 ρ = 0 , (3.2.2)
where D̄2 and R̄ are the Laplace operator and Ricci scalar associated with γ̄ij .
The extrinsic curvature is also separated into its trace K and its tracefree
part given by
1
Aij = K ij − γ ij K . (3.2.3)
3
The Hamiltonian constraint then becomes
2 2
8D̄ ψ − R̄ ψ + ψ Aij A − K + 16πψ 5 ρ = 0 .
2 5 ij
(3.2.4)
3
Notice that we have transformed the Hamiltonian constraint into an elliptic equa-
tion for the conformal factor ψ, and solving it will clearly allow us to reconstruct
the full physical metric γij from a given conformal metric γ̄ij .
Let us now consider the momentum constraints, which in terms of Aij take
the form
2
Dj Aij − Di K − 8πj i = 0 . (3.2.5)
3
In order to transform the momentum constraints into three equations for
three unknowns, we can now use a general algebraic result that states that any
symmetric-tracefree tensor S ij can be split in the following way
ij
S ij = S∗ij + (LW ) , (3.2.6)
29 In order to avoid confusion we will denote conformal quantities here with an over-bar
instead of a tilde. The tilde will be used only for the specific conformal transformation that
factors out the volume element for which ψ = γ 1/12 (as in the BSSNOK formulation discussed
in Chapter 2).
94 INITIAL DATA
where S∗ij is a symmetric, traceless and transverse tensor (i.e. with zero diver-
gence Dj S∗ij = 0), W i is a vector, and L is an operator defined as
2 ij
(LW )ij := Di W j + Dj W i − γ Dk W k , (3.2.7)
3
ij
The quantity (LW ) is known as the conformal Killing form associated with
the vector W i , and its contribution is called the longitudinal part of S ij . If the
conformal Killing form vanishes, then the vector W i is called a conformal Killing
vector, since in that case one has
2
£W γ −1/3 γij = Di Wj + Dj Wi − γij Dk W k = (LW )ij = 0 , (3.2.8)
3
that is, the conformal metric γ̃ij = γ −1/3 γij with the volume element factored
out is invariant along the vector field (one should not confuse γ̃ij , which has unit
volume element, with γ̄ij which has a volume element given by γ̄ = γ/ψ 12 ).
Notice that the operator L can be defined using any metric tensor. Two
natural choices present themselves at this point for the decomposition of Aij :
One can use the operator L̄ associated with the conformal metric, or the oper-
ator L associated with the physical metric. We will consider both these cases
in turn, and later introduce a different type of tensor splitting that resolves the
incompatibility between the first two approaches.
The factor of ψ 10 is chosen since one can easily show that for any symmetric-
tracefree tensor S ij the following identity holds
Dj S ij = ψ −n D̄j ψ n S ij + (10 − n) S ik ∂k ln ψ , (3.2.10)
where as before D̄i is the covariant derivative associated with the conformal
metric. The choice n = 10 is therefore clearly natural.30 Notice that we will
raise and lower indices of conformal tensors with the conformal metric, so that
in particular we find Āij = ψ 2 Aij . In terms of Āij the momentum constraints
become
2
D̄j Āij − ψ 6 D̄i K − 8πψ 10 j i = 0 , (3.2.11)
3
We now apply the transverse decomposition to Āij using the operator L̄
associated with the conformal metric γ̄ij :
30 To avoid confusion, the reader is reminded that when discussing the BSSNOK formulation
in the last Chapter the alternative, less “natural”, rescaling Āij = ψ4 Aij was used in order to
recover the standard form of this formulation.
3.2 YORK–LICHNEROWICZ CONFORMAL DECOMPOSITION 95
ij
Āij = Āij
∗ + L̄W̄ . (3.2.12)
From this one can easily show that the momentum constraints reduce to
¯ L̄ W̄ i − 2 ψ 6 D̄i K − 8πψ 10 j i = 0 ,
∆ (3.2.13)
3
The conformal Ricci tensor R̄ji appears in the last expression when we commute
covariant derivatives. Equations (3.2.13) clearly form a set of three coupled el-
liptic equations for W̄ i .
Let us now assume that we are given the conformal metric γ̄ij , the trace
of the extrinsic curvature K, and the transverse-traceless part of the conformal
extrinsic curvature Āij∗ . We can then use the Hamiltonian constraint (3.2.4) and
momentum constraints (3.2.13) to find the conformal factor ψ and the vector
W̄ i , and thus reconstruct the physical metric γij and extrinsic curvature K ij .
There is still, however, an important point to consider here. Even though it
is a simple task to find a symmetric-tracefree tensor, it is quite a different matter
to construct a transverse tensor. In order to construct such a tensor one needs
to start from an arbitrary symmetric-tracefree tensor M̄ ij that is not necessarily
transverse. Its transverse part can clearly be expressed as
ij
M̄∗ij = M̄ ij − L̄Ȳ , (3.2.15)
for some vector Ȳ i still to be determined. Now, since M̄∗ij is transverse by defi-
nition, the following relation between Ȳ i and M̄ ij must hold
¯ L̄ Ȳ i = D̄j M̄ ij .
∆ (3.2.16)
Given M̄ ij , this equation must be solved to find the vector Ȳ i , which will in turn
allows us to construct the transverse tensor M̄∗ij .
The above procedure can in fact be incorporated into the solution of the
constraints. Taking Āij ij
∗ = M̄∗ we will have
ij ij
Āij = M̄∗ij + L̄W̄ = M̄ ij + L̄V̄ , (3.2.17)
with V̄ i := W̄ i − Ȳ i , and where we have used the fact that L̄ is a linear operator.
96 INITIAL DATA
One can now rewrite the Hamiltonian and momentum constraints in terms
of Āij , V̄ i and M̄ ij to find
2
8D̄2 ψ − R̄ ψ + ψ −7 Āij Āij − ψ 5 K 2 + 16πψ 5 ρ = 0 , (3.2.18)
3
i ij 2
¯ L̄ V̄ + D̄j M̄ − ψ 6 D̄i K − 8πψ 10 j i = 0 .
∆ (3.2.19)
3
It is common to define ρ̄ := ψ 8 ρ and j̄ i := ψ 10 j i as the conformally rescaled
energy and momentum densities. The weight of the conformal factor in the def-
inition of j̄ i is chosen in order to eliminate factors of ψ from the matter terms
in the momentum constraints and thus decouple them more easily from the
Hamiltonian constraint. The weight of ψ in the definition of ρ̄ is then fixed for
consistency reasons (for example, ρ2 and jk j k must have the same power of ψ
when written in terms of conformal quantities for the energy conditions to be
independent of ψ).
Equations (3.2.18) and (3.2.19) are to be solved for ψ and V̄ i , with free data
given in the form of the conformal metric γ̄ij , a symmetric-tracefree tensor M̄ ij ,
the trace of the extrinsic curvature K, and the energy and momentum densities
ρ̄ and j̄ i . The physical quantities are then reconstructed as
γij = ψ 4 γ̄ij , (3.2.20)
1 ij
K ij = ψ −10 Āij + γ K, (3.2.21)
3
with ij
Āij = L̄V̄ + M̄ ij . (3.2.22)
The equations just found are the most common way of writing the constraints
in the York–Lichnerowicz approach.31
Notice how all four equations are coupled with each other. A way to sim-
plify the problem considerably is to simply choose K constant, corresponding to
a constant mean curvature spatial hypersurface, in which case the momentum
constraints decouple completely from the Hamiltonian constraint. One would
then start by solving the momentum constraints (3.2.19) for V i , use this to
reconstruct Āij , and only later solve the Hamiltonian constraint (3.2.18) for ψ.
The problem simplifies even more if, apart from taking K constant, one also
takes the conformal metric to be the one corresponding to flat space (i.e. the
physical metric is conformally flat). The Hamiltonian constraint then reduces to
2
2
8Dflat ψ + ψ −7 Āij Āij − ψ 5 K 2 + 16πψ 5 ρ = 0 , (3.2.23)
3
2
where now Dflat is just the flat space Laplacian. In the particular case of time-
symmetric initial data, corresponding to Kij = 0 at t = 0, the momentum
31 In fact, Lichnerowicz found the expression for the conformal decomposition of the Hamil-
tonian constraint [189], but the full decomposition of the momentum constraints is due to
York [303, 304].
3.2 YORK–LICHNEROWICZ CONFORMAL DECOMPOSITION 97
constraints are trivially satisfied. If, moreover, we assume that we are in vacuum
the Hamiltonian constraint reduces even further to
2
Dflat ψ=0, (3.2.24)
which is nothing more than the standard Laplace equation. We will come back
to this equation later when we study black hole initial data.
ij ij
= ψ −4 D̃j L̄W̄ + 6 L̄W̄ ∂j ln ψ
= ψ −4 ∆ ¯ L̄ W̄ i + 6 L̄W̄ ij ∂j ln ψ . (3.2.30)
Just as before, we will incorporate the procedure for obtaining the transverse
part of a general symmetric-tracefree tensor into the solution of the constraints.
What we want is to obtain the transverse part of Aij , that is Aij ∗ , which has
not been conformally rescaled. Assuming then that we are given a symmetric-
tracefree tensor M ij , its transverse part can clearly be written as
ij
M∗ij = M ij − (LY ) , (3.2.32)
for some vector Y i . As before, the fact that M∗ij implies that the following
relation must hold:
∆L Y i = Dj M ij , (3.2.33)
but notice that all quantities here are in the physical space. Taking now Aij
∗ =
M∗ij , we then have
Aij = M ij + (LV )ij , (3.2.34)
with V i := W i − Y i .
Combining all previous results, the momentum constraints become,
¯ L̄ V̄ i + 6 L̄V̄ ij ∂j ln ψ + ψ 4 Dj M ij − 2 D̄i K − 8πψ 4 j i = 0 ,
∆ (3.2.35)
3
where as before we have used V̄ i = V i .
The last expression, however, still has the divergence Dj M ij written in terms
of the physical metric. In order to express the momentum constraints completely
in terms of the conformal metric we take M̄ ij = ψ 10 M ij . Using again (3.2.10),
we can rewrite the momentum constraints in the final form
¯ L̄ V̄ i + 6 L̄V̄ ij ∂j ln ψ + ψ −6 D̄j M̄ ij − 2 D̄i K − 8πψ 4 j i = 0 .
∆ (3.2.36)
3
The full system of equations to be solved is then
2
8D̄2 ψ − R̄ ψ + ψ 5 Aij Aij − K 2 + 16πψ 5 ρ = 0 , (3.2.37)
3
∆¯ L̄ V̄ i + 6 L̄V̄ ij ∂j ln ψ + ψ −6 D̄j M̄ ij − 2 D̄i K − 8πψ 4 j i = 0 , (3.2.38)
3
with the physical quantities reconstructed as
The mismatched powers of ψ in these transformation rules are at the root of the
non-commutativity of conformal transformation and tensor splitting.
100 INITIAL DATA
Recently, however, Pfeiffer and York have introduced a new splitting of ten-
sors that completely resolves the problem [226]. The main idea is to split the
traceless extrinsic curvature Aij as
1 ij
Aij = Aij
∗ + (LW ) , (3.2.46)
σ
with σ a positive definite scalar. One then introduces the following conformal
transformations
Āij 10 ij
∗ =ψ A , W̄ i = W i , σ̄ = ψ −6 σ . (3.2.47)
The new splitting then has the important property that
1
Aij = Aij
∗ + (LW )ij
σ
−10 1 ij
=ψ ij
Ā∗ + L̄W̄ = ψ −10 Āij , (3.2.48)
σ̄
so conformal transformation and tensor splittings are now fully consistent.32
With this splitting the momentum constraints become
1 ij 2
D̄j L̄W̄ − ψ 6 D̄i K − 8πψ 10 j i = 0 . (3.2.50)
σ̄ 3
Just as in the previous two cases, we can start from an arbitrary symmetric
tracefree tensor M̄ ij which we split as
1 ij
M̄ ij = M̄∗ij + L̄Ȳ . (3.2.51)
σ̄
If we define again V̄ i := W̄ i − Ȳ i , the momentum constraints take the final form
1 ij 2
D̄j L̄V̄ + D̄j M̄ ij − ψ 6 D̄i K − 8πψ 10 j i = 0 . (3.2.52)
σ̄ 3
The Hamiltonian constraint has the same form as before, so the final equa-
tions to solve are
2
8D̄2 ψ − R̄ ψ + ψ −7 Āij Āij − ψ 5 K 2 + 16πψ 5 ρ = 0 , (3.2.53)
3
1 ij 2
D̄j L̄V̄ + D̄j M̄ ij − ψ 6 D̄i K − 8πψ 10 j i = 0 . (3.2.54)
σ̄ 3
The free data is now given by the conformal metric γ̄ij , the symmetric-tracefree
tensor M̄ ij , the trace of the extrinsic curvature K, the energy and momentum
32 It is important to notice that, with this new splitting, the two parts of Aij are orthogonal
both before and after the conformal transformation in the sense that
Z » – Z » –
1 1 ` ´kl
Aij
∗ (LW )kl γik γjl dV = Āij
∗ L̄W̄ γ̄ik γ̄jl dV̄ = 0 , (3.2.49)
σ σ̄
√ √
with the volume elements given by dV = σ γ d3 x and dV̄ = σ̄ γ̄ d3 x.
3.3 CONFORMAL THIN-SANDWICH APPROACH 101
densities ρ̄ and j̄ i , plus the weight factor σ̄. Given this data, the constraints are
solved for V̄ i and ψ, and the physical quantities are reconstructed as
We will further demand that the volume element of the conformal metric remains
momentarily fixed (though not equal to unity necessarily), which implies
γ̄ ij ūij = 0 . (3.3.2)
Consider now the tracefree part of the evolution equation for the physical
metric γij , and define
102 INITIAL DATA
1
uij := ∂t γij −γij (γ mn ∂t γmn )
3
= −2αAij + (Lβ)ij , (3.3.3)
with α and β i the lapse function and shift vector, respectively. Notice also that
equation (3.3.2) implies in particular that
∂t ln ψ = ∂t ln γ 1/12 , (3.3.4)
Let us now go back to the expression for uij . Solving for Aij and using the fact
ij
that (Lβ) = ψ −4 L̄β
ij
(where we remember that the natural transformation
for a vector is β i = β̄ i ) we find
1 ij
Aij = (Lβ) − uij
2α
ψ −4 ij
= L̄β − ūij , (3.3.6)
2α
and taking as before Āij = ψ 10 Aij this becomes
1 ij
Āij = L̄β − ūij , (3.3.7)
2ᾱ
where we have defined the conformal lapse as ᾱ := ψ −6 α, which in the case when
ψ = γ 1/12 corresponds precisely to the densitized lapse α̃ = γ 1/2 α. Notice that
this rescaling for the lapse comes naturally out of the standard rescalings for the
other quantities.
We are now almost finished with this approach to construct initial data. The
Hamiltonian constraint takes the same form as in the other approaches, namely
2
8D̄2 ψ − R̄ ψ + ψ −7 Āij Āij − ψ 5 K 2 + 16πψ 5 ρ = 0 . (3.3.8)
3
The momentum constraint, on the other hand, now becomes
1 ij 1 ij 2
D̄j L̄β − D̄j ū − ψ 6 D̄i K − 8πψ 10 j i = 0 . (3.3.9)
2ᾱ 2ᾱ 3
In this case one needs to solve (3.3.8) and (3.3.9) for the conformal factor
ψ and the shift vector β i , given free data in the form of the conformal metric
γ̄ij , its time derivative ūij , the trace of the extrinsic curvature K, the conformal
3.3 CONFORMAL THIN-SANDWICH APPROACH 103
(densitized) lapse ᾱ, and the matter densities ρ̄ and j̄ i . One then reconstructs
the physical quantities as
There are several important points to mention about the conformal thin-
sandwich approach. First, with this approach we end up with a physical metric
γij and extrinsic curvature K ij that satisfy the constraints, plus a shift vector
β i that we obtain from the solution of the momentum constraint, and a physical
lapse function α = ψ 6 ᾱ. That is, we obtain initial data not only for the dynamical
quantities, but also for the gauge functions. Of course, we are perfectly free to
ignore these values for the gauge functions and start the evolution with arbitrary
lapse and shift – the metric and extrinsic curvature obtained will still satisfy
the constraints. But if we choose to keep the values for lapse and shift coming
from the thin-sandwich approach, then we know that the time derivative of the
physical metric will be directly given by
2
∂t γij = ∂t ψ 4 γ̄ij = uij + γij Dk β k − αK
3
2
= ψ ūij + γ̄ij D̄k β + 6 β ∂k ln ψ − ψ ᾱK
4 k k 6
. (3.3.13)
3
We then have a clear interpretation of the free data in terms of their effects on
the initial dynamics of the system.
Another important point to mention is the fact that although we have ob-
tained a value for the shift vector through the solution of the momentum con-
straints, the conformal lapse ᾱ is still free and we might worry about what would
be a good choice for this quantity. Of course, any choice would be equally valid,
for example we could simply take ᾱ = 1, but there is a more natural way to
determine ᾱ. We can take the point of view that the natural free data are really
γ̄ij and its velocity ūij = ∂t γ̄ij , plus K and its velocity K̇ = ∂t K. We can then
use this information to reconstruct ᾱ in the following way: Consider the ADM
evolution equation for K, which can be easily shown to be given by
∂t K = β j ∂j K − D2 α + α R + K 2 + 4πα (S − 3ρ)
= β j ∂j K − D2 α + αKij K ij + 4πα (S + ρ)
1 2
= β ∂j K − D α + α Aij A + K + 4πα (S + ρ) , (3.3.14)
j 2 ij
3
104 INITIAL DATA
where in the second line we have used the Hamiltonian constraint (3.1.1) to
eliminate the Ricci scalar R. The Laplacian of the lapse can be rewritten as
D2 α = ψ −4 D̄2 α + 2 γ̄ mn ∂m α ∂n ln ψ , (3.3.15)
so that we find
and
D̄m ψ 6 ᾱ D̄m ψ = ψ 6 D̄m ᾱ D̄m ψ + 6 ᾱψ 5 D̄m ψ D̄m ψ . (3.3.17)
Substituting this into the time derivative of K, and using the Hamiltonian con-
straint in conformal form (3.3.8) to eliminate D̄2 ψ, we finally find
3 7 1 2
D̄2 ᾱ + ᾱ R̄ − ψ −8 Āij Āij + ψ 4 K 2 + 42 D̄m ln ψ + 14 D̄m ᾱ D̄m ln ψ
4 4 6
+ ψ −2 (∂t K − β m ∂m K) − 4π ᾱ ψ 4 (S + 4ρ) = 0 . (3.3.18)
This gives us an elliptic equation to solve for ᾱ. The equation is coupled
with the other four equations coming from the constraints, so we end up with a
system of five coupled equations from which we can find ψ, ᾱ, and β i given γ̃ij ,
K, and their time derivatives.
There is one final important comment to make concerning the relationship be-
tween the conformal thin-sandwich approach and the York–Lichnerowicz confor-
mal decomposition. Notice that if, in the weighted decomposition of Section 3.2.3,
we make the choices
then the equations to solve become identical with those of the thin-sandwich
approach. This not only reinforces the fact that the weighted decomposition is
the natural decomposition of Āij into transverse and longitudinal parts, but
it also provides a natural interpretation for the free data in that approach. In
particular, it tells us that the natural choice for the weight function σ̄ is twice
the conformal lapse. This observation closes the circle and we find that the
York–Lichnerowicz conformal decomposition and the conformal thin-sandwich
approach are completely consistent with each other.
3.4 MULTIPLE BLACK HOLE INITIAL DATA 105
8D̄2 ψ − R̄ ψ = 0 , (3.4.1)
where we have also used the fact that we are in vacuum, so that ρ = 0.
We will simplify the problem further by choosing a flat conformal metric, so
that R̄ = 0. We are then left with the simple equation
2
Dflat ψ=0, (3.4.2)
2
where again Dflat is the standard flat-space Laplace operator. The boundary
conditions correspond to an asymptotically flat spacetime (far away, the gravi-
tational field goes to zero), so that at infinity we must have ψ = 1. The simplest
solution to this equation that satisfies the boundary conditions is clearly
ψ=1, (3.4.3)
That is, we have recovered initial data for Minkowski spacetime (though through
a rather elaborate route).
The next interesting solution is clearly
ψ = 1 + k/r , (3.4.5)
106 INITIAL DATA
D2 (f ψ n ) = f ψ n−1 D2 ψ
+ ψ n D2 f + n(n − 1) f ∂m ln ψ ∂ m ln ψ + 2n ∂m f ∂ m ln ψ . (3.4.8)
Taking n = 7 and using the fact that D2 ψ = 0, the equation for the lapse
becomes 7
2 2
Dflat ᾱψ = Dflat (αψ) = 0 . (3.4.9)
Again, we have to choose boundary conditions. If we ask for the lapse to be-
come unity far away and vanish at the horizon, which in isotropic coordinates
corresponds to r = M/2, we find
αψ = 1 − M/2r , (3.4.10)
which implies
1 − M/2r
α= . (3.4.11)
1 + M/2r
We have then also recovered the isotropic lapse of equation (1.15.28).
N
mi
ψ =1+ . (3.4.12)
i=1
2 |
r −
ri |
This solution will represent N black holes that are momentarily at rest, located at
the points
ri . The parameters mi are known as the bare masses of each black hole,
and in terms of them the total ADM mass of the spacetime (see Appendix A)
3.4 MULTIPLE BLACK HOLE INITIAL DATA 107
$
turns out to be simply MADM = i mi . The bare mass corresponds with the
individual mass only in the case of a single black hole, for more than one black
hole the definition of the individual masses is somewhat trickier. One possible
definition is obtained by going to the asymptotically flat end associated with each
hole and calculating the ADM mass there. This can be easily done by considering
spherical coordinates around the ith center and using a new radial coordinate
r̃i = Mi2 /4ri that goes to infinity at
r =
ri . We then find that the ADM mass of
each individual black hole is
⎛ ⎞
mj
M i = mi ⎝ 1 + ⎠ , (3.4.13)
2rij
i=j
with rij := |
ri −
rj | the coordinate distance between the black hole centers.
Another way in which we can define the mass of each hole is by locating the in-
dividual apparent horizons and using the relationship between mass and horizon
area for Schwarzschild (equation (1.15.26)). Numerical experiments have in fact
found that this agrees extremely well (to within numerical accuracy) with the
individual ADM masses given above [289].
Since the initial data is time-symmetric, the location of the individual ap-
parent horizons will coincide with the surfaces of minimal area, that is, with
the throats of the wormholes. Strictly speaking, however, the solution will only
represent N black holes if the points ri are sufficiently far apart, as otherwise
we might find that the black hole horizons have merged and there are really
fewer black holes (maybe even only one) with complicated interior topologies.
For example, if we have two equal-mass black holes with m1 = m2 = 1, we find
that there is indeed a common apparent horizon if the centers are separated in
coordinate space by less that |r1 − r2 | ∼ 1.5, so in that case we have in fact a
single distorted black hole [79, 56, 8].
The solution (3.4.12) is known as the Brill–Lindquist initial data [79, 192].
As in the case of Schwarzschild, for Brill–Lindquist data each singular point
represents infinity in a different asymptotically flat region, so that our Universe
is connected with N different universes through Einstein–Rosen bridges (worm-
holes). The points r = ri are strictly speaking not part of the manifold. We can
then think of this solution as given in 3 with N points removed. These removed
points are commonly know as punctures. Since Brill–Lindquist data represents
in fact N + 1 joined asymptotically flat universes, its topological structure is
much more complex than that of Schwarzschild. Still, this non-trivial topology
will be “hidden” inside the black hole horizons, so it should have no effect on
the exterior Universe.
It turns out that we can in fact construct a solution that represents N black
holes but that contains only two isometric (i.e. identical) universes (see Fig-
ure 3.1). In this case all N wormholes will connect the same two universes. This
solution was found by Misner [205] and is known as Misner initial data. The
solution involves an infinite series expansion and is constructed using a method
108 INITIAL DATA
Fig. 3.1: Topology of a spacetime containing two black holes. The left panel shows the
case of Brill-Lindquist data, for which our universe is joined to two distinct universes
via wormholes. The right panel shows Misner type data for which there are only two
isometric universes joined by two wormholes.
of images (as each hole can “feel” the others an infinite number of times through
the wormholes). The construction of Misner data is rather involved, and here I
will just give the result for the case of two equal-mass black holes. In this case
the black hole throats are coordinate spheres whose centers are located on the
z-axis by construction. The solution is given in terms of a parameter µ that is
related to the position of the centers of the spheres z = ±z0 and their coordinate
radius a through
z0 = coth µ , a = 1/ sinh µ . (3.4.14)
In terms of this parameter, the conformal factor turns out to be
∞
1 1 1
ψ =1+ + , (3.4.15)
n=1
sinh(nµ) rn+ rn−
with %
rn± =
2
x2 + y 2 + (z ± coth(nµ)) (3.4.16)
From this expression for the conformal factor we can find that the total ADM
mass of the system is given by
∞
1
MADM = 4 . (3.4.17)
n=1
sinh(nµ)
We can also show that the proper distance L along a straight coordinate line
between the throats is
∞
n
L = 2 1 + 2µ . (3.4.18)
n=1
sinh(nµ)
Notice how, as µ increases, the centers of the holes approach each other and the
coordinate radius of their throats decreases (for infinite µ we find z0 = 1 and
a = 0). At the same time, the total ADM mass becomes smaller and the two
holes move closer also in terms of proper distance, but the ratio L/MADM in fact
3.4 MULTIPLE BLACK HOLE INITIAL DATA 109
increases. For values of µ less than µ ∼ 1.36 we finds a single apparent horizon
around the two throats [86], while numerical evolutions indicate that for values
less than µ ∼ 1.8 there is also a common event horizon around the throats on
the initial time slice [23].
Owing to its topological simplicity, for a long time Misner data was considered
more “natural”. It was frequently used in the numerical simulation of the head-on
collision of two black holes, and as such it has become a reference case. Notice that
precisely because of the property of having an isometry at each of the wormhole
throats, the natural way to evolve Misner type data is to take advantage of the
isometry and evolve in 3 minus N balls on which symmetry boundary conditions
are applied. This approach works well in the case of the head-on collision of two
black holes, where there is rotational symmetry around the axis joining the black
holes and the coordinates can be adapted to the horizons, but it becomes much
more difficult in the case of orbiting black holes where there are no symmetries,
and Cartesian coordinates are used. Because of this, plus the fact that it involves
an infinite number of singular points inside each throat, Misner type initial data
is not often used anymore and approaches based on Brill–Lindquist type data
are preferred.
1 i
1
V̄ i = − 7P + ni nj P j + 2 ijk nj Sk , (3.4.20)
4r r
with P i and S i constant vectors, ni the outward-pointing unit radial vector, and
ijk the completely antisymmetric Levi–Civita tensor in three dimensions. To
110 INITIAL DATA
The constant vectors P i and S i have clear physical interpretations. Using the
expressions for the ADM integrals found in Appendix A, we see that the linear
and angular momenta at spatial infinity can be calculated as
&
1 i
i
P = lim Kl − δli K nl dS , (3.4.23)
8π r→∞
&
1
Ji = lim ijk xj Kkl nl dS , (3.4.24)
16π r→∞
where the integrals are done over spheres of constant r, with ni the unit outward-
pointing normal vector to the sphere, and where the {xi } are taken to be asymp-
totically Cartesian coordinates (notice that in our case K = 0, so the first integral
simplifies somewhat). Assuming now that for large r the conformal factor be-
comes unity, and substituting (3.4.22) in the above expressions, we find after
some algebra that the vectors P i and S i are precisely the linear and angular
momenta for this spacetime. And since the momentum constraints are linear, we
can add solutions of this form at different centers r = ri to represent a set of
“particles” with the given momenta and spins.
The Bowen–York extrinsic curvature can be used directly as a solution of
the momentum constraints, but as it stands it is not isometric when we consider
Misner-type initial data. we can, however, find a somewhat more general solution
that is symmetric with respect to inversions through a coordinate sphere (see [74]
for details about the inversion through a sphere). The more general expression
is
3
± 4 ni Pj + nj Pi − nk P k (5ni nj − δij )
2r
3
− 3 (ilk nj + jlk ni ) nl S k , (3.4.25)
r
3.4 MULTIPLE BLACK HOLE INITIAL DATA 111
The extra term proportional to a2 does not contribute to either the linear or
angular momenta, but guarantees that the solution is symmetric under inversion
through a sphere of radius a, the throat of the Einstein–Rosen bridge, which
can later be identified with the black hole horizon. The two different signs in
the extra term correspond to the cases where the linear momentum on the other
side of the wormhole is either +P i or −P i . When we have more than one black
hole, we can still find solutions that have inversion symmetry with respect to
each throat, but just as before this requires an infinite series expression.
N
mi
ψ = ψBL + u , ψBL = . (3.4.29)
i=1
2 |
r −
ri |
The singular piece is therefore assumed to have the same behavior as in Brill–
Lindquist data (but notice that the additive 1 from the Brill-Lindquist conformal
factor (3.4.12) has now been absorbed in the u). It is clear that the term ψBL
has zero Laplacian on 3 with the points
r =
ri excised, i.e. on a “punctured”
3 . The Hamiltonian constraint then reduces to
−7
2 u
Dflat u+η 1+ =0, (3.4.30)
ψBL
with
1 ij
η= 7 Āij Ā , (3.4.31)
8ψBL
and we have used the fact that K = 0, and also that the spatial metric is
conformally flat so that R̄ = 0. The last equation must now be solved for u.
As before, we must now consider what boundary conditions must be imposed
on u, both at infinity and at the punctures. At infinity, asymptotic flatness again
implies that we must have u = 1 + k/r for some constant k, or in differential
form ∂r u = (1 − u)/r.
The key observation of the puncture method is that we can in fact solve for
u with no special boundary conditions at the punctures. To see this, notice that,
near a given puncture, we have ψBL ∼ 1/|
r −
ri |, while for Bowen–York extrinsic
curvature of the form (3.4.22) we find that Āij Āij diverges for non-zero spin as
|
r −
ri |−6 and for zero spin as |
r −
ri |−4 , so that η goes to zero as |
r −
ri | for
non-zero spin and as |
r −
ri |3 for zero spin. The Hamiltonian constraint then
2
reduces near the punctures to Dflat u = 0. Brandt and Bruegmann show that
under these conditions there exists a unique C 2 solution u to the Hamiltonian
constraint in all of 3 , so that we can ignore the punctures when solving for u.
Another interesting property of this method is that we can show that each
puncture corresponds to a separate asymptotically flat region, and that as seen
from those regions the corresponding black holes have zero linear momentum.
This is because the linear momentum on the other side of a given throat arises
3.4 MULTIPLE BLACK HOLE INITIAL DATA 113
⎛ ⎞
mj
M i = mi ⎝ 1 + u i + ⎠ , (3.4.32)
2rij
i=j
with ui = u(
r = r
i ).
The puncture approach is considerably easier to implement numerically than
the inversion-symmetric approach, and because of this in recent years it has
become more common in practice.
hole initial data.33 It is important to stress the fact that the problem does not
come from the use of the techniques described in Sections 3.2 and 3.3 which
are completely general, but rather from the specific choices made for the free
data: a maximal initial slice (K = 0) with a conformally flat metric and a purely
longitudinal Bowen–York extrinsic curvature.
Although there has been some work on trying to find alternatives to the
Bowen–York extrinsic curvature while still retaining a conformally flat metric
(see for example [106]), most work has centered around the idea of using Kerr
black holes, originally proposed by Matzner, Huq and Shoemaker [201]. The
proposal calls for the use of superposed boosted Kerr black holes as initial data,
but written in Kerr–Schild form to make sure that there are no singularities at
the black hole horizons. Thus, when the black holes are far apart no spurious
gravitational wave content will be present.
Of course, since the Einstein equations are non-linear the principle of super-
position is not satisfied, so that we must “correct” this initial data by solving
the constraints. The method starts by first identifying the 3+1 quantities from
the Kerr–Schild metric (1.16.16), which turn out to be
1
α2 = , (3.4.33)
1 + 2H
2Hl∗i
βi = , βi = 2Hli , (3.4.34)
1 + 2H
2H
γij = δij + 2Hli lj , γ ij = δ ij − 2Hl∗i l∗j 1 − , (3.4.35)
1 + 2H
with l∗i := δ ij lj . Now, for a time-symmetric situation we would want the time
derivative of the metric to vanish at t = 0. Using this we can find that the
extrinsic curvature is given by
1
Kij = (Di βj + Dj βi )
2α
1
= √ [∂i (Hlj ) + ∂j (Hli ) + 2H l∗a ∂a (Hli lj )] . (3.4.36)
1 + 2H
in fact quite small. Both perturbative and fully non-linear numerical simulations indicate that
for an angular momentum of J/M 2 ∼ 0.5 the energy radiated by such a black hole is of order
10−4 M [145, 106, 91], which should be compared with ∼ 10−3 M for the energy radiated by a
head-on collision of two black holes [28], and ∼ 10−2 M for the energy radiated by the inspiral
collision of two orbiting black holes [42, 43].
3.5 BINARY BLACK HOLES IN QUASI-CIRCULAR ORBITS 115
If, on the other hand, we want to have a black hole with some initial linear
momenta, we can use instead a boosted Schwarzschild or Kerr black hole as
starting point and compute the corresponding extrinsic curvature.
In order to find binary black hole initial data we now choose the conformal
metric γ̄ij as a direct superposition of two Kerr–Schild metrics
(1) (1) (2) (2)
γ̄ij dxi dxj = δij + 2H (1) (r1 ) li lj + 2H (2) (r2 ) li lj dxi dxj , (3.4.38)
(a)
where H (a) and li are the functions corresponding to a single black hole. As
before, the physical metric will be related to this metric by γij = ψ 4 γ̄ij , and we
would need to solve the Hamiltonian constraint to find ψ.
For the momentum constraints, we take as a trial extrinsic curvature simply
the direct sum of the extrinsic curvatures for the two Kerr–Schild black holes. We
then take the trace of the resulting extrinsic curvature as K, and the tracefree
part as M̄ ij , and solve the momentum constraints using the techniques described
in Section 3.2 to find the physical extrinsic curvature.
Since the work of Matzner et al., other approaches based on the idea of
using Kerr–Schild data as a starting point have also been proposed [58, 210]. We
might expect that these approaches would result in smaller amounts of spurious
gravitational waves than the Bowen–York approach. Kerr–Schild initial data for
binary black hole collisions have in fact been used in practice, though their total
content of gravitational waves has not yet been estimated.
Below we will briefly discuss each of these two approaches to finding initial data
for black holes in quasi-circular orbits.
Eb := MADM − M1 − M2 , (3.5.1)
where MADM is the ADM mass of the full spacetime, and Mi are the individual
masses of each black hole. The basic problem here is how to define the masses
of the individual black holes, as these are not well-defined concepts for black
holes that are close to each other. The usual approach is to use the relationship
between the area of the horizon A and the mass M of a single Kerr black hole
34 The ISCO also exists for a test particle orbiting a Schwarzschild black hole, and in that
Let us then start from the standard conformal decomposition of the met-
ric γ̄ij = ψ −4 γij . We will now assume that the initial conformal metric is flat
γ̄ij = δij , and also that we start from a maximal slice K = 0. Since we want
quasi-stationary initial data, it is natural to ask for
Notice that with these choices we have now completely exhausted the free data
for the conformal thin-sandwich method. In particular, the previous choices will
fix the lapse and shift through equations (3.3.9) and (3.3.18).
Even though by construction we will have momentarily stationary values for
the conformal metric and extrinsic curvature, in general we will not have ∂t ψ = 0
or ∂t Āij = 0. This is where the existence of an approximate Killing field comes
into play. It implies that it should be possible to choose coordinates such that
∂t ψ ∼ 0 and ∂t Āij ∼ 0. Of course, as already mentioned the lapse and shift
have already been fixed by our previous choices, but only up to the boundary
conditions used for the elliptic equations.
Consider the boundary conditions at infinity. Asymptotic flatness implies
that the lapse must be such that limr→∞ ᾱ = 1. The shift, however, is more
interesting. As already mentioned it is clear that if want to minimize changes in
the geometry we must go to a corotating frame, so that far away the shift vector
must approach a rigid rotation:
lim β
= Ω
eφ , (3.5.5)
r→∞
where
eφ is the basis vector associated with the azimuthal angle φ.
At this point, however, we still don’t have information that will allow us to
fix the angular velocity Ω. Grandclement et al. suggest that this can be done
by comparing two different definitions of the mass of the spacetime, namely the
ADM mass and the Komar mass. As discussed in Appendix A, the total mass and
angular momentum at spatial infinity can be expressed via the ADM integrals:
&
1
MADM = lim γ ij [γik,j − γij,k ] dS k , (3.5.6)
16π r→∞
&
1
JADM = lim Kij eφ i dS j , (3.5.7)
8π r→∞
where the integrals are taken over coordinate spheres, and where dS i = si dA
with si the unit outward-pointing normal vector to the sphere (in the first in-
tegral above, the spatial metric must be expressed in asymptotically Cartesian
coordinates). In the particular case of a conformally flat spatial metric like the
one we are using here, the first integral can be shown to reduce to
&
1
MADM = − lim ∂i ψ dS i . (3.5.8)
2π r→∞
3.5 BINARY BLACK HOLES IN QUASI-CIRCULAR ORBITS 119
we can also
Now, in the case when the spacetime admits a Killing field ξ,
define conserved quantities through the Komar integral (remember that for a
Killing field we have ∇(µ ξν) = 0):
&
1
IK ξ
= − sµ nν ∇µ ξν dA , (3.5.9)
4π
where here nµ is the timelike unit normal to the spacelike hypersurfaces. If ξ µ is
timelike and such that ξµ ξ µ = 1 at infinity, then the above integral corresponds
to the total mass of the spacetime M , while if it is an axial vector associated
with an angular coordinate φ it will correspond to −2J, with J the total angular
momentum (the factor −2 is needed to obtain the correct normalization). The
Komar mass and angular momentum should coincide with the ADM integrals.
In our case we have neither a timelike nor an axial Killing field, but we do
have an (approximate) helical Killing field. If we choose the boundary conditions
on the shift appropriately, this Killing field should correspond to the time vector
ξ µ = tµ = αnµ + β µ , and we would expect the following relation to hold [290]
= MADM − 2ΩJADM ,
Ik α
n + β (3.5.10)
= Ω
eφ . Notice now that
where the factor Ω appears because far away β
sµ nν ∇µ ξν = sµ nν (α∇µ nν + nν ∇µ α + ∇µ βν ) . (3.5.11)
IK ξ = ∇k α − β j Kjk dS k
4π
& &
1 Ω
= ∇k α dS k − Kjk eφ j dS k . (3.5.13)
4π 4π
Comparing the last result with the ADM expressions given above, we see that
the angular momentum part is identical while the mass part differs, being given
in the Komar case in terms of the lapse α and in the ADM case in terms of the
conformal factor ψ. The relation (3.5.10) then reduces to
& &
∇k α dS k = −2 ∇k ψ dS k . (3.5.14)
The last condition must hold if tµ = αnµ + β µ is a Killing vector, but for this
to be the case we must know the correct angular velocity in order to have the
shift correspond to a true corotation frame. The key observation here is that we
120 INITIAL DATA
can turn the condition around and use it to determine the angular velocity. The
idea is to solve the conformal thin-sandwich equations with a given value of Ω
fixing the exterior boundary condition for the shift, and then change the value of
Ω until (3.5.14) is satisfied. Of course, in our case we only have an approximate
Killing field, but we would expect that this way of fixing the angular velocity
will still bring us to the correct corotating frame. Notice that for Schwarzschild
the asymptotic behavior of the lapse and the conformal factor is
which is precisely of the form (3.5.14). In the more general case we will still
have ψ 1 + Mψ /2r and α 1 − Mα /r, but Mψ and Mα can not be expected
to coincide unless we use the correct angular velocity for the shift boundary
condition. This is the fundamental element of the quasi-equilibrium approach.
Of course, we still need to worry about the inner boundary conditions either
at the black hole horizons or at the punctures. This issue is very technical and
we will not discuss it in detail here. However, it should be mentioned that in
the original approach of Grandclement et al. the initial data is assumed to be
inversion-symmetric at the black hole throats, which also correspond to apparent
horizons, a condition that requires the black holes to be corotating (i.e. their spin
is locked with the orbital motion). However, as pointed out by Cook [103], this
set of boundary conditions does not guarantee regularity, and when an artificial
regularization procedure is used it leads to solutions that no longer satisfy the
constraints near the throats. In [103] Cook has generalized the original proposal
to allow for regular boundary conditions by giving up the inversion symmetry,
and has thus been able to obtain solutions for black holes that can have arbitrary
spins. Tichy et al. have also applied the quasi-stationary idea to puncture type
data with Bowen–York extrinsic curvature [289, 290].
4.1 Introduction
As already mentioned in Chapter 2, in the 3+1 formalism the choice of the
coordinate system is given in terms of the gauge variables: the lapse function α
and the shift vector β i . These functions appear in the evolution equations (2.3.11)
and (2.5.5) for the metric and extrinsic curvature. However, Einstein’s equations
say nothing about how the gauge variables should be chosen. This is what should
be expected, since it is clear that the coordinates can be chosen freely.
The freedom in choosing the gauge variables is a mixed blessing. On the one
hand, it allows us to choose the coordinates in a way that either simplifies the
evolution equations or makes the solution better behaved. On the other hand,
we are immediately faced with the following question: What is a “good” choice
for the functions α and β i ?
There are, of course, some guidelines that one would like to follow when
choosing good gauge conditions, which can be summarized in the following “wish
list”:
121
122 GAUGE CONDITIONS
a0 = β m ∂m ln α , (4.2.2)
ai = ∂i ln α . (4.2.3)
We then see that the spatial components of the proper acceleration are given by
the gradient of the lapse.
Another important relationship comes from the evolution of the volume ele-
ments associated with the Eulerian observers. The change in time of these volume
elements is simply given by the divergence of their 4-velocities ∇µ nµ . Using now
the definition of the extrinsic curvature we find that:
∇µ nµ = −K , (4.2.4)
that is, the rate of change of the volume elements in time is just (minus) the
trace of the extrinsic curvature.
4.2 SLICING CONDITIONS 123
∂t K = β i ∂i K − D2 α + α Kij K ij + 4π (ρ + S) , (4.2.5)
where we have used the Hamiltonian constraint (2.4.10) to eliminate the Ricci
scalar. For geodesic slicing this equation reduces to
∂t K − β i ∂i K = α Kij K ij + 4π (ρ + S) . (4.2.6)
The first term on the right hand side is clearly always positive, and so is the
second term if the strong energy condition holds. This means that along the
normal direction the trace of the extrinsic curvature K will increase without
bound, which through (4.2.4) implies that the volume elements associated with
the Eulerian observers will collapse to zero. Because of this, geodesic slicing is
never used in practice except to test numerical codes. For example, it is well
known that the free fall time to the singularity for an observer initially at rest
at the Schwarzschild horizon is t = πM . We can then set up initial data for
Schwarzschild in isotropic coordinates, evolve using geodesic slicing, and expect
the code to crash at that time.
by demanding that the volume elements associated with the Eulerian observers
remain constant. From equation (4.2.4) we can see that this is equivalent to
asking for
K = ∂t K = 0 . (4.2.7)
It is clear that we must require not only that K vanishes initially, but also that
it remains zero during the evolution. Asking now for the above condition to hold,
we find through (4.2.5) that the lapse function must satisfy the following elliptic
equation
D2 α = α Kij K ij + 4π (ρ + S) . (4.2.8)
This condition is known as maximal slicing.35 The name comes from the fact
that we can prove that when K = 0 the volume of the spatial hypersurface is
maximal with respect to small variations in the hypersurface itself.
Maximal slicing was suggested originally by Lichnerowicz [189], and was al-
ready discussed in the classic papers of Smarr and York [272, 305]. It has been
used over the years for many numerical simulations of a number of different sys-
tems, including black holes, and is still in use today. It has the advantage of
being given by a simple equation whose solution is smooth (because it comes
from an elliptic equation), and also of guaranteeing that the Eulerian observers
will not focus. It is important to mention also that this slicing condition can
only be used for asymptotically flat spacetimes and can not be used in the case
of cosmological spacetimes where volume elements always expand or contract
with time (on a closed Universe only one maximal slice exists at the moment
of time symmetry). However, in that case we can relax the condition and use
K = constant, with a different constant for each slice (in fact, in such cases K
can be used as a time coordinate). For asymptotically flat spacetimes we can in
fact also use the condition K = K0 = constant, ∂t K = 0. This corresponds to
hyperboloidal slices that reach null infinity J ± (depending on the sign of K0 ).
Such hyperboloidal slices can be useful to study gravitational radiation at infin-
ity, and are natural choices in Friedrich’s conformal approach (see Section 2.9.2).
They have also been used in numerical simulations based on the standard 3+1
approach to improve the stability of the evolution [142].
time
Event horizon
Singularity
Spacelike slices
radius
Collapsing matter
Fig. 4.1: Schematic representation of the collapse of the lapse when approaching a sin-
gularity. Time slows down in the region close to the singularity, but continues advancing
away from the singularity.
they fail to do so in the case of the spherical collapse of self-similar dust [115].
126 GAUGE CONDITIONS
paid. As time advances outside and freezes inside, the spatial slices become more
and more distorted, leading to a phenomenon known as slice stretching which
results in a rapid growth of the radial metric component and the development
of large gradients, which eventually cause numerical codes to fail.37 As pointed
out by Reimann in [237, 236], slice stretching is in fact a combination of two sep-
arate effects, referred to by Reimann as slice sucking and slice wrapping. Slice
sucking refers to the fact that the mere presence of the black hole results in the
differential infall of coordinate observers, as those closer to the black hole fall
faster, resulting in their radial separations increasing and the radial metric com-
ponent growing. This means that slice stretching will occur even for geodesic
slicing. The second effect of slice wrapping is the one that one usually has in
mind when thinking about singularity avoidance and is precisely a consequence
of the collapse of the lapse and the slices “wrapping” around the singularity.
Slice stretching results in a power law growth of the radial metric. For the max-
imal slicing of Schwarzschild, Reimann finds that at late times the peak in the
radial metric grows as ∼ τ 4/3 , with τ proper time at infinity (the loose statement
often found in the numerical literature indicating that slice stretching results in
“exponential” growth of the radial metric is therefore incorrect).
If there is a major disadvantage to maximal slicing it is the fact that solving
elliptic equations numerically in three dimensions is a very slow process. Even
with fast elliptic solvers, we can find that over 90% of the processor time is used
to solve just the maximal slicing equation (with zero shift; if we also have elliptic
shift conditions things get much worse). Also, setting up boundary conditions
for complicated inner boundaries such as those that can be found when excising
black hole interiors (see Chapter 6) can be a very hard problem. Because of this
in the past few years maximal slicing has been giving place to hyperbolic-type
slicing conditions like those that we will discuss in Section 4.2.4. Still, when
computer time restrictions are not an issue and good boundary conditions are
known, maximal slicing is probably the best slicing condition available.
Since maximal slicing requires the solution of an elliptic problem, the issue
of boundary conditions is very important (here I will consider only the exterior
boundaries and ignore the possible presence of complicated interior boundaries).
For asymptotically flat spacetimes we can assume that far from the sources we
should approach the Schwarzschild solution, in which case the lapse function
behaves asymptotically as
α = 1 − c/r , (4.2.10)
with c some constant (in static coordinates we find c = M ) . We can eliminate
the unknown constant by taking a derivative with respect to r to find
∂r α = (1 − α) /r . (4.2.11)
37 In older literature one often finds that this is called grid stretching, but the name is
misleading as this is a geometric effect that happens at the continuum level, and is quite
independent of the existence of a numerical grid.
4.2 SLICING CONDITIONS 127
This is known as a Robin boundary condition and is the standard condition used
when solving maximal slicing.
nµ = −N ∇µ σ = −N (1, −F ) , (4.2.15)
−2 −1
2M 1 2M
F 2 = 1− − 2 1− . (4.2.17)
r N r
P (r) = r4 − 2M r3 + C 2 . (4.2.22)
at the form of the polynomial P (r) it is not difficult to show that there will be
two distinct real roots for C 2 < 27M 4 /16 (both roots are such that r > 0), and
no roots if C 2 exceeds this value.38 We will therefore ask for C 2 < 27M 4/16.
Define now rC as the position of the largest real root of P (r). The particular
case C 2 = 0 corresponds to rC = 2M and in fact reduces to the Schwarzschild
t = 0 slice (or t = constant for a non-zero value of σ). On the other hand, as C 2
approaches the critical value 27M 4 /16, rC becomes smaller and approaches the
value 3M/2 corresponding to the limiting maximal slice of Schwarzschild.
The function F will now be given by integrating (4.2.21):
r
C
F (r, C) = − 1/2
dx , (4.2.23)
rC (1 − 2M/x) [P (x)]
where the integral over the pole at x = 2M is taken in the sense of the principal
value. In order to see that the surfaces are indeed smooth at r = 2M , define first
dH 8M 4
lim = −1. (4.2.26)
r→2M dr C2
That is, H(r, C) is a smooth function of r at the horizon. But notice now that
one from r = 0 to the left root, and another from the right root to r = ∞. The first family will
clearly reach the singularity at r = 0 so we are not interested in it.
130 GAUGE CONDITIONS
The lapse function α associated with this foliation is then defined through
nµ = −α∇µ τ . (4.2.29)
∂τ
N =α . (4.2.30)
∂t
Solving for α we find
∂t ∂F ∂F dC
α=N =N =N . (4.2.31)
∂τ ∂τ ∂C dτ
The problem now is to find ∂F/∂C, which is a non-trivial task. To see why this
is so, notice that the lower limit of integration in (4.2.23) is rC which clearly
depends on C. We therefore pick up a boundary term when differentiating with
respect to C, given essentially by the integrand itself evaluated at rC , which
clearly diverges (remember that rC is a root of P (r)). There is, however, a trick
that can be used to calculate this derivative (the derivation is rather long so we
will not include it here; the interested reader can see Appendix B of [51]). We
find that
r
∂F (r, C) r2 1 x (x − 3M )
= − dx . (4.2.32)
∂C 2 (r − 3M/2) P (r) 1/2 2 rC (x − 3M/2)2 P (x)1/2
Notice that this expression is regular at r = rC . We can in fact also show that
both α and N are linearly independent spherically symmetric solutions of the
maximal slicing equation (D2 − Kij K ij )f = 0, but with different boundary
conditions: Both N and α go to 1 at the right infinity, but at the left infinity N
goes to −1 while α goes to 1.
4.2 SLICING CONDITIONS 131
We can now use expression (4.2.33) to study the late time behavior of the
lapse
√ at different interesting locations. Notice first that, as C goes to Clim =
3 3M 2 /4, τ (C) goes to infinity and the areal radius r goes to 3M/2. Define
now
δ := rC − 3M/2 , (4.2.34)
that is, δ is the difference between the radius at the throat and its limiting value.
In terms of δ, the parameter C can be rewritten as
3/2 1/2
C = (δ + 3M/2) (M/2 − δ) . (4.2.35)
A long calculation (see [51]) shows that, for small δ, the following relation holds
τ δ
= −Ω ln + A + O(δ) , (4.2.36)
M M
with Ω and A constants given by
√
3 6
Ω= 1.8371 , (4.2.37)
4
√ √ √
3 6 3 3−5
A= ln 18 3 2 − 4 − 2 ln √ −0.2181 . (4.2.38)
4 9 6 − 22
We then have
√ !
dτ dτ dδ 3 6 M 1 3 M
= − −√ √ = √ . (4.2.39)
dC dδ dC 4 δ 2 6δ 4 2 δ2
Notice also that from (4.2.33) the lapse at the throat can be seen to be given by
dC 1
α(rC ) = , (4.2.40)
dτ 2δ
so that we finally find
√ √
2 2 δ 2 2
α(rC ) exp (A/Ω) exp (−τ /ΩM ) . (4.2.41)
3 M 3
This shows that for the Schwarzschild spacetime the value of the lapse at
the throat collapses exponentially with time as the slices approach the limiting
surface r = 3M/2. The time-scale of this√exponential collapse, normalized with
respect to the mass M , is given by Ω = 3 6/4 1.8371. This decay rate was in
fact discovered very early on. For example, by using a combination of numerical
results and a model problem, Smarr and York estimated this time-scale to be
∼ 1.82 [272].
It is also interesting to study the late time behavior of the lapse at the black
hole horizon. For a long time it has been known empirically that for maximal
132 GAUGE CONDITIONS
slicing the lapse at the horizon approaches the value α ∼ 0.3, and in fact in
numerical simulations there is a “rule of thumb” that says that once the collapse
of the lapse is well under way the level surface α ∼ 0.3 is a rough indicator of
the position of the apparent horizon. In [238], Reimann and Bruegmann have
studied the late time behavior of the lapse at the horizon using the techniques
described above. Notice that from equation (4.2.28) we find
dτ ∂F (∞, C)
= . (4.2.42)
dC ∂C
with r
x (x − 3M )
KC (r) := 2 dx . (4.2.44)
rC (x − 3M/2) P (x)1/2
The lapse at the horizon r = 2M will then be
1 2 C
αr=2M =− − KC (2M ) . (4.2.45)
KC (∞) M 4M 2
Reimann and Bruegmann show that at late times KC (∞) blows up as 1/δ 2 , and
KC (2M ) KC (∞) + η + O(δ 2 ), with η a numerical constant whose exact value
is known but is of no consequence here. This implies that at late times
√
Clim 3 3
αr=2M = = ∼ 0.3248 , (4.2.46)
4M 2 16
which is very close to the value ∼ 0.3 found empirically.
There is one last important issue to discuss regarding the maximal slicing
of Schwarzschild. Notice that for the symmetric (even) slices we have been dis-
cussing here, the lapse collapses at the throat but remains equal to one at the
two asymptotic infinities. If we evolve puncture data then this means that the
lapse will remain equal to one at the puncture, which is in fact a point in the
middle of the computational domain corresponding to r̃ = 0 with r̃ the isotropic
radial coordinate. However, if we solve the maximal slicing condition numerically
over the whole computational domain using Robin boundary conditions on the
outer boundary, we find that the lapse in fact collapses at the puncture. That is,
the numerical algorithm does not settle on the “even” solution, but rather on a
solution that corresponds to having ∇i α = 0 at the puncture, the so-called zero
gradient at puncture, or simply the puncture lapse. This solution has also been
4.2 SLICING CONDITIONS 133
r=
r=
r=
2M 2M 2M
r= r= r=
2M
2M
2M
Fig. 4.2: Maximal slicing of Schwarzschild as seen in the Kruskal diagram for odd, even
and puncture lapse (plots courtesy of B. Reimann).
studied analytically by Reimann and Bruegmann in [239]. They find that at the
puncture the lapse also collapses exponentially with time, but is now such that
√ √
2 2 δ2 2 2
αr̃=0 exp (2A/Ω) exp (−2τ /ΩM ) , (4.2.47)
3 M2 3
so the collapse time-scale is twice as fast as at the throat. The limiting value of
the lapse at the horizon, however, remains the same as before.
Figure 4.2 shows the maximal slicing of the Schwarzschild spacetime as seen in
the Kruskal diagram for the case of the odd, even and puncture lapse. Notice that
the odd lapse simply corresponds to the standard Schwarzschild time slices. The
even lapse is symmetric across the throat, while the puncture lapse is asymmetric.
The plots shown here are the true solution t = F (r, C), with F (r, C) given
by (4.2.23). The coordinates {t, r} are later transformed to Kruskal–Szekeres
coordinates {η, ξ} as discussed in Section 1.15. It is important to mention that
in order to find numerical values for F (r, C), we need first to find the root rC of
the polynomial P (r). Also, the integrand in the expression for F (r, C) has poles
at both r = 2M and r = rC , so that the numerical evaluation of this expression
is highly non-trivial. A procedure to perform this evaluation accurately has been
developed by Thornburg and can be found in Appendix 3 of [285].
used the so-called harmonic coordinates, which are defined by asking for the wave
operator acting on the coordinate functions xα to vanish:
2xα = g µν ∇µ ∇ν xα = 0 . (4.2.48)
d
α ≡ ∂t − £β α = −α2 K . (4.2.50)
dt
This is known as the harmonic slicing condition. Notice that, through the ADM
equations, this condition implies that
d
α̃ = 0 , (4.2.51)
dt
√
with α̃ := α/ γ the densitized lapse introduced in Chapter 2. This implies that,
in the case of zero shift, harmonic slicing can also be written in integrated form as
√
α = h(xi ) γ, with h(xi ) an arbitrary (but positive) time independent function.
It is very important to stress the fact that the integrated relation holds only
when moving along the normal direction to the hypersurfaces, and not when
moving along the time lines which will differ from the normal direction for any
non-zero shift vector. That is, harmonic slicing relates the lapse to the volume
elements associated with the Eulerian observers.
A second route to the use of evolution type slicing conditions started with
the first three-dimensional evolution codes in the early 1990s. Since solving the
maximal slicing condition was very time consuming, some attempts were made
to use algebraic slicing conditions, starting from the integrated form of harmonic
√
slicing α = γ. However, it was quickly realized that such a slicing condition
was not very useful for evolving black hole spacetimes as it approached the sin-
gularity very rapidly (we will see below that harmonic slicing is only marginally
4.2 SLICING CONDITIONS 135
singularity avoiding). This led to the empirical search for better behaved alge-
braic slicing conditions [25, 54], and in particular resulted in the discovery that
a slicing condition of the form α = 1 + ln γ, the so-called 1+log slicing, was very
robust in practice and mimicked the singularity avoiding properties of maximal
slicing.
Both routes finally merged with the work on hyperbolic re-formulations of
the 3+1 evolution equations of Bona and Masso in the early and mid 1990s [62,
63, 64]. This resulted in the Bona–Masso family of slicing conditions [65], which
is a generalization of harmonic slicing for which the lapse is chosen to satisfy the
following evolution equation
d
α = −α2 f (α) K , (4.2.52)
dt
with f (α) a positive but otherwise arbitrary function of α (the reason why f (α)
has to be positive will become clear below). Notice that the particular case f = 1
reduces to harmonic slicing, while f = N/α, with N constant, corresponds (in the
case of zero shift) to α = h(xi ) + ln γ N/2 , so that N = 2 reduces to the standard
1+log slicing. The Bona–Masso version of 1+log slicing, i.e. equation (4.2.52)
with f = 2/α, has been found in practice to be extremely robust and well
behaved for spacetimes with strong gravitational fields [13, 15, 30], and in recent
years has supplanted maximal slicing in most three-dimensional evolution codes
dealing with either black holes or neutron stars.
Let us go back to condition (4.2.52). Taking an extra time derivative we find
d2 d
α = −α f
2
K − α(2f + αf )K 2
, (4.2.53)
dt2 dt
with f := df /dα. Using now the evolution equation for K, equation (4.2.5), we
find that (in vacuum)
d2
2
α − α2 f D2 α = −α3 f Kij K ij − (2f + αf ) K 2 . (4.2.54)
dt
The last equation shows that the lapse obeys a wave equation with a quadratic
source term in Kij . It is because of this that we say that the slicing condi-
tion (4.2.52) is a hyperbolic slicing condition: It implies that the lapse evolves
with a hyperbolic equation. The wave speed associated with equation (4.2.54)
along a specific direction xi can be easily seen to be
vg = α f γ ii . (4.2.55)
Notice that this gauge speed will only be real if f (α) ≥ 0, which explains why
we asked for f (α) to be positive. In fact, f (α) must be strictly positive because
if it were zero we would not have a strongly hyperbolic system (see Chapter 5).
136 GAUGE CONDITIONS
To see how the gauge speed vg is related to the speed of light consider for
a moment a null world-line. It is not difficult to find that such a world-line will
have a coordinate speed along the direction xi given by
vl = α γ ii , (4.2.56)
so the gauge speed (4.2.55) can be smaller or larger that the speed of light de-
pending on the value of f . In the particular case of harmonic slicing we have
f = 1, so the gauge speed coincides with the speed of light, but for the 1+log
slicing with f = 2/α the gauge speed can easily become superluminal. Having
a gauge speed that is larger than the speed of light does not in fact lead to any
causality violations, as the superluminal speed is only related with the propaga-
tion of the coordinate system, and the coordinate system can be chosen freely.
Physical effects, of course, still propagate at the speed of light.
A variation of the Bona–Masso slicing condition has also been proposed that
has the property that for static spacetimes it guarantees that the lapse function
will not evolve [10, 17, 302]. This condition can easily be obtained by asking for
the evolution of the lapse to be such that
αf (α)
∂t α = ∂t γ 1/2 , (4.2.57)
γ 1/2
which results in
∂t α = −αf (α) αK − Di β i . (4.2.58)
The modified condition then substitutes the Lie derivative of the lapse with
respect to the shift with the divergence of the shift itself (essentially the Lie
derivative of γ). This has the consequence that it will not result in the same
foliation of spacetime for a different shift vector, i.e. the foliation of spacetime
we obtain will depend on the choice of shift. However, having a slicing condi-
tion that is compatible with a static solution might have advantages in some
circumstances.
with m some constant power (we will see below that the expected order of a
focusing singularity is in fact m = 1). Notice that we must have m > 0 for there
4.2 SLICING CONDITIONS 137
We then see that for n < 0 the volume elements remain finite as the lapse
approaches zero, in other words for this case we have strong singularity avoid-
ance. Notice that 1+log family of slicing conditions f (α) = N/α corresponds to
n = −1, which implies that it is singularity avoiding in a strong sense, explain-
ing why it has been found to mimic maximal slicing in practice. If, on the other
hand, n > 0 then both the lapse and the volume elements go to zero at the same
time so we can at most have marginal singularity avoidance.
Let us now go back now to the case n = 0. Using again (4.2.61) we find that
It is then clear that in this case α and γ 1/2 also vanish at the same time.
We then find that n < 0 guarantees strong singularity avoidance, while for
n ≥ 0 we can have at most marginal singularity avoidance. In order to see if in
this last case we reach the singularity in an infinite or a finite coordinate time we
need to study the behavior of α as a function of proper time τ as we approach
the singularity. Starting from equation (4.2.62) for the elapsed coordinate time
we find τs 0
dτ dτ /dα
∆t = = dα , (4.2.66)
0 α α0 α
with α0 the initial lapse. Equation (4.2.66) implies that if dτ /dα remains different
from zero as the lapse collapses then ∆t will diverge and we will have marginal
singularity avoidance. On the other hand, if dτ /dα vanishes at the singularity as
αp with p > 0 (or faster), then the integral will converge and the singularity will
be reached in a finite coordinate time, i.e. the singularity will not be avoided.
To find the behavior of dτ /dα as we approach the singularity, we notice that
equation (4.2.59) implies
d ln γ 1/2 m
=− . (4.2.67)
dτ (τs − τ )
dα/dτ m
=− , (4.2.68)
αf (α) (τs − τ )
4.2 SLICING CONDITIONS 139
We still have not addressed the issue of what value of m should be expected,
that is, how fast do the volume elements collapse to zero at the singularity. It is
in fact not difficult to give an estimate of this. First, notice that from the ADM
equations the volume elements behave as
∂t K − β i ∂i K = α R + K 2 + 4π (S − 3ρ) . (4.2.75)
Now, according to the well known BKL conjecture (Belinskii, Khalatnikov and
Lifshitz [53]), in the approach to a singularity the velocity terms can be expected
to dominate (the solution is velocity term dominated or VTD), and both the Ricci
scalar and the matter terms can be ignored compared to the quadratic term K 2 ,
so that along the normal lines we have
d 1
K ∼ K2 ⇒ K∼ . (4.2.76)
dτ τs − τ
Substituting this back into the evolution equation for γ 1/2 we finally find
γ 1/2 ∼ τs − τ . (4.2.77)
The expected order of the singularity is therefore m = 1. This means that for
n = 1 we will have marginal singularity avoidance if A ≥ 1. In other words, the
case f = 1 corresponding to harmonic slicing marks the limit of the region with
marginal singularity avoidance.
There is one final important point to consider in the strongly singularity
avoiding case n < 0. From the form of the slicing condition (4.2.52) we see that
if n ≤ −2, then as the lapse approaches zero we can not guarantee that ∂t α will
also approach zero. In fact, ∂t α could remain finite or even become arbitrarily
negative; the slices will therefore not only avoid the singularity, but will in fact
move back away from it as the lapse function becomes negative. This type of
behavior is not desirable as the time slices could easily stop being spacelike.
If we want to guarantee that we have strong singularity avoidance without the
lapse becoming negative we must therefore choose f (α) such that as α approaches
zero n remains in the region −2 < n < 0. Since the 1+log family corresponds
to n = −1 it is just in the middle of this region, making it a very good choice
indeed.
In terms of P µν , the twist ωµν , strain θµν and acceleration ζ µ of the congruence
are defined as
The strain tensor θµν can be further decomposed into its trace θ = ∇µ Z µ , also
know as the expansion, and its free part σµν = θµν − (θ/3)Pµν , known as the
shear. In terms of the quantities defined above, the covariant derivative of Z µ
can be expressed as
∇µ Zν = ωµν + θµν − ζµ Zν . (4.3.5)
Assume now that our congruence corresponds to the world-lines of the Eule-
rian observers. We then have Z µ = nµ , with nµ the normal vector to the spatial
hypersurfaces. This immediately implies that ωµν = 0, since the congruence is
hypersurface orthogonal. We also find ζ µ = aµ , with aµ the acceleration of the
Eulerian observers, and θµν = −Kµν , with Kµν the extrinsic curvature.
The strain tensor θµν = −Kµν = 1/2 £n γµν corresponds to motion along the
normal lines. In an analogous way, we can define a strain tensor along the time
lies tµ in the following way
1 1
Θij := £t γij = −αKij + £β γij , (4.3.6)
2 2
where we have only considered the spatial components, as the strain tensor is
clearly purely spatial.
The strain tensor Θij just defined measures the change in both the size and
form of the volume elements along the time lines. We can then try to use a shift
142 GAUGE CONDITIONS
vector to minimize some measure of the strain tensor in order to reduce as much
as possible the changes in the spatial metric itself. Smarr and York propose to
minimize the non-negative square of the strain tensor Θij Θij in a global sense
over the spatial hypersurface. If we integrate Θij Θij over the slice, variation with
respect to the shift vector β i can be shown to result in the equation
where the Ricci tensor appears when we commute covariant derivatives of the
shift. This equation is known as the minimal strain shift condition and gives us
three elliptic equations that can be used to find the three components of the
shift vector given appropriate boundary conditions.
The minimal strain condition minimizes a global measure of the change in the
volume elements associated with the time lines. However, as the change in the
volume is related to the trace of the extrinsic curvature K, it would seem better
to use the shift vector to minimize only the changes in the shape of the volume
elements, independently of their size. We then define the shear associated with
the time lines as the tracefree part of Θij :
1
Σij := Θij − γij Θ . (4.3.9)
3
Smarr and York call Σij the distortion tensor in order to distinguish it from the
shear tensor σij associated with the normal lines. From the definition above we
find that the distortion tensor is given by
1
Σij = −αAij + (Lβ)ij , (4.3.10)
2
where as before Aij = Kij − (γij /3) K is the tracefree part of the extrinsic
curvature, and (Lβ)ij is the conformal Killing form associated with the shift
(see equation (3.2.7)):
2
(Lβ)ij := Di βj + Dj βi − γij Dk β k . (4.3.11)
3
We can also rewrite the distortion tensor in a way that is perhaps more illustra-
tive by noticing that (4.3.10) is equivalent to
1 1/3
Σij = γ £t γ̃ij , (4.3.12)
2
with γ̃ij = γ −1/3 γij the conformal metric. The distortion tensor is therefore
essentially the velocity of the conformal metric.
4.3 SHIFT CONDITIONS 143
Minimizing now the integral of Σij Σij over the spatial hypersurface with
respect to the shift yields the condition
Dj Σij = 0 , (4.3.13)
which implies
∆L β i = 2Dj αAij , (4.3.14)
where the operator ∆L is defined as
2 i
∆L β i := Dj (Lβ)ij = D2 β i + Dj Di β j − D Dj β j
3
1 i
= D2 β i + D Dj β j + Rji β j . (4.3.15)
3
Equation (4.3.14) is known as the minimal distortion shift condition, and
again gives us three elliptic equations for the three components of the shift.
The minimal distortion condition is a very natural condition that will minimize
changes in the shape of volume elements during an evolution. It can also be
shown that when the gravitational field is weak, it includes the TT gauge used
to study gravitational waves [273]. Still, the fact that it is given through three
coupled elliptic equations has meant that it has not been extensively used in
three-dimensional numerical simulations.
Notice that, in particular, the expression (4.3.12) for the distortion tensor
implies that the minimal distortion condition can also be written as
Dj ∂t γ̃ ij = 0 . (4.3.16)
This is very closely related to a proposal of Dirac for finding an analog of the
radiation gauge in general relativity [111, 273], for which he suggested using
∂j ∂t γ̃ ij = ∂t ∂j γ̃ ij = 0 . (4.3.17)
Remarkably, this equation has been recently rediscovered in the context of the
BSSNOK formulation of the 3+1 evolution equations. Notice that, from equa-
tion (2.8.8), the conformal connection functions are given by Γ̃i = −∂j γ̃ ij , so the
above gauge condition is equivalent to
∂t Γ̃i = 0 , (4.3.18)
This shift condition, known as Gamma freezing, has been proposed as a natural
shift condition in the context of the BSSNOK formulation, as it freezes three of
the independent degrees of freedom. To see the relationship between the minimal
distortion and Gamma freezing conditions we notice that the evolution equation
for the conformal connection functions (2.8.25) can be written in terms of Σij as
∂t Γ̃i = 2∂j γ 1/3 Σij . (4.3.19)
144 GAUGE CONDITIONS
We then see that the minimal distortion condition Dj Σij = 0, and the
Gamma freezing condition ∂t Γ̃i = 0, are equivalent up to terms involving first
spatial derivatives of the conformal metric and the conformal factor. In particu-
lar, all terms involving second derivatives of the shift are identical in both cases
(but not so terms with first derivatives of the shift which appear in the distortion
tensor itself). That the difference between both conditions involves Christoffel
symbols should not be surprising since the minimal distortion condition is covari-
ant while the Gamma freezing condition is not.
Very recently there has been an important development in looking for more
natural shift conditions. Jantzen and York [168] have proposed a modified form
of the minimal distortion shift vector that takes into account the new weighted
conformal decomposition of symmetric tensors, and the role that the densitized
lapse plays in the ADM evolution equations (see Section 3.2.3). The proposal
of Jantzen and York is to minimize the square of a lapse-corrected distortion
tensor Σij /α (corresponding to the change of the conformal metric with respect
to proper time), and use the full spacetime metric determinant as measure in
the integral (thus giving an extra factor α in the volume element). This new
variation results in the modified condition
ij
Dj Aij /α = 0 ⇒ Dj (Lβ) /2α − Aij = 0 , (4.3.21)
The difference between the original minimal distortion condition and the
modified version is essentially the position of the factor α inside the divergence.
However, the new version has a closer relationship to the initial data problem,
in particular in terms of the conformal thin-sandwich approach of Section 3.3.
To see this, we first rewrite the modified minimal distortion condition (4.3.21)
in terms of conformal quantities using the transformations given in Chapter 3.
We find that the new condition is equivalent to
ij
D̄j L̄β /2ᾱ − Āij = 0 , (4.3.22)
Compare now this last equation with the expression for the conformal tracefree
extrinsic curvature in terms of the shift and the velocity of the conformal metric,
equation (3.3.7). We see that if the shift vector satisfies the new minimal distor-
tion condition, then the time derivative of the conformal metric ūij will remain
transverse during the evolution, i.e. D̄j ūij = 0, so that only the transverse-
traceless part of the conformal metric evolves (ūij is traceless by construction).
Thus the new shift condition reduces the evolution of the conformal metric to
its “dynamical” part. This new version of minimal distortion is therefore more
4.3 SHIFT CONDITIONS 145
natural and is closer to the original motivation of finding a gauge in general rel-
ativity that separates out the evolution of the physical degrees of freedom from
the gauge degrees of freedom.
where the parameter > 0 is chosen small enough to guarantee numerical sta-
bility in the range of resolutions considered, but large enough to allow the shift
to react to changes in the geometry.
We can now make a second observation. In practice we find that parabolic
drivers like the one introduced above usually do not allow the shift to respond
rapidly enough to changes in the geometry unless is large, but in that case the
numerical stability of the whole evolution system rapidly becomes dominated by
the parabolic shift condition. One way to fix this is to use instead a hyperbolic
driver condition of the form
1
∂t2 β i = α2 ξ D2 β i + Di Dj β j + Rji β j − 2Dj αAij , (4.3.27)
3
where the parameter η > 0 is used to avoid strong oscillations in the shift (a
similar damping term can also be added to the minimal distortion driver).40
Both the hyperbolic minimal distortion driver and the hyperbolic Gamma
driver provide us with shift conditions that will attempt to control the distor-
tion of the volume elements with some time delay. In practice, the hyperbolic
Gamma driver condition has been found to be extremely robust and well be-
haved in black hole simulations with puncture initial data, controlling both the
slice stretching and the shear due to the rotation of the hole. The minimal dis-
tortion hyperbolic driver condition has not been used extensively in practice, but
we could expect that it would be just as robust. In fact, on purely theoretical
grounds the minimal distortion driver condition should probably be preferred as
it results in a condition that is 3-covariant. The Gamma driver condition, on
the other hand, is clearly not 3-covariant implying that if we change our spatial
coordinates (say from Cartesian to spherical), we will obtain a different shift.
Before moving to a more general class of hyperbolic shift conditions, there
is an important point related to the characteristic speeds associated with the
hyperbolic driver conditions that should be mentioned. A formal description of
how to obtain such characteristic speeds will have to wait until Chapter 5, but
here we can anticipate some of the results. Notice that the hyperbolic minimal
distortion and Gamma driver conditions are in fact identical up to the principal
part (i.e. all terms with second derivatives are the same in both cases), so that
their characteristic structure is the same. Consider first the asymptotically flat
region where the covariant derivatives can be substituted by partial derivatives. It
is easy to see that the characteristic speeds in that region along a fixed direction
xi are
λl ±α 4γ ii ξ/3 , λt ±α γ ii ξ , (4.3.30)
where here λl refers to the speed associated with the longitudinal shift compo-
nent and λt to the speed associated with the transverse shift components, so
that longitudinal and transverse components propagate at different speeds. In
numerical simulations it is usual to take ξ = 3/4 (or ξ = αn 3/4, with n some
integer), so that the longitudinal components propagate at the speed of light
asymptotically.41
There is yet another important point to make, related to the characteristic
speeds. A more formal analysis using the techniques described in Chapter 5
shows that the characteristic cones associated with the shift modes for the driver
conditions (4.3.27) and (4.3.29) have rather strange properties when compared
with the characteristic cones associated with all other modes. For all other modes
40 When using the Gamma driver shift, the specific value of the damping parameter η has an
important effect in achieving long-lasting evolutions of black hole spacetimes. Notice, however,
that in contrast with ξ, the parameter η is not dimensionless, so that it must be scaled with the
total mass of the spacetime M . The typical value used in black hole simulations is η ∼ 2/M .
41 Recently it has been found that taking n = −2, which effectively makes the gauge speeds
independent of the lapse α, is a good choice in black hole simulations since it allows the shift
to evolve even in regions where the lapse has collapsed to zero.
148 GAUGE CONDITIONS
it turns out that the width of the characteristic cones is independent of the shift,
and the shift’s only effect is to tilt the cones by −β i , which is precisely what we
would expect from the geometrical interpretation of the shift vector. In the case
of the driver conditions as given above we find, however, that the characteristic
cones associated with the shift modes are only tilted by −β i /2, and also that
their width depends on the magnitude of the shift. This unexpected behavior
can be easily cured by changing the minimal distortion driver to
∂0 β i = B i , (4.3.31)
1
∂0 B i = α2 ξ D2 β i + Di Dj β j + Rji β j − 2Dj αAij , (4.3.32)
3
∂0 β i = B i , (4.3.33)
∂0 B i = α2 ξ ∂0 Γ̃i − ηB i . (4.3.34)
Apart from a trivial rewriting of the above conditions in first order in time form,
all we have actually done is added an advection term to the time derivatives to
fix the structure of the light cones. An advection term of this type was in fact
considered initially by the authors of [13] but later rejected (and never published)
as it brings the shift condition to a form very similar to the simple equation
∂t v − v ∂x v = 0, known as Burger’s equation, which is well know to result in
the formation of shocks. However, recent numerical simulations indicate that, at
least in the case of the Gamma driver condition, the advection terms do not give
rise to shocks (the shift condition is coupled to all other dynamical equations, so
that this superficial resemblance to Burger’s equation is misleading). Moreover,
in simulations of binary black hole systems with moving punctures [44] (see
Chapter 6), it has been found that the inclusion of advection terms is important
in order to avoid having a perturbation in the constraints remaining at the initial
location of the punctures.42
The driver conditions described above have been proposed as a way of finding
a shift vector that is close to the solution of the corresponding elliptic equations
and at the same time is easy to solve for. As such these conditions, though cer-
tainly robust in practice, are rather ad hoc. There is, however, a more natural
approach to obtaining hyperbolic shift conditions similar to those used for the
lapse. In Section 4.2.4 we introduced the idea of generalizing the condition for
a harmonic time coordinate to obtain different slicing conditions. This led us
42 Recently there has been both a formal analysis, and a series of numerical tests involving
moving black holes, of variations in the way in which advection terms are be added to the
Gamma driver shift condition [156, 293] and their impact on the hyperbolicity of the system
and the accuracy of the numerical simulations.
4.3 SHIFT CONDITIONS 149
2xi = 0 , (4.3.35)
which implies
Γi := g µν Γiµν = 0 , (4.3.36)
with Γα
µνthe Christoffel symbols associated with the spacetime metric gµν . Using
the expressions found in Appendix B, we can easily show that in 3+1 language
the above condition reduces to
∂t β i = β a ∂a β i − α∂ i α + α2 (3) Γi
βi
+ ∂t α − β a ∂a α + α2 K , (4.3.37)
α
where now (3) Γi is defined in terms of the three-dimensional Christoffel symbols
(3) i
Γjk .This condition has in fact been known for a long time (see e.g. [140, 305]),
though it is usually written down assuming that the lapse is also harmonic so that
the term in parenthesis vanishes.43 Having the evolution equation for the shift
depend on the time derivative of the lapse is clearly inconvenient if we want to
use harmonic spatial coordinates with a different slicing condition, say maximal
slicing. Remarkably, it turns out that if we rewrite the evolution equation for the
shift in terms of a rescaled shift vector of the form σ i = β i /α, then the spatial
harmonic condition decouples completely from the evolution of the lapse so we
can work with an arbitrary slicing condition. We find
∂t σ i = ασ a ∂a σ i − ∂ i α + α σ i K + (3) Γi . (4.3.38)
with h(α) > 0 an arbitrary function of the lapse [16]. The last condition is
known as the generalized harmonic shift condition, and is closely related to shift
conditions recently proposed by Lindblom and Scheel [190], and by Bona and
Palenzuela [67]. It is not difficult to see that by choosing the free parameters
in these references appropriately, we can in fact recover the above condition,
but only provided we also take the lapse to evolve via the Bona–Masso slicing
condition (4.2.52) and take h(α) = f (α).
43 Equation (4.3.37) fixes a sign error in [305], and includes a term missing in [140].
150 GAUGE CONDITIONS
There are several properties of the shift condition (4.3.39) that are important
to mention. First, in an analogous way to the Bona–Masso slicing condition, it
leads to a characteristic speed associated with the shift modes of the form
λ = α −σ i ± hγ ii = −β i ± α hγ ii , (4.3.40)
so that we clearly must ask for h > 0. In fact, the function h plays exactly the
same role for the shift as the function f did in the case of the Bona–Masso slicing
condition. Notice also that the generalized harmonic shift condition (4.3.39) has
again the Burger’s type structure ∂t v − v∂x v = 0, but a detailed analysis shows
that it does not lead to the formation of shocks (see [16] and also Chapter 5).
There is a final issue to discuss regarding the generalized harmonic shift
condition (4.3.39). Notice that, just as was the case with the Gamma driver,
this condition is not covariant with respect to changes in the spatial coordinates.
That is, starting from exactly the same 3-geometry but with different spatial
coordinates we will get a different evolution of the shift vector. In particular,
for curvilinear systems of coordinates we could find that even starting from a
flat slice of Minkowski spacetime we would still have non-trivial shift evolution
driven by the fact that the initial (3) Γi do not vanish. Worse still, in many cases
it can happen that the (3) Γi of flat space are not only non-zero but are also
singular, as is the case with spherical coordinates for which (3) Γr is of order 1/r.
In response to this it has been suggested in [16] that the generalized harmonic
shift condition should be interpreted as always being applied in a coordinate
system that is topologically Cartesian.
Of course, we would still like to be able to express the condition in a general
curvilinear coordinate system. Let us denote by {xā } the reference topologically
Cartesian coordinates, and by {xi } the general curvilinear coordinates. We find
that in the curvilinear coordinate system the shift condition should in fact be
replaced with
∂t σ l = ασ m Dm σ l − Dl α + αh σ l K
+ α (hγ mn − σ m σ n ) ∆lmn , (4.3.41)
In practice, we can use the fact that for flat space in Cartesian coordinates
the Christoffel symbols vanish, which implies
l
Fmn = (3) Γlmn , (4.3.44)
flat
so that
(3) l
∆lmn = (3) Γlmn − Γmn . (4.3.45)
flat
The covariant version of the generalized harmonic shift has been used success-
fully in evolutions of spherically symmetric systems, but so far has not been tried
in the full three-dimensional case where the hyperbolic Gamma driver condition
still dominates.
t = t̄ , (4.3.46)
z = z̄ , (4.3.47)
x = x̄ cos (Ωt) − ȳ sin (Ωt) , (4.3.48)
y = x̄ sin (Ωt) + ȳ cos (Ωt) . (4.3.49)
ᾱ = α , (4.3.50)
i
β̄ i = β i + Ω
×ρ
, (4.3.51)
with ρ
:= (x, y, 0). So in order to go to a corotating frame all we need to do
is add a rigid rotation to the shift vector. Of course, we must know the correct
152 GAUGE CONDITIONS
value of the orbital angular velocity Ω, but this can usually be obtained (at least
approximately) from the initial data.44
Notice that if we are using an elliptic shift condition like minimal distortion,
the rigid rotation term will only fix the boundary condition, and the value of
the shift in the interior will be given by the solution of the elliptic equation.
If, on the contrary, we use an evolution equation for the shift, then the rigid
rotation will fix the initial data for the shift everywhere. There is, however, an
important point that should be mentioned. Having an initial shift given by a pure
rigid rotation term assumes that we are working on a flat spacetime. In practice
we can expect this to work well in the asymptotically flat region far from the
sources, but not in the inner regions where the spacetime can be far from flat.
To see the type of problem we face let us consider for a moment maximal slicing
(K = 0) and conformally flat initial data (γij = ψ 4 δij ) for an orbiting binary,
and an initial shift given by a rigid rotation as above. Now, the ADM evolution
equation for the volume elements implies that the conformal factor ψ evolves as
ψ
∂t ψ = − αK − ∂i β i + β i ∂i ψ . (4.3.52)
6
But in this case we have K = 0, and also for a rigid rotation it is easy to verify
that ∂i β i = 0. The last equation then reduces to
∂t ψ − β i ∂i ψ = 0 . (4.3.53)
This is nothing more than the advection equation. We then find that ψ will
keep its original profile and will just advect with a speed given by −β i (at
least initially). The curious thing about this result is that naively we could have
expected that some value of Ω would bring us to a corotating frame where ψ
essentially does not evolve, but we have instead found that given a rigid rotation
with any value of Ω, the conformal factor ψ will just start rotating the opposite
way, so we would have been better off having zero shift. Of course, this is only
initially, we would expect that if we have a good value of the angular velocity
everything would settle after some initial transient to the true corotating frame.
A extreme example of the situation just mentioned occurs for binary black
holes that use puncture type data. In that case the initial conformal factor be-
haves as 1/r near the puncture, so that as long as K and ∂i β i remain regular
we will find that near the puncture the advection term β i ∂i ψ will dominate so
that again ψ will just advect with speed −β i . In other words, if the shift is not
zero at the puncture then the puncture will move. Since the whole point of going
to a corotating frame is to have the black holes approximately stationary, we
find that the rigid rotation must be modified at the puncture position to ensure
44 The effective potential method gives an approximate value of the angular velocity as
Ω = ∂Eb /∂J, with Eb the effective potential energy and J the ADM angular momentum. The
quasi-equilibrium approach usually gives a much better value of Ω by asking for the Komar
and ADM masses to be equal (see Chapter 3).
4.3 SHIFT CONDITIONS 153
that the shift becomes zero. The standard way to achieve this in practice is to
multiply the initial rigid rotation term with a negative power of ψ to guarantee
that the shift will vanish at the puncture [11, 80]:
1
β
t=0 = 3 Ω × ρ
. (4.3.54)
ψ
The third power is chosen because this guarantees that the magnitude of the
shift will become zero at the puncture in the non-conformal physical metric, and
also guarantees that the puncture will not evolve [13].
There is another important issue to consider when dealing with corotating
coordinates. Notice that as we move away from the center of rotation, the mag-
nitude of the rigid rotation shift vector increases as ρΩ. For a given value of Ω
there will then be a distance such that the shift will become superluminal – this is
known as the light-cylinder. We might then worry that outside the light-cylinder
causality effects will cause problems with a numerical simulations. In fact, some
simple numerical methods do indeed become unstable when the light-cones tip
beyond the vertical, but this problem is in fact very easy to overcome. Methods
using first-order in time evolution equations can be made to remain stable well
beyond the light-cylinder by simply using one-sided differences for the advection
terms on the shift (i.e. terms of the form β i ∂i ). What we do have to worry about
numerically is the fact that, as the magnitude of the corotating shift vector in-
creases with distance, the restrictions on numerical stability associated with the
CFL condition become more severe and the time step must be taken smaller and
smaller.
There is, however, a serious issue related to causality for superluminal shift
vectors that has to do with the outer boundary conditions. If we are evolving on
a cubical numerical grid and the boundary is located beyond the light-cylinder,
then there will be regions of the boundary whose entire causal past is outside
the computational domain (see Figure 4.3). We then have absolutely no physical
information that can be used to set up boundary conditions at those points.
In real-life simulations, standard practice has been to just keep using the same
boundary conditions we would use for simulations with a small shift at the bound-
ary and “hope for the best”, taking the pragmatic position that as long as the
boundaries remain stable and are sufficiently far away we can live with unphys-
ical boundary conditions. But it is well understood that the boundary issue is a
serious problem for corotating grids. A much better solution would be to use a
cylindrical computational domain.
Light-cylinder l1 = 1
Fig. 4.3: The light-cylinder marks the boundary of the region where the shift becomes
superluminal (essentially ρΩ > 1); outside this region the light cones are tilted beyond
the vertical. The arrows at the boundaries indicate those points whose causal past is
completely outside the computational domain so that there is no physical information
available to set up boundary conditions.
5.1 Introduction
When discussing the ADM equations in Chapter 2 we encountered the fact that
the 3+1 evolution equations are not unique, as we can add arbitrary multiples
of the constraints to obtain new systems of evolution equations that will have
the same physical solutions but will differ in their unphysical (i.e. constraint
violating) solutions and, more importantly, in their mathematical properties.
The original ADM equations differ from the standard version used in numerical
relativity (due to York) by precisely the addition of a term proportional to the
Hamiltonian constraint to the evolution equation of Kij . Also, the BSSNOK
system uses both the Hamiltonian constraint to modify the evolution equation
for the trace of the extrinsic curvature K, and the momentum constraints to
modify the evolution equations for the conformal connection functions Γ̃i .
There are of course an infinite number of ways in which we can add multiples
of the constraints to the evolution equations. The key question is then: Which
of these possible systems of equations will be better behaved both mathemati-
cally and numerically? Historically, attempts to write down different physically-
equivalent forms of the evolution equations for general relativity have followed
two different routes: an empirical route that started in the late 1980s and looked
for systems of equations that were simply better-behaved numerically than ADM,
and a mathematically formal route that looked for well-posed, particularly hy-
perbolic, reformulations of the evolution equations and that can be traced back
to the work of Choquet-Bruhat in the 1950s on the existence and uniqueness of
solutions to the Einstein equations [84]. In recent years these two approaches have
finally merged, and the traditionally more pragmatic numerical community has
realized that well-posedness of the underlying system of evolution equations is
essential in order to obtain stable and robust numerical simulations. At the same
time, it has been empirically discovered that well-posedness in not enough, as
some well-posed formulations have been found to be far more robust in practice
than others.
From the early 1990s a large number of alternative formulations of the 3+1
evolution equations have been proposed. We have now reached a point where
more formulations exist than there are numerical groups capable of testing them.
Because of this, here we will limit ourselves to discussing a small number of for-
mulations, chosen both because they are a representative sample of the different
approaches used, and because they correspond to the formulations used by the
majority of numerical evolution codes that exist today. The discussion of well-
155
156 HYPERBOLIC REDUCTIONS
posedness and hyperbolicity presented here is of necessity brief and touches only
on the main ideas, a more formal discussion can be found in the book by Kreiss
and Lorenz [177] (see also the review paper of Reula [241]).
5.2 Well-posedness
Consider a system of partial differential equations of the form
∂t u = P (D)u , (5.2.1)
with k and α constants that are independent of the initial data. That is, the
norm of the solution can be bounded by the same exponential for all initial data.
∂t u = −∂x2 u . (5.2.3)
Assume now that as initial data we take a Fourier mode u(0, x) = eikx . In that
case the solution to the last equation can be easily found to be
2
u(x, t) = ek t+ikx
. (5.2.4)
We then see that the solution grows exponentially with time, with an exponent
that depends on the frequency of the initial Fourier mode k. It is clear that by
45 One should not confuse the vectors we are considering here with vectors in the sense
increasing k we can increase the rate of growth arbitrarily, so the general solution
can not be bounded by an exponential that is independent of the initial data.
This also shows that given any arbitrary initial data, we can always add to it
a small perturbation of the form eikx , with 1 and k 1, such that after
a finite time the solution can be very different, so there is no continuity of the
solutions with respect to the initial data.
A second example is the two-dimensional Laplace equation where one of the
two dimensions is taken as representing “time”:
∂t u1 = −∂x u2 , (5.2.6)
∂t u2 = +∂x u1 , (5.2.7)
where the second equation simply states that partial derivatives of φ commute.
Again, consider a Fourier mode as initial data. The solution is now found to be
We again see that the solution grows exponentially with a rate that depends
on the frequency of the initial data k, so it can not be bounded in a way that
is independent of the initial data. This shows that the Laplace equation is ill-
posed when seen as a Cauchy problem, and incidentally explains why numerical
algorithms that attempt to solve the Laplace equation by giving data on one
boundary and then “evolving” to the opposite boundary are bound to fail (nu-
merical errors will explode exponentially as we march ahead).
The two examples above are rather artificial, as the inverse heat equation
is unphysical and the Laplace equation is not really an evolution equation. Our
third example of an ill-posed system is more closely related to the problem of
the 3+1 evolution equations. Consider the simple system
∂t u1 = ∂x u1 + ∂x u2 , (5.2.9)
∂t u2 = ∂x u2 . (5.2.10)
∂t u = M ∂x u , (5.2.11)
Again, consider the evolution of a single Fourier mode. The solution of the system
of equations can then be easily shown to be
∂t u + M i ∂i u = s(u) , (5.3.1)
where M i are n × n matrices, with the index i running over the spatial dimen-
sions, and s(u) is a source vector that may depend on the u’s but not on their
derivatives. In fact, if the source term is linear in the u’s we can show that
the full system will be well-posed provided that the system without sources is
well-posed. We will therefore ignore the source term from now on. Also, we will
assume for the moment that the coefficients of the matrices M i are constant.
There are several different ways of introducing the concept of hyperbolic-
ity of a system of first order equations like (5.3.1).46 Intuitively, the concept
of hyperbolicity is associated with systems of evolution equations that behave
as generalizations of the simple wave equation. Such systems are, first of all,
well-posed, but they also should have the property of having a finite speed of
propagation of signals, or in other words, they should have a finite past domain
of dependence.
We will start by defining the notion of hyperbolicity based on the properties of
the matrices M i , also called the characteristic matrices. Consider an arbitrary
unit vector ni , and construct the matrix P (ni ) := M i ni , also known as the
46 One can in fact also define hyperbolicity for systems of second order equations (see for
example [154, 155, 212]), but here we will limit ourselves to first order systems as we can
always write the 3+1 evolution equations in this form.
5.3 THE CONCEPT OF HYPERBOLICITY 159
principal symbol of the system of equations (one often finds that P is multiplied
with the imaginary unit i, but here we will assume that the coefficients of the
M i are real so we will not need to do this). We then say that the system (5.3.1)
is strongly hyperbolic if the principal symbol has real eigenvalues and a complete
set of eigenvectors for all ni . If, on the other hand, P has real eigenvalues for all
ni but does not have a complete set of eigenvectors then the system is said to
be only weakly hyperbolic (an example of a weakly hyperbolic system is precisely
the Jordan block considered in the previous Section). For a strongly hyperbolic
system we can always find a positive definite Hermitian (i.e. symmetric in the
purely real case) matrix H(ni ) such that
HP − P T H T = HP − P T H = 0 , (5.3.2)
where the superindex T represents the transposed matrix. In other words, the
new matrix HP is also symmetric, and H is called the symmetrizer. The sym-
metrizer is in fact easy to find. By definition, if the system is strongly hyperbolic
the symbol P will have a complete set of eigenvectors ea such that (here the
index a runs over the dimensions of the space of solutions u)
P ea = λa ea , (5.3.3)
which is clearly Hermitian and positive definite. To see that HP is indeed sym-
metric notice first that
R−1 P R = Λ , (5.3.5)
with Λ = diag(λa ) (this is just a similarity transformation of P into the basis of
its own eigenvectors). We can then easily see that
arise. We can also define a strictly hyperbolic system as one for which the eigen-
values of the principal symbol P are not only real but are also distinct for all
ni . Of course, this immediately implies that the symbol can be diagonalized, so
strictly hyperbolic systems are automatically strongly hyperbolic. This last con-
cept, however, is of little use in physics where we often find that the eigenvalues
of P are degenerate, particularly in the case of many dimensions.
The importance of the symmetrizer H is related to the fact that we can use
it to construct an inner product and norm for the solutions of the differential
equation in the following way
u, v := u† Hv , (5.3.7)
||u||2 := u, u = u† Hu , (5.3.8)
where u† is the adjunct of u, i.e. its complex-conjugate transpose (we will allow
complex solutions in order to use Fourier modes in the analysis). In geometric
terms the matrix H plays the role of the metric tensor in the space of solutions.
The norm defined above is usually called an energy norm since in some simple
cases it coincides with the physical energy.
We can now use the evolution equations to estimate the growth in the energy
norm. Consider a Fourier mode of the form
u(x, t) = ũ(t)eikx·n . (5.3.9)
We will then have
∂t ||u||2 = ∂t u† Hu = ∂t (u† )Hu + u† H∂t (u)
= ikũT P T H ũ − ikũT HP ũ
= ikũT P T H − HP ũ = 0 , (5.3.10)
where on the second line we have used the evolution equation (assuming s = 0).
We then see that the energy norm remains constant in time. This shows that
strongly and symmetric hyperbolic systems are well-posed. We can in fact show
that hyperbolicity and the existence of a conserved energy norm are equivalent,
so instead of analyzing the principal symbol P we can look directly for the
existence of a conserved energy to show that a system is hyperbolic. Notice that
for symmetric hyperbolic systems the energy norm will be independent of the
vector ni , but for systems that are only strongly hyperbolic the norm will in
general depend on ni .
Now, for a strongly hyperbolic system we have by definition a complete set
of eigenvectors and we can construct the matrix of eigenvectors R. We will use
this matrix to define the eigenfunctions wi (also called eigenfields) as
u=Rw ⇒ w = R−1 u . (5.3.11)
Notice that, just as was the case with the eigenvectors, the eigenfields are only
defined up to an arbitrary scale factor. Consider now the case of a single spatial
dimension x. By multiplying equation (5.3.1) with R−1 on the left we find that
5.3 THE CONCEPT OF HYPERBOLICITY 161
∂t w + Λ ∂x w = 0 , (5.3.12)
so that the evolution equations for the eigenfields decouple. We then have a set of
independent advection equations, each with a speed of propagation given by the
corresponding eigenvalue λa . This is the mathematical expression of the notion
that associates a hyperbolic system with having independent “wave fronts” prop-
agating at (possibly different) finite speeds. Of course, in the multidimensional
case the full system will generally not decouple even for symmetric hyperbolic
systems, as the eigenfunctions will depend on the vector ni .
We can in fact use the eigenfunctions also to study the hyperbolicity of a sys-
tem; the idea here would be to construct a complete set of linearly independent
eigenfunctions wa that evolve via simple advection equations starting from the
original variables ua . If this is possible then the system will be strongly hyper-
bolic. For systems with a large number of variables this method is often simpler
than constructing the eigenvectors of the principal symbol directly, as finding
eigenfunctions can often be done by inspection (this is in fact the method we
will use in the following Sections to study the hyperbolicity of the different 3+1
evolution systems).
Up until now we have assumed that the characteristic matrices M i have
constant coefficients, and also that the source term s(u) vanishes. In the more
general case when s(u)
= 0 and M i = M i (t, x, u) we can still define hyperbol-
icity in the same way by linearizing around a background solution û(t, x) and
considering the local form of the matrices M i , and we can also show that strong
and symmetric hyperbolicity implies well-posedness. The main difference is that
now we can only show that solutions exist locally in time, as after a finite time
singularities in the solution may develop (e.g. shock waves in hydrodynamics,
or spacetime singularities in relativity). Also, the energy norm does not remain
constant in time but rather grows at a rate that can be bounded independently
of the initial data. A particularly important sub-case is that of quasi-linear sys-
tems of equations where we have two different sets of variables u and v such
that derivatives in both space and time of the u’s can always be expressed as
(possibly non-linear) combinations of v’s, and the v’s evolve though equations of
the form ∂t v + M i (u) ∂i v = s(u, v), with the matrices M i functions only of the
u’s. In such a case we can bring the u’s freely in and out of derivatives in the
evolution equations of the v without changing the principal part by replacing all
derivatives of u’s in terms of v’s, and all the theory presented here can be ap-
plied directly. As we will see later, the Einstein field equations have precisely this
property, with the u’s representing the metric coefficients (lapse, shift and spatial
metric) and the v’s representing both components of the extrinsic curvature and
spatial derivatives of the metric.
First order systems of equations of type (5.3.1) are often written instead as
where F i are vector valued functions of the u’s (and possibly the spacetime
162 HYPERBOLIC REDUCTIONS
coordinates), but not of their derivatives. The vectors F i are called the flux
vectors, and it terms of them the characteristic matrices are simply given by
i ∂Fai
Mab := , (5.3.14)
∂ub
that is, the matrices M i are simply the Jacobian matrices associated with the
fluxes F i (notice that here i ∈ (1, 2, 3) runs over the spatial dimensions, while
(a, b) ∈ (1, ..., n) run over the dimensions of the space of solutions). A system of
the form (5.3.13) is called a balance law, since the change in u within a small
volume element is given by a balance between the fluxes entering (or leaving) the
volume element and the sources. In the special case where the sources vanish,
the system is called instead a conservation law. This is because in such a case
we can use the divergence theorem to show that if u has compact support then
the integral of u over a volume outside such support is independent of time, i.e.
it is conserved.
The eigenfields then turn out to be (we have rescaled them for convenience)
w1 = Π − ni Ψ i , (5.3.25)
i
w2 = Π + ni Ψ i , (5.3.26)
i
w3 = pi Ψ i , (5.3.27)
i
w4 = qi Ψi , (5.3.28)
i
with p = (−nx nz , −ny nz , n2x + n2y ) and q = (−ny nx , n2x + n2z , −ny nz ). Notice
that both pi and qi are orthogonal to ni . So we see that the longitudinal modes
(parallel to ni ) propagate at speeds ±v, and the transverse modes (orthogonal
to ni ) do not propagate. This example clearly shows that for symmetric hyper-
bolic systems the eigenfields wa in general depend on the vector ni , even if the
symmetrizer does not.
Consider now the energy norm. Since the symmetrizer for the wave equation
is unity (the principal symbol is already symmetric), our energy norm is simply
2 2
2
||u||2 = Π2 + (Ψi ) = (∂t φ) + v 2 (∂i φ) , (5.3.29)
i i
which in this case coincides with the physical energy density of the wave. Notice
that we should really understand this energy in an integrated sense:
164 HYPERBOLIC REDUCTIONS
!
2
E= ||u|| dV =
2 2
Π + (Ψi ) dV . (5.3.30)
i
The divergence theorem now implies that for initial data with compact support
dE/dt = 0.
As a final comment we should mention the fact there is one more equation
that we have left out of the above analysis, namely ∂t φ = Π. But this equation
has no fluxes and evolves only through a source term. If we included it in the
analysis we would simply have to add one new row and column to the matrix
P with zeroes as entries. The result would be one more eigenfunction that does
not propagate – namely φ itself. Since this does not add any new information
we will generally leave out of the analysis the evolution equations of lower order
quantities such as φ.
where for simplicity we have assumed that we are in vacuum (the matter terms
can in any case be considered as sources). This system is first order in time but
second order in space, as the Ricci tensor Rij contains second derivatives of the
spatial metric γij , and also there are second derivatives of the lapse function α
in the evolution equation for Kij . In order to have a purely first order system
we introduce the quantities
1
ai := ∂i ln α , dijk := ∂i γjk . (5.4.3)
2
5.4 HYPERBOLICITY OF THE ADM EQUATIONS 165
We will also assume that the shift vector β i is a known (i.e. a priori given)
function of space and time. The lapse, on the other hand, will be considered a
dynamical quantity that evolves through an equation of the form
∂t α − β i ∂i α = −α2 Q , (5.4.4)
with the explicit form of the gauge source function Q to be fixed later.
Since the characteristic structure is only related to the principal part, from
now on we will ignore all source terms, that is all terms that do not contain
derivatives of ai , dijk , and Kij . In terms of the dijk , the Ricci tensor can be
written as
1
Rij − γ lm ∂l ∂m γij + γk(i ∂j) Γk
2
−∂m dm ij + 2 ∂(i dm mj) − ∂(i dj)m m , (5.4.5)
where here the symbol denotes “equal up to principal part”. There is in fact
an ordering ambiguity in the last expression, as the definition of dijk implies the
constraint
∂i djmn = ∂j dimn , (5.4.6)
so that we can write the Ricci tensor in different ways that are only equivalent if
the constraint above is satisfied. However, here we will ignore this issue and use
the expression for the Ricci tensor given above (known as the De Donder–Fock
decomposition). We are now in a position to write down the evolution equations
for ai , dijk and Kij :
∂0 ai −α ∂i Q , (5.4.7)
∂0 dijk −α ∂i Kjk , (5.4.8)
∂0 Kij −α ∂k Λkij , (5.4.9)
Notice that here we have another ordering ambiguity with the second derivatives
of α, but this ambiguity can be resolved in a unique way by taking the explicitly
symmetric expression: ∂i ∂j α = [∂i (αaj ) + ∂j (αai )]/2.
As mentioned in the previous Section, for the characteristic analysis we will
ignore the evolution of the lower order quantities α and γij . We then have a
system of 27 equations to study corresponding to the three components of ai ,
the 18 independent components of dijk , and the six independent components of
Kij .
To proceed with the characteristic analysis we will choose a specific direction
x and ignore derivatives along the other directions; in effect we will only be
analyzing the matrix M x . The reason why we can do this instead of analyzing
166 HYPERBOLIC REDUCTIONS
the full symbol P = ni M i is that the tensor structure of the equations makes all
spatial directions have precisely the same structure, so it is enough to analyze
just one of them. The idea is then to find 27 independent eigenfunctions that
will allow us to recover the 27 original quantities, where by eigenfunctions here
we mean linear
$ combinations of the original quantities u = (ai , dijk , Kij ) of the
form wa = b Cab ub , that up to principal part evolve as ∂t wa + λa ∂x wa 0,
with λa the corresponding eigenspeeds. Notice that even if the coefficients Cab
should not depend on the u’s, they can in fact be functions of the lapse α and
the spatial metric γij (in order to avoid confusion we will use the Latin indices
at the beginning of the alphabet {a, b, c, ...} to identify the different fields, and
those starting from {i, j, k, ...} to denote tensor indices).
Then taking into account only derivatives along the x direction we immedi-
ately see that there is a set of fields that propagate along the time lines with
velocity λ0 = −β x . These fields are
aq , dqij (q
= x) . (5.4.11)
We have therefore already found 14 out of a possible 27 characteristic fields.
Consider now the evolution equation for the Λxpq , with p, q
= x. Ignoring
again derivatives along directions different from x we find
∂0 Λxpq = ∂0 dx pq −αγ xx ∂x Kpq , (5.4.12)
Comparing with (5.4.9) we see that we have found another set of characteristic
fields that propagate along the light-cones with speeds47
√
λlight
± = −β x ± α γ xx . (5.4.13)
These eigenfields are √ xx
γ Kpq ∓ Λxpq . (5.4.14)
This gives us six new characteristic fields, so we now have a total of 20 out of
27.
From this point on things become somewhat more complicated. Notice that
we still need to find characteristic fields that will allow us to recover the seven
fields ax , dxxi , and Kxi . We start by writing down the evolution equation for
Λxxq :
1
∂0 Λxxq = ∂0 dx xq + ∂0 (aq + dqm m − 2dm mq )
2
−αγ xx ∂x Kxq + α ∂x Kqx
αγ xp ∂x Kpq . (5.4.15)
47 Here we have used the observation that a pair of functions (u , u ) that evolve through
1 2
equations of the form
∂t u1 = a ∂x u2 , ∂t u2 = b ∂x u1 ,
have the structure of the √
simple one-dimensional wave equation. The speeds of propagation for
this system are simply ± ab, and the corresponding eigenfields are w± = u1 ∓ (a/b)1/2 u2 .
5.4 HYPERBOLICITY OF THE ADM EQUATIONS 167
This equation presents us with a serious problem: It shows that while Kxq evolves
through derivatives of Λxxq , the evolution of Λxxq is independent of Kxq and only
involves derivatives of Kpq . This subsystem can therefore not be diagonalized.
Notice in particular that if the metric γij is diagonal then Λxxq will remain con-
stant up to principal part, which implies that Kxq will grow linearly unless Λxxq
vanishes initially (of course, the presence of source terms makes things more
complicated). This shows that the ADM system is not strongly hyperbolic; it
is only weakly hyperbolic as we can in fact show that the eigenvalues of the
principal symbol are all real.
We have then found that we can not recover the four quantities Kxq and Λxxq
(and hence dxxq ) from characteristic fields. What about the quantities ax , dxxx ,
and Kxx? Even though we already know that the full system is not strongly
hyperbolic, there is still something very important to learn from these last three
quantities. As it turns out, it is in fact easier to work with the traces K and
Λx := γ mn Λxmn . The evolution equation for K clearly has the form
∂0 K −α ∂x Λx . (5.4.16)
The evolution equation for Λx , on the other hand, becomes
∂0 Λx = ∂0 ax + 2 ∂0 (dxm m − dm mx )
−αγ xx ∂x Q − 2α (γ xx ∂x K − ∂x K xx ) . (5.4.17)
Notice first that this is the only equation that involves the gauge source function
Q, so that the explicit form of Q only affects the evolution of K and Λx (and ax
of course). Before discussing the form of Q let us for a moment recall the explicit
form of the momentum constraints:
Dj K ij − γ ij K = 8πj i , (5.4.18)
which up to principal part becomes
∂j K ij − γ ij ∂j K 0 . (5.4.19)
Considering now only derivatives along x we see that this implies, in particular,
∂x K xx γ xx ∂x K , (5.4.20)
∂x Kqx 0 . (5.4.21)
This means that if the momentum constraints are satisfied identically the evo-
lution equations for Λxxq and Λx , equations (5.4.15) and (5.4.17), become
∂0 Λxxq −αγ xx ∂x Kxq (5.4.22)
∂0 Λx −αγ xx ∂x Q (5.4.23)
This immediately solves our problem with the pair Kxq and Λxxq , as now they
behave exactly like Kpq and Λxpq , so this subsystem would also be strongly hy-
√
perbolic with the eigenfunctions γ xx Kxq ∓ Λxxq propagating at the speed of
light.
168 HYPERBOLIC REDUCTIONS
On the other hand, for K and Λx we still need to say something about the
form of Q. The simplest choice is to say that, just like the shift, the lapse is also
an a priori known function of spacetime, in which case Q is just a source term
and can be ignored. We can now clearly see, however, that this is a very bad idea
as we would then have Λx constant up to principal part and K evolving through
derivatives of Λx , which is precisely the same type of problem we had identified
before. So having a prescribed lapse will result in a weakly hyperbolic system
even if the momentum constraints are identically satisfied. Notice, however, that
√
if we were to take instead the densitized lapse α̃ = α/ γ as a prescribed function
of spacetime, then we would have
α̃ √
∂t − £β α = √ ∂t − £β γ + γ ∂t − £β α̃
2 γ
√
= −α2 K + γ F (x, t) , (5.4.24)
with F (x, t) := (∂t − £β ) α̃. This essentially means that Q = K (the second term
can be ignored as it contributes only to a source term). The equations for K and
Λx would now become
∂0 K −α ∂x Λx , (5.4.25)
∂0 Λ −αγ
x xx
∂x K , (5.4.26)
so that we have another pair of modes propagating at the speed of light, namely
√
γ xx K ∓ Λx . More generally, we can take a slicing condition of the Bona–Masso
family (4.2.52), which implies Q = f (α)K. The gauge eigenfields would then be
√
f γ xx K ∓ Λx , and they would propagate with the gauge speeds
λgauge
± = −β x ± α f γ xx . (5.4.27)
These two gauge eigenfields will allow us to recover two of the three quantities
ax , dxxx , and Kxx . The last quantity can be recovered by noticing that
so that we have another independent eigenfield propagating along the time lines.
We have then found that, provided that 1) the momentum constraints can
be guaranteed to be identically satisfied, and 2) either the densitized lapse α̃ is
assumed to be a known function of spacetime (but not the lapse itself), or we
use a slicing condition of the Bona–Masso family, then the ADM system would
be strongly hyperbolic.48
Taking a slicing condition of the Bona–Masso type is simple enough, but
guaranteeing that the momentum constraints are identically satisfied is alto-
gether a different matter. First, numerically the constraints will inevitably be
48 In fact, we could also use maximal slicing since in that case K = 0 and the a are obtained
i
through an elliptic equation. We would then have a coupled elliptic-hyperbolic system.
5.5 THE BONA–MASSO AND NOR FORMULATIONS 169
violated, and second, even at the continuum level strong hyperbolicity would
only be guaranteed for a very specific type of initial data, so the system would
not be well-posed as such. We are then forced to conclude that ADM is in fact
only weakly hyperbolic. Still, we have learned two important lessons here: The
lapse itself can not be assumed to be a known function of spacetime, and the
momentum constraints must play a crucial role.
As a final comment, notice that here we have only considered the standard
(York) version of the ADM equations. The original ADM evolution equations
add a multiple of the Hamiltonian constraint which makes the analysis somewhat
more difficult. Still, we will come back to the original ADM system in the next
Section.
Vi := dim m − dm mi . (5.5.1)
We then write the evolution equations for ai , dijk and Kij just as before as
49 One often finds some confusion between the Bona–Masso family of slicing conditions de-
scribed in Chapter 4 (and also used here), and the Bona–Masso formulation of the 3+1 evolution
equations. The Bona–Masso formulation uses their slicing condition, but the slicing condition
itself is in fact quite general and can be used with any form of the evolution equations.
170 HYPERBOLIC REDUCTIONS
∂0 ai −α ∂i Q , (5.5.3)
∂0 dijk −α ∂i Kjk , (5.5.4)
∂0 Kij −α ∂k Λkij , (5.5.5)
where now
Λkij := dk ij + δ(i
k
aj) + 2Vj) − dj)m m . (5.5.6)
Up to this point we have done nothing more than introduce a short-hand for
a certain combination of d’s, but now we will take a big leap and consider the
Vi to be independent dynamical quantities in their own right and take their
definition (5.5.1) as a constraint. If the Vi are independent quantities we will
need an evolution equation for them, which can be obtained directly from their
definition. In order to write these evolution equations we first notice that
1 im
Vi = d m − Γi , (5.5.7)
2
with Γi := γ lm Γilm the contracted Christoffel symbols. We can show, after some
algebra, that
γ lm ∂l ∂m β i = D2 β i − £β Γi − 2Γilm Dm β n + Rm
i
βm , (5.5.9)
∂t Γi = D2 β i +Rm
i
βm −Dl α 2K il − γ il K +2 αK lm − Dl β m Γilm . (5.5.10)
Using now the evolution equation for Γi we find that the Vi evolve up to principal
part as
∂0 Vi α ∂j Kij − ∂i K . (5.5.11)
This is a very beautiful result, as it shows that the principal part of the evolution
equation for Vi is precisely the same as the principal part of the momentum
constraints.
Assume now that we add 2αM i to the evolution equation for Γi , where
M := Dj K ij − γ ij K − 8πj i = 0 are the momentum constraints. We are of
i
5.5 THE BONA–MASSO AND NOR FORMULATIONS 171
course perfectly free to do this as the physical solutions will remain the same.
We would then have
∂t Γi = £β Γi + γ lm ∂l ∂m β i − αγ il ∂l K
+ αal 2K il − γ il K + 2αK lm Γilm − 16πj i , (5.5.12)
with ai = ∂i ln α as before. The evolution equation for the Vi up to principal
part now becomes
∂0 Vi 0 , (5.5.13)
so that Vi evolves along the time lines.
We can now repeat the analysis we did before for ADM, but now for the 30
independent quantities u = (ai , dijk , Kij , Vi ). Considering as before only deriva-
tives along the x direction we immediately see that there are now 17 fields that
propagate along the time lines, namely
aq , dqij , Vi (q
= x) . (5.5.14)
For Λxpqand Kpq with p, q
= x, we find the same thing as in the ADM case, as
independently of the introduction of the Vi we still find that Λxpq = dx pq , so that
again we have the six eigenfields
√ xx
γ Kpq ∓ Λxpq , (5.5.15)
propagating along the light-cones.
The difference with ADM is related to the remaining seven fields ax , dxxi ,
and Kxi . For the evolution of Λxxq we now find
1
∂0 Λxxq = ∂0 dx xq + ∂0 (aq + 2Vq − dqm m )
2
−αγ xx ∂x Kxq , (5.5.16)
so that the four eigenfields
√ xx
γ Kxq ∓ Λxxq , (5.5.17)
also propagate along the light-cones. Moreover, for the evolution of the trace
Λx := γ mn Λxmn we now have
∂0 Λx = ∂0 ax + 2 ∂0 V x
−αγ xx ∂x Q . (5.5.18)
If we again take a slicing condition of the Bona–Masso type so that Q = f (α)K,
we find the two gauge eigenfields
f γ xx K ∓ Λx , (5.5.19)
that propagate with the gauge speeds
λgauge
± = −β x ± α f γ xx . (5.5.20)
The last eigenfield is again ax − f dxm which also propagates along the
m
time lines. In summary, we have 18 eigenfields propagating along the time lines,
172 HYPERBOLIC REDUCTIONS
This is entirely equivalent since changing the Vi for the Γi only corresponds to
a simple change of basis in the characteristic matrices, and does not alter their
eigenvalues or eigenvectors.
where as before M i are the momentum constraints and ξ is also a constant pa-
rameter. Notice that the particular case η = 0, ξ = 2 reduces to the Bona–Masso
formulation.50 Also, the case η = ξ = 0 is essentially equivalent to the standard
ADM formulation (à la York), while ξ = 0 and η = −1/4 corresponds to the
original ADM formulation. The fact that the Γi are introduced as independent
quantities has no effect if we take ξ = 0, since in that case their evolution equa-
tions are not modified and the characteristic structure of the system is identical
to that of ADM with the addition of three extra eigenfields that propagate along
the time lines, namely Γi − 2dm mi + dim m .
50 In fact, the Bona–Masso formulation also considered non-zero values of η, with the special
cases η = 0 and η = −1/4 called the “Ricci” and “Einstein” systems respectively.
5.5 THE BONA–MASSO AND NOR FORMULATIONS 173
The hyperbolicity analysis for the NOR system can be done in a number
of different ways. For example, Nagy, Ortiz, and Reula use a method based on
pseudo-differential operators in order to analyze directly the second order system
without the need to introduce the first order quantities dijk [212]. For simplicity,
however, here we will keep with our first order approach, but will not go into a
detailed analysis since it proceeds in just the same way as before. If we again
use a slicing condition of the Bona–Masso type51 , and consider only derivatives
along the x direction, we find that there are 18 fields that propagate along the
time lines with speed −β x ; they are
aq , dqij , ax − f dxm m , Γi + (ξ − 2) dm mi + (1 − ξ) dim m , (5.5.25)
with q
= x. For the rest of the characteristic fields it turns out to be convenient
to define the projected metric onto the two-dimensional surface of constant x as
hij := γij − (sx )i (sx )j , with s
x the normal vector to the surface:
√ √
(sx )i = δix / γ xx (sx )i = γ xi / γ xx . (5.5.26)
We can now use this projected metric to define the “surface-trace” of Kij as
with an analogous definition for Λ̂x . Using this we find that the following char-
acteristic fields
√ xx hpq hpq x
γ Kpq − K̂ ∓ Λpq −
x
Λ̂ , (p, q)
= x , (5.5.28)
2 2
propagate along the light-cones with speeds
√
λlight
± = −β x ± α γ xx . (5.5.29)
Notice that these are only four independent eigenfields, since the combination
Kpq − hpq K̂/2 is both symmetric and surface-traceless. This result is to be ex-
pected, as these fields represent the transverse-traceless part associated with the
gravitational waves. The four fields correspond to having two independent po-
larizations that can travel in either the positive or negative x directions. The
traces K̂ and Λ̂x also form an eigenfield pair given by
1/2
{γ xx [1 + 2η (2 − ξ)]} K̂ ∓ Λ̂x , (5.5.30)
propagating with the speeds
1/2
λtrace
± = −β x ± α {γ xx [1 + 2η (2 − ξ)]} . (5.5.31)
Notice that for ξ = 2 these two fields also propagate at the speed of light.
51 The original NOR system in fact does not use the Bona–Masso slicing condition, but rather
takes a general densitized lapse of the form Q = αγ −σ/2 , with σ =constant, as a fixed function
of spacetime. The results, however, are entirely equivalent to the ones discussed here if we take
σ = f.
174 HYPERBOLIC REDUCTIONS
1
ai := ∂i ln α , d˜ijk := ∂i γ̃jk , Φi := ∂i φ , (5.6.1)
2
and assume the shift to be a given function of spacetime. For the hyperbolicity
analysis we now have to consider the 30 quantities u = (ai , Φi , d˜ijk , K, Ãij , Γ̃i ).
Notice that these are in fact only 30 quantities and not 34, as Ãij is traceless
by definition, and also γ̃ jk d˜ijk = 0 as a consequence of the fact that γ̃ij has
unit determinant. The evolution equations guarantee that if these constraints
are satisfied initially they will remain satisfied during evolution (numerically,
however, we finds that this does not hold exactly and the trace of Ãij drifts, so
that the constraint trÃij = 0 must be actively enforced during a simulation).
52 At the time of writing this text, BSSNOK has become the dominant formulation used
in 3D numerical relativity, and is used in one way or another in practically all 3+1 based
codes. The only real contenders to BSSNOK are hyperbolic formulations that evolve the full
four-dimensional spacetime metric and are not directly based on the 3+1 formalism [191, 231].
176 HYPERBOLIC REDUCTIONS
Starting from the BSSNOK evolution equations of Section 2.8, we find that
up to principal part the evolution equations for our dynamical quantities become
∂0 ai −α ∂i Q , (5.6.2)
1
∂0 Φi − α ∂i K , (5.6.3)
6
∂0 d˜ijk −α ∂i Ãjk , (5.6.4)
−4φ mn
∂0 K −αe γ̃ ∂m an (5.6.5)
∂0 Ãij −αe−4φ ∂k Λ̃kij , (5.6.6)
2
∂0 Γ̃i α∂k (ξ − 2) Ãik − ξγ̃ ik K , (5.6.7)
3
where we have now defined
T F
Λ̃kij := d˜kij + δ(i
k
aj) − Γ̃j) + 2Φj) , (5.6.8)
and where we have allowed for an arbitrary multiple of the momentum constraint
to be added to the evolution equation for Γ̃i (standard BSSNOK corresponds to
choosing ξ = 2). In the following we will again assume that we are using a slicing
condition of the Bona–Masso family so that Q = f K.
For the hyperbolicity analysis we again consider only derivatives along the
x direction. Again we immediately find that there are 18 fields that propagate
along the time lines with speed −β x :
aq , Φq , d˜qij , ax − 6f Φx , Γ̃i + (ξ − 2) d˜m mi − 4ξ γ̃ ik Φk , (5.6.9)
with q
= x (again, these are only 18 fields since d˜qij is traceless). Now, the fact
that in the BSSNOK formulation the trace of the extrinsic curvature K has been
promoted to an independent variable and the Hamiltonian constraint has been
used to simplify its evolution equation, makes the gauge eigenfields very easy to
identify. The two gauge eigenfields are
e−2φ f γ̃ xxK ∓ ax , (5.6.10)
which propagate with the gauge speeds
λgauge
± = −β x ± αe−2φ f γ̃ xx . (5.6.11)
There is another set of four eigenfields that again correspond to longitudinal
components and are be given by
e2φ γ̃ xx (ξ/2) Ãxq ∓ Λ̃xx q , (5.6.12)
with corresponding characteristic speeds
λlong
± = −β x ± αe−2φ γ̃ xx (ξ/2) . (5.6.13)
These eigenfields in fact correspond to the fields (5.5.32) of the NOR formulation
discussed in the previous Section. Notice that for standard BSSNOK we have
5.6 HYPERBOLICITY OF BSSNOK 177
with p, q
= x, which propagate with speeds
λlight
± = −β x ± αe−2φ γ̃ xx , (5.6.15)
i.e. along the light-cones. There are in fact only four independent eigenfields
here, because we can easily check that
γ̃pq hpq
Ãpq + xx Ãxx = e−4φ Kpq − K̂ , (5.6.16)
2γ̃ 2
with hij the metric tensor projected onto the surface of constant x introduced in
the previous Section and K̂ = hij Kij . These eigenfields are therefore symmetric
and surface-traceless, so that they have only four independent components.
The final two characteristic fields represent the trace of the transverse com-
ponents and are given by
2 2
e2φ γ̃ xx (2ξ − 1)/3 Ãxx − γ̃ xx K ∓ Λ̃xxx − γ̃ xx ãx , (5.6.17)
3 3
down a conserved energy norm for this system that is independent of any partic-
ular direction. By “conserved” what we really mean here is that up to principal
part its time derivative can be written as a full divergence, so that its integral
in space vanishes for data of compact support.
The energy norm must be positive definite, and must include all the 30 inde-
pendent fields we are considering. In order to construct this energy norm we will
start by noticing that the terms corresponding to fields that propagate along the
time lines are trivial. For example,
where again we are only considering principal terms. Here and in what follows,
the square of an object with indices will be understood as the norm calculated
using the conformal metric, i.e. (Tn )2 ≡ γ̃ mn Tm Tn and so on. In a similar way
we also find that 2
∂t Γ̃n + (ξ − 2)d˜m mn − 4Φn 0 . (5.6.21)
So we already have two terms in the energy norm. The gauge sector is also easily
found, as we have
∂t K 2 −2αKe−4φ γ̃ mn ∂m an , (5.6.22)
2
∂t (an ) −2αf γ̃ mn
an ∂m K , (5.6.23)
so that
e−4φ 2
∂t 2
K + (an ) −2∂m (αγ̃ mn an K) , (5.6.24)
f
that is, we have a complete divergence.
For the energy terms involving Ãmn and d˜kmn we notice that
2
∂t d˜kmn −2α d˜kmn ∂k Ãmn , (5.6.25)
2
∂t Ãmn −2α e−4φ Ãmn ∂k d˜k mn + ∂m an − Γ̃n + 2Φn . (5.6.26)
The time derivative of (Ãmn )2 involves terms other than the divergence of d˜kmn ,
so we need to study how these other terms evolve. Consider then
2
∂t an − Γ̃n + 2Φn 2γ̃ mn an − Γ̃n + 2Φn ∂t am − Γ̃m + 2Φm (5.6.27)
2ξ − 1
−2α an − Γ̃n + 2Φn (ξ − 2) ∂l à −
lm ˜
− f ∂ K . (5.6.28)
m
3
The first term involves a divergence of Ãmn so that we can in principle use it
together with the time derivative of (Ãmn )2 to form a total divergence, but the
5.7 THE KIDDER–SCHEEL–TEUKOLSKY FAMILY 179
second term involving the gradient of K spoils this. In order to build our total
divergence we must then ask for
2ξ − 1 1 + 3f
−f =0 ⇒ ξ= . (5.6.29)
3 2
This means that if we want to have a symmetric hyperbolic system, the amount of
momentum constraint added to the evolution equation for Γ̃i must be determined
by the gauge source function f . Doing this we find that
2 2 e−4φ 2
−4φ ˜
∂t Ãmn + e dkmn + an − Γ̃n + 2Φn (5.6.30)
ξ−2
" #
−2αe−4φ ∂k Ãmn d˜k mn + ∂k Ãkm an − Γ̃n + 2Φn , (5.6.31)
NOK (i.e. a different symmetrizer) for which all characteristic speeds can remain causal, and
that, in particular, allows us to have a symmetric hyperbolic system with harmonic slicing [154].
Such an energy norm, however, involves mixed terms of the form γ̃ nl d˜m mn (al − Γ̃l + 2Φl ), so
that showing that it is positive definite becomes considerably more difficult.
180 HYPERBOLIC REDUCTIONS
We will now proceed to modify the evolution equations for the extrinsic cur-
vature Kij and the metric derivatives dijk by adding multiples of the constraints
in the following way
∂t dijk = (· · · ) + αξγi(j Mk) + αχγjk Mi , (5.7.4)
∂t Kij = (· · · ) + αηγij H , (5.7.5)
with (ξ, χ, η) constant parameters, and where (· · · ) denotes the original ADM
values for the right hand side prior to adding multiples of the constraints, and
where, as before, H and M i are the Hamiltonian and momentum constraints,
which up to principal part become
H R −2∂m V m , (5.7.6)
Mi ∂m Kim − ∂i K . (5.7.7)
For the analysis of this system we will again use a Bona–Masso type slicing
condition ∂t α = −α2 f K, and introduce the derivatives of the lapse ai := ∂i ln α.
There is, however, an important point related to the slicing condition that should
be mentioned. Just as is done in the NOR system, the standard KST formulation
uses a fixed densitized lapse of the form Q = αγ −σ/2 instead of the Bona–Masso
slicing condition we will use here. When discussing the NOR system in Section 5.5
we mentioned the fact that these two approaches were actually equivalent; how-
ever, in the case of KST this is not true anymore. The reason for this is that
by densitizing the lapse we are transforming derivatives of ai into derivatives of
dim m . Now, as long as the evolution equations for the d’s are not modified we
find that dim m evolves only through K, so using a Bona–Masso slicing condition
is completely equivalent to densitizing the lapse. But if the evolution equations
for the d’s are modified using the momentum constraints, as is done in KST, then
the evolution of dim m involves other components of Kij , and the two approaches
are not equivalent anymore. Still, the use of a Bona–Masso slicing condition
instead of a densitized lapse results in simpler equations so we will follow this
approach here.
Going back to our evolution equations we find that up to principal part they
take the form
∂0 ai −αf ∂i K , (5.7.8)
∂0 dijk −α∂i Kjk + αξγi(j Mk) + αχγjk Mi , (5.7.9)
∂0 Kij −α∂m Λm
ij , (5.7.10)
∂0 Vi α (1 − ξ + 2χ) Mi , (5.7.12)
so that the combination of d’s represented by Vi evolves only through the mo-
mentum constraints. However, and in contrast with what happened in the Bona–
Masso formulation, it is now not a good idea to choose the parameters in such
a way that the Vi remain constant up to principal part, for a reason that will
become clear below.
The analysis now proceeds in much the same way as before. If we consider
only derivatives along the x direction we find that there are now 15 fields that
propagate along the time lines, namely (q
= x)
1
aq , dqij − ξγq(i Vj) + χγij Vq , (5.7.13)
1 − ξ + 2χ
ξ + 3χ
ax − f dxm −m
Vx . (5.7.14)
1 − ξ + 2χ
From the above expressions it is clear that we must ask for
1 − ξ + 2χ
= 0 , (5.7.15)
as otherwise these fields will degenerate. In other words, we need certain combi-
nation of d’s, namely the Vi , to evolve through the momentum constraints to be
able to cancel out terms in the evolution of dqij and dxm m that will otherwise
make it impossible to diagonalize the system. It is therefore not possible to ar-
range things in such a way that the Vi remain constant up to principal part and
still obtain a strongly hyperbolic system.
For the remaining 12 fields we start by noticing that the longitudinal com-
ponents again result in four eigenfields given by
γ xx (ξ − χ/2) Kqx ∓ Λxx q , (5.7.16)
with hij the projected metric onto the surface of constant x (with an analogous
definition for Λ̂x ). Doing this we find the four transverse-traceless eigenfields
√ xx hpq hpq x
γ Kpq − K̂ ∓ Λxpq − Λ̂ , (p, q)
= x , (5.7.19)
2 2
propagating along the light-cones with speeds
5.8 OTHER HYPERBOLIC FORMULATIONS 183
√
λlight
± = −β x ± α γ xx . (5.7.20)
f >0, (5.7.25)
ξ − χ/2 > 0 , (5.7.26)
1 + 2χ + 4η (1 − ξ + 2χ) > 0 , (5.7.27)
1 − ξ + 2χ
= 0 . (5.7.28)
η = −1/3 , ξ = 1 + χ/2 , χ
= 0 , (5.7.29)
in which case F = 0 and all fields other than those associated with the gauge
propagate either along the time lines or the light-cones.
As we have seen, the KST formulation has the advantage of providing us
with a strongly hyperbolic system without the need to introduce extra quantities
like the V i or Γ̃i of the Bona–Masso and BSSNOK formulations. However, by
modifying the evolution equations for the metric derivatives dijk it forces us to go
to a fully first order form of the equations. This is in contrast with formulations
like Bona–Masso or BSSNOK for which we can keep working at second order
without the need to introduce all the dijk as independent variables, so in practical
applications KST has a larger number of independent variables.
either all the first derivatives of the spatial metric γij , or some special combina-
tions of them, as new independent quantities whose evolution equations are then
modified using the momentum constraints. There are, however, other routes to
constructing hyperbolic systems of evolution equations for relativity. Here we will
briefly introduce two such routes, one based on going to higher order derivatives
and another one that in fact starts by modifying the four-dimensional Einstein
equations themselves. Below we will present the basic ideas behind these ap-
proaches, but will not go into a detailed analysis.
It is important to mention the fact that there are still other forms of obtaining
hyperbolic systems of evolution equations for general relativity. In particular, we
can mention the formulations of Friedrich that involve the Bianchi identities [131]
and make use of the conformal Weyl tensor and its decomposition into “electric”
and “magnetic” parts (see Chapter 8). These formulations are very elegant, but
so far have not been used in practical applications, so here we will not discuss
them further and will refer the reader to [131].
γ mn
δRij −∂m ∂n δγij − ∂i ∂j δγmn + 2∂(i ∂m δγj)n , (5.8.1)
2
which implies that
∂0 Rij α γ mn ∂m ∂n Kij + ∂i ∂j K − 2∂(i ∂m Kj)
m
, (5.8.2)
If we now assume that the lapse evolves via a Bona–Masso type slicing condition
we can transform the last expression into
∂02 Kij α2 f ∂i ∂j K + α2 γ mn ∂m ∂n Kij + ∂i ∂j K − 2∂(i ∂m Kj)
m
Assume now that we have harmonic slicing, i.e. f = 1 (or equivalently, take
the densitized lapse α̃ = αγ −1/2 as a known function of spacetime). The Kij
now satisfy the second order equation
which is nothing more than the wave equation with a wave speed given by the
physical speed of light. We now have a system of six simple wave equations for
the independent components of Kij coupled only through source terms, so the
systems is clearly not only strongly hyperbolic, but even symmetric hyperbolic.
The price we have paid is that now we have moved to a system that essentially
involves third derivatives in time of the spatial metric. A system of this type
has been proposed by Choquet-Bruhat and York in [100], and also by Abrahams
et al. in [1]. We can in fact relax the gauge choice and take any f > 0 to
obtain a system that, though no longer symmetric, is still strongly hyperbolic.
These results shouldn’t be surprising since we can see them as essentially a time
derivative of the Bona–Masso or NOR systems. The interesting feature of these
third order formulations is the fact that by taking an extra time derivative there
is no need to introduce auxiliary quantities.
∇ν ∇ν Zµ + Rµν Z ν = 0 . (5.8.10)
∂t − £β γij = −2αKij , (5.8.11)
∂t − £β Kij = −Di αj + α [Rij + Di Zj + Dj Zi
Θ
Γ0 = −2Z 0 = −2
, (5.8.20)
α
which, using the expressions of Appendix B, becomes
∂t − £β α = −α2 f (α) (K − 2Θ) . (5.8.21)
The final result is that, when using a slicing condition of the form (5.8.17),
the Z4 formulation is strongly hyperbolic for f > 0, but we must take m = 2 in
the particular case when f = 1, so that m = 2 is the preferred choice.
this issue is less important, a property that makes such formulations particularly attractive.
188 HYPERBOLIC REDUCTIONS
In practice, most codes today attempt to solve the boundary problem by us-
ing some simple ad hoc boundary condition and pushing the boundaries as far
away as possible, ideally sufficiently far away so that they will remain causally
disconnected from the interior regions during the dynamical time-scale of the
problem under consideration. This approach, however, is extremely expensive in
terms of computational resources (though the use of mesh refinement can alle-
viate the problem). Also, if we are solving a coupled elliptic-hyperbolic problem
(for example when using elliptic gauge conditions), boundary effects will propa-
gate at an infinite speed and will affect the interior regardless of how far away
the boundaries are. Because of this, the correct treatment of the boundaries is
rapidly becoming a very important issue in numerical relativity.
∂t f + v ∂r f + v (f − f0 )/r ∼ 0 , (5.9.1)
Quite apart from the issue of the constraint violation introduced by the ra-
diative boundary condition, we might also worry about it being well-posed. In-
tuitively, it would seem that we are imposing two many conditions, since even
though we can model incoming fields in any way we like, outgoing fields must be
left alone. In order to see what is the effect of the radiative boundary conditions,
let us for a moment consider the case of simple wave equation in flat space
∂t2 ϕ − v 2 ∇2 ϕ = 0 , (5.9.2)
∂t ϕ + v ∂r ϕ + v (ϕ − ϕ0 )/r = 0 , (5.9.6)
pendix A). In practice, we find that “sufficiently far away” means that the boundaries should
be at least at a distance of ∼ 50M in order to have violations of the constraints at the bound-
aries of the order of a few percent. Recently, however, simulations using different forms of
mesh refinement have placed the boundaries at ∼ 500M or more in order to make sure that
the boundaries remain causally disconnected from the central regions during the dynamical
time-scale of the systems under study.
190 HYPERBOLIC REDUCTIONS
taken into account, but these are typically quadratic in quantities that become
small far away, so that ignoring them is a reasonable approximation. Second, in
the case of the Einstein equations not all fields propagate with the same speed,
so that we must be careful with the value of v chosen. However, as we have seen
in previous sections, the standard BSSNOK formulation has only eigenfields that
propagate along the time lines, at the speed√of light which far away is essentially
1, or at the gauge speed which far away is f . We can therefore take all speeds
equal to 1 except for the fields associated with the gauge, namely the lapse
itself, the conformal factor φ, and the trace of the extrinsic curvature K. What
about the fields that propagate along the time lines? It turns out that due to
the presence of source terms, these fields also behave as pulses moving outward,
but since the sources have contributions from many different fields, choosing one
fixed value for the wave speed might introduce numerical reflections from the
boundaries.56
Let us now rewrite the wave equation in a more “ADM-like” form by defining
Π := ∂t ϕ. We then have
∂t ϕ = Π , (5.9.7)
2 2 2
∂t Π = v ∂r ϕ + ∂r ϕ . (5.9.8)
r
Notice that if we have a solution of the form ϕ = u(r − vt)/r, then the behavior
of Π will be
Π = −vu (r − vt)/r , (5.9.9)
which is precisely of the form Π = u0 (r − vt)/r, with u0 = −vu . This justifies
the fact that in relativity we can use a radiative boundary condition for the
extrinsic curvature as well as for the metric components. What about the spatial
derivative? Defining Ψ := v∂r ϕ, we now find that if ϕ = u(r − vt)/r then
Ψ = v u (r − vt)/r − u/r2 , (5.9.10)
which is not of the same form as ϕ. We therefore should not use the radiative
boundary condition for spatial derivatives.
Notice that in the case of the BSSNOK formulation, as long as we keep the
evolution equations second order in space, we will in fact be using the radiative
boundary condition for the six components of the extrinsic curvature, and the
three Γ̃i (plus the quantities {α, φ, γ̃ij }). Now, since we know that we have three
quantities associated with the Γ̃i that propagate along the time lines, namely
Γ̃i − 8 γ̃ ik ∂k φ (cf. equation (5.6.9)), giving boundary data for the Γ̃i is clearly
inconsistent, but probably not too bad since the fields are not outgoing (a much
56 A good idea might be to choose the gauge function f such that far away it goes to 1.
Harmonic slicing corresponds precisely to f = 1, but as we saw in Chapter 4 it does not have
good singularity avoiding properties. The standard 1+log slicing takes f = 2/α, so that far
away f ∼ 2, which will cause precisely the boundary reflections mentioned here.
5.9 BOUNDARY CONDITIONS 191
where in the second line we used the fact that H is the same symmetrizer for all
ni , which in particular means that all three matrices HM i are symmetric. This
is precisely the place where the assumption that we have a symmetric hyperbolic
system becomes essential. Using now the divergence theorem we finally find
dE † †
=− u HM u ni dA = −
i
u HP (
n)u dA , (5.9.14)
dt ∂Ω ∂Ω
with
n. In contrast to what we have done before, we will now not assume that the
surface integral above vanishes. Instead, we now make use of equation (5.3.6):
where w := R−1 u are the eigenfields. Let w+ , w− , w0 now denote the eigenfields
corresponding to eigenvalues of P (
n) that are positive, negative and zero re-
spectively, i.e. eigenfields that propagate outward, inward, and tangential to the
boundary respectively.57 We then find
dE † †
=− w+ Λ+ w+ dA − w− Λ− w− dA
dt
∂Ω ∂Ω
† †
= w− |Λ− | w− dA − w+ |Λ+ | w+ dA , (5.9.17)
∂Ω ∂Ω
with S some matrix that relates incoming fields at the boundary to outgoing
ones. We then have
dE † T †
= w+ S |Λ− | S w+ dA − w+ |Λ+ | w+ dA
dt
∂Ω ∂Ω
† T
= w+ S |Λ− | S − |Λ+ | w+ dA . (5.9.19)
∂Ω
From this we clearly see that if we take S to be “small enough” in the sense
† T †
that w+ S |Λ− | S w+ ≤ w+ |Λ+ | w+ , then the energy norm will not increase
57 We should be careful with the interpretation of w and w , because in many references we
+ −
find their meaning reversed. This comes from the fact that we often find the evolution system
written as ∂t u = M i ∂i u instead of the form ∂t u + M i ∂i u = 0 used here, which of course
reverses the signs of all the matrices and in particular of the matrix of eigenvalues Λ.
5.9 BOUNDARY CONDITIONS 193
with time and the full system including the boundaries will remain well-posed.
Boundary conditions of this form are known as maximally dissipative [185]. The
particular case S = 0 corresponds to saying that the incoming fields vanish, and
this results in a Sommerfeld-type boundary condition. This might seem the most
natural condition, but it is in fact not always a good idea, as we might find that
in order to reproduce the physics correctly (e.g. to satisfy the constraints) we
might need to have some non-zero incoming fields at the boundary.
We can in fact generalize the above boundary condition somewhat to allow
for free data to enter the domain. We can then take a boundary condition of the
form
w− |∂Ω = S w+ |∂Ω + g(t) , (5.9.20)
where g(t) is some function of time that represents incoming radiation at the
boundary, and where as before we ask for S to be small. In this case we are
allowing the energy norm to grow with time, but in a way that is bounded by
the integral of |g(t)| over the boundary, so the system remains well-posed. In the
same way we can also allow for the presence of source terms on the right hand
side of the evolution system (5.9.11).
As a simple example of the above results we will consider again the wave
equation in spherical symmetry
2
∂t2 ϕ − v 2 ∂r2 ϕ + ∂r ϕ = 0 . (5.9.21)
r
Introducing the first order variables Π := ∂t ϕ and Ψ := v∂r ϕ, the wave equation
can be reduced to the system
∂t ϕ = Π , (5.9.22)
2v
∂t Π = v∂r Ψ + Π , (5.9.23)
r
∂t Ψ = v∂i Π . (5.9.24)
The system above is clearly symmetric hyperbolic, with eigenspeeds {0, ±v} and
corresponding eigenfields
w0 = ϕ , w± = Π ∓ Ψ . (5.9.25)
basic idea behind these approaches without considering any specific formulation
of the evolution equations. In order to simplify the discussion further, we will also
assume that the shift vector vanishes at the boundary (the results can be easily
generalized to the case when the shift vector is tangential to the boundary, but
the presence of a component of the shift normal to the boundary complicates
the analysis considerably). The discussion presented here follows closely that
of [154].
Consider an arbitrary strongly or symmetrically hyperbolic formulation of
the 3+1 Einstein evolution equations. The system therefore has the form
∂t u + M i ∂i u = s(u) , (5.9.28)
with M i the characteristic matrices, s(u) some source terms, and u the vector of
main evolution variables, essentially the extrinsic curvature Kij and the spatial
derivatives of the metric dijk = ∂i γjk /2, plus possibly some combinations of these
(such as the Γi of the NOR formulation, or the Γ̃i of the BSSNOK formulation).
Given such a formulation we will have a series of constraints that involve spatial
derivatives of the main evolution variables u. These constraints will include both
the Hamiltonian and the momentum constraints, plus constraints related to the
definition of some first order quantities of the form Cijk := dijk − ∂i γjk /2 = 0.
Let us collectively denote the constraints as ca , where the index a runs over all
the constraints. Using now the evolution equations plus the (twice contracted)
Bianchi identities, it is always possible to derive a system of evolution equa-
tions for the constraints. Since the constraints are compatible with the evolution
equations, this system will have the form
∂t c + N i ∂i c = q(c) , (5.9.29)
with N i some characteristic matrices and q(c) some source terms. This system
is clearly closed in the sense that if all constraints are initially zero they will
remain zero during the subsequent evolution.
Now, if the main evolution system (5.9.28) is strongly or symmetrically hy-
perbolic, the constraint evolution system will inherit this property. This comes
about because the constraints must be compatible with the evolution equations.
Let us denote by w the eigenfields of the main evolution system, and by W those
of the constraint system. Consider now a boundary surface with unit normal
vector ni . The compatibility of the constraints with the main evolution system
implies, in particular, that for every pair of constraint characteristic variables
W± that propagate with speeds ±λ along the normal direction, there will be
an associated pair of main characteristic variables w± that propagate with the
same speeds.58 Since the constraints are given in terms of spatial derivatives of
58 The inverse of this statement is clearly not true as there are in general more main variables
than constraints. In particular, both gauge and physical characteristic modes (those associated
with the gravitational waves) can have no counterpart in the constraint system.
196 HYPERBOLIC REDUCTIONS
the main variables, it will always be possible to choose the normalization such
that these associated pairs of variables are related through
W± = ∂n w± + . . . , (5.9.30)
W− − SW+ = 0 . (5.9.31)
∂n w− − S∂n w+ 0 , (5.9.32)
∂t w− + S∂t w+ 0 . (5.9.33)
∂t X 0 , (5.9.34)
i.e. X is constant up to principal part. We can then go back and impose the
following boundary condition on the main system
w− + Sw+ = X . (5.9.35)
This gives us boundary conditions for those characteristic variables of the main
evolution system that are associated with constraint modes. For the remaining
characteristic variables (those associated with the gauge and physical degrees of
freedom) we are of course free to choose any maximally dissipative boundary
conditions we desire.
If the quantities X were truly constant on the boundary, then the above
boundary conditions would in fact be maximally dissipative conditions. How-
ever, in general X is only constant up to principal part, and its evolution is
coupled to the other variables through both source terms and tangential deriva-
tives, so that we do not have a true maximally dissipative boundary condition.
5.9 BOUNDARY CONDITIONS 197
There are in fact some interesting exceptions to this where, for some specific
evolution systems, we can find special choices of the matrix S for which the
boundary system becomes closed in the sense that the evolution of the different
X’s is given only in terms of the X’s themselves (the boundary system decouples
from the “bulk” system). In such a case we can evolve the boundary system first,
and then treat the X’s as a priori given functions so that boundary conditions
for the w− become truly maximally dissipative. Such special cases can be shown
to yield boundary conditions of the Dirichlet or Newmann types, as they reduce
to imposing boundary data either on the components of the extrinsic curvature
or on the components of the metric derivatives (for details see [154] and refer-
ences therein). However, these special cases are very restrictive and will result
in reflections of the constraint violations at the boundaries. A much more inter-
esting case would be that of a Sommerfeld type boundary condition but this is
still a matter for further research.
The requirement of having maximally dissipative boundary conditions might
in fact be too strong, and there are some current efforts to find conditions for
the well-posedness of initial-boundary value problems with more general differ-
ential boundary conditions (see e.g. [251]). At the time of writing this book, the
problem of having well-posed and constraint-preserving boundary conditions for
metric formulations of the 3+1 evolution equation is still not entirely solved and
promises to be a very active area of research in the near future.
6
EVOLVING BLACK HOLE SPACETIMES
6.1 Introduction
The study of black hole spacetimes has been a central theme in numerical rela-
tivity since its origins. Black holes are important for a variety of reasons. First
they correspond to the final state of the gravitational collapse of compact objects
and are therefore expected to form even in situations that start from completely
regular initial data, such as supernova core collapse or the collision of neutron
stars. At the same time, spacetimes containing black holes can be seen as the
simplest representation of gravitating bodies in general relativity. Indeed, be-
ing vacuum solutions, black holes avoid the need to consider non-trivial matter
distributions with their corresponding complications, i.e. dealing with hydro-
dynamics, thermodynamics, micro-physics, electromagnetic fields, etc.59 Also,
we should remember the fact that general relativity is a self-consistent theory
where energy and momentum conservation, and as a consequence the dynamics
of matter, are intimately linked to the Einstein field equations. Because of this,
in relativity it is inconsistent to think of distributions of matter whose motion is
determined “from the outside”. For example, trying to solve for the gravitational
field associated with two solid spheres at rest is not entirely consistent in general
relativity since we would immediately need to ask which forces are responsible
both for the solid nature of the spheres and for keeping them at rest, and what is
the stress-energy tensor associated with those forces, with is corresponding effect
on the gravitational field itself (solutions of this type typically contain artifacts
such as singularity “struts” connecting the two bodies to cancel the gravitational
attraction that would otherwise move them toward each other). Considerations
of this type imply that black holes are in fact the cleanest way to approach the
two body problem in general relativity. Black holes, however, are clearly not the
simple point particles of Newtonian dynamics and they bring extra complica-
tions of their own associated with the presence of horizons and singularities, as
well as the possibility of non-trivial internal topologies.
Historically, the numerical study of binary black hole spacetimes started in
1964 with the work of Hahn and Lindquist [158] (though at that time the term
“black hole” had not yet been coined). The binary black hole problem has con-
tinued as a major area of research in numerical relativity for the past 40 years,
both because of purely theoretical considerations, and because of the fact that
59 We must remember that from the point of view of general relativity, anything other than
the gravitational field itself is seen as “matter”, including physical entities such as scalar fields
and electromagnetic fields.
198
6.2 ISOMETRIES AND THROAT ADAPTED COORDINATES 199
binary black hole inspiral collisions are considered one of the most promising
sources of gravitational waves, and as such are potentially observable in the next
few years by the large interferometric gravitational wave observatories that are
only just coming on line. Indeed, the binary black hole problem has been con-
sidered for a long time as the “holy grail” of numerical relativity. The problem
has proved a difficult one to solve, hitting serious stumbling blocks and tak-
ing numerous detours over the years. In the past two years, however, enormous
progress has been made following two different routes, first by Pretorius using an
approach based on evolving the full 4-metric using generalized harmonic coordi-
nates [232, 231, 233], and more recently by the Brownsville and Goddard groups
(independently of each other) using a more traditional 3+1 approach based on
the BSSNOK formulation with advanced hyperbolic gauge conditions and the
crucial new ingredient of allowing for moving punctures [44, 45, 92, 93].
In the following Sections we will consider the different issues associated with
the numerical evolution of black hole spacetimes. These issues can roughly be
separated into three areas: 1) How to evolve black holes successfully and in
particular how to deal with the presence of singularities, 2) how to locate the
black hole horizons in a numerically generated spacetime, and 3) how to measure
physical quantities such as mass and angular momentum associated with a black
hole. There are a series of good references where we can look at all of these issues.
In particular, for locating event and apparent horizons there is an excellent review
by Thornburg [287].
dl2 = ψ 4 dr2 + r2 dΩ2 . (6.2.3)
As already mentioned in Chapter 1, for this metric the horizon is located at
r0 = M/2 and coincides with the throat of the wormhole (the Einstein–Rosen
bridge).60 This metric has an isometry with respect to the transformation
To evolve the initial data (6.2.3) we now place a boundary at the throat r =
r0 , evolve the exterior metric, and impose isometry boundary conditions at the
throat. In order to find the form that these isometry boundary conditions should
take, let us consider a map x̄ = J(x) from one side of the wormhole to the other
that has the following form in spherical coordinates
r̄ = r0 /r , θ̄ = θ , φ̄ = φ , (6.2.5)
The isometry now implies that any field should remain identical under the above
map. For example, for a scalar field Φ this implies
In the previous expression we allow the possibility of a minus sign since the
square of the map x̄ is clearly the identity, plus the fact that considering the
antisymmetric case has some important applications (see below). For a vector v i
and a one-form wi , the isometry condition will take the form
−1 m
v i (x) = ± Λim v (J(x)) , wi (x) = ±Λm
i wm (J(x)) , (6.2.7)
where
∂J i
Λij := . (6.2.8)
∂xj
The transformation above looks very much like a standard coordinate transfor-
mation, the main difference being that here we assume that the components v i
and wi have the same functional form before and after the transformation.
Finally, for a tensor Tij the isometry condition becomes
For the spatial metric γij the minus sign is clearly unphysical, but this is not
necessarily so for other tensors, such as for example the extrinsic curvature.
If we now differentiate the above expressions and evaluate them at the fixed
point of the isometry we will obtain boundary conditions for the different geo-
metric quantities at the throat. For a scalar field the antisymmetric case must
60 For a single Schwarzschild black hole Brill–Lindquist and Misner data are in fact identical.
202 EVOLVING BLACK HOLE SPACETIMES
clearly satisfy Φ|throat = 0, while the boundary condition for the symmetric case
takes the form
(∂i Φ)|throat = (Λm
i ∂m Φ)|throat , (6.2.10)
which for the transformation (6.2.5) reduces to ∂r Φ|r0 = 0. For vectors and
one-forms the boundary conditions become
−1 j −1 n
∂i v j r0 = ± v m ∂i Λjm + Λm Λi ∂n v m , (6.2.11)
r0
∂i wj |r0 = ± wm ∂i Λm m n
j + Λj Λi ∂n wj r , (6.2.12)
0
Trr |r0
v r |r0 = 0 , wr |r0 = 0 , ∂r Trr |r0 = −2 , (6.2.14)
r0
and in the antisymmetric case to,
v r |r0 wr |r0
∂r v r |r0 = , ∂r wr |r0 = − , Trr |r0 = 0 . (6.2.15)
r0 r0
Notice that different fields can have different symmetry properties, for example
we can have a symmetric spatial metric and an antisymmetric extrinsic curvature.
Antisymmetric fields are interesting if we wish, for example, to use a lapse
function that vanishes at the throat. In that case we can take the lapse to be an
antisymmetric scalar function, and use an antisymmetric extrinsic curvature to
guarantee that the spatial metric itself remains symmetric. This is precisely what
happens if we choose the lapse associated with the full isotropic metric (6.2.1),
which takes the form
1 − M/2r
α= . (6.2.16)
1 + M/2r
In this case time will move in opposite directions on the different sides of the
wormhole and the throat will not evolve, which implies that the spatial slices
will not penetrate inside the horizon.
Notice also that the conformal factor ψ is not a scalar function but rather a
scalar density. Its boundary conditions are therefore inherited from those for the
spatial metric, and for the symmetric case take the form
The isometry boundary conditions described above can also be used in the
case of two or more black holes, the main difference being that we now have a
series of isometry maps, one for each throat. There is, however, one important
issue that must be addressed: The isometry boundary conditions become very
difficult to implement if we use a coordinate system that is not adapted to the
throats of the individual black holes. Ideally we should then use a coordinate sys-
tem such that the individual throats correspond to surfaces where some “radial”
coordinate remains constant, and with the lines associated with the “angular”
coordinates perpendicular to this surface.
In order to construct such throat adapted coordinates, it turns out to be
convenient to think of a coordinate
transformation starting from the standard
cylindrical coordinates (z, ρ = x2 + y 2 ) as a map from the complex plane onto
itself (see [271]). In order to do this we define the complex variable ζ = z +iρ and
consider a complex function χ(ζ). The new coordinates (η, ξ) will correspond to
the real and imaginary parts of χ:
η = Re(χ) , ξ = Im(χ) . (6.2.21)
The simplest example corresponds to the function
χ = ln ζ . (6.2.22)
If we rewrite ζ as ζ = reiθ , with r = ρ2 + z 2 and θ = arctan(ρ/z), we find
immediately find that
χ = ln r + iθ , (6.2.23)
which implies
η = ln r , ξ=θ. (6.2.24)
We then see that this transformation corresponds to polar coordinates on the
(ρ, z) plane, with η a logarithmic radial coordinate and ξ just the standard
angular coordinate.
204 EVOLVING BLACK HOLE SPACETIMES
d
=
co
ns
t.
j=0
j
=
co
n
st.
j=/ d=0
l
coordinates must therefore be constructed numerically for each value of the pa-
rameter µ. Figure 6.2 shows a representation of the Čadež coordinate grid in the
(ρ, z) plane for the case when µ = 2.5.
The advantage of using Čadež coordinates is that these coordinates approach
spherical coordinates close to the black hole throats and also far away from both
holes. The coordinate η goes to 0 at the center of both throats, while ξ goes from
0 to π/2 as we go half way around the throat close to a black hole, or a quarter
of the way around the origin in the faraway region. The coordinates are clearly
singular at the saddle point ρ = z = 0 between both holes, where the coordinate
line ξ = π/2 has a kink. Čadež coordinates have been used extensively for the
simulation of black hole collisions using Misner initial data [27, 28, 85, 123, 270,
271].
Notice that since both for bipolar and Čadež coordinates η behaves as a loga-
rithmic radial coordinate close to the throats, the isometry boundary conditions
simplify just as they did in the case of a single black hole.
There are yet other throat adapted coordinate systems that have been pro-
posed in the literature (see e.g. [24]), but we will not discuss them here. Finally,
we can mention another approach to adapt our coordinates to the black hole
throats: Instead of using a global coordinate system that is adapted to both
black hole throats, we can use multiple coordinate patches with ordinary spher-
ical coordinates close to each hole and a Cartesian patch everywhere else. This
approach has been recently advocated and is being actively pursued by a number
of groups (particularly in the context of black hole excision).
206 EVOLVING BLACK HOLE SPACETIMES
d
=
j=0
co
n
st.
j = //2
j = const.
j = //2 l
singular point
Fig. 6.2: Čadež coordinates. These coordinates are constructed numerically for each
value of the parameter µ associated with Misner initial data. Here we show the partic-
ular case µ = 2.5 (the data for this figure is courtesy of S. Brandt).
evolving only the regular conformal metric in all 3 , was first used by Anninos
et al. in 1994 [25] for a single Schwarzschild black hole. In that reference the punc-
ture idea was arrived at empirically by using a code that was originally designed
to follow the approach of Smarr et al. that consisted of imposing an isometry
condition at the throat and at the same time factoring out the conformal factor,
and then simply turning off the isometry boundary condition and evolving the
conformal metric everywhere. In 1997 Brügmann [82] proposed puncture evolu-
tions as a general method for the evolution of black hole spacetimes constructed
with a conformally flat metric and Bowen–York extrinsic curvature (i.e. puncture
initial data). The idea was later described in detail by Alcubierre et al. in [13],
both for the ADM and BSSNOK formulations of the evolution equations. Since
the BSSNOK formulation already separates out the volume elements from the
conformal metric, it is perhaps more natural to discuss puncture evolutions in
this context.
Starting from the BSSNOK rescaling of the spatial metric:
γ̃ij := e−4φ γij , (6.3.1)
we further split the conformal factor as
φ = φ̂ + ln ψBL , (6.3.2)
where ψBL is the singular Brill–Lindquist conformal factor. For pure Brill–
Lindquist type data initially we will have φ̂(t = 0) = 0, but for more general
puncture data the initial conformal factor has an extra regular piece so that
φ̂(t = 0)
= 0 (see Chapter 3). During the subsequent evolution, the regular
part of the conformal factor φ̂ is evolved, but the singular part is kept constant
in time. Terms in the evolution equations that make reference to the singular
piece of the conformal factor or its derivatives are calculated analytically. Since
the singular piece of the conformal factor is kept constant in time, it is perhaps
better to refer to this method as the static puncture evolution method.
We can argue that it is possible to obtain regular evolutions using this ap-
proach by examining the evolution equations for the geometric quantities and
the gauge conditions close to a puncture. For simplicity, let us assume that we
have a single puncture located at the origin with zero linear momentum and
spin, and consider the limit r → 0 (in the more general case the analysis is
more complicated but the results are similar). For the conformal factor we have
ψBL = O(1/r) initially, and the different BSSNOK quantities will have the fol-
lowing initial behavior and assumed subsequent evolution:
φ̂ = O(1) → O(1) , (6.3.3)
γ̃rr = O(1) → O(1) , (6.3.4)
K = 0 → O(r ) ,2
(6.3.5)
Ãrr = 0 → O(r2 ) , (6.3.6)
Γ̃r = 0 → O(r) . (6.3.7)
208 EVOLVING BLACK HOLE SPACETIMES
Assume furthermore that the initial lapse is α = O(1), and the shift is purely
radial at the punctures with β r = O(r). Notice also that first derivatives of O(1)
quantities are O(r) and second derivatives are again order O(1). The BSSNOK
evolution equations (2.8.9), (2.8.10), (2.8.11), (2.8.12), and (2.8.25) can then be
shown to imply the following behavior near the puncture
Substituting this into the above expressions we see that it is compatible with
our assumptions, so that to leading order we have
∂t φ̂ = O(1) , (6.3.14)
∂t γ̃rr = O(1) , (6.3.15)
∂t K = O(r2 ) , (6.3.16)
∂t Ãrr = O(r2 ) , (6.3.17)
∂t Γ̃r = O(r) . (6.3.18)
Notice, in particular, that K, Ãij and Γ̃i will remain zero at the puncture. In
contrast, φ̂ and the conformal metric γ̃ij will evolve at the puncture for non-zero
shift, but will remain at their initial values if the shift vanishes or if it behaves
as O(r3 ) instead of O(r).
In the previous analysis we have assumed that the initial lapse is α = O(1),
but we can start instead the evolution with a lapse of the form α = O(rn ) with
n an even positive number (a “pre-collapsed” lapse). In that case the above
counting arguments will change for vanishing shift, but they will remain the
same if β r = O(r) because the shift terms will dominate.
The static puncture evolution method allows us to deal with the infinities in
the spatial metric in a clean and efficient way. Nevertheless, it has a series of
disadvantages. From a purely numerical point of view, the infinity still exists in
the static conformal factor, so we must be careful to avoid placing a grid point
6.4 SINGULARITY AVOIDANCE AND SLICE STRETCHING 209
directly at the puncture. In practice, the numerical grid is set up in such a way
that the puncture lies between grid points, i.e. the puncture is staggered. Still,
as resolution is increased it is clear that we will have to deal with larger and
larger values of the conformal factor, so that convergence near the puncture will
not be achieved.61 Other more serious effects are related to the fact that since
the singular conformal factor is static, we are connected at all times with the
other asymptotic infinities.
Perhaps the worst disadvantage of this approach, however, is the fact that
having a static singular term implies that the punctures can not move by con-
struction, so that the black hole horizons must always surround the original
position of the puncture, even when the individual black holes have non-zero
linear momentum. This is really a serious drawback, as it forces us to use co-
ordinate systems that follow the motion of the individual black holes so that
they remain at an approximately fixed position in coordinate space. For binary
black holes in orbiting configurations, for example, this implies that we must
use a coordinate system that is corotating with the black holes. Still, the static
puncture approach has been used successfully for the simulation of the grazing
collision of two black holes [7, 82], and also of orbiting black holes with corotating
coordinates [12, 110].
The static puncture method has been recently superseded by an approach
that allows the punctures to move, resulting in much more accurate and sta-
ble simulations of black hole spacetimes. We will discuss this moving puncture
method in Section 6.6 below.
61 Empirical experience shows, however, that this lack of convergence does not leak out of
the black hole horizon. In fact, it usually does not extend more than a few grid points away
from the puncture itself.
210 EVOLVING BLACK HOLE SPACETIMES
1.2
0.8
_ 0.6
0.4
0.2
0
0 2 4 6 8 10
r
Fig. 6.3: Evolution of the lapse function α for Schwarzschild using maximal slicing. The
value of α is shown every t = 1M . The collapse of the lapse is clearly visible.
Other slicing conditions are much better at avoiding the singularity by slow-
ing down the evolution in the regions approaching the singularity. The classical
example of a singularity avoiding slicing condition is maximal slicing which was
discussed in detail in Section 4.2.2, and corresponds to asking for the trace of
the extrinsic curvature to remain zero during the evolution, K = ∂t K = 0. This
requirement results in the following elliptic condition on the lapse function
D2 α = α Kij K ij + 4π (ρ + S) . (6.4.1)
For Schwarzschild we can in fact show analytically that the maximal slices ap-
proach a limiting surface given by rSchwar = 3M/2, with rSchwar the standard
Schwarzschild areal radius, so that the singularity is avoided. Because of its sin-
gularity avoiding properties, for many years maximal slicing was the standard
slicing condition for evolving black hole spacetimes.
As already mentioned in Chapter 4, when using maximal slicing with punc-
ture initial data, a specific boundary condition arises naturally at the punctures
that corresponds to the lapse having zero gradient at the puncture, the so-called
puncture lapse. Figure 6.3 shows an example of the evolution of the lapse func-
tion for a numerical simulation of the Schwarzschild spacetime starting from
isotropic (puncture) initial data and using maximal slicing. The collapse of the
lapse is evident.
Maximal slicing is not the only singularity avoiding slicing condition. Another
common choice for a singularity avoiding slicing condition is the 1+log slicing,
which is a member of the Bona–Masso family of slicing conditions (4.2.52) and
has the form
∂t α − β i ∂i α = −2α K . (6.4.2)
6.4 SINGULARITY AVOIDANCE AND SLICE STRETCHING 211
1.2
0.8
_ 0.6
0.4
0.2
0
0 2 4 6 8 10
r
Fig. 6.4: Evolution of the lapse function α for Schwarzschild using 1+log slicing. The
value of α is shown every t = 1M . Notice how the lapse remains equal to one at the
puncture and large gradients develop.
The 1+log slicing condition has been found to be very robust in black hole
evolutions. It is also much easier to solve than maximal slicing, and because
of this it has supplanted maximal slicing in recent years. There is, however,
an important point to keep in mind when using 1+log slicing with puncture
data. Since 1+log slicing is a √hyperbolic√slicing condition, it has a finite speed
of propagation given by v = α 2γ rr = α 2γ̃ rr /ψ 2 . At the puncture ψ diverges,
so the speed of propagation vanishes (this reflects the fact that the puncture is
an infinite proper distance away). As a result of this, for standard 1+log slicing
the lapse will collapse in the region around the throat of the wormhole, but will
remain equal to one both at infinity and at the puncture location. Having the
lapse remain equal to one at the puncture will cause extremely large gradients to
develop close to the puncture that can cause the numerical simulation to fail.62
This phenomenon can be clearly seen in Figure 6.4, which shows the evolution
of the lapse function for a numerical simulation of the Schwarzschild spacetime
starting from puncture initial data and using the 1+log slicing.
This problem of 1+log slicing with puncture data can be cured in a number
of different ways. First, we can simply ignore the region close to the puncture and
cut it from the simulation using so-called excision techniques (see Section 6.5).
Another possibility is to modify the 1+log condition so that the speed of prop-
agation of signals becomes infinite in physical space, while remaining finite in
coordinate space. This can be done by changing the 1+log condition to
62 Depending on the numerical method used, we can find instead that the numerical dissipa-
tion is so strong that these large gradients force the lapse to collapse at the puncture. But this
is a purely numerical artifact.
212 EVOLVING BLACK HOLE SPACETIMES
∂t α = β i ∂i α − 2αψ m K , (6.4.3)
¾
arr 3
0
0 2 4 6 8 10
r
Fig. 6.5: Evolution of the conformal radial metric γ̃rr = γrr /ψ 4 for Schwarzschild using
maximal slicing. The value of the metric is shown every t = 1M . The slice stretching
effect is evident.
with an isometry boundary condition, there is a large growth of the radial metric
close to the horizon as coordinate observers get separated from each other. This
growth can be cured by pushing coordinate lines from the region around the
throat to the region around the horizon. But this clearly will result in a growth
of the radial metric close to the throat as this region has now been depleted of
coordinate lines. Notice, however, that since the region close to the throat is fast
approaching the limiting surface r = 3M/2, there is very little distortion of the
volume elements there. The depletion of coordinate lines in that region will only
result in an increase of the volume elements with essentially no distortion, to
which the minimal distortion condition is blind.
Because of the negative early results with the use of a shift vector, excision
was considered for a long time as the only viable way to avoid the effects of
grid stretching (though excision also requires a good shift choice in order to
be successful). Recently, however, it has been shown that modern hyperbolic
shift conditions such as the Gamma driver condition (4.3.34) can indeed cure
slice stretching when used in conjunction with the puncture evolution method.
In this type of simulations the shift is initially zero, but as the slice stretching
develops the shift reacts by pulling out points from the inner asymptotically flat
region near the punctures (so the isometry is clearly broken). The region around
the puncture is therefore providing the necessary coordinate lines to counter slice
stretching close to the horizon. The result is that with these new techniques the
slice stretching phenomenon can be controlled and the evolution of the radial
metric component can be virtually frozen [13].
The development of modern gauge conditions like the 1+log slicing and the
Gamma driver shift, coupled with the puncture evolution method, has finally
214 EVOLVING BLACK HOLE SPACETIMES
allowed the use of singularity avoiding slicings without the negative effects of
slice stretching, and this has resulted in the possibility of having long-lived and
accurate evolutions of black hole spacetimes.
the more modern name of “black hole excision” [26, 261]. This older name is now considered
inadequate as no boundary condition is placed at the apparent horizon directly.
6.5 BLACK HOLE EXCISION 215
need an outward pointing shift vector to counter this effect. In fact, the shift
vector must be such that the coordinate lines move out at the speed of light at
the horizon, and faster than light inside the horizon. This is, of course, no cause
for alarm as we are only talking about the “motion” of the coordinates and not
any physical effect.
As it turns out, some simple numerical methods become unstable when the
shift is superluminal, so in the early 1990s numerical techniques known as “causal
differencing” [261] and “causal reconnection” [20] where developed that tried to
tilt the computational molecules in order for them to follow the physical light-
cones. However, later on it was realized that when using hyperbolic formulations
of the evolution equations this was unnecessary, and quite standard numerical
techniques should be able to cope with superluminal shifts as long as the CFL
stability condition remains satisfied (see Chapter 9).64 Moreover, when using
hyperbolic formulations of the evolution equations we find that inside the black
hole all characteristics move inward, so there is no need for a boundary condition
on the excision surface as all the necessary causal information is in fact known.
In more geometric terms, the excision surface is spacelike, so it is not a boundary
as such, but rather a “time” at which we simply stop integrating – see Figure 6.6
(a possible exception to this would be the existence of gauge modes that move
faster than light and that could have incoming characteristics at the excision
surface, but we can always choose a gauge that does not behave in this way).
Though black hole excision has been successful in spherical symmetry [26,
196, 173, 151, 253, 254, 256, 261], it has been very difficult to implement with a
3+1 approach in three dimensions [25, 81, 104, 107, 286, 296] (black hole excision
using a characteristic formulation, on the other hand, has been very successful in
3D, allowing stable evolutions of perturbed black holes for essentially unlimited
times [147]). The main problem is related to the fact that the excision surface is
typically of spheroidal shape and this is hard to represent on a Cartesian grid. It
is therefore difficult to even define at a given point on the excision surface what
is the “outward” direction. This inevitably leads to the use of extrapolations
in several directions in order to be able to do the finite differencing, and this
extrapolation is typically very unstable. One important development in this area
was the introduction of so-called simple excision techniques [10]. The original
idea behind such techniques was to excise a simple surface in Cartesian coordi-
nates (e.g. a cube), and then to apply very simple boundary conditions on that
surface, in particular simply copying the values of all dynamical variables onto
the excision cube from the point directly adjacent to it. Such techniques have
have later been generalized to the case of more complicated excision surfaces (so-
called “lego-spheres”), and have allowed the simulation of black hole collisions
for some limited times [12, 15]. Still, excision techniques in Cartesian grids have
64 A common practice is to use one-sided differences in the shift advection terms (terms involv-
ing β i ∂i ) to improve accuracy and stability, this being the only place where any consideration
about causality enters the numerical code (i.e. the direction of the tilt of the light-cones).
216 EVOLVING BLACK HOLE SPACETIMES
a)
excision surface
horizon
t = constant
b) ce
urfa
on
ns
riz
io
cis ho
ex
t = constant
Fig. 6.6: Two different views of excision: a) Excision as seen in coordinate space. The
excision surface is inside the horizon, where the light-cones are titled beyond the vertical
and all characteristics are outgoing so no boundary condition is needed. b) Excision
as seen locally by the normal observers. Here the light-cones are not distorted and
the horizon is a null surface moving at the speed of light. The excision surface is
spacelike, and again we can see that no boundary condition is needed as we simply
stop integrating when this surface is reached (do notice that this diagram is valid only
locally – in particular the horizon and excision surface do not really intersect since
spacetime is not flat).
had a very difficult time evolving binary black hole spacetimes for more than
about one orbit in the best of cases.
The problems faced by excision algorithms using Cartesian grids suggest an-
other approach. Ideally, the excision surface should correspond to a coordinate
surface where some “radial” coordinate remains fixed, so the use of spherical
coordinates would seem ideal. In such a case defining the outgoing direction be-
comes trivial, so that when used in combination with a hyperbolic formulation
of the evolution equations the excision algorithm should become easy to imple-
ment and can be guaranteed to remain stable. When dealing with more than one
black hole, however, the use of a Cartesian grid is desirable to describe the space
between the black holes. This has led to algorithms that use a series of overlap-
ping coordinate patches: spherical coordinates near each black hole, Cartesian in
between, and possibly spherical again far away. Several such multi-patch codes
are currently under active development [255, 257].
Before finishing this Section it is important to mention that the use of excision
algorithms has lately become less common as there has recently been a crucial
breakthrough in puncture evolution techniques that has allowed the long-term
6.6 MOVING PUNCTURES 217
simulation of binary black hole spacetimes without the need to excise the black
hole interior (see the following Section).
eral orbits with the black holes moving through the grid was in fact done by Pretorius using
an approach based on evolving the full 4-metric of the spacetime in generalized harmonic co-
ordinates [231, 232, 233]. This method seems to be extremely robust and powerful and it is
actively being used by several numerical relativity groups, but as it is not directly based on
a 3+1 approach we will not discuss it here. Both the approach of Pretorius and the moving
puncture method where developed within a year of each other in 2004–05, showing the tremen-
dous amount of progress that the field of black hole simulations has seen within the last few
years.
218 EVOLVING BLACK HOLE SPACETIMES
The evolution equation for χ follows from that for φ and has the form
2
∂t χ − β i ∂i χ = χ αK − ∂i β i . (6.6.2)
3
Notice that while ψ has a 1/r pole at the puncture, χ is instead O(r4 ) at the
puncture. We then do not have to differentiate an infinite quantity at the punc-
ture, but rather a C 4 quantity. Still, since there will be explicit divisions by χ in
the evolution equations, we must ensure that χ is never exactly zero (in practice
we can set the value of χ to some very small positive number whenever it either
gets too close to zero or becomes negative due to numerical error).
Of course, evolving the puncture directly does not immediately imply that it
will move across the grid. From the discussion in Section 6.3 we can see that for
vanishing shift the punctures will simply not evolve. We therefore need to have
a good choice of gauge conditions. Moving puncture approaches typically use a
1+log slicing condition
∂t α − β i ∂i α = −2αK , (6.6.3)
together with a shift vector of the Gamma driver family (4.3.34) of the form
3
∂0 β i = B i , ∂0 B i = ∂0 Γ̃i − ηB i . (6.6.4)
4
Notice that no power of the lapse is present in the coefficient in front of ∂0 Γ̃i , as
we want the shift to evolve at the puncture where the lapse will collapse to zero.
The operator ∂0 is taken to be equal to ∂t in some approaches and to ∂t − β i ∂i in
others (and even different combinations are used in the different terms). Notice
that having a non-zero shift at the position of the puncture is precisely what
allows it to move. The position of the puncture xip can be tracked by integrating
the equation of motion66
dxip
= −β i xjp . (6.6.5)
dt
The shift is typically chosen to vanish initially, but the Gamma driver condi-
tion rapidly causes the shift to evolve in such a way that it not only counteracts
the longitudinal slice stretching effect, but for orbiting black holes it also auto-
matically acquires a tangential component that allows the punctures to orbit one
another. Also, because of the problem of the lack of collapse of the 1+log lapse at
the puncture discussed in the previous Section, the initial lapse is usually taken
to be of a pre-collapsed type α(t = 0) = ψ −n , with n equal to 2 or 4.
Both the φ-method and the χ-method have been shown empirically to lead
to robust and stable simulations of black hole spacetimes, allowing us to follow
binary black holes for several orbits through the merger and ring-down, and to
66 In practice, this equation does not need to be integrated very accurately, since with the shift
conditions typically used (i.e. the Gamma driver shift) the position of the puncture behaves
as an attractor.
6.6 MOVING PUNCTURES 219
extract accurate gravitational waves, with evolution times that can last for many
hundreds of M ’s. A recent review of the moving puncture approach can be found
in [83].67
to evolve the black holes accurately. Because of this, modern algorithms use fourth order
differencing in both space and time, plus some form of mesh refinement, to allow high resolution
close to each of the black holes while keeping the outer boundaries as far away as possible.
220 EVOLVING BLACK HOLE SPACETIMES
resolution is increased, the effect of these errors leaking out of the black hole
become smaller and smaller so that the simulation converges. At the time of
writing this book such an analysis has not yet been done, but in some simplified
systems it might not be so difficult to address.
There has been more progress on the second issue regarding the final sta-
tionary state reached near each puncture. Recent work by Hannam et al. [159]
has shown that, at least in the case of Schwarzschild, the 1+log slicing condition
together with a Gamma driver shift condition does indeed allow us to find an
explicitly time independent form of the Schwarzschild metric that penetrates the
horizon and corresponds to the final state of the “evolving puncture” simulations
to within numerical error. The surprising result has been that as the simulation
proceeds the numerical puncture ceases to correspond to another asymptotically
flat region and moves to a minimal surface inside the black hole. Also, the singu-
larity in the conformal factor changes character. Without going into the details
of their argument, Hannam et al. show that once the stationary state is reached,
the lapse and conformal factor behave near the puncture as
√
α∼r, ψ ∼ 1/ r , (6.6.6)
which is in contrast with the initial conformal factor of the isometric initial data
ψ ∼ 1/r, and the initial pre-collapsed lapse typically chosen for these simulations
α ∼ r2 . More interesting is the fact that since the area of spheres close to the
puncture goes as r2 ψ 4 , we now find that at the puncture itself this area remains
finite, so that the puncture no longer corresponds to asymptotic infinity. For
standard 1+log slicing we can in fact show that the puncture corresponds to a
cylinder with areal radius R0 ∼ 1.31M < 2M , i.e. well inside the black hole
horizon (the puncture, however, is still an infinite proper distance away from the
horizon).
We can ask how it is possible that a slice that initially reaches the other
asymptotic infinity ends up instead reaching a cylinder inside the black hole.
What is happening is that, as the evolution proceeds, the slicing condition causes
the throat of the Einstein-Rosen bridge to collapse toward the limiting cylinder
with radius R0 , while at the same time the shift condition makes the throat
approach the puncture exponentially in coordinate space. After a relatively short
time, the throat is less that one grid point away from the puncture, so the whole
space between the throat and the inner asymptotic infinity is squeezed into a
region that can simply not be resolved numerically, and in practice the puncture
can now be assumed to correspond to the throat itself.
As a final comment, the fact that we start with two asymptotically flat regions
but end up with a single region that reaches an inner throat at the puncture
immediately raises the possibility of constructing initial data that already has
this final topology, and using it instead of the standard Brill–Lindquist type data.
This is an issue that will no doubt receive much attention in the near future.
6.7 APPARENT HORIZONS 221
The expansion of the null lines is essentially the change in the area elements of
S along
l. If we denote by hµν the two-dimensional metric on S induced by the
spacetime metric gµν we will have
68 The normalization of the null vector
l is in fact arbitrary. Throughout most of the book we
√ √
will usually take
l = (
n +
s)/ 2, but here we drop the factor 1/ 2 to simplify the notation.
222 EVOLVING BLACK HOLE SPACETIMES
1 1
H = − hµν £l hµν = − hµν (£s hµν + £n hµν ) . (6.7.3)
2 2
But £s hµν = −2Xµν , with Xµν the extrinsic curvature of S, so that
where Kµν is the extrinsic curvature of Σ. Using now the fact that nα nα = −1,
sα sα = 1 and nα sα = 0 we find
so that
hµν £n hµν = −2 hµν Kµν = −2K + 2Kmn sm sn , (6.7.7)
H = Dm sm − K + Kmn sm sn
= (γ mn − sm sn ) (Dm sn − Kmn ) , (6.7.8)
Notice that the condition for having a minimal (or rather extremal) surface,
like the throat of a wormhole for example, is simply
D m sm = 0 , (6.7.10)
so that for time symmetric data with Kij = 0 the apparent horizon will coincide
(at least initially) with the minimal surface, i.e. with the throat of the Einstein–
Rosen bridge.
6.7 APPARENT HORIZONS 223
a) b)
Fig. 6.7: a) A strahlkörper or ray-body: The rays from the “center” intersect the surface
only once. b) A closed surface with spherical topology that is nevertheless not a ray-
body: There are rays that intersect the surface more than once.
The parameterization above is not in fact completely general, and it does imply
a restriction in the allowed form of the apparent horizon. Namely, the horizon is
assumed to be a so-called strahlkörper or ray-body, which is defined as a surface
with spherical topology with the property that we can always find a “center”
inside it such that all rays leaving this center intersect the surface once and only
once (see Figure 6.7).69 This simplifying assumption is typical for horizon finding
algorithms, and implies that horizons with very complicated shapes will not be
found.
Substituting now the form of F given above into equation (6.7.13), and assum-
ing that we are working in spherical coordinates (r, θ, ϕ), we obtain a non-linear
second order differential equation for h of the form
d2 h −1 2 ij
= rr θθ u γ − ∂ i F ∂ j F Γkij ∂k F + uKij , (6.7.21)
dθ 2 γ γ − (γ )
rθ 2
with Γkij the Christoffel symbols associated with the spatial metric γij , and where
dh dh
∂i F = 1, − , 0 , ∂ F = γ ∂m F = γ − γ
i im ir iθ
, (6.7.22)
dθ dθ
and
2 1/2
1/2 dh dh
i
u = ∂i F ∂ F = γ − 2γ
rr rθ
+γ θθ
. (6.7.23)
dθ dθ
The crucial observation here is the fact that this is an ordinary differential
equation (ODE) for h, which can be solved by giving some appropriate boundary
conditions and using any standard ODE integrator (e.g. Runge–Kutta).
69 The original definition of a strahlkörper is due to Minkowski.
226 EVOLVING BLACK HOLE SPACETIMES
The boundary conditions can be derived from the fact that the apparent
horizon should be a smooth surface. If we integrate from θ = 0 to θ = π,
smoothness requires that h(θ) should have zero derivative at both boundaries,
that is
∂θ h = 0 , θ = 0, π . (6.7.24)
In some cases we have equatorial symmetry as well as axial symmetry, in which
case we can stop the integration at θ = π/2 and impose the boundary condition
∂θ h = 0 there.
The way to find a horizon in practice is to start at some arbitrary point
r = r0 on the symmetry axis (say, a few grid points away from the origin),
and integrate the second order differential equation (6.7.21) with the boundary
conditions h = r0 and ∂θ h = 0 at θ = 0. When we reach θ = π (or θ = π/2
for equatorial symmetry), we check whether the boundary condition ∂θ h = 0
is satisfied at that point to some numerical tolerance. If it is, we have found
an apparent horizon, but if it isn’t we move up along the axis and start again.
If no horizon has been found as we reach the end of the axis, then no horizon
exists in our spacetime. This method is usually called a shooting method in the
ODE literature, because we start with some initial guess at one boundary and
“shoots” to the other boundary, adjusting the initial guess (we “aim better”)
until we obtain the desired boundary condition on the other side (we “hit the
target”).
6.7.3.2 Flow algorithms. Flow algorithms work by evolving an initial trial sur-
face in some unphysical time λ in such a way that the trial surface approaches
the apparent horizon asymptotically. The basic idea behind a flow algorithm is
the following: We starts with a spherical trial surface that is considerably fur-
ther out than the expected apparent horizon, and calculate the expansion H
over the whole surface which, being well outside the horizon, should be positive
everywhere. At each point, the trial surface is now moved inward a distance pro-
portional to the value of H at that point. If we have a point on the trial surface
with coordinates xi , it will then move according to the equation
dxi
= −Hsi , (6.7.27)
dλ
where si is the unit normal to the surface at that point. A flow algorithm of this
type for finding apparent horizons was first proposed by Tod in 1991 [291]. In
228 EVOLVING BLACK HOLE SPACETIMES
the case when the extrinsic curvature vanishes we have H = ∇i si , the apparent
horizon then reduces to a minimal surface and the flow method is guaranteed
to converge (the method is then known as mean curvature flow). In the more
general case no such proof exists, but in practice flow algorithms do seem to
converge to the apparent horizon.
If we parameterize our family of surfaces as F (xi , λ) = 0, we then find
dF dxi
= ∂λ F + ∇i F = 0 , (6.7.28)
dλ dλ
which using the flow equation can be reduced to
where we have used the fact that the normal vector is given in terms of F as
si = ∇i F/ |∇F |. Taking now as usual F = r − h(θ, φ), the flow equation becomes
We can now use this equation directly to evolve the function h in the unphysical
time λ until the right hand side vanishes and the evolution reaches a stationary
state, at which point we have found the apparent horizon.
The direct flow algorithm just described has one serious disadvantage: When
we write the expansion H explicitly, it turns out to be a second order elliptic
operator on the function h(θ, φ). The flow equation then has the structure of
a non-linear heat equation, so that numerical stability demands a very small
time-step (the stability condition takes the form ∆λ ∼ (∆x)2 , with x any of the
angular coordinates). For a reasonably high angular resolution the algorithm can
then become too slow to use more than a few times during an evolution.
In order to improve on its speed, the basic flow algorithm can be generalized
in a number of different ways. For example, we can use implicit integrators in
the unphysical time λ (see for example [269]). Gundlach, however, has suggested
a very different way of improving flow algorithms by considering a generalized
flow equation of the form [152]
−1
∂λ h = −A 1 − BL2 ρH , (6.7.31)
with ρ some scalar function constructed from {Kij , gij , si }, and where L2 denotes
the angular part of the flat-space Laplace operator:
1 1
L2 f := ∂θ (sin θ ∂θ ) + ∂ϕ2 f . (6.7.32)
sin θ sin2 θ
−1
The expression 1 − BL2 denotes the inverse of the operator 1 − BL2 . We
can compute this inverse explicitly by expanding the function h in terms of
spherical harmonics as in (6.7.26). Gundlach’s flow equation then becomes
6.7 APPARENT HORIZONS 229
A
∂λ al,m = − (ρH)l,m . (6.7.33)
1 + Bl(l + 1)
This can now be solved by forward differencing to obtain the following iterative
procedure
(n+1) (n) A (n)
al,m = al,m − (ρH)l,m . (6.7.34)
1 + Bl(l + 1)
6.7.3.3 Direct elliptic solvers. This family of algorithms takes the standard
parameterization of the level surface H(r, θ, φ) = r − h(θ, φ), and interprets
the horizon equation (6.7.13) as a non-linear second order elliptic differential
equation for the function h in the space of the two angular coordinates. The
idea is then to solve this equation using standard techniques for solving non-
linear elliptic problems. Typically, we can calculate all derivatives using finite
differences and later use Newton’s method to solve the resulting system of non-
linear algebraic equations.
Many different versions of these direct elliptic solvers have been used to locate
apparent horizons in three dimensions, and we will not go into the details here
here (but see Thornburg’s review for a more detailed description [287]). These
types of finder have the advantage that they can be very fast indeed (for example,
Thornburg reports that his direct elliptic solver finder can be over 30 times faster
than a horizon finder based on Gundlach’s fast-flow algorithm on the same data
sets). The main disadvantage is that they need a good initial guess to converge
rapidly, or even to converge at all (Newton’s method can in fact diverge if the
initial guess is too far off). Still, in many cases we usually have a reasonably good
idea of the region where an apparent horizon is expected, so this is not such a
serious drawback (in particular, during an evolution we can always use the last
known location of the apparent horizon as an initial guess).
230 EVOLVING BLACK HOLE SPACETIMES
Just as we did when looking for apparent horizons, let us assume that we
want to parameterize our null surface as a level set of some function F (xµ ) of
the form F = 0, the main difference being that F (xµ ) is now a function of all
four spacetime dimensions so that its level sets are in fact three-dimensional.
Since we are looking for a horizon, we will also assume that F is constructed in
such a way that the intersections of the different level sets of F with the spatial
hypersurfaces are always closed two-dimensional surfaces. The condition for the
surface to be null will just be
g µν ∂µ F ∂ν F = 0 , (6.8.1)
where gµν is the full spacetime metric. Since we want to integrate F in time, let
us explicitly separate its time and space derivatives:
2
g tt (∂t F ) + 2g it ∂t F ∂i F + g ij ∂i F ∂j F = 0 . (6.8.2)
The positive sign in the above equation corresponds to outgoing light rays and
the negative sign to ingoing ones. Since the event horizon is associated with
outgoing light rays, from now on we will keep only the positive sign. In terms of
3+1 quantities, the null surface equation takes the simple form
1/2
∂t F = β i ∂i F − α γ ij ∂i F ∂j F . (6.8.4)
In order to locate the horizon we then start with a given initial profile for the
function F (t = 0) that grows monotonically outward, and then integrates the
above equation back in time. Since the event horizon behaves as an attractor,
the initial surface F (t = 0) = 0 can in principle correspond to any arbitrary
closed surface. However, convergence will be faster if we choose F such that
F (t = 0) = 0 is already close to the expected position of the event horizon. The
location of the apparent horizon on the last slice of our numerical spacetime is
a good initial guess, as for a black hole that is almost stationary it will be very
close to the true event horizon (the apparent horizon will be somewhat inside
the event horizon).
As a simple example, assume that we are in Minkowski spacetime. The above
equation will then reduce to
∂t F = −∂r F . (6.8.5)
If at t = 0 we take F = r − r0, so that F = 0 corresponds to a sphere of
radius r = r0 , then the solution of this equation will be F = r − r0 − t. Our null
surface will then be given by r = r0 + t. It is clear that this solution does not
settle on any stationary surface as the light rays just keep moving out, which
232 EVOLVING BLACK HOLE SPACETIMES
is expected since Minkowski has no event horizon (notice that since we have an
exact solution there is no need to look at the evolution of the surface backward
in time).
It is important to notice that, in contrast with the geodesic equation needed to
integrate null rays, the above equation for the evolution of a null surface makes
no reference to derivatives of the metric (no Christoffel symbols are present).
This means that the numerical integration of this equation will be more accu-
rate than the integration of individual geodesics. Another important property of
this method is the fact that, even if we are focusing on the particular level set
F = 0, all the different level sets of F will correspond to a separate null surface.
This implies that by evolving the single scalar function F we are actually track-
ing a whole sequence of concentric light-fronts as they propagate through our
spacetime.
The idea of tracking null surfaces backward in time in order to locate the
event horizon is extremely robust, but there are two important issues that must
be taken into account. The first has to do with the fact that, since the event
horizon behaves as an attractor as we go back in time, light-fronts that are
already close to it will approach it very slowly, while those further away will
approach it much faster. This has the effect that close to the event horizon the
function F will rapidly become very steep as the different level sets (i.e. different
light-fronts) approach each other, resulting in large gradients that can cause the
numerical integration to lose accuracy.
To see how this happens consider the case of a single Schwarzschild black hole
in Kerr–Schild coordinates described by the metric (1.15.20), which corresponds
to the 3+1 quantities
1 2M/r
α= , βr = , γrr = 1 + 2M/r . (6.8.6)
(1 + 2M/r)1/2 1 + 2M/r
The evolution equation for the null surface then becomes
2M/r − 1
∂t F = β r − α (γ rr )1/2 ∂r F = ∂r F . (6.8.7)
1 + 2M/r
Remember now that we want to evolve this back in time, so we must in fact take
1 − 2M/r
∂t̄ F = ∂r F , (6.8.8)
1 + 2M/r
with t̄ = −t. Figure 6.8 shows the evolution of the function F for a black hole of
mass M = 1, starting from F (t̄ = 0) = r − 2 and evolving up to t̄ = 1. Since in
this case the horizon position is precisely at r = 2, we expect the level set F = 0
to remain fixed at the horizon location. We can clearly see from the figure how
this is indeed the case. Notice also how the other level sets rapidly approach the
horizon from both sides causing the function F to become steeper very rapidly.
6.8 EVENT HORIZONS 233
10
F 0
–5
–10
0 0.5 1 1.5 2 2.5 3 3.5 4
r
Fig. 6.8: Evolution of the function F backward in time for a Schwarzschild black hole
with M = 1 in Kerr–Schild coordinates. The evolution is shown for a total time of
t̄ = 1, with snapshots every ∆t̄ = 0.1.
The problem of the steepening of the function F close to the event horizon is
well known and can be solved by regularly re-initializing the function F without
changing its zero level set. For example, in [109] Diener suggests re-initializing
F whenever its gradient becomes too large by evolving the following equation in
the unphysical time λ until a steady state is achieved:
dF F
= −√ (|∇F | − 1) , (6.8.9)
dλ F2 + 1
with |∇F | := γ ij ∂i F ∂j F . The last equation will have the effect of driving
√
the magnitude of the gradient of F to 1 everywhere. The factor F/ F 2 + 1 is
there to guarantee that the level set F = 0 does not change position during re-
initialization, while at the same time limiting the size of the coefficient in front
of |∇F | since too large a coefficient would require a very small ∆λ to maintain
numerical stability (this is related to the CFL stability condition, see Chapter 9).
Notice, however, that as soon as we start re-initializing F , the level sets different
from F = 0 will stop corresponding to null surfaces, so that in the end we will
only be tracking the single null surface F = 0.
Another potential problem with the algorithm is related to the possibility of
the event horizon changing topology as black holes merge. As we are integrat-
ing backward in time, what we will in fact see is a single horizon that splits
into two separate ones. Simple geometry indicates that at the point where the
split happens the scalar function F will necessarily develop a saddle point where
the spatial gradient vanishes, so that according to equation (6.8.4) it will stop
evolving (at the saddle point it is simply impossible to decide which direction
234 EVOLVING BLACK HOLE SPACETIMES
is “outward”). Luckily, this seems to be one of those situations where the in-
accuracies of numerical approximations come to our help: Numerically, we will
never be exactly on the saddle point of F , so that the gradient will never vanish
completely. Empirically we find that the surface simply evolves right through a
topology change without any problem.
The discussion presented here has been very brief. For a much more detailed
explanation of the null surface algorithm the reader should look at the recent pa-
per by Diener on a general-purpose three-dimensional event horizon finder [109].
70 One might argue that black holes are by definition non-local. However, the new concepts
apply to geometric objects that are more closely related to the astrophysical notion of a black
hole: something that forms as a result of gravitational collapse and has a horizon around it.
6.9 ISOLATED AND DYNAMICAL HORIZONS 235
H n l
Y s
S
Fig. 6.9: On each spatial slice Σ the intersection of the hypersurface H is a marginally
trapped surface S. If s is
√the unit normal to S in Σ, and n is the timelike unit normal
to Σ, then l = (n + s)/ 2 is an outgoing null vector. For a non-expanding horizon, l
must be tangential to H.
where qij := γij − si sj is the induced metric on the horizon. Notice that ϕ
only
needs to be defined on the horizon itself.
Once we have a Killing field ϕ
, the magnitude of the angular momentum on
the horizon can be written in 3+1 terms as
&
1
JH = ϕl sm Klm dA , (6.9.2)
8π S
236 EVOLVING BLACK HOLE SPACETIMES
is somewhat ambiguous since in general we can have many different vector fields
on the horizon that satisfy these conditions, and each will give us a different
value for the angular momentum.
As a final comment, it should be mentioned that the need to study the evo-
lution of the direction as well as the magnitude of the angular momentum of a
black hole during a dynamical simulation has recently led to the introduction of
the so-called coordinate angular momenta, for which we simply take the standard
Euclidean rotational vector fields around the coordinate axis, ϕ
x = (0, −z, y),
7.1 Introduction
In the previous Chapter we considered the simulation of black hole spacetimes,
which corresponds to the study of the Einstein field equations in their purest
form, i.e. in vacuum. It is clear, however, that most relativistic astrophysical
systems involve matter sources: stars, accretion flows, jets, gravitational col-
lapse, etc. Since many of these systems involve gases it is natural to model them
using the theory of fluid dynamics. The fluid approximation describes matter as
a continuum, this means that when we consider an “infinitesimal” fluid element
we are in effect assuming that the fluid element is small compared to the system
as a whole, but still big enough to contain a very large number of particles. In
order for the fluid model to be a good approximation there are two basic condi-
tions that must be satisfied: Both the mean distance between the particles and
the mean free path between collisions must be much smaller than the macro-
scopic characteristic length of the system. Fluids are therefore particularly bad
at modeling collisionless particles, unless their paths are not expected to cross,
in which case we can use a fluid with zero pressure (known as “dust”).
The state of the fluid is usually described in terms of the velocity field of the
fluid elements v i , the mass-energy density ρ, and the pressure p. There are two
distinct approaches that can be used to follow the fluid motion. One possibility is
to consider a fixed coordinate system and study the motion of the fluid as seen by
observers at rest in this coordinate system – such an approach is called Eulerian
(as in the Eulerian observers of the 3+1 formulation). The other possibility is to
tie the coordinate system to the fluid elements themselves, corresponding to the
so-called Lagrangian approach. The Lagrangian point of view has some advan-
tages in the case of simple fluid motion (e.g. spherical collapse), as conservation
of mass can be trivially guaranteed. However, a Lagrangian scheme becomes in-
adequate when the fluid motion has large shears since in that case the coordinate
system can easily become entangled. Because of this, in the presentation given
here we will always consider a Eulerian approach.
In the following Sections we will derive the dynamical equations governing
the motion of a perfect fluid, known as the Euler equations, both in the special
and general relativistic cases, and will also consider some simple equations of
state. We will later discuss the hyperbolicity properties of the hydrodynamic
system of evolution equations, and consider the concept of weak solutions and
shock waves. Finally, we will discuss how to generalize the Euler equations to the
case of imperfect fluids with heat conduction and viscosity. The description pre-
238
7.2 SPECIAL RELATIVISTIC HYDRODYNAMICS 239
sented here will be very brief. There exists, however, a large literature dedicated
exclusively to the study of fluid mechanics, most notably the beautiful book of
Landau and Lifshitz [184]. There are also many books that discuss the numeri-
cal treatment of the hydrodynamic equations, particularly in the non-relativistic
case. In the case of relativity there is the recent book by Wilson and Mathews
on relativistic numerical hydrodynamics [300], and the review papers by Marti
and Müller [200], and Font [129].
where uµ is the 4-velocity of the fluid elements (the average 4-velocity of the
particles), ρ and p are the energy density and pressure as measured in the fluid’s
rest frame, and where, for the moment, we have assumed that we are in special
relativity so the metric is given by the Minkowski tensor gµν = ηµν .
The stress-energy tensor above is usually written in a simplified form by first
separating the total energy density ρ into contributions coming from the rest
mass energy density ρ0 and the internal energy:
ρ = ρ0 (1 + ) , (7.2.2)
where is the specific internal energy (internal energy per unit mass) of the fluid.
Let us now introduce the so-called specific enthalpy of the fluid defined as71
p
h := 1 + + . (7.2.3)
ρ0
The rest mass energy density is also often written in terms of the particle
number density n as
ρ0 = nm , (7.2.5)
with m the rest mass of the fluid particles.
71 In thermodynamics the enthalpy H is defined as the sum of the internal energy U plus
the pressure times the volume, H = U + pV . In other words, the enthalpy represents the total
energy in the system capable of doing mechanical work. In relativity we also add the rest mass
energy M to the definition of enthalpy, so that H = M + U + pV = M (1 + ) + pV . The specific
enthalpy is then defined as the enthalpy per unit mass: h = H/M = 1 + + p/ρ0 .
240 RELATIVISTIC HYDRODYNAMICS
with nµ the unit normal to the spacelike hypersurfaces. Substituting the above
stress-energy tensor here and using the fact that nµ nµ = −1 we find
2
ρADM = ρ0 h (uµ nµ ) − p = ρ0 hW 2 − p , (7.2.7)
In the particular case when the local coordinates follow the fluid element we
have W = 1, and the energy densities become equal:
ρADM = ρ0 hW 2 − p = ρ0 h − p = ρ0 (1 + ) = ρ . (7.2.12)
Notice, however, that if the flow is non-uniform we can not adapt the coordinate
system to follow the fluid elements (the Lagrangian approach) without being
forced to replace the Minkowski metric ηµν with a more general metric gαβ ,
since the fluid motion will in general deform the volume elements.
7.2 SPECIAL RELATIVISTIC HYDRODYNAMICS 241
The state of the fluid at any given time is given in terms of the six variables
(ρ0 , , p, v i ), which from now on will be called the primitive variables. The evo-
lution equations for the fluid now follow from the conservation laws. We have in
fact two sets of conservation laws, namely the conservation of particles and the
conservation of energy-momentum:
∂µ (ρ0 uµ ) = 0 , (7.2.13)
µν
∂µ T =0. (7.2.14)
Notice that these conservation laws provide us with five equations. In order to
close the system we therefore need an equation of state which can be assumed
to be of the form
p = p (ρ0 , ) . (7.2.15)
To proceed let us now introduce the quantity
D := ρ0 W , (7.2.16)
which is nothing more than the rest mass density as seen in the Eulerian frame.
The conservation of particles now implies
∂t D + ∂k Dv k = 0 . (7.2.17)
This is known as the continuity equation and has exactly the same form as in the
Newtonian case, but now D includes the relativistic correction coming from the
Lorentz factor W . The continuity equation can be interpreted as an evolution
equation for D.
For the conservation of momentum we first define the quantities
S µ := ρ0 hW uµ . (7.2.18)
∂µ (Si uµ /W + p δiµ ) = 0
⇒ ∂t Si + ∂k Si v k + ∂i p = 0 . (7.2.20)
These are the evolution equations for the momentum density and are known
as the Euler equations. Notice that they have a structure similar to that of the
242 RELATIVISTIC HYDRODYNAMICS
continuity equation, but now there is an extra term given by the gradient of the
pressure. The momentum density can then change both because ofthe flow of
momentum out of the volume element represented by the term ∂j Si v j , and
because of the existence of a force given by the gradient of the pressure ∂i p. The
Euler equations above have again exactly the same form as in the Newtonian
case, but now the definition of the momentum density Si includes the relativistic
corrections.
We are still missing an evolution equation for the energy density. Such an
equation can be obtained in a number of different ways. Experience has shown
that it is in fact convenient to subtract the rest mass energy density in order
to have higher accuracy, since for systems that are not too relativistic the rest
mass can dominate the total energy density. However, there are several non-
equivalent ways to do this. As a first approach, consider the internal energy
density as measured in the Eulerian frame:
E = ρ0 W . (7.2.21)
Notice that there is only one Lorentz factor W coming from the Lorentz contrac-
tion of the volume elements, since the specific internal energy can be considered
a scalar (this is by definition the internal energy per particle in the fluid’s frame).
In order to derive an evolution equation for E we first notice that the conserva-
tion equations imply that
∂µ (uν T µν ) = T µν ∂µ uν . (7.2.22)
Substituting here the expression for the stress-energy tensor, and remembering
that uµ uµ = −1 implies uµ ∂ν uµ = 0, we find
∂µ (uν T µν ) = p ∂µ uµ . (7.2.23)
∂t E + ∂k Ev k + p ∂t W + ∂k W v k = 0 . (7.2.27)
This equation has been used successfully by Wilson and collaborators to evolve
relativistic fluids (see e.g. [300]). However, as an evolution equation for E it has
7.2 SPECIAL RELATIVISTIC HYDRODYNAMICS 243
one serious drawback, namely that it also involves the time derivative of W ,
so that it can not be written as a balance law, which in particular makes it
impossible to use for analyzing the characteristic structure of the system.
Fortunately, there exists an alternative way of subtracting the rest mass en-
ergy from the system that does yield an equation in balance law form. We can
simply decide to evolve instead the difference between the total energy density
and the mass energy density as measured in the Eulerian frame:
Notice that the energies E and E differ since E does not include contributions
from the kinetic energy while E does. To find the evolution equation for E we first
notice that from the definition of S µ we have S 0 = ρ0 hW 2 . The conservation of
energy then takes the form
0 = ∂µ T 0µ = ∂µ S 0 uµ /W + pη 0µ , (7.2.29)
∂t E + ∂k (E + p) v k = 0 , (7.2.31)
∂t E + ∂k (E + p) v k = 0, (7.2.34)
with the conserved quantities (D, Si , E) given in terms of the primitive quantities
(ρ0 , , p, v i ) as
D = ρ0 W , Si = ρ0 hW 2 vi , E = ρ0 hW 2 − p − ρ0 W . (7.2.35)
Note that the Euler equations are frequently written for the speed vi instead
of the flux Si and have the form (see e.g. [297])
These equations can be easily derived by combining the evolution equations for
D, Si and E. However, they are not as convenient as the evolution equations for
72 The hydrodynamic equations in the conservative form given here were first derived by
Marti, Ibañez, and Miralles at the University of Valencia in Spain and are often called the
Valencia formulation of relativistic hydrodynamics [130, 198] (see also [129, 200]).
244 RELATIVISTIC HYDRODYNAMICS
Si since they are not written as conservation laws, and in particular involve the
time derivative of the pressure.
where d/dτ := uµ ∂µ is the derivative along the trajectory of the fluid elements.
This equation is in fact nothing more than the local version of the first law of
thermodynamics. To see this, consider a fluid element with rest mass M , internal
energy U , and volume V . We then have in general that
M 1
ρ0 = ⇒ dV = M d , (7.2.41)
V ρ0
and similarly
U
= ⇒ dU = M d . (7.2.42)
M
The first law of thermodynamics then implies that
1
dQ = dU + pdV = M d + p d . (7.2.43)
ρ0
This shows that (7.2.40) is precisely the first law of thermodynamics for a fluid
element for which dQ = 0 (this is to be expected since by definition a perfect
fluid has no heat conduction). And since in general dQ = T dS, with T the
temperature and S the entropy of the fluid, we see that a perfect fluid behaves
in such a way that entropy is preserved along flow lines.
Let us now go back and consider the relation between the primitive and con-
served variables (7.2.35). In the Newtonian limit these relations reduce to D = ρ0 ,
Si = ρ0 vi and E = ρ0 ( + v 2 /2), so that they are very easy to invert. In the rela-
tivistic case, however, inverting the relations becomes much more difficult since
first W involves v 2 , and also the pressure appears explicitly in the expression for
7.3 GENERAL RELATIVISTIC HYDRODYNAMICS 245
and change the value of p∗ until this residual vanishes. This can typically be ac-
complished by standard non-linear root-finding techniques (e.g. one-dimensional
Newton–Raphson). For some simple equations of state, such as that of an ideal
gas discussed in Section 7.5, the whole procedure can in fact be done analytically
and involves finding the physically admissible root of a high order polynomial (a
fourth order polynomial in the case of an ideal gas). However, this is typically
more computationally expensive than using the non-linear root finder directly.
where as before uµ is the 4-velocity of the fluid elements, ρ0 is the rest mass
energy density measured in the fluid’s rest frame, p is the pressure and h is the
specific enthalpy
p
h := 1 + + , (7.3.2)
ρ0
with the specific internal energy. The evolution equations for the fluid again
follow from the conservation laws, which however now take the form
∇µ (ρ0 uµ ) = 0 , (7.3.3)
∇µ T µν = 0 . (7.3.4)
Using the fact that the divergence of a vector can be written in general as
1 √
∇µ ξ µ = √ ∂µ −g ξ µ , (7.3.5)
−g
with g the determinant of the metric tensor gµν , we can immediately rewrite the
conservation of particles as
√
∂µ −g ρ0 uµ = 0 , (7.3.6)
with Γα
µν the Christoffel symbols associated with the metric gµν .
We now assume that we are using a standard 3+1 decomposition of spacetime,
in which case we find
√ √
g = −α2 γ ⇒ −g = α γ , (7.3.8)
with α the lapse function and γ the determinant of the spatial metric γij .
Just as we did in special relativity, let us define the scalar parameter W as
W := −uµ nµ , (7.3.9)
with nµ the unit normal to the spatial hypersurfaces. In this case we have
nµ = (−α, 0), so that
W = αu0 . (7.3.10)
Define now
ui βi
v i := + , (7.3.11)
αu0 α
with β i the shift vector. With this definition v i is precisely the speed of the fluid
elements as seen by the Eulerian observers. To see this notice that ui /u0 is the
coordinate speed of the fluid elements, so we first need to add the shift to go to
the Eulerian reference frame and then divide by the lapse to use the proper time
7.3 GENERAL RELATIVISTIC HYDRODYNAMICS 247
of the Eulerian observers instead of coordinate time. Notice also that since uµ is
a 4-vector while v i is only a 3-vector, when we lower the indices we have
ui = giµ uµ = βi u0 + γik uk
= βi u0 + γik u0 αv k − β k = αu0 vi = W vi . (7.3.12)
Using again the fact that uµ uµ = −1, we find that W takes the simple form
W = 1/ 1 − v 2 , (7.3.13)
where now v 2 := γij v i v j , i.e. W is again the Lorentz factor as seen by the
Eulerian observers. Define again D as the rest mass density measured by the
Eulerian observers
D := ρ0 W , (7.3.14)
we can then rewrite the conservation of particles as
√ √
∂t ( γ D) + ∂k γ D αv k − β k = 0 . (7.3.15)
This is again a conservation law for D, but in contrast to the case of special
relativity it involves the lapse α, the shift vector β k , and the determinant of the
spatial metric γ (compare with equation (7.2.32)).
For the conservation of momentum we again introduce the quantities
S µ := ρ0 hW uµ , (7.3.16)
E = ρ0 hW 2 − p − D . (7.3.19)
∂µ α2 γ T 0µ = ∂t γ ρ0 hW 2 − p
)√
*
+ ∂k γ ρ0 hW 2 αv k − β k + pβ k , (7.3.23)
and using the evolution equation for D we find that the final expression for the
conservation of energy takes the form
√ )√ k
*
∂t ( γ E) + ∂k γ E αv − β k + αpv k
√
= α2 γ T 0µ ∂µ ln α − Γ0µν T µν .
The final set of evolution equations is then
√ √
∂t ( γ D) + ∂k γ D αv k − β k = 0 , (7.3.24)
√ )√ k
* √
∂t ( γ Si ) + ∂k γ Si αv − β k + αpδik = α γ Γµνi Tµν , (7.3.25)
√ ) √
* √
∂t ( γ E) + ∂k γ E αv k − β k + αpv k = α2 γ T 0µ ∂µ ln α
− Γ0µν T µν , (7.3.26)
where the conserved quantities (D, Si , E) and primitive variables (ρ0 , , p, v i ) are
related through
D = ρ0 W , Si = ρ0 hW 2 vi , E = ρ0 hW 2 − p − D . (7.3.27)
Notice that the system of equations (7.3.24)–(7.3.26) reduces to the special rel-
ativistic counterpart (7.2.32)-(7.2.34) when we take α = 1, β i = 0 and γij = δij ,
in which case Γα µν = 0 and the system is truly conservative. The presence of a
non-trivial gravitational field implies that there is no longer true conservation of
energy and momentum, but the equations are still in the form of balance laws:
∂t u + ∂k F k (u) = s(u).
Before concluding this Section, it is important to write down the relation
between the quantities (D, Si , E) and the matter terms measured by the Eulerian
observers that appear in the ADM equations, namely the energy density ρADM ,
the momentum density j i ADM and the stress tensor S ij ADM . Using the expression
for T µν we find
ρADM := nµ nν Tµµ = ρ0 hW 2 − p = E + D , (7.3.28)
j ADM := −n P Tµν = ρ0 hW v = S ,
i µ νi 2 i i
(7.3.29)
S ij ADM := P µi P νj Tµν = ρ0 hW 2 v i v j + γ ij p , (7.3.30)
where P µν = g µν + nµ nν is the standard projection operator onto the spatial
hypersurfaces.
7.4 3+1 FORM OF THE HYDRODYNAMIC EQUATIONS 249
∂t ( γ D) + ∂k γ D αv k − β k = 0 . (7.4.1)
with Dk the covariant derivative associated with the spatial metric γij . This
implies that
1 √
√ ∂k γ D αv k − β k = Dk αDv k − Dk β k D − β k ∂k D . (7.4.3)
γ
Now, from the ADM evolution equations for the spatial metric (2.3.12), we can
easily find that
∂t ln γ = −2αK + 2Dk β k , (7.4.5)
with K the trace of the extrinsic curvature Kij . Collecting results we find that
the evolution equation for D in 3+1 language takes the final form
∂t D − β k ∂k D + Dk αDv k = αKD . (7.4.6)
This equation is clearly a scalar equation. The different terms are easy to inter-
pret: The shift appears only in the advection term, as it should, since the only
role of the shift is to move the coordinate lines. The last term on the left hand
side shows that the change in D along the normal lines is given essentially by the
divergence of the flux of particles. Finally, the source term shows that the density
of particles D can also change because of an overall change in the spatial volume
elements. For example, in the case of cosmology the so-called cosmological fluid
is co-moving with the Eulerian observers so that v i = β i = 0, but the density of
250 RELATIVISTIC HYDRODYNAMICS
particles still becomes smaller with time because of the overall expansion of the
Universe (K < 0).
It is also interesting to note that in the last equation the shift appears as
an advection term that is not in flux-conservative form (the shift is outside the
spatial derivatives). The flux conservative form (7.3.24) comes about because as
we bring the shift vector into the spatial derivative we pick up a term with the
divergence of the shift. This term is canceled by a corresponding term coming
from the time derivative of the volume element. This shows that, quite generally,
advection terms on the shift can be transformed into flux conservative type terms
√
by bringing a γ factor into the time derivative.
Consider next the evolution equation for E. Again, since this is by definition
the energy density measured by the Eulerian observers minus the rest mass
density, E = ρADM − D = ρ0 hW 2 − p − D, it is clearly a scalar quantity in 3+1
terms. Its original evolution equation is
√ )√ k
*
∂t ( γ E) + ∂k γ E αv − β k + αpv k
√
= α2 γ T 0µ ∂µ ln α − Γ0µν T µν .
Let us first look at the source term. Using the expression for the stress-energy ten-
sor (7.3.1), the definition of v i (7.3.11), and the expressions for the 4-Christoffel
symbols in 3+1 language found in Appendix B, it is not difficult to show that
α2 T 0µ ∂µ ln α − Γ0µν T µν = ρ0 hW 2 (αv m v n Kmn − v m ∂m α) + αpK
= (E + p + D) (αv m v n Kmn − v m ∂m α) + αpK . (7.4.7)
We can now rewrite the left hand side of the evolution equation for E in ex-
actly the same way as the evolution equation for D. We then find the following
evolution equation for E in 3+1 form
∂t E − β k ∂k E + Dk αv k (E + p) = (E + p + D) (αv m v n Kmn − v m ∂m α)
+ αK (E + p) . (7.4.8)
The last term on the right hand side is interesting. Assume that we have a fluid
that is co-moving with the Eulerian observers so that v k = β k = 0, we then find
that ∂t E = αK(E + p). This shows that the internal energy density changes both
as a reflection of a simple change in the volume elements (αKE), and because of
√
the existence of a non-zero pressure (αpK). But we know that αK = −∂t ln γ,
√
so that αpK = −p ∂t ln γ, which is nothing more than the work done by the
fluid as space expands. That is, the term αpK in the source term is there in
accordance with the first law of thermodynamics.
Finally, let us consider the evolution equation for the momentum density Si .
Since we have Si = ρ0 hW 2 vi , then we can consider Si a vector with respect to
the 3-geometry. Its original evolution equation is
√ )√ k
* √
∂t ( γSi ) + ∂k γ Si αv − β k + αp δik = α γ Γµνi Tµν , (7.4.9)
7.4 3+1 FORM OF THE HYDRODYNAMIC EQUATIONS 251
From the expression for the stress-energy tensor, the right hand side of this
equation can easily be shown to be
√ √ √ ρ0 h µ ν
α γ Γµνi Tµν = p ∂i (α γ) + α γ u u ∂i gµν . (7.4.10)
2
Substituting now the components of the 4-metric gµν in terms of 3+1 quantities
we find, after some algebra, that
k
α µ ν (3) m u vm
u u ∂i gµν = W −∂i α + v Di βk + Γik
2 k
, (7.4.11)
2 u0
where (3) Γm
ik are the Christoffel symbols associated with the 3-geometry. On the
other hand we have
1 √
√ ∂t ( γSi ) = ∂t Si − αKSi + Si Dk β k , (7.4.12)
γ
and
1 √
√ ∂k γSi αv k − β k = Dk αSi v k − β k
γ
+ Sm (αv n − β n ) (3) Γm
in . (7.4.13)
Collecting results we find that the evolution equation for Si becomes
1 √
∂t Si − αKSi + Dk αSi v k − β k Dk Si + Sm αv k − β k (3) Γm ik + √ ∂i (α γ p)
γ
p √ k
u vm
= √ ∂i (α γ) + ρ0 hW 2 −∂i α + v k Di βk + (3) Γm ik , (7.4.14)
γ u0
which can be simplified to
∂t Si − £β Si + Dk αSi v k + ∂i (αp) = − (E + D) ∂i α + αKSi . (7.4.15)
Notice that the shift vector again only appears in the Lie derivative term, as
expected.
The full set of hydrodynamic equations in 3+1 form can then be written as
∂t D − β k ∂k D + Dk αDv k = αKD , (7.4.16)
i k
∂t S − £β S + Dk α S v + γ p = − (E + D) D α + αKS ,
i i ik i i
(7.4.17)
k
∂t E − β ∂k E + Dk αv (E + p) = (E + p + D) (αv v Kmn − v ∂m α)
k m n m
+ αK (E + p) . (7.4.18)
The above equations are now manifestly 3-covariant when we consider (D, E, p)
as scalars and Si as a 3-vector.73 The 3+1 equations just derived can also be used
73 These 3+1 hydrodynamic equations have also been derived previously by Salgado using a
in the case of special relativity with curvilinear coordinate systems, in which case
we only need to take α = 1, β i = 0 and Kij = 0.
A much more realistic choice is the equation of state for an ideal gas which
has the form75
74 If we wish to study a system of collisionless particles in cases where the particle paths may
cross then we can’t think in terms of a continuous fluid anymore and need to go instead to a
description based on kinetic theory and the Boltzmann equation.
75 The terms “perfect fluid” and “ideal gas”, though similar, refer in fact to very different
things. A perfect fluid is defined as one with no viscosity and no heat conduction, but the
equation of state can still be very general. An ideal gas, on the other hand, refers to a very
specific equation of state.
7.5 EQUATIONS OF STATE 253
p = (γ − 1) ρ0 , (7.5.2)
with γ a constant known as the adiabatic index (not to be confused with the de-
terminant of the spatial metric of the previous Section). To see how this equation
comes about consider the standard equation of state for an ideal gas:
pV = nkT , (7.5.3)
where V is the volume, n the number of particles, T the temperature and k the
Boltzmann constant. Define now the specific heat of the gas as the amount of
heat Q per unit mass needed to raise the temperature by one degree. The specific
heat is in fact different if we keep the volume or the pressure constant, so we
define the specific heats at constant volume and at constant pressure as
1 dQ 1 dQ
cV = , cp = , (7.5.4)
M dT V M dT p
with M = nm the total mass and m the mass of the individual particles. The
adiabatic index γ of the gas is defined as the ratio of these two quantities:
cp
γ := . (7.5.5)
cV
Notice that in general we expect to find cp > cV , so that γ > 1. This is because
it takes more heat to increase the temperature at constant pressure than at
constant volume since in the first case part of the heat produces work (pdV ),
while in the second case no work is allowed (dV = 0).
Now, the first law of thermodynamics states that dQ = dU + pdV . This
implies that, at constant volume, dQ = dU , from which we find
1 dU
cV = ⇒ dU = M cV dT . (7.5.6)
M dT
If we now assume that cV is constant, this relation can be integrated to yield
U = M cV T . (7.5.7)
so that
1 dQ M cV dT + nkdT
cp = = = cV + k/m . (7.5.10)
M dT p M dT
254 RELATIVISTIC HYDRODYNAMICS
Another very common choice that is closely related to the ideal gas case is
the so-called polytropic equation of state that has the form
1+1/N
p = KρΓ0 ≡ Kρ0 , (7.5.15)
polytropic index instead of N . Also, we sometimes find that the polytropic equation of state is
defined as p = KρΓ = KρΓ Γ
0 (1 + ) , which is not equivalent to the standard version discussed
here.
7.5 EQUATIONS OF STATE 255
p = Kργ0 , (7.5.18)
with K some constant. Comparing this with equation (7.5.15) we see that Γ
plays the role of γ. However, we must remember that the thermodynamic rela-
tion (7.5.18) above is only valid for an adiabatic process involving an ideal gas,
while equation (7.5.15) is often promoted to an equation of state valid even when
there is heat exchange, so they are not entirely equivalent and in the general case
Γ will not correspond to the true adiabatic index.
For a polytrope we also finds the following relation between the specific in-
ternal energy and the rest mass density
K
= ργ−1 . (7.5.19)
γ−1 0
This relation can be easily obtained by substituting the polytropic relation
p = Kργ0 into the equation of state for an ideal gas p = (γ − 1)ρ0 . Again,
the relation only holds for an adiabatic situation.
In the particular case of a perfect fluid there is no heat exchange by definition,
so using the polytropic relation (7.5.18) is equivalent to using the equation of
state of an ideal gas (7.5.2). In fact, in this case we can show that the evolution
equation for E can be derived from the evolution equations for D and Si , together
with the equation of state and the polytropic relation p = Kργ0 , so that we can
ignore E completely (remember that the evolution equation for the energy den-
sity E is essentially the local version of the first law of thermodynamics, and the
polytropic relation is an integral of this law). There is, however, one final subtlety
to this argument. The hydrodynamic equations only imply the first law of ther-
modynamics along the trajectory of the fluid elements (cf. equation (7.2.40)), so
that the constant K in the integrated polytropic relation (7.5.18) can in principle
be different along different flow lines. The value of this constant can in fact be
shown to be related to the entropy of the fluid element. To see this, notice that
in the situation where there is heat transfer the first law of thermodynamics has
the general form
1 p
T dσ = d + p d = d − 2 dρ0 , (7.5.20)
ρ0 ρ0
256 RELATIVISTIC HYDRODYNAMICS
where T is the temperature and σ the specific entropy (entropy per unit mass)
of the gas. Dividing by T and using the expression for the temperature for an
ideal gas (7.5.14) we find
1/(γ−1)
m d dρ0
dσ = − = d ln , (7.5.21)
k (γ − 1) ρ0 ρ0
so that 1/(γ−1)
k
σ= C + ln , (7.5.22)
m ρ0
with C some integration constant. For an adiabatic process we can substitute
the polytropic relation (7.5.19) to find
k 1 K
σ= C+ ln . (7.5.23)
m γ−1 γ−1
We then see that taking K to be the same constant everywhere implies that σ
is not only constant along flow lines, but is also uniform. Because of this a fluid
with K constant everywhere is called isentropic. Notice finally that even though
for a perfect fluid we expect the flow to be adiabatic, this is only true as long
as the flow remains smooth. When shocks develop, kinetic energy is transformed
into internal energy (the so-called shock heating), so there is heat transfer and
the polytropic relation is no longer equivalent to the ideal gas equation of state.
The origin of the word polytrope, which comes from the Greek for “many
turns”, is in the study of adiabatic processes in convective gases. Polytropic
models where first studied by R. Emden in 1907 [122] and they are particularly
useful for the study of hydrostatic equilibrium in the Earth’s atmosphere. They
have also been extensively used in astrophysics for studying stellar structure.
In the particular case of cold compact objects such as white dwarfs and
neutron stars that are supported mainly by the Pauli exclusion principle (for
electrons in the first case and neutrons in the second), we can not use the ideal
gas equation of state discussed above since it only applies to classical gases and
does not describe correctly a highly degenerate Fermi gas. However, we can
derive from first principles an equation of state for an ideal Fermi gas at zero
temperature and it turns out that this equation reduces to a polytropic form
both in the non-relativistic and extremely relativistic limits, with γ = 5/3 and
4/3 respectively. More generally, polytropic equations of state with an adiabatic
7.6 HYPERBOLICITY AND THE SPEED OF SOUND 257
index in the range 1 < γ < 3 can be used as very simple models of neutron
stars made of “non-ideal” Fermi gases. High values of γ result in stiff equations
of state, and low values in soft equations of state (the words “soft” and “stiff”
relate to the speed of sound in the fluid – see Section 7.6).
The true equation of state for neutron stars is still largely unknown owing to
our lack of knowledge of the interactions of nuclear matter at very high densities.
There are, however, a number of proposed equations of state that are consider-
ably more realistic than a simple polytrope, but we will not discuss them here
(the interested reader can see e.g. [264, 275]).
∂t E + ∂k (E + p) v k = 0 , (7.6.3)
u1 := ρ0 , u2 := Sx , u3 := Sy , u4 := Sz , u5 := E . (7.6.4)
u5 u2 + u23 + u24 u2 u3 u4
ρ0 = u 1 , = − 2 , vx = , vy = , vz = . (7.6.5)
u1 2u21 u1 u1 u1
F1 = ρ0 vx = u2 , (7.6.6)
u22
F2 = Sx vx + p = +p, (7.6.7)
u1
u2 u3
F3 = Sy vx = , (7.6.8)
u1
u2 u4
F4 = Sz vx = , (7.6.9)
u1
(u5 + p) u2
F5 = (E + p) vx = . (7.6.10)
u1
Here we must remember that p = p(ρ0 , ), so that in general we have
∂p ∂p ∂ρ0 ∂p ∂
= + . (7.6.11)
∂ui ∂ρ0 ∂ui ∂ ∂ui
The characteristic matrix is now defined as the Jacobian of the fluxes, i.e.
Mij = ∂Fi /∂uj . When we construct this matrix explicitly we find that it has
five real eigenvalues and a complete set of eigenvectors, so the system is strongly
hyperbolic. The characteristic speeds (eigenvalues) are
u2
λ0 = = vx , (with multiplicity 3) (7.6.12)
u1
+ +
u2 p p
λ± = ± χ + 2 κ = vx ± χ + 2 κ , (7.6.13)
u1 u1 ρ0
λ± = vx ± cs , (7.6.15)
where cs is given by
p
c2s := χ + κ. (7.6.16)
ρ20
The quantity cs is known as the local speed of sound , and measures the speed
at which density perturbations travel as seen in the fluid’s reference frame. The
speed of sound cs can also be rewritten in a different way by remembering that
for an adiabatic process the first law of thermodynamics has the form
1 p
0 = d + p d = d − 2 dρ0 , (7.6.17)
ρ0 ρ0
which implies
p
d = dρ0 . (7.6.18)
ρ20
7.6 HYPERBOLICITY AND THE SPEED OF SOUND 259
e1 = 1, vx , vy , vz , + v 2 /2 − ρ0 χ/κ , (7.6.22)
e2 = vy , vy vx , vy − v /2, vy vz , vy ( − ρ0 χ/κ) ,
2 2
(7.6.23)
e3 = vz , vz vx , vz vy , vz − v /2, vz ( − ρ0 χ/κ) ,
2 2
(7.6.24)
e± = 1, vx ± cs , vy , vz + v 2 /2 + p/ρ0 ± vx cs . (7.6.25)
As we will see in Chapter 9, knowing the form of the eigenvectors can be very
useful for developing numerical methods that can adequately handle shocks. On
the other hand, the concept of eigenfunction is not particularly useful in this
case since the equations are nonlinear, so that we can’t move the matrix of
eigenvectors in and out of derivatives without changing the principal part of the
system.78 In other words, there is in general no set of functions that simply prop-
agate along characteristic lines. Of course, we can always linearize the solution
around a given background state (ρˆ0 , Sˆi , Ê). In this case the eigenfunctions can
78 In the case of the Einstein field equations we can define eigenfunctions meaningfully since
even though those equations are also non-linear, they are nevertheless quasi-linear and the
nonlinearities appear in the source terms.
260 RELATIVISTIC HYDRODYNAMICS
be defined in a meaningful way and one does find that small perturbations do
travel along characteristic lines. However, even if we do this explicitly the form
of the eigenfunctions is somewhat complicated and not particularly illuminating.
There is, however, at least one simple case where looking at the eigenfunctions
is interesting. Consider a perturbation around a state in which the fluid is at rest
with uniform density and internal energy. In that case we have:
ρ0 = ρˆ0 + δρ0 , Si = δSi , E = Ê + δE , (7.6.26)
with ρˆ0 and Ê constant. We then find that the eigenvalues are λ0 = 0 with
multiplicity three (since the background speed vanishes), and λ± = ±ĉs , with ĉs
the speed of sound on the background. The three eigenfunctions associated with
λ0 turn out to be
p̂
w1 = δSy , w2 = δSz , w3 = δE − ˆ + δρ0 . (7.6.27)
ρˆ0
In other words, δSy and δSz remain at their initial values, and δE −(ˆ
+ p̂/ρˆ0 ) δρ0
also remains constant. On the other hand, the eigenfunctions associated with λ±
are found to be !
κ̂Ê κ̂
w± = χ̂ − 2 δρ0 + δE ± ĉs δSx . (7.6.28)
ρ̂0 ρˆ0
Notice now that since w± obey the evolution equations
∂t w± ± ĉs ∂x w± = 0 , (7.6.29)
then we can take an extra time derivative to find that
∂t2 w± − ĉ2s ∂x2 w± = 0 . (7.6.30)
That is, w± obey a simple scalar wave equation with speed ĉs . And since both
obey the same wave equation, it is clear that their sum and difference also do.
Finally, since (7.6.27) implies that δρ0 and δE are linearly related to each other
with constant coefficients, then we can easily deduce that all three perturbations
δρ0 , δE, and δSx obey the same scalar wave equation.
7.6.2 Relativistic case
Let us now consider the relativistic hydrodynamic equations. We will start from
the 3+1 version of these equations which for concreteness we rewrite here:
∂t D − β k ∂k D + Dk αDv k = αKD , (7.6.31)
i k
∂t S − £β S + Dk α S v + γ p = − (E + D) D α + αKS ,
i i ik i i
(7.6.32)
k
∂t E − β ∂k E + Dk αv (E + p) = (E + p + D) (αv v Kmn − v ∂m α)
k m n m
+ αK (E + p) , (7.6.33)
with D = ρ0 W , S i = ρ0 hW 2 v
i
, E = ρ0 hW 2 − p − D, and where h = 1 + + p/ρ0
is the enthalpy and W = 1/ 1 − γij v i v j is the Lorentz factor. As before, for
7.6 HYPERBOLICITY AND THE SPEED OF SOUND 261
where u is the vector of main variables (in our case the primitive variables),
each F µ is a vector of fluxes, and s is a source vector that does not involve
derivatives. Construct now the Jacobian matrices Aµij = ∂Fiµ /∂uj . The system
will be strongly hyperbolic if given any arbitrary pair of vectors ξ µ and ζ µ such
that79
ξµ ξ µ = −1 , ζµ ζ µ = 1 , ξµ ζ µ = 0 , (7.6.35)
then the matrix Aµ ξµ is invertible (its determinant is different from zero), and the
characteristic matrix M := (Aµ ξµ )−1 (Aµ ζµ ) has real eigenvalues and a complete
set of eigenvectors. Notice that the matrix M just defined plays the role of the
principal symbol in the standard definition of hyperbolicity.
In the case of the relativistic hydrodynamic equations we will take as main
variables u = (ρ0 , v x , v y , v z , ) (in that order). Notice that now it is important
to distinguish between contra-variant and co-variant indices in the speed v i , and
for the main variables we have chosen to use the contra-variant components. The
fluxes along the time direction are just the definition of the conserved quantities
themselves:
79 The notation used here would seem to suggest that F µ is a 4-vector in the language of
differential geometry. However, we are only interested in the fact that it is a collection of four
flux vectors, one of which is associated with the time direction. Also, the norm of the vectors
ξ µ and ζ µ is constructed using the Minkowski metric, as all that really matters is that ξ µ is
timelike and ζ µ spacelike.
262 RELATIVISTIC HYDRODYNAMICS
where v 2 := γij v i v j , and where now the local speed of sound cs is defined as
2 1 p
cs = χ+ 2 κ , (7.6.44)
h ρ0
with χ = ∂p/∂ρ0 and κ = ∂p/∂ as before. Notice that, in contrast to the
Newtonian case, the local speed of sound cs is now divided by the enthalpy
(compare with equation (7.6.16)).
The eigenvalues λ± have the interesting property that even though they rep-
resent the speed of modes propagating along the x direction, they nevertheless
involve the tangential speeds v y and v z through the combination v 2 (this is a con-
sequence of the relativistic length contraction). One interesting limit is obtained
when v y = v z = 0 and the spacetime metric is flat (α = 1, β i = 0, γij = δij ),
corresponding to the case of one-dimensional special relativity. In that case the
eigenvalues reduce to
v x ± cs
λ0 = v x , λ± = . (7.6.45)
1 ± v x cs
We then see that λ± is nothing more than the standard expression for the rela-
tivistic composition of the velocities v x and cs .
Another special case corresponds to the situation where the fluid is at rest
as seen by the Eulerian observers so that v x = v y = v z = 0, but the metric is
still quite general. In this case the eigenvalues take the simple form
√
λ0 = −β x , λ± = −β x ± αcs γ xx . (7.6.46)
√
The local speed of sound cs is now just corrected by the geometric factor α γ xx
and shifted by −β x .
7.6 HYPERBOLICITY AND THE SPEED OF SOUND 263
Let us again look at the expression for the speed of sound cs for an adiabatic
process. As we found in the Newtonian case, for an adiabatic process we have
p
d = dρ0 , (7.6.47)
ρ20
p
dp = χ + 2 dρ0 (7.6.48)
ρ0
This is identical to the Newtonian expression (7.6.21), except for the fact that
the relativistic expression involves the total energy density and not just the rest
mass energy density. For the particular case of an ideal gas we find again from
the polytropic relation
γp γp
c2s = = . (7.6.52)
ρ0 h ρ+p
In contrast with the Newtonian case, we now need to worry about the fact that
at high temperature the speed of sound might be larger than the speed of light
for some values of the adiabatic index γ, so in general we should ask for the value
of γ to be such that cs < 1. For an ultra-relativistic mono-atomic gas we have
γ = 4/3 so that p = (γ √ − 1)ρ0 = ρ0 /3, and also ρ ρ0 . The speed of sound
then becomes cs 1/ 3 0.58 which is still lower than the speed of light but
already quite large. More generally, inserting the equation of state for an ideal
gas into the above expression for c2s we find that
γ (γ − 1)
c2s = . (7.6.53)
1 + γ
In the limit of very high temperatures ( >> 1) this reduces to c2s γ − 1 < 1,
so that we must ask for 1 < γ < 2. Of course, if we choose γ to be a slowly
varying function of temperature, we can still have values larger than 2 at low
temperatures (this is the reason why we can sometimes consider models for
neutron stars at T = 0 that have values of γ as high as 3).
264 RELATIVISTIC HYDRODYNAMICS
∂t u + u∂x u = 0 . (7.7.1)
Notice that this is essentially an advection equation where the wave speed is
now the unknown function itself. Burgers’ equation is also the limit of the Eu-
ler equations for the case of a pressureless fluid (dust), as can be seen from
equation (7.2.36).
The behavior of the solutions to Burgers’ equation can be easily understood
by noticing that since the characteristic speed is u itself, then the change in u
along a characteristic line will be given by
d
u = ∂t u + u∂x u = 0 . (7.7.2)
dt
We can use the integral form of the conservation law to define the so-called
weak solutions of the conservation law.80 To see how this works, assume then
that we have a conservation law of the form
∂t u + ∂x F (u) = 0 . (7.7.3)
We now multiply this equation with a smooth test function φ(x, t) with compact
support (both in space and time), and integrate over space and time
∞ ∞
φ (∂t u + ∂x F ) dxdt = 0 . (7.7.4)
0 −∞
Integrating by parts the first term in time and the second term in space, and
using the fact the φ has compact support, we find
∞ ∞ ∞
(u∂t φ + F ∂x φ) dxdt = − φ(x, 0)u(x, 0)dx . (7.7.5)
0 −∞ −∞
We now say that u is a weak solution of the conservation law if the last identity
holds for all test functions φ.
To understand how weak solutions behave we must consider the so-called
Riemann problem, which is the solution corresponding to piecewise constant
initial data with a single discontinuity. As a simple example, assume that we
have initial data for Burgers’ equation of the form
'
ul x<0,
u(x, 0) = (7.7.6)
ur x>0.
We can consider two distinct possibilities. Assume first that ul > ur . In this case
there is a unique weak solution corresponding to the discontinuity propagating
with a speed s = (ul + ur )/2. To see that this is the case assume that we have a
discontinuity propagating at a speed s. We then have
xr xr
d xr
u dx = ∂t u dx = − ∂x F (u) dx = F (ul ) − F (ur ) , (7.7.7)
dt xl xl xl
where xl and xr are such that the discontinuity remains in the interior region
during the time considered, and F (u) is the flux function. On the other hand,
for a discontinuity propagating with speed s we clearly have
d xr d
u dx = [(xr − xl ) ur + st (ul − ur )] = s (ul − ur ) , (7.7.8)
dt xl dt
Fig. 7.1: Shocks and rarefaction waves. The left panel shows a shock wave, i.e. a trav-
eling discontinuity for which the speed on the left is larger than the speed on the right.
The discontinuity travels at a speed given by the Rankine–Hugoinot jump condition.
The right panel shows a rarefaction wave corresponding to the case where the speed
on the left is lower than that on the right. The solution is then an interpolating line
between states of constant u.
F (ur ) − F (ul ) [F ]
F (ul ) − F (ur ) = s (ul − ur ) ⇒ s= = , (7.7.9)
ur − ul [u]
where [·] denotes the jump of a given quantity across the discontinuity. The last
expression is known as the Rankine–Hugoinot jump condition and governs the
behavior of conservation laws across discontinuities.
In the particular case of Burgers’ equation we can easily see that the flux is
given by F (u) = u2 /2. The jump condition then implies that the speed of prop-
agation of the discontinuity must be s = (u2r /2 − u2l /2)/(ur − ul ) = (ur + ul )/2.
We have so far assumed that ul > ur . However, nothing in the derivation of
the jump condition used this assumption, so that we can expect that a propa-
gating discontinuity with speed s = (ur + ul )/2 would also be a weak solution
to Burgers’ equation when ul < ur . This is indeed true, but in that case the
weak solution turns out not to be unique. In fact, it is not even stable as we can
show that by adding even a small amount of viscosity the solution will change
completely. The stable weak solution in this case is in fact very different and has
the form ⎧
⎨ ul x < ul t ,
u(x, t) = x/t ul t ≤ x ≤ ur t , (7.7.10)
⎩
ur x > ur t .
This solution interpolates two regions of constant u with a straight line whose
slope decreases with time.
We then have two different types of weak solutions depending on the relative
sizes of ur and ul : A traveling discontinuity for ul > ur known as a shock wave,
and an interpolating solution for ur > ul known as a rarefaction wave (see
Figure 7.1).
In the case of a rarefaction wave, we have already mentioned that the weak
solution is in fact not unique. Physically, the way to choose the correct weak
solution is to consider the equation with non-vanishing viscosity, by adding a
term of the form ∂x2 u to the right hand side of the conservation law (with > 0
the viscosity coefficient), and then taking the limit of the solution when → 0. In
practice, however, this is very difficult to do so that some simpler physical criteria
7.7 WEAK SOLUTIONS AND THE RIEMANN PROBLEM 267
are required to choose among the different weak solutions. Such simpler criteria
are commonly known as entropy conditions, since they are based on an analogy
with the hydrodynamic equations where the relevant physical condition we have
to apply is to say that the entropy of the fluid elements must not decrease.
There are many different formulations of entropy conditions for conservation
laws, but in essence they can all be reduced to the statement that for a traveling
discontinuity to be an “entropy solution” (i.e. a physically acceptable solution)
the characteristic lines must converge at the discontinuity. If the characteristic
lines instead diverge from an initial discontinuity, then the entropy solution is a
rarefaction wave instead.81
In the previous discussion we have assumed that we have a single non-linear
scalar conservation law, in which case the solution of the Riemann problem is
either a shock wave or a rarefaction wave. There is in fact another possible
solution known as a contact discontinuity that corresponds to the case of a
linear conservation law for which the characteristic speed is constant and the
initial discontinuity simply moves along characteristic lines.
This implies that [u] must be an eigenvector of the characteristic matrix M , with
s the corresponding eigenvalue. That is, only jumps corresponding to eigenvectors
of M will result in a single propagating discontinuity, and they will move with
the corresponding characteristic speed.
What happens if we take as initial data a jump [u] that does not correspond
to an eigenvector of M ? After all, we are allowed to choose the initial data
freely. In this case the initial discontinuity will “split” into a group of separate
discontinuities traveling at different speeds. The original jump [u] needs to be
decomposed into a linear combination of eigenvectors of M , with each component
traveling at its own characteristic speed. Since we are considering a linear system,
all these discontinuities will correspond to simple contact discontinuities.
In the case of non-linear systems of equations like the Euler equations the
situation is more complicated, but the general idea is the same. The general
81 Rarefaction waves are often also called rarefaction fans, as characteristic lines “fan out”
solution of the Riemann problem involves separating an initial jump [u] into a
series of jumps, but now the different jumps might develop into shock waves that
satisfy the entropy condition, rarefaction waves, or simple contact discontinuities.
Contact discontinuities, in particular, can be expected to be present whenever
we have a situation where a given eigenvalue λi (u) is constant along the integral
curves of the corresponding eigenvector ei (u) (i.e the curves that are tangent to
the eigenvector), that is, if
∇u λi · ei (u) = 0 , (7.7.12)
with ∇u the gradient of λi (u) with respect to the u’s (not with respect to the
spatial coordinates). When this happens we say that the corresponding eigenfield
wi is linearly degenerate. If, on the other hand, the above expression is non-zero
for some i, then the corresponding eigenfield wi is called genuinely non-linear.
This situation does in fact occur for the hydrodynamic equations, in which case
we can see that all three eigenfields that propagate along the flow lines are
linearly degenerate, while the sound waves are genuinely non-linear. For the
linearly degenerate fields we can have at most contact discontinuities, for which
the density ρ0 and specific internal energy are discontinuous, while the pressure
p and flow speed v i remain continuous.
We will not describe here in any detail the theory behind the general solution
of the Riemann problem for non-linear systems; the interested reader can see for
example [187] and references therein. It is sufficient to say that the solution to
the Riemann problem for the hydrodynamic equations in the Newtonian case is
well known, see e.g. [105]. The corresponding solution in the relativistic case was
recently found in the case of special relativity by Marti and Müller [199], and in
the case of general relativity by Pons, Marti and Müller [229].82 A particular case
of the Riemann problem for the hydrodynamic equations, known as the shock
tube problem and consisting of a fluid that is initially at rest with a discontinuity
in both the density ρ0 and the specific internal energy (or equivalently in the
pressure p), is often used as a test of numerical hydrodynamic codes.
case can be found in the recent book by Toro [292], and another one for the relativistic case
can be found in the online version of the review paper by Marti [200].
7.7 WEAK SOLUTIONS AND THE RIEMANN PROBLEM 269
ρl M 2 (γ + 1)
= , (7.7.16)
ρr 2 + M 2 (γ − 1)
l 2γM 2 − (γ − 1) 2 + M 2 (γ − 1)
= 2 , (7.7.17)
r M 2 (γ + 1)
2 M 2 − 1 cr
vl = . (7.7.18)
M (γ + 1)
with cr = γpr /ρr = γ(γ − 1)r the local speed of sound on the right of the
shock, and where we have introduced the Mach number M defined as the ratio
of the shock speed s over cr
s s
M := = . (7.7.19)
cr γ(γ − 1)r
Using the above expressions, the jump in the pressure can also be found to be
pl 2γM 2 − (γ − 1)
= . (7.7.20)
pr γ+1
We can now invert the last relation to solve for the shock speed s in terms of the
pressure ratio to find
1/2
γ+1 pl γ−1
s = cr + . (7.7.21)
2γ pr 2γ
Now, for a shock to be moving to the right we must clearly have pl > pr , which
from the above expression implies that s > cr (M > 1), or in other words the
shock must be supersonic as in enters the region to the right. As the shock moves,
fluid elements that were initially at rest acquire a speed vl > 0 (see Figure 7.2).
270 RELATIVISTIC HYDRODYNAMICS
Fluid elements
Shock front
time
Fig. 7.2: Motion of a single shock wave. As fluid elements that are initially at rest cross
the shock front they acquire a speed vl > 0. The shock, however, still moves faster than
the fluid elements so that s > vl . The entropy of the fluid elements also increases as
they pass through the shock.
Finally, from the discussion of Section 7.5 we can find that the change in
entropy as a fluid element crosses the shock is given by
σl pl ρl
= ln − γ ln . (7.7.22)
σr pr ρr
Using the expressions above for pl /pr and ρl /ρr we can show that if M > 1 and
γ > 1, then σl > σr so that the entropy of the fluid elements increases as they
move through the shock.
assuming that the state of a fluid element is always close to equilibrium so that
we can still define a local temperature T . The state of the fluid will still be
described by the rest mass energy density ρ0 , the specific internal energy , and
the 4-velocity of the fluid elements uµ , but now the pressure is assumed to include
an extra contribution coming from the viscosity and known as the bulk viscous
pressure Π,
p→p+Π, (7.8.1)
The non-viscous part of the pressure p is still assumed to be given by the same
equation of state as before.
The stress-energy tensor of the fluid is generalized by considering the general
decomposition of a symmetric tensor in terms of parallel and normal projections
along the fluid’s 4-velocity field uµ . Doing this we find the following expression
for the stress-energy tensor
with hµν = gµν + uµ uν the projection operator into the fluid’s rest frame, and
where q µ and πµν are a vector and a symmetric tensor such that
The vector q µ can be interpreted as the energy flux in the fluid’s frame, or
in other words the heat flow. The tensor πµν , on the other hand, represents
the presence of possible anisotropic stresses (the pressure corresponds to the
isotropic stresses). The quantities Π, q µ and παβ together are usually called the
thermodynamic fluxes.
The dynamics of the fluid are still obtained from the conservation laws
∇µ (ρ0 uµ ) = 0 , ∇µ T µν = 0 . (7.8.4)
with ρ = ρ0 (1 + ), and where σµν is the shear tensor associated with the fluid
motion, i.e the symmetric tracefree part of the projection of ∇µ uν :
1
σµν := hα h
µ ν
β
∇ u
(α β) − h ∇
αβ λ u λ
. (7.8.7)
3
It is clear that the energy-momentum conservation involves the thermodynamic
fluxes Π, q µ and πµν . The system of equations will therefore not be closed unless
we can relate these fluxes to quantities associated with the fluid state, also known
as the thermodynamic potentials.
The standard approach due to Eckart starts by considering the entropy cur-
rent S µ and asking for its covariant divergence (i.e. the entropy production) to
be non-negative, in accordance with the second law of thermodynamics:
∇µ S µ ≥ 0 . (7.8.8)
In the case of a perfect fluid the entropy current is simply given by S = σρ0 uµ ,
µ
with σ the specific entropy, and as we have seen the conservation equations
directly imply ∇µ S µ = 0. When there is heat flow q µ , however, it contributes to
the entropy flux with a term given q µ /T (dS = dQ/T ), with T the temperature
of the fluid element, so that
qµ
S µ = σρ0 uµ + . (7.8.9)
T
Calculating the divergence and using the conservation of particles we find that
dσ
T ∇µ S µ = ρ0 T − q µ ∇µ ln T + ∇µ q µ . (7.8.10)
dτ
Using now the conservation equations (7.8.5) and (7.8.6), together with the first
law of thermodynamics in the form
1
T dσ = d + p d , (7.8.11)
ρ0
we can rewrite the entropy divergence as
duµ
T ∇µ S = − Π ∇µ u + q
µ µ µ
+ ∇µ ln T + π σµν .
µν
(7.8.12)
dτ
It is now clear that we can guarantee that this divergence will always be posi-
tive if we postulate a linear relation between the thermodynamic fluxes and the
thermodynamic potentials of the form
duµ
qµ = −χ T + hµ ∇ν T ,
ν
(7.8.13)
dτ
Π = −ζ∇µ uµ , (7.8.14)
πµν = −2ησµν , (7.8.15)
with χ, ζ and η positive parameters known respectively as the coefficients of heat
conduction, bulk viscosity and shear viscosity (the projection operator applied
7.8 IMPERFECT FLUIDS 273
The Eckart theory just presented has a serious drawback in the sense that it
is not causal. This can be seen from the fact that if a thermodynamic potential
is turned off, the corresponding flux vanishes instantaneously. To make this more
concrete consider the problem of heat flow for a fluid at rest in flat spacetime. In
that case we have uµ = δ0µ and also q 0 = 0, so the conservation equation (7.8.6)
reduces to
∂t ρ = − ∂i qi , (7.8.16)
i
For a fluid at rest we also have ∂t ρ = ρ0 ∂t = ρ0 cV ∂t T , which implies
ρ0 cV ∂t T = − ∂i qi . (7.8.17)
i
On the other hand, the heat flux is now given by qi = −χ∂i T , so that we find
ρ0 cV ∂t T = χ∇2 T , (7.8.18)
with ∇ the ordinary flat space Laplacian. This is nothing more than the stan-
2
be constant and might in fact depend on the fluid variables ρ and , but for
simplicity we will take them to be constant in what follows.
We can now calculate the divergence of the entropy flux (7.8.19) in the same
way as before to obtain
µ
dΠ β0 T u
T ∇µ S µ = −Π ∇µ uµ + β0 + ∇µ
dτ 2 T
ν
duµ dqµ β1 T u
−q µ + ∇µ ln T + β1 + qµ ∇ν
dτ dτ 2 T
λ
dπµν β2 T u
−π µν σµν + β2 + πµν ∇λ . (7.8.20)
dτ 2 T
with the coefficients χ, ζ and η the same as before, and where we have again
introduced the projection operator hµν to guarantee that q µ and πµν remain
orthogonal to uµ . The terms proportional to ∇ν (uν /T ) are usually omitted as
it can be argued that they are small compared to the other terms. In that case
we can rewrite the last expressions as
dΠ
τ0 + Π = −ζ∇µ uµ , (7.8.24)
dτ
ν dqν duµ
τ1 hµ + qµ = −χ T + hµ ∇ν T ,
ν
(7.8.25)
dτ dτ
β dπαβ
τ2 hα
µ hν + πµν = −2ησµν , (7.8.26)
dτ
where we have introduced the shorthand
τ0 = ζβ0 , (7.8.27)
τ1 = χT β1 , (7.8.28)
τ2 = 2ηβ2 . (7.8.29)
they are evolution equations instead of simple algebraic relations as before. The
parameters τi play the role of relaxation times. We now see that if a thermody-
namic potential is suddenly switched off, the corresponding flux will die away
slowly in a time of order τi . The values of the τi , or correspondingly the βi , can
be roughly estimated as the mean collision time between particles. The theory
just presented is known as the (truncated) Israel–Stewart theory of relativistic
irreversible thermodynamics [167].
To see how this theory solves the causality problem, consider again heat
flow in a static fluid in flat spacetime. As before, the conservation equation still
implies
ρ0 cV ∂t T = − ∂i qi . (7.8.30)
i
τ1 ∂t qi = −χ∂i T − qi . (7.8.31)
and finally
χ
τ1 ∂t2 T − ∇2 T + ∂t T = 0 . (7.8.33)
ρ 0 cV
Instead of the standard parabolic heat equation we have now obtained a damped
wave equation. The new equation is clearly hyperbolic, so the speed of propaga-
tion is now finite.
8
GRAVITATIONAL WAVE EXTRACTION
8.1 Introduction
Gravitational waves, that is perturbations in the geometry of spacetime that
propagate at the speed of light, are one of the most important predictions of
general relativity. Though such gravitational radiation has not yet been detected
directly, there is strong indirect evidence of its existence in the form of the
now famous binary pulsar PSR 1913+16, whose change in orbital period over
time matches to very good accuracy the value predicted by general relativity
as a consequence of the emission of gravitational waves [164, 282]. Moreover,
there is every reason to believe that the new generation of large interferometric
observatories (LIGO, VIRGO, GEO 600, TAMA) will finally succeed in detecting
gravitational radiation within the next few years.
Gravitational waves are one of the most important physical phenomena as-
sociated with the presence of strong and dynamic gravitational fields, and as
such they are of great interest in numerical relativity. Gravitational radiation
can carry energy and momentum away from an isolated system, and it encodes
information about the physical properties of the system itself. Predicting the
gravitational wave signal coming from astrophysical systems has been one of the
main themes in numerical relativity over the years, because such predictions can
be used as templates that can significantly improve the possibility of detection.
There are two main approaches to the extraction of gravitational wave in-
formation from a numerical simulation. For a number of years the traditional
approach has been based on the theory of perturbations of a Schwarzschild
spacetime developed originally by Regge and Wheeler [235], Zerilli [308, 309],
and a number of other authors, and later recast as a gauge invariant framework
by Moncrief [209]. In recent years, however, it has become increasingly common
in numerical relativity to extract gravitational wave information in terms of the
components of the Weyl curvature tensor with respect to a frame of null vectors,
using what is known as the Newman–Penrose formalism [218]. In the following
Sections, I will present a brief introduction to both these approaches, and will de-
scribe how we can calculate the energy and momentum radiated by gravitational
waves in each case.
Finally, a word of warning. Unfortunately, though the main ideas and results
presented here are well known, there are no standard conventions in either the
definitions or the notation used for many of the concepts introduced here (in
particular, sign conventions in the definitions of Weyl scalars, spacetime invari-
ants, etc., can be very different from author and to author). Here I will use of
276
8.2 PERTURBATIONS OF SCHWARZSCHILD 277
notation and definitions that are common, though not universal, and I will try to
keep the conventions consistent throughout. However, the reader is advised not
to mix expressions taken from different sources without first carefully comparing
the conventions used.
with Ωab the metric of the unit two-sphere, Ωab = diag(1, sin2 θ). We will also
distinguish covariant derivatives in the full spacetime from covariant derivatives
in the sub-manifolds in the following way: ∇µ will represent covariant deriva-
tives in M, while DA and Da will denote covariant derivatives in M 2 and S 2
respectively.
l,m 1
Db Zab = − (l − 1)(l + 2) Yal,m , (8.2.9)
2
where in order to derive this expression we need to use the fact that the Ricci
tensor for the two-sphere is just equal to the angular metric Rab = Ωab .
8.2 PERTURBATIONS OF SCHWARZSCHILD 279
Odd parity tensor harmonics, on the other hand, can be constructed in only
one way, namely
l,m 1
Xab = Da Xbl,m + Db Xal,m . (8.2.10)
2
From the definition of the Xal,m we can easily see that the odd tensor harmonics
l,m
are again traceless. We also find for the divergence of Xab
l,m 1
Db Xab = − (l − 1)(l + 2) Xal,m . (8.2.11)
2
At this point it is important to mention the fact that, since Y 00 is a constant,
both vector and tensor harmonics vanish for l = 0. On the other hand, for l = 1
the vector harmonics do not vanish, but the tensor harmonics can still be easily
shown to vanish from the explicit expressions for Y 1m . This means that vector
harmonics are only non-zero for l ≥ 1, and tensor harmonics for l ≥ 2. The scalar
mode with l = 0 can be interpreted as a variation in the mass of the Schwarzschild
spacetime, the odd vector mode with l = 1 as an infinitesimal contribution to the
angular momentum (the “Kerr” mode [252]), while the scalar and even vector
modes with l = 1 are just gauge [197]. Also, for any given value of l, the index
m can only take integer values from −l to l.
Having defined the vector and tensor harmonics, the perturbed metric is
expanded in multipoles, and separated into its even sector given by
(e)
hl,m
AB
l,m
= HAB Y l,m , (8.2.12)
(e)
hl,m
Ab = HAl,m
Ybl,m , (8.2.13)
(e)
hl,m
ab
l,m
= r2 K l,m Ωab Y l,m + Gl,m Zab , (8.2.14)
l,m l,m
where the coefficients (HAB , HA , K l,m , Gl,m , hl,m
A ,h
l,m
) are in general functions
of r and t.
The vector and tensor harmonics introduced above are related to the spin-
weighted spherical harmonics defined in Appendix D. In order to find this rela-
tion, consider the two unit complex vectors (these vectors will appear again in
Section 8.5 when we discuss the Newman–Penrose formalism)
280 GRAVITATIONAL WAVE EXTRACTION
1 1
ma := √ (1, i sin θ) , m̄a := √ (1, −i sin θ) , (8.2.18)
2 2
where z̄ denotes the complex conjugate of z. In terms of (ma , m̄a ), we find that
the vector harmonics Yal,m and Xal,m can be written as
1/2
l(l + 1)
Yal,m = −1 Y
l,m
ma − 1 Y l,m m̄a , (8.2.19)
2
1/2
l(l + 1)
Xal,m = −i −1 Y
l,m
ma + 1 Y l,m m̄a . (8.2.20)
2
Xab =− −2 Y
l,m
ma mb − 2 Y l,m m̄a m̄b , (8.2.22)
2 (l − 2)!
where in order to derive the last expressions we have again used the fact that
the scalar harmonics are eigenfunctions of the Laplace operator on the sphere.
Notice again that, since the spin-weighted harmonics are defined only for l > |s|,
the above expressions are consistent with the fact that vector harmonics start
with l = 1 and tensor harmonics with l = 2.
xµ → xµ + ξ µ , (8.2.23)
The only difference between this expression and that of Chapter 1 is the fact that
we now have a non-trivial background metric and, as a consequence, the gauge
transformation involves covariant derivatives with respect to the background
metric.
It turns out that we can in fact construct gauge invariant combinations of
l,m l,m
the coefficients (HAB , HA , K l,m , Gl,m ). In order to do this, let us consider an
8.2 PERTURBATIONS OF SCHWARZSCHILD 281
We now need to use this expression to find the transformation of the coefficients
corresponding to an even perturbation. We start from the expression for the
mixed Christoffel symbols which can be easily shown to be
rB a
ΓA bc = −rr Ωbc ,
a
Bc = ΓBC = 0 , ΓA A
ΓaBc = δ , (8.2.26)
r c
where rA := DA r. Using this we find that
l,m
∇A ξB = DA EB Y l,m , (8.2.27)
l,m
rA l,m l,m
∇A ξb = DA E l,m − E Yb , (8.2.28)
r
l,m
l,m rB
∇a ξB = EB − E l,m Yal,m , (8.2.29)
r
l,m
∇a ξb = Da Ybl,m E l,m + rΩab rC EC
l,m l,m
Y . (8.2.30)
l,m
l,m l,m
The gauge transformation of (HAB , HA , K l,m , Gl,m ) then takes the form
l,m l,m l,m l,m
HAB → HAB − DA EB − DB EA , (8.2.31)
l,m l,m l,m 2r A
HA → HA − EA − DA E l,m + E l,m , (8.2.32)
r
2 l,m l(l + 1) l,m
K l,m → K l,m − rA EA + E , (8.2.33)
r r2
2
Gl,m → Gl,m − 2 E l,m . (8.2.34)
r
We can now use these results to construct gauge invariant combinations of coef-
ficients. Two such invariant combinations are [143]
1 2
K̃ l,m := K l,m + l(l + 1) Gl,m − rA εl,m
A , (8.2.35)
2 r
l,m l,m
H̃AB := HAB − DA εl,m l,m
B − DB ε A , (8.2.36)
with
1 2
εl,m l,m
A := HA − r DA Gl,m . (8.2.37)
2
Instead of looking for gauge invariant combinations we can choose to work
on a specific gauge. For example, from the transformations above it is clear that
282 GRAVITATIONAL WAVE EXTRACTION
l,m l,m
we can choose E l,m and EA such that Gl,m = HA = 0. This is known as
the Regge–Wheeler gauge, and with that choice we clearly has K̃ l,m = K l,m ,
l,m l,m
H̃AB = HAB .
l,m
In terms of the gauge invariant perturbations K̃ l,m and H̃AB we define the
so-called Zerilli–Moncrief master function as
2r 2 A B l,m
Ψl,m
even := K̃ l,m + r r H̃AB − rrA DA K̃ l,m , (8.2.38)
l(l + 1) Λ
The Zerilli equation is specially useful in the frequency domain, for which
we assume that Ψl,m even (t, r) = exp(iωn t)F (r), with ωn some complex frequency.
Substituting this into the Zerilli equation, and imposing boundary conditions
such that we have purely outgoing waves at spatial infinity (r∗ → ∞) and purely
ingoing waves at the black hole horizon (r∗ → −∞), we obtain an eigenvalue
problem for the complex frequencies ωn . The different values of ωn found in this
way give us the allowed frequencies of oscillation of the perturbations (the real
part of ωn ) and their corresponding decay rates (the imaginary part of ωn ). Such
solutions are known as the quasi-normal modes of the black hole.
We will not go into a discussion of the quasi-normal modes of Schwarzschild,
but we will mention a few important points. First, all the resulting modes are
damped, with the higher frequency (shorter wavelength) modes damped the
fastest. The so-called fundamental mode l = 2, which is often the most exited
and decays the slowest, has a wavelength of Re(ω) ∼ 16.8M . When the black
hole is excited (e.g. by throwing matter into it), it behaves much like a bell in
the sense that it oscillates at its characteristic frequencies, losing energy in the
form of gravitational waves, and eventually settling down to a stationary state.
Such behavior is known as “ringing”, and we typically find that the late time
gravitational wave signal coming from a system that has collapsed to a single
black hole can always be matched to a superposition of the quasi-normal modes
of the final black hole. The reader interested in studying these issues in more
detail is directed to references [96, 175].
8.2 PERTURBATIONS OF SCHWARZSCHILD 283
∇A ξB = 0 , (8.2.41)
rA l,m l,m
∇A ξb = DA E l,m − E Xb , (8.2.42)
r
l,m
rB
∇a ξB = − E l,m Xal,m , (8.2.43)
r
l,m
∇a ξb = Da Xbl,m E l,m , (8.2.44)
l,m
1 rA l,m
h̃l,m l,m
A := hA − DA hl,m + h . (8.2.47)
2 r
Notice that, from the transformations above, it is clear that we can always choose
E l,m such that hl,m = 0, in which case we will have h̃l,m l,m
A = hA , corresponding
to the Regge–Wheeler gauge.
In terms of h̃l,m
A we define the Cunningham–Price–Moncrief master function:
2r 2rA l,m
Ψl,m := AB
D h̃
A B
l,m
− h̃
odd
(l − 1)(l + 2) r B
2r 2rA l,m
= AB DA hl,m − h . (8.2.48)
(l − 1)(l + 2) B
r B
l,m
But we have already seen that Zab is itself traceless, so the last condition reduces
to K l,m = 0. The Zerilli–Moncrief function defined in (8.2.38) then simplifies to
Ψl,m
even = r G
l,m
. (8.2.55)
8.2 PERTURBATIONS OF SCHWARZSCHILD 285
We then see that the contribution from the even perturbations to the TT metric
functions is given in terms of Ψl,m
even as
!
l,m
Ψl,m l,m Zϕϕ
+ l,m
(h )even = even
Zθθ −
2r sin2 θ
Ψl,m
even ∂2 1
= + l(l + 1) Y l,m , (8.2.56)
r ∂θ2 2
l,m
!
× l,m Ψl,m
even
Zθϕ
(h )even =
r sin θ
Ψl,m im ∂
= even
− cot θ Y l,m , (8.2.57)
r sin θ ∂θ
We now need to relate hl,m to Ψl,m odd . In this case, however, we can not just
l,m
ignore hA in favor of h l,m
since from the definition of Ψl,m odd , equation (8.2.48),
we see that it in fact depends only on hl,m
A and not on h l,m
. However, in the TT
gauge these quantities are related to each other. In order to see this, consider
the transverse condition on hµa :
∇µ hµa = 0 . (8.2.60)
Using now the multipole expansion and the expression for the mixed Christoffel
symbols we find that this implies
Xal,m DA r2 hl,m
A
l,m
+ hl,m Db Xab =0. (8.2.61)
l,m
Substituting now the divergence of Xab from equation (8.2.11), this becomes
1
DA r2 hl,m
A = (l − 1)(l + 2) hl,m . (8.2.62)
2
Now, remember also that in the TT gauge we have the extra freedom of taking
hµν to be purely spatial, so that hµ0 = 0. Also, asymptotically the metric gAB
is just the Minkowski metric. The previous expression then reduces to
286 GRAVITATIONAL WAVE EXTRACTION
1
∂r r2 hl,m
r = (l − 1)(l + 2) hl,m . (8.2.63)
2
2r
Ψl,m
odd = ∂t hl,m . (8.2.64)
(l − 1)(l + 2) r
For an outgoing wave we have ∂t hl,m ∼ −∂r hl,m , so that we can integrate the
above expression to find
hl,m ∼ −r Ψl,m
odd . (8.2.66)
We can then rewrite the odd metric perturbations as
!
+ l,m Ψl,m l,m Xϕϕl,m
(h )odd = − odd
Xθθ −
2r sin2 θ
Ψl,m im ∂
= − odd − cot θ Y l,m , (8.2.67)
r sin θ ∂θ
l,m
!
l,m
Ψodd Xθϕ
(h× )l,m
odd = −
r sin θ
l,m 2
Ψ ∂ 1
= odd + l(l + 1) Y l,m . (8.2.68)
r ∂θ2 2
Notice that since the above expressions involve only tensor harmonics, the sum
over l starts with l = 2, while the sum over m goes from −l to l. In what follows
we will always assume this to be the case, and will never write the summation
limits explicitly.84
84 For summations involving coefficients with indices l = l ± 1 and m = m ± 1, such as those
that will appear when discussing the linear and angular momentum carried by gravitational
waves in Section 8.9, we should only remember that the expansion coefficients vanish whenever
l < 2 and |m | > l.
8.2 PERTURBATIONS OF SCHWARZSCHILD 287
Before leaving this section it is important to mention one very common con-
vention used in numerical relativity. We start by considering a different odd
master function originally introduced by Moncrief [209]:
l,m 2rA h̃l,m 2rA l,m 1 rA l,m
QM := A
= hA − D A h + l,m
h . (8.2.73)
r r 2 r
Notice that with this definition Ql,m M is clearly gauge invariant and a scalar.
The Moncrief function just defined has traditionally been the most common
choice for studying odd perturbations of Schwarzschild, and because of this,
many existing numerical implementations use this function instead of Ψl,modd . Us-
ing again (8.2.63), we can show that asymptotically and in the TT gauge Q̃l,m M
reduces to
2r 2r
Ql,m
M ∼− ∂r2 hl,m ∼− ∂ 2 hl,m , (8.2.74)
(l − 1)(l + 1) r
(l − 1)(l + 1) t r
so that
Ql,m l,m
M ∼ −∂t Ψodd . (8.2.75)
We finally introduce the following rescaling
1/2
l,m (l + 2)!
Qeven := Ψl,m
even , (8.2.76)
2(l − 2)!
1/2 1/2
l,m (l + 2)! l,m (l + 2)!
Qodd := QM = − ∂t Ψl,m
odd . (8.2.77)
2(l − 2)! 2(l − 2)!
Again, the fact that h+ and h× are real implies that
Q̄l,m m l,−m
even = (−1) Qeven , Q̄l,m m l,−m
odd = (−1) Qodd . (8.2.78)
l,m
In terms of Ql,m
even and Qodd we now find for the complex coefficient H:
t
1 l,m
H= √ Qeven − i Ql,m
odd dt −2 Y
l,m
. (8.2.79)
2r l,m −∞
288 GRAVITATIONAL WAVE EXTRACTION
not apply for dimensions less than three, since being constructed from the Riemann tensor,
the Ricci tensor can not have more independent components. In the particular case of only one
dimension the Riemann tensor in fact vanishes, i.e. a single line has no intrinsic curvature.
8.3 THE WEYL TENSOR 289
1
∇α C α βµν = ∇[µ Rν]β + gβ[µ ∇ν] R , (8.3.5)
6
which through the Einstein field equations reduces to
1
∇α C α βµν = 8π ∇[µ Tν]β + gβ[µ ∇ν] T . (8.3.6)
3
We then see that, in vacuum, the Weyl tensor has zero divergence.
Given an arbitrary timelike unit vector nµ , we define the electric Eµν and
magnetic Bµν parts of the Weyl tensor as
∗ 1
Cαβµν := Cαβλσ λσ µν , (8.3.9)
2
with αβµν the Levi–Civita completely antisymmetric tensor. The word “dual”
refers in general to a transformation such that when applied twice it returns you
to the original situation with at most a change of sign (or a factor ±i). In this
case we find that
∗∗
Cαβµν = −Cαβµν . (8.3.10)
To prove this we need to use the fact that the contraction of two Levi–Civita
symbols in an n-dimensional space is given in general by
µ1 ···µp α1 ···αn−p µ1 ···µp β1 ···βn−p = (−1)s p! (n − p)! δβ11 · · · δβn−p
[α α ]
n−p
, (8.3.11)
where s is the number of negative eigenvalues of the metric (which in the case
of a Lorentzian spacetime is 1).
The names of the electric and magnetic tensors come from the fact that when
we rewrite the Bianchi identities in terms of Eij and Bij , the resulting equations
have the same structure as Maxwell’s equations.
The symmetries of Weyl imply that the electric and magnetic tensors are
both symmetric, traceless, and spacelike in the sense that
The symmetry of the magnetic part is not evident from its definition, but it is
in fact not difficult to prove (see below). This means that both Eµν and Bµν
have five independent components, giving a total of 10. Since this is precisely
290 GRAVITATIONAL WAVE EXTRACTION
Cαβµν = 2 lα[µ Eν]β − lβ[µ Eν]α − n[µ Bν]λ λ αβ − n[α Bβ]λ λ µν , (8.3.13)
with
lµν := gµν + 2nµ nν . (8.3.14)
If we now take the vector nµ to be the unit normal vector to the spacelike
hypersurfaces in the 3+1 formalism, we can use the Gauss–Codazzi and Codazzi–
Mainardi equations (2.4.1) and (2.4.2) to write the electric and magnetic tensors
in 3+1 language as
γij
Eij = Rij + KKij − Kim K m j − 4π Sij + (4ρ − S) , (8.3.15)
3
Bij = i mn [Dm Knj − 4πγjm jn ] , (8.3.16)
with ρ, Sij and ji the energy density, stress tensor and momentum density mea-
sured by the Eulerian observers, and where ijk is now the Levi–Civita tensor in
three dimensions which is constructed from αβµν as 86
Notice that the Hamiltonian constraint (2.4.10) guarantees that the trace of Eij
vanishes, while the trace of Bij can be seen to vanish trivially from the anti-
symmetry of the Levi–Civita tensor. In contrast, the symmetry of Eij is evident,
while the symmetry of Bij comes about through the momentum constraints. To
see this consider the contraction of Bij with the Levi–Civita tensor
Now, from the general expression for the contraction of two Levi–Civita symbols
given above we find
iab imn = δm δn − δm
a b b a
δn , (8.3.19)
which implies
j a
= [Dm Kma
− Da K − 8πj a ] = 0 , (8.3.20)
where the last equality follows from the momentum constraints (2.4.11). This
shows that Bij is indeed a symmetric tensor.
86 The definition of the three-dimensional
ijk is one of those few places where it actually
makes a difference to define the spacetime coordinates as (x0 , x1 , x2 , x3 ) = (t, x, y, z) or as
(x1 , x2 , x3 , x4 ) = (x, y, z, t). Notice that, even in Minkowski spacetime, if we take the first
choice we find 123 = nµ µ123 = 0123 = 1, while with the second choice we have instead
123 = nµ µ123 = 4123 = −1. So that when we work with (x1 , x2 , x3 , x4 ) = (x, y, z, t) it is in
fact better to define αβµ := nν αβµν .
8.4 THE TETRAD FORMALISM 291
e(a) ·
e(b) = η(a)(b) , (8.4.1)
e (a) = η (a)(b)
e (a) . (8.4.3)
From this definition it is easy to show that these vectors are such that
(a)
e (a) ·
e(b) = δ(b) . (8.4.4)
Figure 8.1 shows a graphical representation of the relationship between the two
sets of vectors.
The tetrad vectors {
e(a) } can be used as a basis, so that we can express any
arbitrary vector
v as
v = v (a)
e(a) . (8.4.5)
In order to solve for v (a) , we multiply both sides of the above expression with
{
e (b) } to obtain
e (b) ·
v =
e (b) · v (a)
e(a) = v (a)
e (b) ·
e(a) . (8.4.6)
e (2)
e (1)
e (2)
e (1)
Fig. 8.1: Starting from the two vectors (e(1) , e(2) ), we construct a new vectors (e (1) , e (2) )
such that e(1) · e (2) = 0, e(2) · e (1) = 0, e(1) · e (1) = 1, e(2) · e( 2) = 1.
v = v (a)
e(a) = v(a)
e (a) , (8.4.8)
with
v (a) =
v ·
e (a) , v(a) =
v ·
e(a) . (8.4.9)
This implies, in particular, that the scalar product of any two vectors
v and
u
takes the form
v ·
u = v µ uµ = v (a) u(a) . (8.4.10)
The scalar product can then be expressed either as the contraction of the co-
variant and contra-variant spacetime components, or the contraction of the cor-
responding components in terms of the tetrad.
Using the above results we also find that
v = v (a)
e(a) = (
v ·
e (a) )
e(a) = (v µ e (a) µ )(e(a) ν
eν ) , (8.4.11)
In this expression, the partial derivatives can in fact be changed for covariant
derivatives since the Christoffel symbols cancel out. We can then rewrite the
directional derivative as
From the definition of the Ricci rotation coefficients it is also easy to show that
∇(a)
e(b) ≡ e(a) µ ∇µ
e(b) = γ (c) (b)(a)
e(c) . (8.4.22)
We then see that the γ (c) (b)(a) are nothing more than the connection coefficients
in the tetrad basis. Notice, however, that since the tetrad is not a coordinate
basis, in general we will have that
γ (c) (a)(b)
= γ (c) (b)(a) . (8.4.23)
∇µ e(a) ν e(b)ν = 0 ⇒ e(a) ν ∇µ e(b)ν = −e(b)ν ∇µ e(a) ν , (8.4.24)
which implies
γ(a)(b)(c) = −γ(b)(a)(c) . (8.4.25)
The Ricci rotation coefficients are therefore antisymmetric in their first two in-
dices. This means that in a general four-dimensional spacetime there are only 24
independent γ(a)(b)(c) , which is in contrast with the 40 independent components
of the Christoffel symbols Γα µν in a coordinate basis. This is another advantage
of the tetrad approach.
We now define the intrinsic derivative of a vector A
as
eµ(1) = eµr , and (eµ(2) , eµ(3) ) as unit vectors in the angular directions. Notice, how-
ever, that even in flat space the coordinate vectors eµθ and eµϕ are not unit vectors,
so they have to be normalized. Moreover, in a general spacetime eµθ and eµϕ can
not be expected to be orthogonal to eµr or to each other, so a Gram–Schmidt
orthogonalization procedure is required in order to construct eµ(2) and eµ(3) .
Once we have an orthonormal basis, we can construct the two null vectors:
1 1
lµ := √ eµ(0) + eµ(1) , k µ := √ eµ(0) − eµ(1) . (8.5.2)
2 2
With the above choices for eµ(0) and eµ(1) , lµ is an outgoing null vector while k µ
is ingoing.87 As long as we only consider real quantities we can not construct
two more null vectors that are at the same time independent of lµ and k µ and
orthogonal to them, but this is no longer the case if we allow for complex vectors,
in which case we can define the vectors:
1 1
mµ := √ eµ(2) + ieµ(3) , m̄µ := √ eµ(2) − ieµ(3) . (8.5.3)
2 2
The four complex vectors (lµ , k µ , mµ , m̄µ ) form what is known as a null tetrad,
and are such that
D := lµ ∇µ , ∆ := k µ ∇µ , δ := mµ ∇µ , δ̄ := m̄µ ∇µ . (8.5.9)
The Ricci rotation coefficients are now called spin coefficients, and are also given
individual names. We will separate them into two groups:
87 It is standard notation to use nµ instead of k µ as the ingoing null vector. Here, however,
and
1 1 µ
:= γ(2)(1)(1) − γ(4)(3)(1) = (k Dlµ − m̄µ Dmµ ) , (8.5.18)
2 2
1 1 µ
γ := γ(2)(1)(2) − γ(4)(3)(2) = (k ∆lµ − m̄µ ∆mµ ) , (8.5.19)
2 2
1 1 µ
β := γ(2)(1)(3) − γ(4)(3)(3) = (k δlµ − m̄µ δmµ ) , (8.5.20)
2 2
1 1 µ
α := γ(2)(1)(4) − γ(4)(3)(4) = k δ̄lµ − m̄µ δ̄mµ . (8.5.21)
2 2
We then have 12 complex spin coefficients, corresponding to 24 independent real
quantities, as expected. The different names introduced above have historical
origins (notice in particular that the four coefficients {τ , κ , ρ , σ } have in fact
two different names).
Several of the spin coefficients have clear geometrical interpretations. For
example, since lν ∇ν lµ is a vector we can expand it in terms of the tetrad as
The values of the coefficients can be obtained by taking contractions with all
four tetrad vectors and using conditions (8.5.4)–(8.5.6). We then find that
This implies that if κ = 0, then the flow lines of lµ are geodesics. Moreover, if
we also have + ¯ = 0 then those geodesics are parameterized with an affine
parameter.88 If we now change lµ for k µ , we find in a similar way that the k µ
flow lines are geodesics if κ = 0, and have an affine parameter if γ + γ̄ = 0.
Moreover, by expanding ∇µ lν as a linear combination of products of tetrad
vectors we can also show that if + ¯ = 0, then
∇µ lµ = (ρ + ρ̄) . (8.5.24)
88 If a geodesic is not parameterized with an affine parameter then under parallel transport
the tangent vector is allowed to change as long as it keeps its direction (locally) fixed, so that
we have lµ ∇ν lµ ∝ lµ .
8.5 THE NEWMAN–PENROSE FORMALISM 297
This shows that the real part of ρ corresponds to the expansion of the lµ flow
lines. Similarly, we can show that the shear of the lµ flow lines (i.e. the change
in shape) is related to σσ̄, and the rotation is related to (ρ − ρ̄).
8.5.2 Tetrad transformations
The definition of the null tetrad (lµ , k µ , mµ , m̄µ ) is based on the choice of the
original orthonormal tetrad {
e(a) }. We can, of course, change this tetrad by an
arbitrary rotation in space plus a Lorentz boost in a given direction while still
keeping it orthonormal. We then have six degrees of freedom corresponding to
possible transformations of the tetrad that will not change the formalism just
developed. Such transformations are usually separated into three distinct classes:
Notice that, as with all other Newman–Penrose quantities, the Ψa are scalars
with respect to coordinate transformations, but they clearly depend on the choice
of the null tetrad. These five complex scalars are enough to specify all 10 inde-
pendent components of the Weyl tensor. The symmetries of the Weyl tensor
imply that all other possible contractions of Cαβµν with combinations of the
tetrad vectors either vanish or can be expressed as combinations of the Ψa .
Similarly, the 10 independent components of the Ricci tensor can also be
described in terms of the Ricci scalars, which are separated into four real scalars
(again, the sign convention is not universal)
1
Φ00 := R(1)(1) , (8.6.6)
2
1
Φ11 := R(1)(2) + R(3)(4) , (8.6.7)
4
1
Φ22 := R(2)(2) , (8.6.8)
2
1 1
Λ := R= R(3)(4) − R(1)(2) , (8.6.9)
24 12
1 1
Φ01 := R(1)(3) = R̄(1)(4) = Φ̄10 , (8.6.10)
2 2
1 1
Φ02 := R(3)(3) = R̄(4)(4) = Φ̄20 , (8.6.11)
2 2
1 1
Φ12 := R(2)(3) = R̄(2)(4) = Φ̄21 . (8.6.12)
2 2
We have introduced these Ricci scalars here for completeness, but will not use
them in what follows.
8.7 THE PETROV CLASSIFICATION 299
Using the definition of the electric and magnetic parts of the Weyl tensor,
equations (8.3.7) and (8.3.8), we can rewrite the Weyl scalars as
Ψ0 = Qij mi mj , (8.6.13)
1
Ψ1 = − √ Qij mi ejr , (8.6.14)
2
1
Ψ2 = Qij eir ejr , (8.6.15)
2
1
Ψ3 = √ Qij m̄i ejr , (8.6.16)
2
Ψ4 = Qij m̄i m̄j , (8.6.17)
with
Qij := Eij − iBij , (8.6.18)
and where
er is the unit radial vector. These expressions can be easily obtained
starting from equation (8.3.13) and using the fact that, for any arbitrary three-
dimensional vector v i , the following relations hold:
The expressions given above for the Ψa in terms of the electric and magnetic
tensors provide us with a particularly simple way of calculating these scalars in
the 3+1 approach: We start from the 3+1 expressions for Eij and Bij , equa-
tions (8.3.15) and (8.3.16), and then use these tensors to construct the Ψa .
necessary for the following discussion). For a class I transformation the different
Weyl scalars can be shown to transform as
Ψ0 → Ψ0 , (8.7.1)
Ψ1 → Ψ1 + āΨ0 , (8.7.2)
Ψ2 → Ψ2 + 2āΨ1 + ā2 Ψ0 , (8.7.3)
Ψ3 → Ψ3 + 3āΨ2 + 3ā2 Ψ1 + ā3 Ψ0 , (8.7.4)
Ψ4 → Ψ4 + 4āΨ3 + 6ā2 Ψ2 + 4ā3 Ψ1 + ā4 Ψ0 , (8.7.5)
There is one very important point to notice about the above transformations.
Simple inspection shows that the transformations of (Ψ3 , Ψ2 , Ψ1 , Ψ0 ) can be
obtained by taking subsequent derivatives with respect to ā of the transformation
of Ψ4 , with appropriate rescalings.
Similarly, under a class II transformation we find that
The above equation has in general four complex roots. The resulting directions
associated with the new vector lµ ,
are known as the principal null directions of the Weyl tensor. When some of
the roots of equation (8.7.11) coincide the spacetime is said to be algebraically
special. This leads to the Petrov classification that separates different spacetimes
into six types according to the number of distinct root of (8.7.11):
Petrov type I: All four roots are distinct: b1 , b2 , b3 , b4 . In this case we can make
a transformation of class II, with b equal to any of the distinct roots, that will
result in Ψ0 = 0. Furthermore, we can later make a transformation of class
I to make Ψ4 also vanish while keeping Ψ0 = 0. This is because, for a class I
8.7 THE PETROV CLASSIFICATION 301
Notice that this equation will in fact be cubic since after the first class II trans-
formation we already have Ψ0 = 0. For a Petrov type I spacetime we can then
always choose a tetrad such that only (Ψ1 , Ψ2 , Ψ3 ) are different from zero.
Petrov type II: Two roots coincide: b1 = b2 , b3 , b4 . In this case it is clear that
the derivative of (8.7.11) with respect to b must also vanish for b = b1 (since it is
a double root). But looking at the transformation of Ψ1 we see that this implies
that Ψ1 will also become zero. We can then make both Ψ0 and Ψ1 vanish just by
taking b = b1 . And as before, we can now use a transformation of class I to make
Ψ4 also vanish (but notice that equation (8.7.13) will now be quadratic since
Ψ0 = Ψ1 = 0). Now, a class I transformation leaves Ψ0 unaltered but in general
changes Ψ1 according to (8.7.2), but the new Ψ1 is a linear combination of the
previous values of Ψ0 and Ψ1 , and since both vanish then Ψ1 will still vanish
after the transformation. For a type II spacetime we can then always choose a
tetrad such that only (Ψ2 , Ψ3 ) are different from zero.
Ψ4
Ψ1 → (b − b1 )(b − b2 )(2b − b1 − b2 ) , (8.7.15)
2
Ψ4 2
We can now substitute these new values into a class I transformation to find that
after both transformations Ψ4 becomes
2 2
Ψ4 → Ψ4 [ā(b − b1 ) + 1] [ā(b − b2 ) + 1] . (8.7.19)
This shows that the quadratic equation needed to make Ψ4 vanish has a double
root itself, namely ā = 1/(b2 −b1 ). By taking now this value for ā we can therefore
make both Ψ4 and Ψ3 vanish while still keeping Ψ0 = Ψ1 = 0. The final result
is that for a type D spacetime we can always choose a tetrad such that only Ψ2
remains non-zero.
Petrov type O: The Weyl tensor vanishes identically (the spacetime is confor-
mally flat).
Examples of D type spacetimes include both the Schwarzschild and Kerr so-
lutions, while a plane gravitational wave spacetime is of type N. Both Minkowski,
and the Friedmann–Robertson–Walker cosmological spacetimes are of type O.
1
Ψn ∼ , (8.7.21)
r5−n
so that the Riemann tensor behaves as
N III II I
R∼ + 2 + 3 + 4 . (8.7.22)
r r r r
This is known as the “peeling theorem”, and in particular implies that the far
field of any source of gravitational waves behaves locally as a plane wave.
8.8 INVARIANTS I AND J 303
The expression for J can in fact be written in more compact form as the following
determinant ⎛ ⎞
Ψ0 Ψ1 Ψ2
J = ⎝ Ψ1 Ψ2 Ψ3 ⎠ . (8.8.6)
Ψ2 Ψ3 Ψ4
The invariants I and J can be very useful in the characterization of numerical
spacetimes. For example, they can be used to compare a numerical solution to
an exact solution, or two numerical solutions corresponding to the same phys-
ical system but computed in different gauges. In particular, for Petrov type D
spacetimes like Schwarzschild and Kerr it is possible to choose a null tetrad such
304 GRAVITATIONAL WAVE EXTRACTION
that the only non-zero Weyl scalar is Ψ2 . In such a case the above expressions
imply that
I = 3Ψ22 , J = −Ψ32 ⇒ I 3 = 27J 2 . (8.8.7)
Notice that even if the Ψ’s depend on the choice of the null tetrad, the I and J
invariants do not (they are true scalars), so that for a type D spacetime we will
always have I 3 = 27J 2 , regardless of the choice of tetrad. This result is in fact
more general, as it is not difficult to see that for a spacetime of type II we also
have I 3 = 27J 2 , while for spacetimes of types III and N we have I = J = 0.
Following Baker and Campanelli [41], we define the speciality index S as
S := 27J 2 /I 3 . (8.8.8)
We then find that S = 1 for type II or D spacetimes (for type III or N spacetimes
S can not be defined as I = J = 0). More generally, we can use the deviation
of S from unity as a measure of how close we are to a type II or D spacetime.
For isolated systems the regions where S is close to unity will correspond to the
wave zone where we have small perturbations of a Kerr background. Baker and
Campanelli show that for small perturbations of a type D spacetime, the index
S differs from unity by terms that are quadratic on the perturbation parameter,
which means that when S is significantly different from unity we can no longer
trust quantities derived from a first order theory.
The first of these equations is automatically satisfied, while the second describes
the dynamics of the gravitational waves. The third equation, on the other hand,
can be rewritten in terms of the Einstein tensor as
1 (0) (1) (2)
Fµν (h ) − gµν
(1) (2)
F (h ) = 8π tµν , (8.9.10)
2
with
1 1 (0) (2) (1)
tµν := − Fµν (h ) − gµν
(2) (1)
F (h ) . (8.9.11)
8π 2
When it is written in this way we see that the contribution to the Einstein tensor
(2)
coming from hµν has tµν as its source. This suggests that tµν can be interpreted
(1)
as the stress-energy tensor of the gravitational waves described by hµν . There
are several reasons why this is in fact a good definition. First, tµν is clearly
(1)
symmetric and quadratic in hµν , as expected. But moreover, from the Bianchi
identities to order 2 we can in fact show that
tµν |µ = 0 , (8.9.12)
of any type of wave). Through a rather lengthy calculation we can show that the
averaged stress-energy tensor for a gravitational wave is given by (from now on
(1)
we will drop the super-index (1) from hµν )
, -
1 1
Tµν := tµν = h̄αβ|µ h̄ |ν − h̄|µ h̄|ν − 2h̄ |β h̄α(µ|ν) ,
αβ αβ
(8.9.14)
32π 2
where denotes an average over several wavelengths, and where h̄µν is the
trace-reverse of hµν defined as
1 (0)
h̄µν := hµν − gµν h. (8.9.15)
2
The tensor Tµν just defined is known as the Isaacson stress-energy ten-
sor [166]. It is also not difficult to show that Tµν is in fact gauge invariant.
In the particular case where we consider the transverse-traceless (TT) gauge
corresponding to h = hαβ |β = 0, the Isaacson stress-energy tensor reduces to
1 . /
Tµν = hαβ|µ hαβ |ν . (8.9.16)
32π
× ×
hµν = h+ A+
µν + h Aµν , (8.9.17)
where h+,× are the amplitudes of the two independent wave polarizations, and
A+,×
µν are constant symmetric polarization tensors such that
with lµ a null vector (the wave vector), and uµ an arbitrary unit timelike vector.
Let us now consider the spatial orthonormal basis (êr , êθ , êϕ ) induced by the
spherical coordinates (r, θ, ϕ) (notice that this is not the spherical coordinate
basis). If we choose uµ = (1, 0, 0, 0) (the unit timelike vector in the flat back-
ground), and take lµ as an outgoing radial null vector (since for an isolated source
308 GRAVITATIONAL WAVE EXTRACTION
we only expect outgoing waves), we find that the only non-zero components of
the polarization tensors are
A+
θ̂ θ̂
= −A+
φ̂φ̂
=1, (8.9.19)
A×
θ̂ φ̂
= A×
φ̂θ̂
=1, (8.9.20)
Using this we can rewrite the Isaacson stress-energy tensor (8.9.16) in locally
Cartesian coordinates as
1 . /
Tµν = ∂µ h+ ∂ν h+ + ∂µ h× ∂ν h× , (8.9.22)
16π
or equivalently
1 . /
Tµν = Re ∂µ H∂ν H̄ , (8.9.23)
16π
with H = h+ − ih× , and where Re(z) denotes the real part of z.
It turns out that the complex quantity H can in fact also be written in terms
of the Weyl scalar Ψ4 . In order to see this, notice first that if we are in vacuum
far from the source of the gravitational waves the Weyl and Riemann tensors
coincide. Using now the expression for the Riemann tensor in the linearized
approximation from Chapter 1, equation (1.14.4), we can easily show that, for
plane waves in the TT gauge traveling along the r direction, the Weyl scalars
Ψa become
Ψ1 = Ψ2 = Ψ3 = 0 , (8.9.24)
1
i 2 ×
Ψ0 = − ∂t2 h+ + 2∂t ∂r h+ + ∂r2 h+ − ∂ h + 2∂t ∂r h× + ∂r2 h× , (8.9.25)
4 4 t
1 2 + i 2 ×
Ψ4 = − ∂t h − 2∂t ∂r h+ + ∂r2 h+ + ∂t h − 2∂t ∂r h× + ∂r2 h× . (8.9.26)
4 4
Now, for outgoing waves we have h = h(r − t), so that ∂r h = −∂t h. The Weyl
scalars then reduce to
Ψ0 = Ψ1 = Ψ2 = Ψ3 = 0 , (8.9.27)
×
Ψ4 = −ḧ + iḧ = −Ḧ .
+
(8.9.28)
(For ingoing waves we would have instead that ∂r h = ∂t h, so that the only
¨ .)
non-vanishing Weyl scalar would be Ψ0 = −H̄
8.9 ENERGY AND MOMENTUM OF GRAVITATIONAL WAVES 309
dE 1 . / 1 . /
= T 0r = Re ∂ 0 H∂ r H̄ = − Re ∂t H∂r H̄ , (8.9.30)
dtdA 16π 16π
with dA the area element orthogonal to the radial direction. Since for outgoing
waves we have ∂r h = −∂t h, we find that
dE 1 0 ˙1 1 0 21
= Ḣ H̄ = |Ḣ| . (8.9.31)
dtdA 16π 16π
The last expression is manifestly real, so there is no need to ask for the real part.
If we now want the total flux of energy leaving the system at a given time we
need to integrate over the sphere to find (see, e.g. [90])
& & 2
dE r2 r2 t
= lim |Ḣ|2 dΩ = lim Ψ4 dt dΩ , (8.9.32)
dt r→∞ 16π r→∞ 16π
−∞
where we have taken dA = r2 dΩ, with dΩ the standard solid angle element,
and where the limit of infinite radius has been introduced since the Isaacson
stress-energy tensor is only valid in the weak field approximation. Notice also
that we have dropped the averaging since the integral over the sphere is already
performing an average over space, plus the expression above is usually integrated
over time to find the total energy radiated, which again eliminates the need to
take an average.
Consider next the flux of momentum which corresponds to the spatial com-
ponents of the stress-energy tensor Tij . The flux of momentum i along the radial
direction will then be given by
dPi 1 . /
= Tir = Re ∂i H∂r H̄ . (8.9.33)
dtdA 16π
Now, if we are sufficiently far away the radiation can be locally approximated
as a plane wave, so that ∂i H (xi /r) ∂r H (in effect we are assuming that for
310 GRAVITATIONAL WAVE EXTRACTION
large r the angular dependence can be neglected). Using again the fact that for
outgoing waves ∂r h = −∂t h we find that, for large r,
dPi 1 0 1
li |Ḣ|2 , (8.9.34)
dtdA 16π
where
l is the unit radial vector in flat space
l = (sin θ cos ϕ, sin θ sin ϕ, cos θ) . (8.9.35)
Notice that the above expression implies that the magnitude of the momentum
flux is equal to the energy flux, which is to be expected for locally plane waves.
The total flux of momentum leaving the system will again be given by an
integral over the sphere as (again, see [90])
& & t 2
dPi r2 r2
= lim li |Ḣ|2 dΩ = lim li Ψ4 dt dΩ . (8.9.36)
dt r→∞ 16π r→∞ 16π −∞
Finally, let us consider the flux of angular momentum. Locally, the flux of the
i component of the angular momentum along the radial direction should corre-
spond to ijk xj T kr , with ijk the three-dimensional Levi–Civita antisymmetric
tensor and T ij the stress-energy tensor of the field under study (this corresponds
to
r × p
in vector notation). However, in the case of gravitational waves such an
expression is in fact wrong since the averaging procedure that is used to derive
the Isaacson stress-energy tensor ignores terms that go as 1/r3 , and it is precisely
such terms the ones that contribute to the flux of angular momentum. A correct
expression for the flux of angular momentum due to gravitational waves was first
derived by DeWitt in 1971, and in the TT gauge has the form (see e.g. [288])
&
dJ i r2
= lim ijk (xj ∂k hab + 2δaj hbk ) ∂r hab dΩ . (8.9.37)
dt r→∞ 32π
Remember that in the TT gauge all the time components of hαβ vanish, so the
summations above are only over spatial indices.
The last expression can be rewritten in a more compact and easy to interpret
form if we introduce the angular vectors ξ
i associated with rotations around the
three-coordinate axis. These vectors are clearly Killing fields of the flat metric,
and in Cartesian coordinates have components given by ξik = i jk xj , where ξik
represents the k component of the vector ξ
i . In terms of the vectors ξ
i the flux
of angular momentum can now be written as
&
dJi r2
= − lim (£ξi hab ) ∂t hab dΩ , (8.9.38)
dt r→∞ 32π
ξ
x = (0, − sin ϕ, − cos ϕ cot θ) , (8.9.39)
ξ
y = (0, + cos ϕ, − sin ϕ cot θ) , (8.9.40)
ξ
z = (0, 0, 1) . (8.9.41)
ξ
± = e±iϕ (0, ±i, − cot θ) . (8.9.42)
We will furthermore introduce an orthonormal spherical basis (êr , êθ , êϕ ), and
define the two unit complex vectors (notice the counterintuitive signs in this
definition)
1
ê± := √ (êθ ∓ iêϕ ) , (8.9.43)
2
so that
1
ê± = √ (0, 1, ∓i csc θ) . (8.9.44)
r 2
The vectors ê± clearly correspond to the vectors m̄ and m of the Newman–
Penrose formalism. We can then show after some algebra that the Lie derivative
of ê± with respect to ξ
± is given by
£ξ± ea± = ∓ ie±iϕ csc θ ea± . (8.9.45)
Let us now rewrite the metric perturbation hab in the TT gauge in terms of
the orthonormal basis as:
We are now in a position to calculate the Lie derivative of hab with respect to
ξ
± . We find
312 GRAVITATIONAL WAVE EXTRACTION
Jˆ± := ξ±
a
∂a − is e±iϕ csc θ = e±iϕ [±i∂θ − cot θ ∂ϕ − is csc θ] , (8.9.48)
with s the spin weight of the function on which the operator is acting: s = −2
for H, and s = +2 for H̄ (see Appendix D). The last result implies that
with Jˆx := (Jˆ+ + Jˆ− )/2 and Jˆy := −i(Jˆ+ − Jˆ− )/2.
Collecting results, the flux of angular momentum becomes
'& (
dJi r2
= − lim Re Jˆi H ∂t H̄ dΩ , (8.9.52)
dt r→∞ 16π
1ˆ
Jˆx = J+ + Jˆ− = − sin ϕ ∂θ − cos ϕ (cot θ ∂ϕ − is csc θ) , (8.9.53)
2
i ˆ
Jˆy = J− − Jˆ+ = + cos ϕ ∂θ − sin ϕ (cot θ ∂ϕ − is csc θ) , (8.9.54)
2
Jˆz = ∂ϕ . (8.9.55)
Notice that, except for a factor of −i, these are just the quantum mechanical
angular momentum operators with the correct spin weight (see Appendix D).
Finally, the flux of angular momentum can be calculated in terms of Ψ4 as
(see [90, 246])
2 &
t
dJi r2
= − lim Re Ψ̄4 dt
dt r→∞ 16π −∞
t t ! 3
× Jˆi Ψ4 dt dt dΩ . (8.9.56)
−∞ −∞
8.9 ENERGY AND MOMENTUM OF GRAVITATIONAL WAVES 313
Notice that since we are expanding over the harmonics of spin weight s = −2,
the sum over l also starts at l = 2. Comparing the multipole expansion for Ψ4
with the expansion for the metric perturbation (8.2.72), and using the fact that
l,m
asymptotically Ψ4 = −Ḧ, we can relate the coefficients Al,m to (Ψl,m even , Ψodd )
l,m
and (Ql,m
even , Qodd ) in the following way
1/2
1 (l + 2)! l,m
A l,m
=− Ψ̈l,m + i Ψ̈
2r (l − 2)! even odd
1
= −√ l,m
even − iQ̇odd .
Q̈l,m (8.9.59)
2r
Using this we can translate expressions in terms of the Al,m directly into expres-
l,m l,m
sions in terms of (Ψl,m l,m
even , Ψodd ) and/or (Qeven , Qodd ).
Consider first the radiated energy given by equation (8.9.32). If we substi-
tute the multipole expansion for Ψ4 , and use the orthogonality of the s Y l,m
(equation (D.28) of Appendix D), we immediately find that
2
dE r2 t l,m
= lim A dt , (8.9.60)
dt r→∞ 16π −∞l,m
where, in order to derive these last expressions, we must use the fact that, as a
consequence of (8.2.71), we have
Ψ̇l,m Ψ̄ ˙ l,m Ψ̇l,m = 0 .
˙ l,m − Ψ̄ (8.9.62)
even odd even odd
m
The calculation for the linear momentum flux is somewhat more complicated.
Substituting the multipole expansion of Ψ4 in (8.9.36) we find
&
dPi r2
= lim li −2 Y l,m −2 Ȳ l ,m dΩ
dt r→∞ 16π
l,m l ,m
t t
× l,m
A dt Āl ,m dt . (8.9.63)
−∞ −∞
In order to calculate the integral over the sphere, notice first that the components
of the radial unit vector li can be expressed in terms of scalar (i.e. spin zero)
spherical harmonics as
+
2π 1,−1
lx = sin θ cos ϕ = Y − Y 1,1 , (8.9.64)
3
+
2π 1,−1
ly = sin θ sin ϕ = i Y + Y 1,1 , (8.9.65)
3
+
π 1,0
lz = cos θ = 2 Y . (8.9.66)
3
We then see that the flux of linear momentum involves integrals over three
spin-weighted spherical harmonics. Such integrals can be calculated using equa-
tion (D.30) of Appendix D. They involve the Wigner 3-lm symbols with l3 = 1,
which are also explicitly given in Appendix D.
Instead of Px and Py it turns out to be easier to work with the complex
quantity P+ := Px + iPy . After a straightforward, but rather long, calculation
we finally arrive at the following expressions for the multipole expansion of the
flux of linear momentum in terms of the Al,m coefficients
dP+ r2 t
= lim dt Al,m
dt r→∞ 8π
l,m −∞
t
× dt al,m Āl,m+1 + bl,−m Āl−1,m+1 − bl+1,m+1 Āl+1,m+1 , (8.9.67)
−∞
dPz r2 t
= lim dt Al,m
dt r→∞ 16π −∞
l,m
t
× dt cl,m Āl,m + dl,m Āl−1,m + dl+1,m Āl+1,m , (8.9.68)
−∞
8.9 ENERGY AND MOMENTUM OF GRAVITATIONAL WAVES 315
Using now (8.9.59) we can also rewrite these expressions in terms of the multipole
expansion for H. The calculation is again quite long, and in order to simplify the
expressions we must make use several times of (8.2.71). The final result is 90
2
dJi r2 t
t t
= − lim Re Āl ,m dt Al,m dt dt
dt r→∞ 16π
l,m l m −∞ −∞ −∞
& 3
× −2 Ȳ l ,m Jˆi −2 Y l,m dΩ . (8.9.77)
The action of the angular momentum operators Jˆi on the spin-weighted spherical
harmonics can be found in Appendix D. We again obtain integrals that involve
products of two spin-weighted spherical harmonics which satisfy the usual ortho-
normalization relations. We can then easily find the following expressions for the
angular momentum carried by the gravitational waves 91
2
dJx ir2 t t
= − lim Im Al,m dt dt
dt r→∞ 32π −∞ −∞
l,m
t 3
× fl,m Āl,m+1
+ fl,−m Āl,m−1
dt , (8.9.78)
−∞
2
dJy r2 t t
= − lim Re Al,m dt dt
dt r→∞ 32π
l,m −∞ −∞
t 3
× fl,m Āl,m+1
− fl,−m Āl,m−1
dt , (8.9.79)
−∞
2
dJz ir2 t t
= − lim Im m Al,m dt dt
dt r→∞ 16π −∞ −∞
l,m
3
t
× Āl,m dt , (8.9.80)
−∞
with
fl,m := (l − m)(l + m + 1) = l(l + 1) − m(m + 1) , (8.9.81)
and where we use the convention that Im(a + ib) = ib, for a and b real. Again,
we can rewrite the last expressions in terms of gauge invariant perturbations
using (8.9.59). We find
91 These expressions for the flux of angular momentum can also be found in [193] (with an
extra factor of 4 coming from a different normalization of the null tetrad used to define Ψ4 ).
8.9 ENERGY AND MOMENTUM OF GRAVITATIONAL WAVES 317
Notice that the expressions for dJx /dt and dJy /dt are manifestly real. On the
other hand, for dJz /dt the term inside the sum can be easily shown to be purely
imaginary, so that the final result is also real.
As a final comment, the above expressions for the radiated energy, linear
momentum and angular momentum can also be shown to be equivalent to the
expressions derived by Thorne in [288] by noticing that
1 (l+2) l,m
Al,m = √ I − i (l+2) S l,m ,
2r
(−1)m (l+2) l,−m
Āl,m = √ I + i (l+2) S l,−m , (8.9.86)
2r
where, in Thorne’s notation, the coefficients I l,m are the mass multipole momenta
of the radiation field, S l,m are the current multipole momenta, and where (l) I l,m
and (l) S l,m denote the lth time derivative of these quantities.
9
NUMERICAL METHODS
9.1 Introduction
Field theories play a fundamental role in modern physics. From Maxwell’s clas-
sical electrodynamics, to quantum field theories, through the Schrödinger equa-
tion, hydrodynamics and general relativity, the notion of a field as a physical
entity on its own right has had profound implications in our understanding of
the Universe. Fields are continuous functions of space and time, and the math-
ematical description of their dynamics must be done in the context of partial
differential equations.
The partial differential equations associated with physical theories are in gen-
eral impossible to solve exactly except in very idealized cases. This difficulty can
have different origins, from the presence of irregular boundaries, to the existence
of non-linear terms in the equations themselves. In order to solve this type of
equation in general dynamical situations it becomes inevitable to use numerical
approximations.
There are many different ways in which we can solve partial differential
equations numerically. The most popular methods are three: Finite differenc-
ing [178, 208, 243], finite elements [179, 207] and spectral methods [281]. In this
Chapter I will describe the main ideas behind finite differencing methods, since
this is the most commonly used approach in numerical relativity (though not
the only one; in particular spectral methods have become increasingly popular
in recent years [70, 171, 173]).
Finally, I should mention the fact that, for simplicity, in this Chapter, I
will only discuss methods for the numerical solution of systems of evolution
equations of essentially “hyperbolic” type, and will not discuss the solution of
elliptic equations, such as those needed for obtaining initial data. The numerical
solution of elliptic equations is a very important subject in its own right, and
even a very simple introduction would demand a full Chapter on its own. The
interested reader can find a discussion of some of the basic techniques for solving
elliptic equations in [230], but we should point out that even if simple methods
are quite easy to code they tend to be extremely slow in practice. Fast and
efficient algorithms for solving elliptic equations (such as e.g. multi-grid) are
usually much more complex.
318
9.2 BASIC CONCEPTS OF FINITE DIFFERENCING 319
t 6x
6t
x
Fig. 9.1: Discretization of spacetime used in finite differencing.
field at every point of space and for all times. In order to find the value of the
field using numerical approximations, the first thing that needs to be done is
to reduce these unknowns to a finite number. There are several different ways
of doing this. Spectral methods, for example, expand the solution as a finite
linear combination of some appropriate basis functions. The variables to solve
for are then the coefficients of such an expansion. A different approach is taken
by finite differencing and finite elements. In both cases the number of variables
is reduced by discretizing the domain of dependence of the functions, although
using different strategies in each case.
The basic idea of finite differencing approximations is to substitute the con-
tinuous spacetime with a set of discrete points. This set of points is known as the
computational grid or mesh. The distances in space between points on the grid
do not necessarily have to be uniform, but in this Chapter we will assume for sim-
plicity that they are. The time step between two consecutive levels is denoted by
∆t, and the distance between two adjacent points by ∆x. Figure 9.1 is a graph-
ical representation of the computational grid. There is a huge literature on the
subject of finite differencing; the interested reader can see, e.g. [178, 208, 243].
Once we have established the computational grid, the next step is to sub-
stitute our differential equations with a system of algebraic equations. This is
done by approximating the differential operators by finite differences between
the values of our functions at nearby points on the grid. In this way we obtain
an algebraic equation at each grid point for each differential equation. These
algebraic equations involve the values of the functions at the point under con-
sideration and its nearest neighbors. The system of algebraic equations can then
be solved in a simple way, but the price we have paid is that now we have a huge
number of algebraic equations, so that a computer is needed in order to solve
them all.
320 NUMERICAL METHODS
In order to make the ideas more concrete, let us write our differential equation
in the general form
Lu = 0 , (9.2.1)
where u denotes a set of functions of the spacetime coordinates (t, xi ), and L
is some differential operator acting on u. Furthermore, let us denote by u∆ the
discretized approximation to u evaluated at the points of the computational
grid, and by L∆ the finite difference version of the original differential operator.
Here ∆ can represent either ∆x or ∆t, which can in general be assumed to
be proportional to each other. The finite difference version of our differential
equation then takes the form
L∆ u∆ = 0 . (9.2.2)
The notation just introduced serves the purpose of indicating explicitly that
a given finite difference approximation depends on the grid size parameter ∆.
Indeed, since we are approximating a differential equation, the behavior of u∆
in the limit when ∆ vanishes is the crucial property of our approximation.
The truncation error of our finite difference approximation is defined as
τ∆ := L∆ u , (9.2.3)
that is, the result of applying the finite difference operator L∆ to the solution
of the original differential equation. Typically, the truncation error will not be
zero, but it should approach zero as ∆ becomes smaller. A related concept (often
confused with the truncation error) is that of solution error, which is defined
instead as the difference between the exact solution to the differential equation
u and the exact solution to the finite difference equation u∆
∆ := u − u∆ . (9.2.4)
We then look for approximations that, in the continuous limit, approach the
original differential equation and not a different one. When this happens locally
we say that our approximation is consistent. In general, this property is quite
easy to see from the structure of the finite difference approximation, and can
often by checked by “eye”. The important exceptions are situations where the
coordinate system becomes singular, since proving consistency at a singular point
might be non-trivial. For example, it is common for standard finite difference
approximations to fail at the point r = 0 when using spherical coordinates.
Consistency is clearly fundamental for any finite difference approximation.
When it fails, even if it is at just one point, we will not be able to recover
9.2 BASIC CONCEPTS OF FINITE DIFFERENCING 321
the correct solution to the original differential equation. For a consistent finite
difference approximation, in the continuum limit the truncation error typically
approaches zero as a power of the discretization parameter ∆. We then say that
a given approximation is of order n if 92
lim τ∆ ∼ ∆n . (9.2.6)
∆→0
the number of basis functions is increased. This is the main strength of spectral methods over
finite differencing.
322 NUMERICAL METHODS
where the sum is over all points in the spatial grid. This norm is also commonly
known as the root mean square (or simply r.m.s.) norm.
We will now say that the finite difference approximation is stable if for any
t > 0 there exists a constant Ct such that
for all 0 < n∆t < t, in the limit when ∆x and ∆t go to zero. That is, in the
continuum limit the norm of the finite difference solution up to a finite time
t is bounded by the norm at t = 0 times a constant that is independent of
∆x and ∆t. Stability is a property of the system of finite difference equations,
and is essentially the discrete version of the definition of well-posedness for a
system of evolution equations (cf. equation (5.2.2)). An unstable finite difference
approximation is simply useless in practice.
Given an initial value problem that is mathematically well posed, and a finite
difference approximation to it that is consistent, then stability is a necessary and
sufficient condition for convergence.
This theorem is of great importance since it relates the final objective of any
finite difference approximation, namely convergence to the exact solution, with
a property that is usually much easier to prove: stability.
(∆x) 4
2
φnm+1 − 2φnm + φnm−1
∂x2 φ = 2 + ∂x φ + · · · . (9.3.4)
(∆x) 12
We can then approximate the second derivative as
φnm+1 − 2φnm + φnm−1
∂x2 φ 2 . (9.3.5)
(∆x)
From the above expressions we see that the truncation error is of order (∆x)2 ,
so this approximation is second order accurate.
The second derivative of φ with respect to t can be approximated in exactly
the same way. In this way we find the following finite difference approximation
for the wave equation:
φn+1 − 2φnm + φn−1 φnm+1 − 2φnm + φnm−1
m
2
m
− 2 =0. (9.3.6)
(c∆t) (∆x)
We can rewrite this equation in more compact form if we introduce the so-
called Courant parameter, ρ := ∆t/∆x.93 Our approximation then takes the final
form:
n+1
φm − 2φnm + φn−1m − c2 ρ2 φnm+1 − 2φnm + φnm−1 = 0 . (9.3.7)
This equation has a very important property: It involves only one value of
the wave function at the last time level; the value φn+1
m . We can then solve for
this value in terms of values at previous time levels to obtain:
φn+1
m = 2φnm − φn−1
m + c2 ρ2 φnm+1 − 2φnm + φnm−1 . (9.3.8)
Because of this property, the last approximation is known as an explicit approx-
imation. If we know the values of the function φ at the time levels n and n − 1,
93 It is common to define the Courant parameter absorbing the wave speed into it as
ρ = c∆t/∆x. This is very useful in the case of simple scalar equations, but becomes less so for
systems of equations where we can have several different characteristic speeds.
324 NUMERICAL METHODS
we can use the last equation to calculate directly the values of φ at the new time
level n + 1. The process can then be iterated as many times as desired.
It is clear that all that is required in order to start the evolution is to know
the values of the wave function at the first two time levels, and finding these two
first levels is easy to do. As we are dealing with a second order equation, the
initial data must include
For the second time level it is enough to approximate the first time derivative
using finite differences. One possible approximation is given by
φ1m − φ0m
g (m∆x) = , (9.3.11)
∆t
from where we find
φ1m = g (m∆x) ∆t + φ0m . (9.3.12)
The previous expression, however, has one important drawback: From the Taylor
expansion we can easily see that the truncation error for this expression is of order
∆t, so the approximation is only first order. It is clear that we start our evolution
with a first order error, the second order accuracy of the whole scheme will be
lost. However, this problem is quite easy to fix. A second order approximation
to the first time derivative is
φ1m − φ−1
m
g (m∆x) = . (9.3.13)
2∆t
The problem now is that this expression involves the value of the function φ−1
m ,
which is also unknown. But we already have one other equation that makes ref-
erence to φ1m and φ−1
m ; the approximation to the wave equation (9.3.7) evaluated
at n = 0. We can then use these two equations to eliminate φ−1 m and solve for
φ1m . In this way we find the following second order approximation for the second
time level
c2 ρ2 0
φ1m = φ0m + φm+1 − 2φ0m + φ0m−1 + ∆t g (m∆x) . (9.3.14)
2
Equations (9.3.10) and (9.3.14) give us all the information we require in order
to start our evolution.
There is another important point that must be mentioned here. In order to
reduce the total number of variables to a finite number, it is also necessary to
consider a finite region of space with a finite number of points N , the so-called
computational domain. It is therefore crucial to specify the boundary conditions
9.3 THE ONE-DIMENSIONAL WAVE EQUATION 325
This choice, apart from being extremely simple, is equivalent to using the interior
approximation everywhere, so it allows us to concentrate on the properties of the
interior scheme only. We will for the moment assume that our boundaries are
indeed periodic, and will come back to the boundary issue later.
with analogous definitions for temporal difference operators. The centered dif-
ference operator is then defined as
n 1 n
¯ x φnm := 1 ∆+
∆ −
x + ∆x φm = φm+1 − φnm−1 . (9.3.17)
2 2
We can now use these definitions to introduce the second centered difference
operators
− n
x ∆x φm = φm+1 − 2φm + φm−1 .
∆2x φnm := ∆+ n n n
(9.3.18)
(Do notice that with this notation (∆¯ x )2
= ∆2x .)
Having defined these operators, we can now go back to the approximations
we used for the differential operators that appear on the wave equation. Start-
ing again from the Taylor series, it is possible to show that the second spatial
derivative can be approximated more generally as
1 2 θ
n+1
∂x φ
2
2 ∆x 2 φm + φm n−1
+ (1 − θ) φm ,
n
(9.3.19)
(∆x)
with θ some arbitrary parameter. The expression we had before, equation (9.3.5),
can be recovered by taking θ = 0. This new approximation corresponds to taking
an average, with a certain weight, of the finite difference operators acting on
326 NUMERICAL METHODS
different time levels. In the particular case when θ = 1, the contribution from
the middle time level in fact completely disappears.
If we now use the new approximation for the second spatial derivative, but
keep the same approximation as before for the time derivative, we find the fol-
lowing finite difference approximation for the wave equation
θ n+1
∆2t φnm − c2 ρ2 ∆2x φm + φn−1
m + (1 − θ) φn
m =0 . (9.3.20)
2
This is one possible generalization of (9.3.7), but not the only one. It is
clear that we can play this game in many ways to obtain even more general
approximations, all equally valid, and all second order (one can also find ap-
proximations that are fourth order accurate or even higher). The approximation
given by (9.3.20) has a new and very important property: It involves not one,
but three different values of φ at the last time level. This means that it is now
not possible to solve for φ at the last time level explicitly in terms of its values
in the two previous time levels. Because of this, the approximation (9.3.20) is
known as an implicit approximation.
When we consider the equations for all the points in the grid, including the
boundaries, it is possible to solve the full system by inverting a non-trivial matrix,
which is of course a more time-consuming operation than the one needed for the
explicit approximation.94 However, in many cases implicit approximations turn
out to have better properties than explicit ones, in particular related to the
stability of the numerical scheme as we will see below.
un+1 = B un , (9.4.1)
where un is the solution vector at time level n, and B is an update matrix (in
general sparse). It is important to notice that all finite difference approximation
to a linear equation can be written in this way, even those that involve more
than two time levels by simply introducing auxiliary variables. For example, for
the three-level approximations to the wave equation introduced in the previous
Sections we can define unm := φnm and vm n
:= φn−1 n
m , and take the vector u to be
n n n m
given by (u1 , v1 , ..., uN , vN ) with N the total number of grid points.
If the matrix B has a complete set of eigenvectors, then the vector un can be
written as a linear combination of them. The requirement for stability can then
94 In the particular case of one spatial dimension, the resulting matrix is tridiagonal since
each equation involves only a given point and its two nearest neighbors, and there are very
efficient algorithms to invert such matrices. In more than one dimension, however, the matrix
is no longer that simple, though it is still sparse (i.e. most of its entries are zero).
9.4 VON NEWMANN STABILITY ANALYSIS 327
be reduced to asking for the matrix B not to amplify any of its eigenvectors,
that is, we must ask for the magnitude of its largest eigenvalue of B, known as
its spectral radius, to be less than or equal to 1.
The stability analysis based on the idea just described is quite general, but it
requires a knowledge of the entries of B over all space, including the boundary.
There is, however, a very popular stability analysis method that, even though it
can only be shown to give necessary conditions for stability, in many cases turns
out to also give sufficient conditions. This method, originally introduced by von
Newmann, is based on a Fourier decomposition.
To introduce von Newmann’s method, we start by expanding the solution
of (9.4.1) as a Fourier series
un (x) = ũn (k) ei k·x , (9.4.2)
k
where the sum is over all wave vectors k that can be represented on the grid.95
If we now substitute this into the original equation (9.4.1) we find
where now B̃ is the Fourier transform of the original matrix B, also known as
the amplification matrix. The stability condition now corresponds to asking that
no Fourier mode should be amplified, that is, for the spectral radius of B̃ to be
less than or equal to 1. This is von Newmann’s stability condition.
It is important to stress the fact that in order to use this stability criteria
we have assumed two things: 1) The boundary conditions are periodic, since
otherwise we can not make a Fourier expansion, and 2) the entries of the matrix
B are constant (i.e. independent of position), since otherwise it is not possible
to decouple the different Fourier modes.
As an example of von Newmann’s stability analysis we can now study the sta-
bility of the implicit approximation to the wave equation we derived previously,
equation (9.3.20). Consider a Fourier mode of the form
If we substitute this back into the finite difference equation we find, after some
algebra, a quadratic equation for ξ of the form
Aξ 2 + Bξ + C = 0 , (9.4.5)
as the Nyquist limit. This implies that the maximum value that any component of the wave
vector can take is π/∆x.
328 NUMERICAL METHODS
where Zk+
and Zk−
are arbitrary constants.
On the other hand, from the fact that A = C we can easily show that
|ξ+ ξ− | = |C/A| = 1 . (9.4.11)
This is a very important property, it implies that if the system is stable for all
k, that is, if |ξ± (k)| ≤ 1 then the system will necessarily also be non-dissipative
(the Fourier modes not only don’t grow, they don’t decay either). For the system
to be stable we must then ask for
|ξ+ (k)| = |ξ− (k)| = 1 . (9.4.12)
It is easy to see that this will happen as long as
B 2 − 4AC ≤ 0 . (9.4.13)
Substituting now the values of the coefficients A, B and C into this expression
we find the following stability condition:
c2 ρ2 (1 − 2θ) [1 − cos (k∆x)] − 2 ≤ 0 . (9.4.14)
As we want this to hold for all k, we must consider the case when the left
hand side reaches its maximum value. If we take θ < 1/2, this will happen for
k = π/∆x, in which case the stability condition takes the simple form:
c2 ρ2 ≤ 1/ (1 − 2θ) . (9.4.15)
For the explicit scheme we have θ = 0, and the stability condition reduces to the
well known Courant–Friedrich–Lewy (CFL) condition96
cρ ≤ 1 ⇒ c∆t ≤ ∆x . (9.4.16)
The CFL condition has a clear geometric interpretation: The numerical do-
main of dependence must be larger than the physical domain of dependence, and
96 For systems with N spatial dimensions the CFL stability condition is slightly modified and
√
becomes instead cρ ≤ 1/ N .
9.5 DISSIPATION AND DISPERSION 329
Fig. 9.2: CFL stability condition. For c∆t ≤ ∆x, the numerical domain of dependence
is larger than the physical domain of dependence (shaded region), and the system is
stable. For c∆t > ∆x we have the opposite situation, and the system is unstable.
not the other way around (see Figure 9.2). If this weren’t the case, it would be
impossible for the numerical solution to converge to the exact solution, since as
the grid is refined there will always be relevant physical information that would
remain outside the numerical domain of dependence. And, as we have seen, the
Lax theorem implies that if there is no convergence then the system is unstable.
The argument we have just given clearly only applies to explicit schemes.
This is because for an implicit scheme, the numerical domain of dependence is
in fact the whole grid. In that case there is no simple geometric argument that
can tell us what the stability condition should be.
In order to obtain the stability condition (9.4.15) we assumed that θ < 1/2.
If, on the other hand, we take θ ≥ 1/2 then we must go back to the general
condition (9.4.14). However, in this case it is easy to see that the condition is
always satisfied. This means that an implicit scheme with θ ≥ 1/2 is stable for
all values of ρ, that is, it is unconditionally stable.
This takes us to one of the most important lessons of the theory of finite
differencing: Simple schemes not always have the best stability properties.97
where for every real wave number k there exists a complex frequency ω given by
97 This is even more so for systems of equations of parabolic type (such as the heat equation),
for which explicit schemes are practically useless since the stability condition becomes ∆t
∆x2 . The problem with this is that it means that if we reduce ∆x by half, we must reduce
∆t by a factor of four, so integrating to a desired finite time T quickly becomes prohibitive in
terms of computer time.
330 NUMERICAL METHODS
ω = ω (k) . (9.5.2)
The explicit form of this relation will of course be determined by the properties
of our original differential equation. Clearly, the imaginary part of ω will give
the rate of growth or decay of a given Fourier mode with time. A differential
equation that admits solutions for which the imaginary part of ω is negative is
called a dissipative equation.
It is also useful to define the concept of dispersion: A differential equation
is said to be non-dispersive if the real part of ω is a linear function of k, and
dispersive otherwise. For this reason, equation (9.5.2) is known as the dispersion
relation.
The wave equation is one example of a differential equation that admits solu-
tions of the form (9.5.1). In this case the dispersion relation is simply ω = ±ck,
which implies that the wave equation is both non-dissipative and non-dispersive.
The dispersion relation contains within itself very important information
about the behavior of the solution of a given differential equation. Consider, for
example, a non-dissipative differential equation. Equation (9.5.1) implies that we
will have solutions in the form of traveling sinusoidal waves. The speed of each
wave will be given in terms of its wave number k and its frequency ω by the
so-called phase velocity vp (k) defined as
ω
vp (k) := . (9.5.3)
k
In practice, however, we never deal with infinite plane waves but rather with
localized wave packets. A wave packet that has a narrow Fourier transform,
centered at a wavenumber k, doesn’t travel with the phase velocity v(k), but
rather with a speed given by:
dω
vg (k) = , (9.5.4)
dk
and known as the group velocity. Non-dispersive systems like the wave equation
have the property that the phase velocity and group velocity are equal for all
modes, vp (k) = vg (k).
Even when we are dealing with a differential equation that is both non-
dissipative and non-dispersive, it turns out that quite generally these properties
are not preserved when we consider a finite difference approximation. These ap-
proximations are almost always dispersive, and are often also dissipative (notice
that an instability can be interpreted as the result of negative dissipation).
The finite difference approximations to the simple one-dimensional wave
equation discussed in the previous sections provide an excellent example of this
type of phenomenon. Let us again consider the general implicit approximation
to the 1D wave equation (equation (9.3.20)). Comparing equation (9.5.1) with
the Fourier mode decomposition (9.4.4) used for the von Newmann stability,
9.5 DISSIPATION AND DISPERSION 331
1 − c2 ρ2 (θ − 1) [cos (k∆x) − 1]
cos (ω∆t) = . (9.5.5)
1 − c2 ρ2 θ [cos (k∆x) − 1]
Introducing the adimensional quantities K := k∆x and Ω := ω∆t, we can rewrite
the last equation as
1 − c2 ρ2 (θ − 1) [cos (K) − 1]
Ω = ± arccos . (9.5.6)
1 − c2 ρ2 θ [cos (K) − 1]
This expression is the numerical equivalent of the dispersion relation of the wave
equation, and is called the numerical dispersion relation. Notice how it is far
from being a simple linear function. In the limit when K 1 it is not difficult
to prove that this dispersion relation reduces to Ω = ±cρK, or in other words
ω = ±ck, so we recover the correct dispersion relation for the wave equation.
The dispersion relation (9.5.6) can be used to study the dispersion and dissi-
pation properties of our finite differencing approximation for the different values
of the parameters ρ and θ. In the particular case of the explicit approximation
(θ = 0), the relation reduces to
Ω = ± arccos 1 + c2 ρ2 [cos (K) − 1] . (9.5.7)
If we now take also cρ = 1 we find that Ω = ±K, that is, the finite difference
approximation is neither dissipative nor dispersive and in fact has the same
dispersion relation as the original differential equation. This remarkable fact can
be understood if we calculate the truncation error associated with our finite
difference approximation. It turns out that for θ = 0 and cρ = 1, the truncation
error vanishes exactly to all orders, in other words the explicit finite difference
approximation is exact for cρ = 1 (this property of the explicit scheme goes away
when we have more than one spatial dimension).
Figure 9.3 plots Ω as a function of K = k∆x for the explicit scheme (θ = 0)
using three different values of the Courant parameter ρ (for simplicity we have
taken c = 1). For ρ = 1 the dispersion relation is a straight line so that all
modes propagate with the same speed. For the case ρ = 0.8 we see that the slope
of the graph becomes smaller for larger values of K, so that the smaller wave-
lengths propagate more slowly than they should. This implies that a localized
wave packet will disperse as it propagates, leaving a trail of smaller wavelengths
behind. Finally, for the case ρ = 1.2 there are two things to notice. First, the
slope of the graph gets larger for large values of K, so small wavelengths now
propagate faster than they should. A wave packet would then disperse in the
opposite way to before, with smaller wavelengths overtaking the main packet.
Worse still, at K ∼ 2 the slope in fact becomes infinite and the frequency Ω
becomes purely imaginary (the imaginary part is also shown in the plot). Imag-
inary frequencies do not correspond to traveling waves any more but rather to
332 NUMERICAL METHODS
1
l=
3
2.5
1.2
l=
l = 0.8
2
1
1.5
1
l = 1.2
(imaginary part)
0.5
0
0.5 1 1.5 2 2.5 3
k 6x
Fig. 9.3: Dispersion relation Ω(k∆x) for the explicit scheme (θ = 1), and for three
different values of the Courant parameter ρ (taking c = 1).
solutions that grow exponentially with time, i.e. the corresponding modes are
unstable, which is to be expected since we are now violating the CFL condition.
αu + β ∂n u = γ , (9.6.1)
where u is our unknown function (or functions), (α, β, γ) are given functions of
space and time, and ∂n u denotes the derivative normal to the boundary under
consideration. We further classify the different types of boundary condition in
the following way:
infinity, where the exact boundary conditions are often known (e.g. the wave function should
be zero at spatial infinity if we have initial data of compact support). This approach works very
well when we compactify on null hypersurfaces surfaces, or on hypersurfaces that are asymp-
totically null (as in the characteristic [301] and conformal [132, 165] approaches to numerical
relativity). However, when we compactify on spatial hypersurfaces the loss in resolution as
the waves approach the boundary of the grid produces a large numerical backscattering. Still,
compactification on spatial hypersurfaces in combination with a sponge filter has been used in
practice with good results (see e.g. [231]).
334 NUMERICAL METHODS
φn+1
0 = (cρ − 1) φn+1
1 + (1 + cρ) φn1 + (1 − cρ) φn0 . (9.6.5)
1 + cρ
In situations where we have more than one spatial dimension, radiation
boundary conditions become more complicated. For example, for spherical waves
in three dimensions we typically expect the solution to behave as φ = u(r − ct)/r
for large r. This can be expressed in differential terms as
∂t φ + c ∂r φ + c φ/r = 0 . (9.6.6)
If we are using a Cartesian coordinate system, this equation will usually have to
be applied at the boundaries of a cube, where the normal direction corresponds
9.7 NUMERICAL METHODS FOR FIRST ORDER SYSTEMS 335
to one of the Cartesian directions xi . In that case we can use the fact that
for a spherically symmetric function, ∂i φ = (xi /r) ∂r φ, and use as a boundary
condition:
xi cxi
∂t φ + c ∂i φ − 2 φ = 0 . (9.6.7)
r r
This condition can now be applied at each of the Cartesian boundaries by taking
xi equal to (x, y, z). Of course, typically our function φ can not be expected
to be spherically symmetric, but if the boundaries are sufficiently far away the
angular derivatives will be much smaller than the radial derivative and can be
safely ignored without introducing large reflections.
The specific boundary conditions described above only work well for simple
systems like the wave equation. For more complex systems we would need to
carefully consider what are the appropriate boundary conditions at the differ-
ential level, which will of course be determined by the type of problem being
solved. For example, for hyperbolic systems of equations we should first decom-
pose the solution into ingoing and outgoing modes at the boundary, and apply
boundary conditions only to ingoing modes because applying them to outgoing
modes would violate causality. Only after we have found adequate (and well-
posed) boundary conditions at the differential level can we start constructing
finite difference approximations to them.
vρ n
un+1
m = unm − um+1 − unm−1 , (9.7.3)
2
where again ρ stands for the Courant parameter ρ = ∆t/∆x. This approxima-
tion is known as the forward Euler method (also called FTCS for forward time
centered space). This method has one very serious drawback: A von Newmann
stability analysis shows that the method is unconditionally unstable, i.e. it is
unstable for any choice of ∆x and ∆t, so it is useless in practice.
A stable method is obtained by using a backward time difference instead:
vρ n+1
un+1
m = unm − um+1 − un+1
m−1 . (9.7.4)
2
This method, known as backward Euler, is stable but clearly implicit. Both Eu-
ler methods are also only first order accurate because of the off-centered time
differences.
Remarkably, the stability problem with the standard Euler method can be
fixed if instead of unm we take the average (unm+1 + unm−1 )/2 in the first term of
equation (9.7.3):
1 n vρ n
un+1
m = um+1 + unm−1 − um+1 − unm−1 . (9.7.5)
2 2
This method is known as Lax–Friedrichs, and although it is still only first order
accurate, it is explicit and stable provided the CFL condition vρ ≤ 1 is satisfied.
This is another example of how finite difference is sometimes an art as much as
a science.
Other more sophisticated methods can be obtained from a Taylor series ex-
pansion of the form
(∆t)2 2
u(t + ∆t, x) = u(t, x) + ∆t ∂t u + ∂t u + · · ·
2
(∆t)2 2
= u(t, x) − v∆t ∂x u + v 2 ∂x u + · · · , (9.7.6)
2
where in the second line we have used the original advection equation to ex-
change time derivatives for spatial derivatives. Approximating now the spatial
derivatives using centered differences we obtain the so-called Lax–Wendroff finite
difference approximation
vρ n v 2 ρ2 n
un+1
m = unm − um+1 − unm−1 + um+1 − 2unm + unm−1 . (9.7.7)
2 2
Because of the Taylor expansion, the Lax–Wendroff method is second order ac-
curate in both time and space. It is also clearly explicit and turns out to be
stable if the CFL condition is satisfied.
The improved stability of Lax–Wendroff can in fact be intuitively understood.
First, notice that Lax–Wendroff is basically the forward Euler method plus a
9.7 NUMERICAL METHODS FOR FIRST ORDER SYSTEMS 337
v 2 ∆t 2
∂t u + v ∂x u = ∂x u . (9.7.8)
2
But this is nothing more than the advection-diffusion equation. That is, Lax–
Wendroff adds a diffusion (i.e. dissipative) term as a correction to forward Euler,
improving its stability properties. The diffusion term vanishes in the continuum
limit, so in the end we will still converge to the solution of the non-diffusive
advection equation. A similar argument shows that the first order Lax–Friedrichs
method also adds a diffusive
correction
term to forward Euler, only in this case
we are in fact adding (∆x)2 /2∆t ∂x2 u.
There are still other quite standard finite difference approximations to the
advection equation. For example, instead of centered spatial differences we can
use one-sided spatial differences. However, when doing this we must be careful
to use one-sided differences that respect the causal flow of information in order
to obtain a stable system. This leads to the so-called upwind method that takes
the following form
The name of the method comes from the fact that, in order to have stability,
we must take finite differences along the direction of the flow. Upwind is again
stable as long as the CFL condition is satisfied. It is also only first order accurate
in both space and time. Nevertheless, it is at the heart of the modern shock
capturing methods we will discuss in Section 9.10.
un+1 = unm − vρ unm − unm−1 v≥0,
m
n (9.7.9)
um = um − vρ um+1 − um
n+1 n n
v≤0.
Another quite common method is based on the idea of using centered differ-
ences in both space and time. This results in the following method:
un+1
m = un−1
m − vρ unm+1 − unm−1 . (9.7.10)
There are several things to notice about this method. First, in contrast to all
other methods presented so far this is a three-level method. This method is
known as leapfrog, as at each iteration we are using the centered time difference
to “jump” over the middle time level. Leapfrog is second order accurate in both
space and time, and stable if the CFL condition is satisfied. It can also be easily
generalized to non-linear equations (which is not the case for Lax–Wendroff, for
example). Still, it has two main drawbacks: First, since it is a three-level method,
in order to start it we need to know the solution on the first two time levels. But
for first order systems the initial data can only give us the very first time level
and nothing more, so that leapfrog needs to be initialized using one of the other
methods for at least one time step. Also, because of the way it is constructed it
is not difficult to convince oneself that leapfrog in fact produces an evolution on
two distinct, and decoupled, numerical grids, one of them taking even values of
338 NUMERICAL METHODS
Fig. 9.4: Structure of the numerical grid for the leapfrog scheme. Notice how the points
represented by the circles are decoupled from those represented by the stars, so in fact
we have two distinct grids.
m for even values of n and odd values of m for odd values of n, and the other
one doing the opposite (see Figure 9.4). This decoupling of the grid can lead to
the development of numerical errors that have a checker board pattern in space
and time typically known as red-black errors.
Finally, we will introduce one more implicit finite difference approximation
to the advection equation. For this we simply take the average of the forward
and backward Euler methods:
vρ n
un+1
m = unm − um+1 − unm−1 + un+1m+1 − um−1
n+1
. (9.7.11)
4
This approximation is known as the Crank–Nicholson scheme. It turns out to
be second order accurate in both time and space and stable for all values of ρ
(i.e it is unconditionally stable). It also forms the basis for a method that has
become very common in numerical relativity in the past few years and that we
will discuss in the next Section.
Figure 9.5 shows a schematic representation of the different finite difference
schemes introduced so far using the concept of computational stencil or molecule,
i.e. a diagram that shows the relationship between the different grid points used
in the approximation.
Up to this point we have considered only the simple scalar advection equa-
tion. However, the methods discussed above can be easily generalized to linear
hyperbolic systems of the form
∂t u + A ∂x u = 0 , (9.7.12)
where now u denotes a set of unknowns u = (u1 , u2 , ...), and A is some matrix
with constant coefficients (the characteristic matrix). Most of the methods dis-
cussed above can be used directly by simply replacing the scalar function u with
the vector of unknowns and the speed v with the characteristic matrix A. The
CFL condition now takes the form
forward Euler
and backward Euler
Lax-Wendroff
Lax-Friedrichs upwind or
leapfrog Crank-Nicholson
Fig. 9.5: Computational stencils for the different finite difference approximations to the
advection equation.
where λa are the eigenvalues of A, so the time step is now restricted by the
largest characteristic speed of the system. Notice that we are assuming that the
system is hyperbolic, so that all eigenvalues are real.
We must take care, however, with one-sided methods such as upwind which
would only be stable if all the eigenvalues of A had the same sign, and the method
is used along the corresponding direction. In a more general case where A has
eigenvalues of different signs we must first decompose the system into left-going
and right-going fields, which can only be done if we have a strongly hyperbolic
system (i.e. the matrix A has a complete set of eigenvectors, see Chapter 5). In
such a case we separate the eigenvalues of A according to sign by defining
λ+
a = max(λa , 0) , λ−
a = min(λa , 0) , (9.7.14)
and constructing the matrices of positive and negative eigenvalues
Λ+ = diag(λ+
a , 0) , Λ− = diag(λ−
a , 0) . (9.7.15)
A stable upwind method would then take the form
un+1
m = unm − ρ R Λ+ R−1 unm − unm−1 + R Λ− R−1 unm+1 − unm , (9.7.16)
with R the matrix of column eigenvectors of A. That is, each characteristic
field is differentiated along the upwind direction associated with the sign of its
corresponding characteristic speed.
The methods discussed here can in fact also be generalized to the case when
the coefficients of the matrix A are functions of space and time (but now we must
be careful with the Taylor expansion used to obtain the Lax–Wendroff scheme,
as we will pick up terms coming from derivatives of A).
while leaving the time dimension continuous. For concreteness, let us assume
that we have a single scalar partial differential equation of the form
∂t u = S(u) , (9.8.1)
Using Runge–Kutta we can now rewrite our method for solving the advection
equation as
u∗ = un + ∆t S(un )/2 ,
(9.8.6)
un+1 = un + ∆t S(u∗ ) .
which after some algebra becomes
vρ n v 2 ρ2 n
un+1
m = unm − um+1 − unm−1 + um+2 − 2unm + unm−2 , (9.8.7)
2 8
where as before ρ = ∆t/∆x. We can call this method MOL-RK2 for short. If
we now compare the last expression with the Lax–Wendroff method discussed
in the previous Section, equation (9.7.7), we see that it has essentially the same
structure except for the fact that the correction term has now been approximated
using twice the grid spacing. Unfortunately, a von Newmann stability analysis
shows that the MOL-RK2 method just described is unstable for any value of ρ,
so it is useless in practice.
A stable method is obtained if we use fourth order Runge–Kutta [180] instead,
which corresponds to first recursively calculating the quantities
k1 = S(un ) , (9.8.8)
n
k2 = S(u + k1 ∆t/2) , (9.8.9)
k3 = S(un + k2 ∆t/2) , (9.8.10)
k4 = S(un + k3 ∆t) , (9.8.11)
and then taking
∆t
un+1 = un + (k1 + 2k2 + 2k3 + k4 ) . (9.8.12)
6
Notice that fourth order Runge–Kutta requires four evaluations of the sources
to advance one time step. We can call the method of lines obtained in this way
MOL-RK4. Since this method gives us fourth order accuracy in time, it is natural
to use it with fourth order spatial differencing as well.
Another very common choice for the time integration in the method of lines is
the iterative Crank–Nicholson (ICN) scheme. The idea behind this method is to
use an iterative scheme to approach the solution of the standard implicit Crank–
Nicholson scheme described in the last Section (equation (9.7.11)). Viewed as a
method of lines, the iteration can be described as follows
with N ≥ 2. The method takes one initial forward Euler step, uses this to
calculate an approximation to the source in the next time level, and then iterates
342 NUMERICAL METHODS
using the average of the source in the old time level and the latest approximation
to the new time level. It is clear that if the iterations converge in the limit
N → ∞, we will recover the implicit Crank–Nicholson scheme:
un+1 − un 1
N
(∆t)k
un+1 = un + S k (un ) . (9.8.17)
2k−1
k=1
Also, for linear source functions the iterations can be written in an entirely
equivalent way as
∆t
ū∗(l) = un + S(ū∗(l−1) ) , l = 1, ..., N − 1, (9.8.18)
2
un+1 = un + ∆t S(ū∗(N −1) ) , (9.8.19)
with u∗(l) and ū∗(l) related through ū∗(l) = (un + u∗(l) ). The method now takes
a series of half steps and a final full step. Viewed in this way we can see that the
case N = 2 is in fact identical to second order Runge–Kutta. Notice that the two
different versions of ICN are only equivalent for linear equations; for non-linear
equations this equivalence is lost and since there is no a priori reason to prefer
one version over the other it is a matter of personal choice which version is used
in a given code.
For linear systems, it is also possible to show that the ICN method has the
following important properties [14, 283]:
1. In order to obtain a stable scheme we must take at least three steps, that is
N ≥ 3. The case N = 2 is enough to achieve second order accuracy, but it
is unstable (it is equivalent to second order Runge–Kutta). Stability in fact
comes in pairs in terms of the number of steps: 1 and 2 steps are unstable,
3 and 4 are stable, 5 and 6 are unstable, and so on (see e.g. [283]).
2. The iterative scheme itself is only convergent if the standard CFL stability
condition is satisfied, otherwise the iterations diverge [14].
These two results taken together imply that there is no reason (at least from
the point of view of stability) to ever do more than three ICN steps. Three steps
are already second order accurate, and provide us with a scheme that is stable
as long as the CFL condition is satisfied.99 Increasing the number of iterations
will not improve the stability properties of the scheme any further. In particular,
99 Some authors call the N = 3 method a “two-iteration” ICN scheme, since we do an initial
Euler step and then iterate twice. This is identical to the “three-step” scheme discussed here.
9.9 ARTIFICIAL DISSIPATION AND VISCOSITY 343
we will never achieve the unconditional stability properties of the full implicit
Crank–Nicholson scheme, since if we violate the CFL condition the iterations
will diverge.
Three-step ICN became a workhorse method in numerical relativity for a
number of years. Recently, however, the demand for higher accuracy has seen
ICN replaced by fourth order Runge–Kutta with fourth order spatial differencing.
Still, for applications that do not require very high accuracy, three-step ICN has
proven to be a very robust method.
un+1
m = unm + ∆t S(unm ) , (9.9.1)
with S(un ) some spatial finite difference operator. Let us now modify this scheme
by adding a term of the form
∆t N
un+1
m = unm + ∆t S(unm ) − (−1) ∆2N n
x (um ) , (9.9.2)
∆x
− N
with > 0, N ≥ 1 an integer, and where ∆2N
x := (∆+
x ∆x ) is the 2N cen-
2N
tered difference operator. These ∆x operators appear in the finite difference
344 NUMERICAL METHODS
2T =4
2T
=6
0.8
2T
=2
0.6
h
0.4
0.2
Fig. 9.6: Dissipation factor λ as a function of the wave number k∆x for the case when
∆t/∆x = 1/22N and N = 1, 2, 3. Notice how, as N increases, the dissipation factor
approaches a step function at the Nyquist wavelength k∆x = π.
approximations to the high-order spatial derivatives ∂x2N u, and are very easy to
construct by using the corresponding coefficients of the binomial expansion with
alternated signs, for example:
The factor (−1)N in (9.9.2) guarantees that the extra term is dissipative if we
take > 0. Now, assume for the moment that S(unm ) = 0, then our finite differ-
ence scheme becomes
∆t N
un+1
m = unm − (−1) ∆2N n
x (um ) . (9.9.6)
∆x
A von Newmann stability analysis shows that this scheme is stable as long as
Moreover, if we choose such that ∆t/∆x = 1/22N , then the dissipation factor
approaches a step function in Fourier space as N increases, so that it damps very
strongly modes with wavelengths close to the grid spacing ∆x, and leaves longer
wavelengths essentially unaffected (see Figure 9.6).
Going back to the full finite difference scheme (9.9.2) we find that the stability
depends on the explicit form of the operator S, but in general the extra term
still introduces dissipation and has the effect of improving the stability of the
original scheme for small positive values of (in practice it turns out that even
9.9 ARTIFICIAL DISSIPATION AND VISCOSITY 345
un+1 − unm 1 N
m
= S(unm ) − (−1) ∆2N n
x (um ) . (9.9.8)
∆t ∆x
(∆x)2N −1 ∂x2N u .
N
∂t u = S(u) − (−1) (9.9.9)
We then see that we have added a new term to the original differential equation
that vanishes in the continuum limit as (∆x)2N −1 . In other words, if the original
numerical scheme has an accuracy of order m, we would need to use a dissipative
term such that 2N − 1 ≥ m to be sure that we have not spoiled the accuracy
of the scheme. We typically say that the “order” of the dissipative term is given
by the number of derivatives in it, that is 2N , so that if our original scheme
was second order accurate we would need to add fourth order dissipation to it
(4 − 1 = 3 > 2), and if we had a fourth order scheme we must add sixth order
dissipation (6 − 1 = 5 > 4). Higher order dissipative terms clearly have problems
when we get very close to the boundaries since we don’t have enough points to
apply the dissipative operators. This means that, close to the boundaries, we
must either not use artificial dissipation, or we have to use one-sided dissipative
operators.
p→p+Π. (9.9.10)
This already guarantees that energy conservation will be maintained as the vis-
cous term will transform kinetic energy into internal energy. In the original work
of von Newmann and Richtmyer, the artificial viscosity term was taken to be of
the form
2
Π1 = c1 ρ (∆x ∂x v) , (9.9.11)
where c1 > 0 is a constant coefficient, ρ is the density of the fluid at a given grid
point, and v is the flow speed. This form of Π is motivated by considering the loss
of kinetic energy when two fluid particles have a completely inelastic collision.
A different type of artificial viscosity that vanishes less rapidly for small changes
in v, and has the effect of reducing unphysical oscillations behind the shock, has
the form
Π2 = c2 ρ cs ∆x |∂x v| , (9.9.12)
with c2 > 0 again a constant coefficient, and where cs is the local speed of sound.
A more generic form of artificial viscosity will then consist of adding together
both types of terms:
2
Π = c1 ρ (∆x ∂x v) + c2 ρ cs ∆x |∂x v| . (9.9.13)
The values of the constants c1 and c2 are adjusted for each simulation, with c2
typically an order of magnitude smaller than c1 (notice that the linear viscosity
term introduces an truncation error of order ∆x, while the error associated with
the quadratic term has order (∆x)2 instead). Also, in order to make sure that
viscosity is only added in regions where shocks are likely to form, the coefficients
100 The oscillations associated which such Gibbs phenomena are equivalent to those that arise
in the Fourier expansion of a discontinuous function and have the same origin. As the resolution
is increased their wavelength becomes smaller and they appear closer and closer to the shock
front, but their amplitude remains essentially the same.
9.10 HIGH RESOLUTION SCHEMES 347
c1 and c2 are usually taken to be different from zero only for regions where the
flow lines are converging, that is ∂x v < 0, and are set to zero otherwise.
The simple form of artificial viscosity just described has been modified over
the years, and considerably more sophisticated versions have been proposed,
though the basic idea has remained (see for example the recent textbook by
Wilson and Mathews [300] for the use of artificial viscosity in relativistic hydro-
dynamical simulations). Here, however, we will not go into the details of those
more advanced artificial viscosity techniques and will refer the interested reader
to any standard textbook on computational fluid dynamics.
This method is good for smooth solutions, and stable as long as u > 0 and
∆t/∆x ≤ 1. However, it will not converge to the correct solution of the Rie-
mann problem for discontinuous initial data. What we find in practice is that
even though the numerical solution looks qualitatively correct, with a propa-
gating discontinuity smoothed out over a few grid points because of numerical
dissipation, the discontinuity in fact moves at the wrong speed, and this does
not improve with resolution.
Fortunately, there is a simple way to solve this problem. We must use a
numerical method that is written in conservation form. That is, our numerical
method must have the form
∆t n+1/2 n+1/2
un+1
m = u n
m − F − Fm−1/2 , (9.10.3)
∆x m+1/2
n+1/2
with Fm+1/2 a function that depends on the values of um and um+1 (plus maybe
a few more nearest neighbors), but that has the same functional form as we move
from one grid point to the next. The function F is known as the numerical flux
function. Equation (9.10.3) can be understood as the numerical analog of the
integral form of the conservation law if we think of unm not as the value of u(x, t)
at a single grid point, but rather as the average over the grid cell.
We will usually obtain numerical methods in conservation form if we start
from the conservative form of the differential equation. For Burger’s equation we
can start from
∂t u + ∂x u2 /2 = 0 , (9.10.4)
which can easily be seen to be equivalent to (9.10.1). This leads to the following
upwind method
∆t n 2
un+1
m = unm − (um ) − (unm−1 )2 , (9.10.5)
2∆x
n+1/2
where we have taken Fm+1/2 = (unm )2 /2. With this method we now find that
discontinuities propagate at the correct speed.
This brings us to the Lax–Wendroff theorem (1960): For hyperbolic systems
of conservation laws, a numerical approximation written in conservation form,
if it converges, will converge to a weak solution of the original system.
Fig. 9.7: Approximation of a function in piecewise constant form over the grid cells. The
boundaries of the individual cells are halfway between the grid points. With Godunov’s
method we solve Riemann problems at each of the cell boundaries.
problems do not intersect each other (this is essentially the CFL condition). Fi-
nally, we use this solution to compute the numerical flux function in the following
way
t+∆t
n+1/2 1
Fm+1/2 = f ũn (xm+1/2 , t) dt , (9.10.6)
∆t t
with f (u) the exact flux function. Notice that this integral is in fact trivial since,
by construction, ũn is constant along the line x = xm+1/2 . If we denote this
constant value by u∗ (unm , unm+1 ) we will have
Fm+1/2 = f u∗ (unm , unm+1 ) .
n+1/2
(9.10.7)
The problem now is to compute u∗ (unm , unm+1 ): We know it is constant, but what
is its value? In order to find u∗ we must first determine the full wave structure
of our system of equations and solve the Riemann problem.
For a system of linear conservation laws the Riemann problem is in fact very
easy to solve and we find that Godunov’s method reduces to the upwind method
that takes into account the speed of propagation of each characteristic field.
For a general non-linear scalar conservation law, on the other hand, Godunov’s
method reduces to
'
min f (u) , ul ≤ u ≤ ur if ul ≤ ur ,
Fm+1/2 = (9.10.8)
max f (u) , ul ≥ u ≥ ur if ul ≥ ur .
For systems of non-linear conservation laws we must solve the full Riemann
problem first. This can be very difficult in practice, so we often use approximate
Riemann solvers.
Godunov’s method, though quite elegant, does have one important disad-
vantage: Because of the piecewise constant form of the function ũn , the method
is only first order accurate (as can also be seen from the fact that, for linear
systems, it reduces to upwind).
350 NUMERICAL METHODS
superbee limiter
2
minmod limiter
1
0
1 2 3 4
e
Fig. 9.8: Different limiter function φ(θ): The solid line corresponds to Sweby’s minmod
limiter, the dashed line to Van Leer’s limiter, and the dotted line to Roe’s superbee
limiter.
n+1/2 v
Fm+1/2 = vunm + (1 − vρ) unm+1 − unm φ(θ) , (9.10.13)
2
that is, we have multiplied the second term with the limiter function φ. Notice
that for φ = 1 we recover the Lax–Wendroff flux, while for φ = 0 the expression
reduces to the upwind flux.
How do we choose the form of φ? There are in fact many possibilities that
have been suggested in the literature. For example, Beam and Warming propose
taking φ(θ) = θ, though this is not really a flux limiter and only differs from Lax–
Wendroff in the fact that the dissipative term is not centered but is upwinded
instead. Other more interesting limiter functions are the following:
Notice that all three limiters set φ(θ) = 0 for θ < 0, which implies that at
extrema the scheme will become only first order accurate. Also, all these limiters
are constructed in such a way as to guarantee that the total variation of the
numerical solution, defined as
T V (u) := |um − um−1 | , (9.10.17)
m
with σm a slope to be determined in terms of the data. For a linear equation the
obvious choice for this slope is
um+1 − um
σm = . (9.10.19)
∆x
Now, in order to get a second order method we need to calculate the flux at the
point (xm + ∆x/2, tn + ∆t/2). For the advection equation we will have
n+1/2
Fm+1/2 = v u(xm + ∆x/2, tn + ∆t/2)
= vu(xm + (∆x − v∆t)/2, tn )
= v (unm + σm
n
(1 − vρ) ∆x/2)
v
= vunm + (1 − vρ) unm+1 − unm . (9.10.20)
2
We then see that this choice of σm reduces to Lax–Wendroff, which as we know
is bad for discontinuous solutions.
9.11 CONVERGENCE TESTING 353
A much better choice is to “limit” the value of the slope σm . For example,
we can choose the slope as
1
σm = × minmod (um+1 − um , um − um−1 )
∆x
um+1 − um
= × minmod (1, θ) , (9.10.21)
∆x
which for linear equations is equivalent to the minmod flux limiter. Similarly, we
can use a slope of the form
1 2 (um+1 − um ) (um − um−1 )
σm =
∆x (um+1 − um ) + (um − um−1 )
um+1 − um 2θ
= , (9.10.22)
∆x 1+θ
for θ > 0, and σm = 0 otherwise, which for linear equations reduces to van Leer’s
flux limiter.
There is also a family of limiter methods that relax the TVD condition in
order to increase accuracy near extrema (where TVD methods become only
first order accurate). These higher order methods are known as essentially non-
oscillatory (ENO) methods, and instead of asking for the total variation to be
non-increasing they allow it to increase as long as its growth is bounded by an
exponential that is independent of the initial data. This means that such methods
are total variation stable. However, we will not discuss ENO methods here.
where u(t, x) is the solution of the original differential equation, and the ei (t, x)
are error functions at different orders in ∆. For a first order accurate approxi-
mation we expect e1
= 0, for a second order approximation we expect e1 = 0
and e2
= 0, etc.
Let us for a moment assume that we know the exact solution to the problem
(as in the case of the one-dimensional wave equation). To test the convergence
of our finite difference approximation we perform a calculation at two different
resolutions ∆1 and ∆2 , with r ≡ ∆1 /∆2 > 1, and calculate the solution error in
each case
∆ 1 = u − u ∆ 1 , ∆ 2 = u − u ∆ 2 . (9.11.2)
Notice that in practice this error can only be calculated at the points correspond-
ing to the numerical grid under consideration. We then find the r.m.s. norm of
each solution error and calculate the ratio of both norms
||∆1 ||
c(t) := . (9.11.3)
||∆2 ||
This ratio is clearly a function of time only, and is known as the convergence
factor (again, it can only be calculated at times when both solution errors are
defined). If we have a finite difference approximation of order n we can now use
the Richardson expansion to find that in the continuum limit the convergence
factor will become n
∆1
lim c(t) = ≡ rn . (9.11.4)
∆→0 ∆2
The convergence test is typically done taking r = 2, so that we expect c(t) ∼ 2n .
In practice we perform the calculation at several different resolutions and looks
at the behavior of c(t) as the resolution increases. If it is close (or getting closer)
to the expected value we say that we are in the convergence regime.
Of course, the above procedure will only work if we know the exact solution.
This is not true in most cases (after all, if it were true, we would not be using
a numerical approximation), so the best thing that we can hope for is to prove
9.11 CONVERGENCE TESTING 355
Because we are taking norms of errors, the convergence test that we have just
described is known as a global convergence test. It is also possible to perform
so-called local convergence tests by simply plotting the relative errors u∆1 − u∆2
and u∆2 − u∆3 directly as functions of position in order to compare them by eye.
Since for an order n scheme the Richardson expansion implies that both these
errors will be proportional to en (t, x), we expect that if the relative errors are
rescaled appropriately they should lie approximately one on top of the other.
For example, if we have ∆1 /∆2 = ∆2 /∆3 = r, then after we multiply u∆2 − u∆3
with rn it should coincide (roughly) with u∆1 − u∆2 .
Convergence testing not only allows us to verify that the errors are indeed
becoming smaller at the correct rate as the grid is refined, but it also allows us
to estimate the remaining solution error. Assume for example that we have an
order n scheme and two numerical calculations with resolutions ∆1 and ∆2 , then
the Richardson expansion implies that
u∆1 − u∆2 = en (∆n1 − ∆n2 ) + O(∆n+1 )
= en ∆n2 (rn − 1) + O(∆n+1 )
∼ ∆2 (rn − 1) . (9.11.8)
We can then estimate the solution error on our highest resolution grid as
1
∆ 2 ∼ (u∆1 − u∆2 ) . (9.11.9)
rn − 1
This error estimate is accurate to at least order n+1, and often to order n+2 since
for centered schemes there are only even powers of n in the Richardson expansion.
101 In some cases we in fact do know the exact solution, at least in part. For example, in the
3+1 formalism we know that the constraints must vanish, so we can check the convergence of
the constraints to zero.
356 NUMERICAL METHODS
so that
1 r n u∆2 − u∆1
u = u∆2 − (u∆1 − u∆2 ) + O(∆n+1 ) ∼ . (9.11.11)
rn −1 rn − 1
We then have an estimate of the final solution that is at least one order higher
than our original approximation. This estimate is known as the Richardson ex-
trapolation and can improve our solution significantly provided that we start
with an error that is already small.102
102 As with any type of extrapolation, Richardson extrapolation is not very good if the initial
error is very large, in which case it might even make things worse. So it should be used with
care.
103 Boyd goes so far as to define an “idiot” as someone who publishes a numerical calculation
10.1 Introduction
In this Chapter we will discuss the application of the techniques of numerical
relativity to some simple spacetimes. We will consider in turn three different
types of example, starting from 1+1 “toy” relativity, and moving on to the case
of spherical and axial symmetry. We will discuss some of the special problems
that arise in each case, such as the issue of gauge shocks in the 1+1 case, and the
regularization of the equations in both spherical and axial symmetry, and will
also give examples of numerical simulations of simple physical systems, such as
the evolution of Schwarzschild, the collapse of a scalar field, and the evolution
of a Brill wave spacetime.
where for simplicity we have taken the shift vector to be zero. Notice that since we
only have one spatial dimension, in this case the trace of the extrinsic curvature
is simply given by K = Kxx .104
The ADM evolution equations in the 1+1 case, together with the Bona–Masso
slicing condition, can be written in first order form as
∂t α = −α2 f K , (10.2.2)
∂t g = −2αgK , (10.2.3)
and
104 This means, in particular, that maximal slicing is trivial since setting K = K x = 0 in
x
vacuum immediately implies α = 1, so that we do not even have gauge dynamics.
357
358 EXAMPLES OF NUMERICAL SPACETIMES
∂t Dα + ∂x (αf K) = 0 , (10.2.4)
∂t Dg + ∂x (2αK) = 0 , (10.2.5)
∂t K + ∂x (αDα /g) = α K 2 − Dα Dg /2g , (10.2.6)
Now, in Chapter 5 it was shown that the ADM evolution equations are not
strongly hyperbolic in the 3+1 case, even when using the Bona–Masso slicing
condition. However, this is no longer true in the case of 1+1 and we find that
the matrix M defined above has eigenvalues
1/2
λ0 = 0 , λ± = ±α (f /g) , (10.2.10)
Since the eigenvalues are clearly real for f > 0, and the two eigenvectors are
linearly independent, the system (10.2.8) is in fact strongly hyperbolic. The
eigenfunctions of the system are defined by (see Chapter 5)
ω = R−1
v ,
(10.2.12)
ω0 = Dα /f − Dg /2 , ω± = K̃ ± Dα /f 1/2 , (10.2.13)
(ω+ + ω− )
K̃ = , (10.2.14)
2
f 1/2 (ω+ − ω− )
Dα = , (10.2.15)
2
(ω+ − ω− )
Dg = − 2ω0 . (10.2.16)
f 1/2
It is important to notice that with the eigenfunctions scaled as above, their
evolution equations also turn out to be conservative and have the simple form
∂t
ω + ∂x (Λ ω
) = 0 , (10.2.17)
with Λ := diag {λi }. If, however, the eigenfunctions are rescaled in the way
ωi = Fi (α, g) ωi , then the evolution equations for the ωi will in general no longer
be conservative, and non-trivial sources will be present. The important point is
that there is in fact one normalization in which the equations are conservative.
10.2.1 Gauge shocks
Although the 1+1 evolution equations derived above are strongly hyperbolic
and therefore well-posed, it turns out that in the general case they can give rise
to singular solutions (i.e. gauge pathologies). These singular solutions are very
similar to the shock waves that develop in hydrodynamics, and because of this
they are known as gauge shocks [2, 3, 4, 19].
There are at least two different ways in which we can see how gauge shocks
develop. Let us first consider the evolution equations for the traveling eigenfunc-
tions ω± :
∂t ω± + ∂x (λ± ω± ) = 0 . (10.2.18)
We can rewrite these equations as
∂t ω± + λ± ∂x ω± = −ω± ∂x λ± . (10.2.19)
Using now the expressions for λ± , and denoting f ≡ df /dα, we find
αf 1/2 f 1/2 αf
∂x λ± = ∓ 3/2 ∂x g ± 1/2 1 + ∂x α
2g g 2f
αf Dα Dg
= λ± f + −
2 f 2
αf ω+ − ω−
= λ± f − 1 + + ω0 , (10.2.20)
2 2f 1/2
and finally
∂t ω± + λ± ∂x ω± =
αf ω+ − ω−
λ± ω± 1 − f − − ω 0 . (10.2.21)
2 2f 1/2
Assume now that we are in a region such that ω0 = ω− = 0. It is clear that
ω0 will not be excited since it does not evolve, nor will ω− be excited since from
360 EXAMPLES OF NUMERICAL SPACETIMES
the equation above we see that all its sources vanish. The evolution equation for
ω+ then simplifies to
λ+ αf
∂t ω+ + λ+ ∂x ω+ = 1/2 1 − f − 2
ω+ . (10.2.22)
2f 2
But this equation implies that ω+ will blow up along its characteristics unless
the term in parenthesis vanishes:
αf
1−f − =0. (10.2.23)
2
We will call this the shock avoiding condition. This condition can in fact be easily
integrated to give
f (α) = 1 + κ/α2 , (10.2.24)
with κ an arbitrary constant. Notice that harmonic slicing corresponds to f = 1
which is of this form, but f = const.
= 1 is not. Notice also that 1+log slic-
ing, for which f = 2/α, though not of the form (10.2.24), nevertheless satisfies
condition (10.2.23) when α = 1 (we will come back to the case of 1+log slicing
below).
In some cases it is even possible to predict exactly when a blow up will
occur. In order to see this we first need to define the rescaled eigenfunctions
Ω± := αω± /g 1/2 . For their evolution equations we now find
αf Ω2±
∂t Ω± + λ± ∂x Ω± = 1 − f −
2 2
αf Ω± Ω∓
+ 1−f + . (10.2.25)
2 2
Notice that with this new scaling, all contributions from ω0 to the sources have
disappeared. If we now assume that we have initial data such that Ω− = 0, then
the evolution equation for Ω+ simplifies to
αf Ω2+
∂t Ω+ + λ+ ∂x Ω+ = 1 − f − , (10.2.26)
2 2
which can be rewritten as
dΩ+ αf Ω2+
= 1−f − , (10.2.27)
dt 2 2
with d/dt the derivative along the characteristic. The last equation has a very
important property: For constant f the coefficient of the quadratic source term
is itself also constant. In that case the equation can be easily integrated to find
(assuming f
= 1)
Ω0+
Ω+ = , (10.2.28)
1 − (1 − f ) Ω0+ t/2
where Ω0+ ≡ Ω+ (t = 0). The solution will then blow up at a finite time given by
t∗ = 2/[(1 − f ) Ω0+ ]. Clearly, this time will be in the future if (1 − f ) Ω0+ > 0,
10.2 TOY 1+1 RELATIVITY 361
otherwise it will be in the past. Since, in general, Ω0+ will not be constant in
space, the first blow-up will occur at the time
This shows that blow-ups can easily develop when we use the Bona–Masso
slicing condition with a function f (α) such that (10.2.23) does not hold. Of
course, this does not imply that such blow-ups will always develop, since their
appearance will clearly depend on the specific form of the initial data being used,
but it does show that we should not be terribly surprised if they do.
There is another way in which we can see how these blow-ups develop that
provides us with a more direct geometrical interpretation of the problem. Con-
sider now the evolution of the eigenspeeds themselves along their corresponding
characteristic lines. From (10.2.10) we find that
∂t λ± = ±∂t αf 1/2 /g 1/2 , (10.2.30)
α=1+, (10.2.35)
with 1. Notice that the limit above applies to situations that are close to
10.2 TOY 1+1 RELATIVITY 363
flat space, but generally does not apply to strong field regions (like the region
close to a black hole) where the lapse can be expected to be very different from 1.
However, in such regions considerations about singularity avoidance are probably
more important. Our aim is to find slicing conditions that can avoid singularities
in strong field regions, and at the same time do not have a tendency to generate
shocks in weak field regions.
We can now expand f in terms of as
f = a0 + 2 (1 − a0 ) + O(2 )
= (3a0 − 2) + 2 (1 − a0 ) α + O(2 ) . (10.2.39)
Here we must remember that (10.2.39) is just an expansion for small . Any
form of the function f (α) that has the same expansion to first order in will
also satisfy condition (10.2.23) to lowest order. For example, one family of such
functions emerges if we ask for f (α) to have the form
p0
f= . (10.2.40)
1 + q1
It is not difficult to show that for this to have an expansion of the form (10.2.39)
we must ask for
p 0 = a0 , q1 = 2 (a0 − 1) /a0 , (10.2.41)
which implies
a20 a20
f= = . (10.2.42)
a0 + 2 (a0 − 1) (2 − a0 ) + 2 (a0 − 1) α
f = 2/α , (10.2.43)
which corresponds to a member of the 1+log family. The crucial observation here
is that, as already mentioned in Chapter 4, this specific member of the 1+log
family is precisely the one that has been found empirically to be very robust
364 EXAMPLES OF NUMERICAL SPACETIMES
with h some profile function that decays rapidly (e.g. a Gaussian function). It is
then not difficult to show that if we keep x = xM as our spatial coordinate, the
spatial metric and extrinsic curvature turn out to be
10.2 TOY 1+1 RELATIVITY 365
Assume now that we want to have waves traveling to the right (the opposite
situation can be done in an analogous way). This means that we want ω− = 0,
which in turn implies that
Dα = f K̃ = − f h /g . (10.2.47)
In the particular case when f is a constant the above equation can be easily
integrated to find
√f /2
1 − h
α= . (10.2.49)
1 + h
From this we can reconstruct the rescaled eigenfunction Ω+ , and use equa-
tion (10.2.29) to predict the time of shock formation given a specific form of
h(x). In all the simulations shown here, the profile function h has been chosen
to be a simple Gaussian of the form
2
h(x) = e−x /10 . (10.2.50)
But do notice that even if we keep h(x) always the same, changing the value of
f will change the initial lapse (10.2.49).
The evolution code used for the simulations is based on a method of lines
integration with three-step iterative Crank–Nicholson. For the spatial differenti-
ation the code uses a slope limiter method of van Leer’s type (see Chapter 9).
No artificial dissipation is introduced. Since the slope limiter method requires
information about the direction of propagation, the code evolves the eigenfields
(ω0 , ω± ) directly, and the variables (Dα , Dg , K̃) are later reconstructed from
these.
Consider first an evolution with f = 0.5. Figure 10.1 shows the numerical
evolution of the eigenfield ω+ and the lapse function α, using a resolution of
∆x = 0.003125 and a Courant factor of ∆t/∆x = 0.5. For the initial data used
here, according to (10.2.29) a blow-up is expected at time T ∗ 9.98. The plot
shows both the initial data (dotted lines) and the numerical solution at time
t = 10 (solid lines), just after the expected blow-up. We clearly see how the
eigenfield ω+ has developed a large spike at the center of the pulse, while the
lapse has developed a corresponding sharp gradient (the numerical solution for
ω+ does not in fact become infinite because the numerical method has some
inherent dissipation). Notice also that after t = 10 the pulse has propagated a
distance of ∼ 7, in accordance
√ with the fact that the characteristic speed in this
case is λ = α f /g ∼ f ∼ 7.07.
366 EXAMPLES OF NUMERICAL SPACETIMES
Fig. 10.1: Evolution of the eigenfield ω+ and the lapse function α for the case where
f = 0.5. The dotted lines show the initial data, and the solid lines the solution at
t = 10, just after the expected blow-up.
Fig. 10.2: Evolution of the eigenfield ω+ and the lapse function α for the case where
f = 1.5. The dotted lines show the initial data, and the solid lines the solution at
t = 21, just after the expected blow-up.
Next, consider the case where f = 1.5. For the same profile function h(x) as
before, equation (10.2.29) now predicts a blow-up at time T ∗ = 20.84. Figure 10.2
shows the numerical evolution of ω+ and α in this case, using the same resolution
and Courant parameter as before. The plot again shows both the initial data
(dotted lines), and the numerical solution at time t = 21 (solid lines), just after
the expected blow-up. Notice that ω+ has now developed two large spikes at the
front and back of the pulse, with corresponding sharp gradients developing in
the lapse function α. The sign of the spikes in ω+ has also been reversed. Notice
also that in this case the pulse has traveled a√distance of ∼ 25 after t = 21,
corresponding to a characteristic speed of λ ∼ f ∼ 1.22.
We then see that for both cases with f = const.
= 1, gauge shocks develop as
expected. For completeness, Figure 10.3 shows a simulation of the same initial
data, but now using harmonic slicing corresponding to f = 1. The plot shows
10.2 TOY 1+1 RELATIVITY 367
Fig. 10.3: Evolution of the eigenfield ω+ and the lapse function α for the case of har-
monic slicing, corresponding to f = 1.0. The dotted lines show the initial data, and the
solid lines the solution at t = 25.
the solution after t = 25. One can clearly see that the pulse now propagates with
a speed ∼ 1, maintaining its initial profile so that no shock forms.
A more interesting example corresponds to the case where we choose the func-
tion f (α) as a member of the shock avoiding family (10.2.24): f = 1 + κ/α2 . How-
ever, since f is now not constant, if we want to have a purely right-propagating
pulse we can not take a lapse of the form (10.2.49) anymore. Fortunately, it turns
out that in this case we can still integrate equation (10.2.48) exactly. One finds
+ +
1 1 − h κ 1 + h
α= C − , (10.2.51)
2 1 + h C 1 − h
with C an integration constant. If we now ask for√the lapse to become 1 far away,
then this constant must take the value C = 1 + 1 + κ.
Figure 10.4 shows a simulation for a shock avoiding slicing with κ = 0.5.
Since√ α ∼√1, in this case we expect the pulse to propagate at a speed of
λ ∼ f = 1 + κ ∼ 1.22, i.e. essentially the same speed as in the case f = 1.5.
From the figure we see that the pulse propagates maintaining its initial profile,
showing that this form of f does indeed avoid the formation of gauge shocks.
At this point we could worry about the fact that even if the simulations
show sharp gradients and large spikes, this does not in itself prove that a true
singularity has developed at the predicted time. In fact, since the numerical
method has some inherent dissipation, we would not expect to see the true blow-
up numerically (moreover, the slope limiter method is designed precisely so that
such large gradients can be handled without problems). However, there is at least
one way to show numerically that a true blow-up is developing at a specific time:
Since after the blow-up the differential equation itself makes no sense anymore,
we should not expect the numerical solution to converge after this time.
In order to see this we can consider, for example, the convergence of the
constraint Cα := Dα − ∂x ln α. Numerically this constraint will not vanish, but
368 EXAMPLES OF NUMERICAL SPACETIMES
Fig. 10.4: Evolution of the eigenfield ω+ and the lapse function α for the case of a shock
avoiding slicing of the form (10.2.24) with κ = 0.5. The dotted lines show the initial
data, and the solid lines the solution at t = 21.
it should converge to zero as the resolution is increased. Define now the conver-
gence factor as the ratio of the r.m.s norm of Cα for a run at a given resolution
and another run at twice the resolution. Before the blow-up this convergence
factor should approach 4 (since the evolution code is second order accurate),
but after the blow-up it should drop below 1, indicating loss of convergence. As
an example of this, let us consider again the simulation with f = 0.5 discussed
above. Figure 10.5 shows a plot of the convergence factors as a function of time
using five different resolutions: ∆x = 0.05, 0.025, 0.0125, 0.00625, 0.003125. The
figure shows that as the resolution is increased the convergence factors approach
the expected value of 4 for t < 10 (remember that in this case a shock is expected
at T ∗ 9.98), but after that time they drop below 1, as expected. Moreover,
as the resolution is increased the convergence factor resembles more and more
a step function, corresponding to second order convergence before the blow-up,
and no convergence afterwards. The behavior of the convergence factor is very
similar for the case with f = 1.5, with the loss of convergence now centered
around t ∼ 21, as expected.
Before leaving this Section, there is one final point that should be addressed.
As we have already mentioned, since we are evolving a foliation of Minkowski
spacetime it is clear that the background geometry remains perfectly regular, so
when a gauge shock develops the only thing that can become pathological are
the spatial hypersurfaces that define the foliation. Figure 10.6 shows a compar-
ison of the initial hypersurface and the final hypersurface at t = 10, as seen in
Minkowski spacetime, using again the data from the same numerical simulation
with f = 0.5 discussed above (for an easier comparison the final slice has been
moved back in time so that it lies on top of the initial slice at the boundaries).
The hypersurfaces are reconstructed by explicitly keeping track of the position of
the normal observers in the original Minkowski coordinates during the evolution.
Notice how the initial slice is very smooth (it has a Gaussian profile), while the
10.3 SPHERICAL SYMMETRY 369
Convergence factor
5
0
0 2 4 6 8 10 12
t
Fig. 10.5: Convergence of the constraint Cα for the simulation with f = 0.5 done at the
five different resolutions: ∆x = 0.05, 0.025, 0.0125, 0.00625, 0.003125. As the resolution
increases, the convergence factors approach the expected value of 4 for t < 10, but after
that time they drop sharply indicating loss of convergence.
final slice has developed a sharp kink. This shows that the formation of a gauge
shock indicates that the hypersurface, though still spacelike everywhere, is no
longer smooth (its derivative is now discontinuous).
0.1
0.08
Minkowski t
0.06
0.04
0.02
<0.02
<4 <2 0 2 4 6 8 10 12
Minkowski x
Fig. 10.6: Initial hypersurface (dotted line) and final hypersurface at t = 10 (solid line),
as seen in Minkowski coordinates, for the simulation with f = 0.5 discussed in the text.
The final slice has been moved back in time so that it lies on top of the initial slice at
the boundaries. The initial slice is smooth, while the final slice has a sharp kink.
10.3.1 Regularization
Let us start by writing the general form of the spatial metric in spherical sym-
metry as
dl2 = A(r, t)dr2 + r2 B(r, t) dΩ2 , (10.3.1)
with A and B positive metric functions, and dΩ2 = dθ2 + sin2 θdϕ2 the solid
angle element. Notice that we have already factored out the r2 dependency from
the angular metric functions. This has the advantage of making explicit the
dependency on r of these geometric quantities, and making the regularization of
the resulting equations easier.
As we will deal with the Einstein equations in first order form, we will intro-
duce the auxiliary quantities:
DA := ∂r ln A , DB := ∂r ln B . (10.3.2)
In order to simplify the equations, we will also work with the mixed components
of the extrinsic curvature, KA := Krr , KB := Kθθ = Kϕϕ .
Before writing down the 3+1 evolution equations in spherical symmetry, we
must first remember that the spherical coordinates introduced above are in fact
singular at the origin, and this coordinate singularity can be the source of serious
numerical problems caused by the lack of regularity of the geometric variables
there. The problem arises because of the presence of terms in the evolution
equations that go as 1/r near the origin. The regularity of the metric guarantees
the exact cancellations of such terms at the origin, thus ensuring well behaved
10.3 SPHERICAL SYMMETRY 371
solutions. This exact cancellation, however, though certainly true for analytical
solutions, usually fails to hold for numerical solutions. One then finds that the
1/r terms do not cancel and the numerical solution becomes ill-behaved near
r = 0: It not only fails to converge at the origin, but it can also easily turn out
to be violently unstable after just a few time steps.
The usual way to deal with the regularity problem is to use the so-called areal
(or radial) gauge, where the radial coordinate r is chosen in such a way that the
proper area of spheres of constant r is always 4πr2 (so that B = 1 during the
whole evolution). If, moreover, we also choose a vanishing shift vector we end
up in the so-called polar-areal gauge [48, 97], for which the lapse is forced to
satisfy a certain ordinary differential equation in r. The name polar comes from
the fact that, for this gauge choice, there is only one non-zero component of the
extrinsic curvature tensor, namely Krr [48]. In the polar-areal gauge the problem
of achieving the exact cancellation of the 1/r terms is reduced to imposing the
boundary condition A = 1 at r = 0, which can be easily done if we solve for
A directly from the Hamiltonian constraint (which in this case reduces to an
ordinary differential equation in r), and ignore its evolution equation. If we do
this in vacuum, we end up inevitably with Minkowski spacetime in the usual co-
ordinates (we can also recover Schwarzschild by working in isotropic coordinates
and factoring out the conformal factor analytically).
The main drawback of the standard approach is that the gauge choice has
been completely exhausted. In particular, the polar-areal gauge can not penetrate
apparent horizons, since inside an apparent horizon it is impossible to keep the
areas of spheres fixed without a non-trivial shift vector.105 One would then like
to have a way of dealing with the regularity issue that allows more generic
gauge choices to be made. Because of this I will discuss here a more general
regularization procedure originally introduced in [18, 29].
There are in fact two different types of regularity conditions that the variables
{A, B, DA , DB , KA , KB } must satisfy at r = 0. The first type of conditions are
simply those imposed by the requirement that the different variables should be
well defined at the origin, and imply the following behavior for small r:
A ∼ A0 + O(r2 ) , B ∼ B 0 + O(r2 ) ,
DA ∼ O(r) , DB ∼ O(r) ,
KA ∼ KA0
+ O(r2 ) , KB ∼ KB 0
+ O(r2 ) ,
with {A0 , B 0 , KA
0 0
, KB } in general functions of time. These regularity conditions
are in fact quite easy to implement numerically. For example, we can use a finite
differencing grid that staggers the origin, and then obtain boundary data on
a fictitious point at r = −∆r/2 by demanding for {A, B, KA , KB } to be even
functions at r = 0, and for {DA , DB } to be odd.
105 The polar-areal gauge has in fact been used successfully in many situations, particularly in
the study of critical collapse to a black hole, where the presence of the black hole is identified
by the familiar “collapse of the lapse” even if no apparent horizon can be found [98].
372 EXAMPLES OF NUMERICAL SPACETIMES
∂t A = −2αAKA , (10.3.3)
∂t B = −2αBKB , (10.3.4)
∂t DA = −2α[KA Dα + ∂r KA ] , (10.3.5)
∂t DB = −2α[KB Dα + ∂r KB ] , (10.3.6)
2
α Dα DA D DA DB
∂t KA = − ∂r (Dα + DB ) + Dα2 − + B −
A 2 2 2
1
− AKA (KA + 2KB ) − (DA − 2DB ) + 4παMA , (10.3.7)
r
α DA DB 1
∂t KB = − ∂r DB + Dα DB + DB 2
− − (DA − 2Dα − 4DB )
2A 2 r
2 (A − B)
− + αKB (KA + 2KB ) + 4παMB , (10.3.8)
r2 B
where, as before, we have introduced Dα := ∂r ln α, and where MA and MB are
matter terms given by
MA = 2SB − SA − ρ , M B = SA − ρ , (10.3.9)
with ρ the energy density of the matter, Sij the stress tensor, and where we
have defined SA := Srr , SB := Sθθ . On the other hand, the Hamiltonian and
momentum constraints take the form
1
H = −∂r DB + (A − B) + AKB (2KA + KB )
r2 B
2
1 DA DB 3DB
+ (DA − 3DB ) + − − 8πAρ = 0 , (10.3.10)
r 2 4
1 DB
M = −∂r KB + (KA − KB ) + − 4πjA = 0 , (10.3.11)
r 2
where jA := jr is just the momentum density of matter in the radial direction.
Since {Dα , DA , DB } go as r near the origin, terms of the type D/r are in fact
regular and represent no problem. However, both in the Hamiltonian constraint,
and in the evolution equation for KB there is a term of the form (A − B)/r2 ,
while in the momentum constraint there is a term of the form (KA −KB )/r, and,
given the behavior of these variables near the origin, these terms would seem to
blow up. The reason why this does not in fact happen is that, near the origin,
we must also ask for the extra regularity conditions
or in other words A0 = B 0 , KA
0
= KB 0
. These conditions arise as a consequence
of the fact that space must remain locally flat at r = 0. The local flatness
condition implies that near r = 0 it must be possible to write the metric as
with R a radial coordinate that measures proper distance from the origin. A
local transformation of coordinates from R to r takes the metric into the form
2
dR
dl2 r∼0 = dr2 + r2 dΩ2 , (10.3.14)
dr r=0
which implies that A0 = B 0 and, since this must hold for all time, also that
0 0
KA = KB .
Implementing numerically both the symmetry regularity conditions and the
local flatness regularity conditions at the same time is not entirely trivial. The
reason for this is that at r = 0 we now have three boundary conditions for just
two variables, i.e. both the derivatives of A and B must vanish, plus A and
B must be equal to each other (and similarly for KA and KB ). The boundary
conditions for the exact equations are also over-determined, but in that case the
consistency of the equations implies that if they are satisfied initially they remain
satisfied for all time. In the numerical case, however, this is not true owing to
truncation errors, and very rapidly one of the three boundary conditions fails
to hold. Notice also that, from the above equations, we can easily see why the
polar-areal gauge has no serious regularity problem. In that gauge we have B = 1
by construction. If we now impose the boundary condition A(r = 0) = 1, and
solve for A(r) by integrating the Hamiltonian constraint (ignoring the evolution
equations), then the (A − B)/r2 term causes no trouble.
There are several different ways in which we can solve the regularity problem.
When we discuss axial symmetry in the following sections we will introduce a very
general way to deal with the regularity issue, but here we will follow a somewhat
simpler approach. We start by introducing an auxiliary variable defined as [18, 29]
1 A
λ := 1− . (10.3.15)
r B
Local flatness then implies that the variable λ has the following behavior near
the origin
λ ∼ O(r) . (10.3.16)
λ
H = −∂r DB − + AKB (2KA + KB )
r
2
1 DA DB 3DB
+ (DA − 3DB ) + − − 8πAρ . (10.3.18)
r 2 4
Unfortunately, this last equation presents us with a new problem since it clearly
has the dangerous term (KA − KB )/r, but this term can be removed with the
help of the momentum constraint (10.3.11) to find
2αA DB
∂t λ = ∂r KB − (KA − KB ) + 4πjA , (10.3.20)
B 2
10.3.2 Hyperbolicity
Having found a regular version of the ADM evolution equations, the next step
is to construct a strongly hyperbolic evolution system. As we have already dis-
cussed in Chapter 5, the 3+1 ADM evolution equations are generally only weakly
hyperbolic and are therefore not well-posed. However, in our discussion on 1+1
we mentioned the fact that, in that simple case, the ADM equations are in fact
strongly hyperbolic. It turns out that in the case of spherical symmetry some-
thing similar happens, and the ADM equations are already strongly hyperbolic
in most cases, though not in all, and the one exception is so important that the
system must still be modified.
10.3 SPHERICAL SYMMETRY 375
It is clear that for the hyperbolicity analysis we still need to say something
about the lapse function, and as usual we will choose the Bona–Masso slicing
condition
∂t α = −α2 f (α)K = −α2 f (α)(KA + 2KB ) , (10.3.21)
which implies
∂t Dα = −∂r [αf (α)(KA + 2KB )] . (10.3.22)
Consider then the evolution system for the variables (Dα , DA , DB , KA , KB )
given by equations (10.3.22) and (10.3.5)–(10.3.8). Notice that for the hyperbol-
icity analysis of this system the regularity issue is not relevant.
One finds that in general the system as it stands is strongly hyperbolic, with
the following characteristic structure:
• There is one eigenfield with eigenspeed λ = 0 given by
However, as can be clearly seen from the above expressions, there is one par-
ticular case where hyperbolicity fails. If we choose harmonic slicing corresponding
f f
to f = 1, then the eigenfields w± become ill-defined. In fact, by multiplying w±
f
with f − 1, we can see that for f = 1 the eigenfields w± and w± l
become pro-
portional to each other, so that we do not have a complete set anymore and the
system is only weakly hyperbolic. Since harmonic slicing is such an important
gauge condition, having a system that fails to be strongly hyperbolic in that case
is clearly unacceptable.
There are many different ways in which we can modify the evolution equations
to obtain a strongly hyperbolic system for all f > 0. For example, one possibility
would be to use the BSSNOK system adapted to the special case of spherical
symmetry (this would probably be a very good choice, particularly when dealing
with black hole spacetimes). However, for simplicity we will consider here a much
simpler alternative. We start by making a change of variables, so that instead
of using DA and KA as fundamental variables, we take D 5 = DA − 2DB and
K = KA + 2KB . We then rewrite the evolution equations in terms of the new
variables, and use the Hamiltonian and momentum constraints to eliminate the
376 EXAMPLES OF NUMERICAL SPACETIMES
with Fir = Mij (u) ∂r [fj (u)B/2αA]. Using now the fact that
1 A
∂r λ = − λ + (DA − DB ) , (10.3.39)
r B
we finally find
∂t vi = Mij (u) ∂r vj + pi (u, v) + λ Fir − Fit
Mij (u)fj (u)B A
− λ + (DA − DB ) . (10.3.40)
2αAr B
This last system is now regular, and has precisely the same characteristic struc-
ture as the original system. What we have done is transform the original evolution
378 EXAMPLES OF NUMERICAL SPACETIMES
equations for the vi variables into evolution equations for the new vi variables,
for which the principal part is the same and the source terms are regular. Notice
that typically only some of the fi (u) will be different from zero, so we do not
need to transform all variables.
In the particular case of the strongly hyperbolic system introduced above we
have only used the momentum constraint to modify the evolution equation for
5 so that this variable must be replaced with
D,
5 := D
U 5 − 4Bλ/A . (10.3.41)
5 then becomes
The evolution equation for U
5 = −2α [∂r K + Dα (K − 4KB )
∂t U
− 2 (K − 3KB ) (DB − 2λB/A) + 16πjA ] . (10.3.42)
As always, we must start with the choice of initial data. It is clear that simply
taking the Schwarzschild metric at t = 0 in standard coordinates is not a good
choice since this metric is singular at the horizon. A much better choice is to use
isotropic coordinates. In such coordinates the spatial metric has the form (see
equation (6.2.3))
dl2 = ψ 4 dr2 + r2 dΩ2 , (10.3.43)
with the conformal factor given by ψ = 1 + M/2r, and where r is related to the
2
standard “areal” Schwarzschild radius through rSchwar = rψ 2 = r (1 + M/2r) .
As we have already mentioned in Chapter 6, there are several different ways in
which we could choose to deal with the singularity at r = 0 (remember that this
is a coordinate singularity associated with the compactification of infinity on the
other side of the Einstein–Rosen bridge, and not the physical singularity). For
example, we could place a boundary at the throat, which at t = 0 coincides with
the horizon and is located at r = M/2, and use an isometry boundary condition.
Alternatively, we could excise the black hole interior. For simplicity, however, we
10.3 SPHERICAL SYMMETRY 379
will use here the static puncture evolution technique (see Section 6.3). We then
start by extracting analytically the singular conformal factor and defining new
dynamical variables as:
(notice that the variables KA and KB are not rescaled). We now rewrite the ADM
equations, or any equivalent strongly hyperbolic reformulation, in terms of the
new variables. In terms of our rescaled variables, the initial data for Schwarzschild
is simply:
à = B̃ = 1 , D̃A = D̃B = 0 . (10.3.46)
As the Schwarzschild metric is static, and the shift vector is zero in isotropic
coordinates, the initial extrinsic curvature is just
KA = KB = 0 . (10.3.47)
We still have to choose the gauge condition. For the shift vector we simply
choose to keep it equal to zero, while for the lapse we will use either 1+log slicing,
which corresponds to equation (10.3.21) with f = 2/α, or maximal slicing (see
equation (4.2.8)), which in this case reduces to:
!
1 2 D̃A
2
∂r α + + D̃B − + 2∂r ln ψ ∂r α = KA 2
+ 2KB 2
. (10.3.48)
Ãψ 4 r 2
completely static situation, it does not penetrate inside the black hole horizon.
Another possibility is to ask for:
∂r α|r=0 = 0 . (10.3.51)
This no longer results in the Schwarzschild lapse and gives us a dynamical situ-
ation (even if the dynamics are just a result of the gauge choice). Also, this case
does penetrate the black hole horizon. This is in fact the choice we will use in
the numerical simulations shown below.
A third possibility is to ask for the lapse to be symmetric at the throat of
the wormhole at r = M/2. This choice also results in dynamical evolution and
penetrates the horizon. In the case of a puncture evolution the throat is not a
natural boundary of the computational domain, and this makes the symmetric
lapse hard to use. However, if instead of a puncture evolution we choose to locate
a boundary at the throat and impose isometry conditions, then the symmetric
lapse becomes a natural choice. Notice that, as was already discussed in Sec-
tion 4.2.3, the maximal slicing equation can in fact be solved analytically in the
case of Schwarzschild, so we know what to expect from a numerical simulation.
1.2
0.8
_ 0.6
0.4
0.2
0
0 2 4 6 8 10
r
Fig. 10.7: Evolution of the lapse function α for Schwarzschild using maximal slicing.
The value of α is shown every t = 1M .
¾
A 3
0
0 2 4 6 8 10
r
Fig. 10.8: Evolution of the conformal metric variable à for Schwarzschild using maximal
slicing. The value of the metric is shown every t = 1M .
the fact that the normal observers at different distances from the black hole fall
with different accelerations, so the distance between them increases (remember
that since the shift is zero in this simulation, our coordinates are tied to these
observers).
From the figures it is clear that, with the gauge conditions that we have
chosen, the Schwarzschild spacetime does not appear static. This can be seen even
more dramatically if we study the position of the horizon during the evolution.
Figure 10.9 shows the radius of the black hole horizon rh as a function of time.
382 EXAMPLES OF NUMERICAL SPACETIMES
1.5
Horizon radius
1
0.5
0
0 2 4 6 8 10
t
70
60
Horizon area 50
40
30
20
10
0
0 2 4 6 8 10
t
Fig. 10.10: The area of the horizon remains constant during the evolution, with a value
close to ah = 16π ∼ 50.2.
1.2
0.8
_ 0.6
0.4
0.2
0
0 2 4 6 8 10
r
Fig. 10.11: Evolution of the lapse function α for Schwarzschild using 1+log slicing. The
value of α is shown every t = 1M .
the simulations shown below will be restricted to the case of a massless scalar
field, for completeness we will consider the general expressions for a scalar field
with an arbitrary self-interaction potential V (φ). We start from the stress-energy
tensor for the scalar field which is given by
gµν
Tµν = ∇µ φ ∇ν φ − (∇α φ∇α φ + 2V ) , (10.3.52)
2
with gµν the spacetime metric. Notice that for a scalar field whose self-interaction
includes only a mass term we will have V (φ) = m2 φ2 /2, while for the massless
case we have V (φ) = 0. Using the spherical metric (10.3.1) we find that
2
1 Π
ρ = nµ nν Tµν = + Ψ 2
+V , (10.3.53)
2A B 2
ΠΨ
jA = −nµ Tµr = − 1/2 , (10.3.54)
A
2B
1 Π
SA = γ rr Trr = + Ψ2 − V , (10.3.55)
2A B 2
2
1 Π
SB = γ θθ Tθθ = − Ψ 2
−V , (10.3.56)
2A B 2
where we have defined
Π := (A1/2 B/α) ∂t φ , Ψ := ∂r φ . (10.3.57)
Having found the form of the matter terms that appear in the 3+1 evolution
equations, let us go back to the issue of the evolution equation for the scalar field
itself. Starting from the conservation law ∇ν T µν = 0, we can easily show that
the scalar field must evolve through the Klein–Gordon equation:
2φ = ∂φ V ⇒ ∂µ (−g)1/2 ∂ µ φ = (−g)1/2 ∂φ V , (10.3.58)
with g the determinant of gµν , which is given in terms of the lapse and the
determinant of the spatial metric as g = −α2 γ. In the particular case of spherical
symmetry the Klein–Gordon equation reduces to
1 αBr2
∂t Π = 2 ∂r Ψ − αA1/2 B ∂φ V . (10.3.59)
r A1/2
Notice that this only gives us an evolution equation for Π. The evolution equa-
tion for Ψ, on the other hand, can be obtained from the fact that the partial
derivatives of φ commute. The final system of equations is then
α
∂t φ = 1/2 Π , (10.3.60)
A B
α
∂t Ψ = ∂r 1/2 B
Π , (10.3.61)
A
1 αBr2
∂t Π = 2 ∂r Ψ − αA1/2 B ∂φ V . (10.3.62)
r A1/2
10.3 SPHERICAL SYMMETRY 385
This system of evolution equations couldn’t look simpler. However, its ap-
proximation in terms of finite differences presents us with a beautiful example of
how the naive use of simple centered spatial differences can sometimes fail to be
consistent at places where the coordinate system becomes singular, such as the
point r = 0 in our spherical coordinates. The problem comes from the evolution
equation for Π. Consider a general term of the form
1
T = 2
∂r r2 f (r) , (10.3.63)
r
with f (r) some arbitrary smooth function of r that for small r behaves as f = ar.
This implies that close to the origin we will have T = 3a. Now consider a centered
finite difference approximation to T , and assume for simplicity that the grid
staggers the origin so that r0 = −∆r/2, r1 = ∆r/2, r2 = 3∆r/2, etc. (similar
results are obtained if the grid is not staggered, but we then have to do something
special at the point r = 0). At the point r1 = ∆r/2 we will have
1 r2 f2 − r0 f0 a
But this is clearly very different from the expected value of T = 3a. This means
that, close to the origin, the finite difference approximation has a serious problem,
and even though we can expect this problem to be confined to a very small region
around r = 0, it can still make the entire numerical scheme go unstable.
What has gone wrong? The problem can be easily understood if we no-
tice that, since our finite difference approximation is second order, the trun-
cation error should have the form τ∆ ∼ (∆r)2 ∂r2 T . But close to the origin we
have, to leading order, ∂r2 T ∼ 4f (r)/r3 = 4a/r2 , so the truncation error becomes
τ∆ ∼ a(∆r/r)2 . Now, close to the origin we also have r ∼ ∆r, which implies that
the truncation error remains finite regardless of how small ∆r becomes, so the
finite difference approximation is inconsistent!
Fortunately, there is a well known trick that can fix this problem [127]. Notice
first that, quite generally,
1
∂r = 3 ∂r3 . (10.3.65)
r2
Let us then define T = 3 ∂r3 (r2 f (r)), and consider the centered finite difference
approximation to this new expression. At the point r1 = ∆r/2 we will have
r22 f2 − r02 f0
T1 =3 = 3a . (10.3.66)
r23 − r03
We now find the correct value at the origin. Since this trick is consistent for
all r, we can in fact use the above approximation everywhere, even for large
r. Similar problems can also arise when finite differencing the 3+1 evolution
equations of general relativity, so we must always be careful about the consistency
386 EXAMPLES OF NUMERICAL SPACETIMES
This equation is solved numerically for A using the fact that, for our initial
data, ρ = Ψ2 /2A. For our simulations we solve the above equation with a second
order Runge–Kutta method. The integration is done starting from the origin and
taking as boundary condition A(r = 0) = 1.
We will consider first a simulation with a scalar field amplitude of a = 0.001.
This amplitude is in fact already very large, but not quite large enough to cause
the scalar field to collapse to form a black hole. Figure 10.12 shows three snap-
shots of the evolution of the scalar field φ in this case: The solid line shows the
initial data, the dashed line the solution after t = 5, and the dotted line the
solution after t = 20. From the figure we see that the initial pulse first separates
into two smaller pulses traveling in opposite directions, as expected. By t = 5,
the outward moving pulse has moved to r ∼ 10, while the inward moving pulse
has reached the origin. By t = 20, the outward moving pulse has moved to
10.3 SPHERICAL SYMMETRY 387
Scalar field
0.14
0.12
0.10 t=5
0.08
q 0.06
0.04
t=0 t=20
0.02
0.00
<0.02
0 5 10 15 20 25 30
r
Fig. 10.12: Evolution of the scalar field φ for an initial configuration with amplitude
a = 0.001.
Central value of _
1
0.9
0.8
0.7
0.6
0.5
0 5 10 15 20 25
t
Fig. 10.13: Central value of the lapse as a function of time for an initial scalar field
configuration with amplitude a = 0.001.
r ∼ 25, while the pulse that was originally moving inward has imploded through
the origin, changing sign in the process, and is now also moving outward having
reached r ∼ 14. The evolution then proceeds with both pulses moving out and
leaving flat space behind.
As we can see, the evolution of φ behaves much as we would expect for an
evolution in flat space, so we might be tempted to think that spacetime is almost
flat during the whole evolution. This is in fact not so, and there is a period when
spacetime has a large curvature as the inward moving pulse reaches the origin.
In order to see this, Figure 10.13 shows the value of the lapse at the origin as
a function of time. Notice that, just after the inward moving pulse reaches the
origin (t ∼ 5), the central value of the lapse drops to ∼ 0.6, indicating that
significant curvature has developed. However, since the scalar field density is not
large enough to collapse to a black hole the central value of the lapse eventually
388 EXAMPLES OF NUMERICAL SPACETIMES
Scalar field
0.15
0.10 t=5
0.05
t=0
0.00
q <0.05
<0.10
t=20
<0.15
<0.20
<0.25
0 5 10 15 20 25 30
r
Fig. 10.14: Evolution of the scalar field φ for an initial configuration with amplitude
a = 0.002.
Central value of _
1
0.8
0.6
0.4
0.2
0 5 10 15 20
Fig. 10.15: Central value of the lapse as a function of time for an initial scalar field
configuration with amplitude a = 0.002.
Fig. 10.16: Coordinate radius and mass of the apparent horizon found for the simulation
with an initial scalar field amplitude of a = 0.002.
What has happened in this case is that the inward going pulse has in fact
collapsed to a black hole. This can again be seen more clearly by looking at
the evolution of the central value of the lapse function which is shown in Fig-
ure 10.15. Notice how, after a couple of bounces, the central lapse collapses to
zero, indicating the presence of a black hole. Since the lapse is now zero in the
central regions, the evolution stops there, freezing the scalar field configuration.
Of course, the collapse of the lapse, though a strong indicator, is not in
itself proof of the presence of a black hole. In order to be sure, we should look
for an apparent horizon, and indeed such an apparent horizon is found during
this simulation. Figure 10.16 shows the coordinate radius rAH and mass MAH
of the apparent
horizon during the evolution (the horizon mass is defined as
MAH = AAH /16π, with AAH the horizon area). Notice how the horizon first
appears at t ∼ 9.5 with a coordinate radius of rAH ∼ 0.6, and after that the
horizon grows in coordinate space until at t = 20 it has a radius of rAH ∼ 1.5.
The horizon mass, on the other hand, starts at MAH ∼ 0.24 and rapidly stabilizes
at a value of MAH ∼ 0.27. Notice that for this data set the initial ADM mass
can be easily calculated and turns out to be MADM = 0.54. The fact that the
final black hole has half the initial ADM mass is to be expected, because the
piece of the initial pulse that started moving outward has managed to escape.
The ADM mass can be calculated by using, for example, the “Schwarzschild-
like mass” defined in equation (A.26) of Appendix A, and this gives the exact
ADM mass for a Schwarzschild spacetime at any radius. Since our initial scalar
field configuration is restricted to a small region, it is clear that outside this
region the spacetime will be Schwarzschild and this approach will give us the
correct ADM mass.
A second method for calculating the ADM mass of our spacetime can be
found by first defining a function m(r) that is related to the radial metric A(r)
through
390 EXAMPLES OF NUMERICAL SPACETIMES
0.6
0.5
0.4
0.3
0.2
0.1
0
0 5 10 15 20
t
Fig. 10.17: Mass function m(r) at t = 0 for the scalar field configuration with amplitude
a = 0.002.
1
A(r) = , (10.3.69)
1 − 2m(r)/r
and then rewriting the Hamiltonian constraint at t = 0, equation (10.3.68), in
terms of m(r). Doing this we find that the Hamiltonian constraint simplifies to:
∂r m = 4πρr2 . (10.3.70)
Notice that, since at t = 0 we are using the areal radius (B = 1), in the vacuum
region we must recover the Schwarzschild metric, which implies that m(r) must
in fact be constant and equal to the total ADM mass. We can then find the ADM
mass by simply integrating the above equation up to a point where the scalar
field is negligible:
r
m(r) = 4πρr2 dr , MADM = lim m(r) . (10.3.71)
0 r→∞
It is quite remarkable that the above integral is precisely the expression for the
Newtonian mass contained inside a sphere of radius r. This is a general result
in spherical symmetry as long as B = 1 and KA = KB = 0. In that case we
also finds that m(r) is precisely the Schwarzschild-like mass (A.26), so that both
mass measures coincide for all r. Figure 10.17 shows a plot of the function m(r)
at t = 0. From the figure we can clearly see that m(r) increases monotonically
in the region where the scalar field is non-zero, and settles down to a constant
value of ∼ 0.54.
Simulations similar to those presented here, but focusing on the threshold
of black hole formation (i.e. the smallest value of the amplitude a for which a
black hole is formed), led Choptuik to the discovery of critical phenomena in
gravitational collapse [98]. We will not discuss these important results here, but
the interested reader can look at e.g. [153] and references therein.
10.4 AXIAL SYMMETRY 391
Spherical coordinates have the advantage of being well adapted to the asymptotic
boundary conditions, as well as to the propagation of gravitational waves, but
they make the regularity conditions more complicated since we have to worry
about regularity both at the origin r = 0, and at the axis of symmetry θ = 0.
Because of this, here we will consider only cylindrical coordinates, for which the
only regularity problems are associated with the axis of symmetry ρ = 0.
The next step is to consider the form of the spatial metric. In the case of
spherical symmetry it was in fact possible to choose the spatial metric as diago-
nal, with the two angular metric components proportional to each other so that
only two metric components were truly independent. One might then think that
such a simplification of the metric is also possible in axisymmetry. Unfortunately,
a little thought can convince us that this is not so in the general case, and that all
six spatial metric components must be considered independently. For example,
asymptotically the metric component γzϕ corresponds to the h× polarization of
gravitational waves, so it will in general not vanish. Also, there is no reason to
assume that the coordinate lines associated with z and ρ should be orthogonal,
so that γρz will in general not vanish. Finally, if there is angular momentum in
the spacetime there will be dragging of inertial frames so that even if initially
we have γρϕ = 0, this will not necessarily remain so during evolution as the ρ
coordinate lines can be dragged differentially around the axis of symmetry. In
summary, all three off-diagonal components of the metric will in general be non-
zero. Nevertheless, it is possible to impose the condition that γρϕ = 0 during the
entire evolution by choosing appropriately the shift vector component β ϕ (see
e.g. [48]), and this is often done in practice.
Here, however, we will follow a different route and simplify the metric by
considering only a restricted class of axisymmetric spacetimes, namely those
that have zero angular momentum and no odd-parity gravitational waves (i.e.
h× = 0), so that we can always take γzϕ = γρϕ = 0. In such a case the spatial
metric can be simplified to
dl2 = γρρ dρ2 + γzz dz 2 + 2γρz dρdz + γϕϕ dϕ2
≡ A dρ2 + B dz 2 + 2ρC dρdz + ρ2 T dϕ2 , (10.4.5)
with the quantities (A, B, C, T ) functions of (ρ, z) only, and where we have ex-
plicitly extracted some factors of ρ in order to make the regularization procedure
somewhat simpler. We can also assume that β ϕ = 0, so that there are only two
non-zero shift components. Notice that with the above restrictions we can still
study a wide variety of interesting physical systems, such as the collapse of non-
rotating stars, and even non-trivial strong gravitational wave spacetimes with
even-parity such as the Brill waves that we will discuss in the following Section.
Let us now consider the regularity of the metric functions at the axis of
symmetry. Just as in spherical symmetry, there are two different types of reg-
ularity conditions. In the first place, axial symmetry implies that the metric
should remain unchanged under the transformation ρ → −ρ, which implies that
(A, B, C, T ) must all be even (and smooth) functions of ρ. As before, this can be
10.4 AXIAL SYMMETRY 393
From the behavior of the different Cartesian metric components near the axis
we then see that
Going back to the definition of the functions (A, B, C, T ), we see that, close
to the axis, A and T are such that A − T ∼ O(ρ2 ), so that they can be written
in general as
394 EXAMPLES OF NUMERICAL SPACETIMES
A := H + ρ2 J , T := H − ρ2 J , (10.4.24)
where H and J are regular functions that are even in ρ. The expressions above
can be easily inverted to find
A+T A−T
H := , J := . (10.4.25)
2 2ρ2
Since the above relation between A and T must hold for all time, the components
of the extrinsic curvature must have a similar behavior, so that we can write
(again assuming no angular momentum and no odd-parity gravitational waves)
⎛ ⎞
KA ρ KC 0
Kij = ⎝ ρ KC KB 0 ⎠ , (10.4.26)
2
0 0 ρ KT
The term on the right hand side is now clearly regular since D/ρ ∼ 1 + O(ρ2 ).
The specific form of such terms depends on the details of the formulation being
used (i.e. ADM, BSSNOK, NOR, etc.), and the equations must be carefully
inspected before writing a numerical code (see e.g. [245]).
10.4 AXIAL SYMMETRY 395
q |ρ=0 = 0 , (10.4.30)
∂ρn q |ρ=0 = 0 for odd n , (10.4.31)
−2
q |r→∞ = O r , (10.4.32)
where r = ρ2 + z 2 . In order to find the conformal factor Ψ, we first impose
the condition of time symmetry, which implies that the momentum constraints
are identically satisfied. The Hamiltonian constraint, on the other hand, takes
the form
2 1 2
Dflat Ψ+ ∂ρ q + ∂z2 q Ψ = 0 , (10.4.33)
4
2
with Dflat the flat space Laplacian. Since this is an elliptic equation we must
also say something about the boundary conditions. At infinity we must clearly
have Ψ = 1; however since our computational domain is finite we must ask for
boundary condition at a finite distance. Notice that we expect that far away the
solution should approach a Schwarzschild spacetime, which implies that Ψ must
behave asymptotically as Ψ ∼ 1 + M/2r. This provides us with our boundary
condition for large r.
Once a function q has been chosen, all we need to do is solve the above elliptic
equation numerically for Ψ. Different forms of the function q have been used by
different authors [124, 125, 163, 267]. Here we will consider the one introduced
by Holz and collaborators in [163], which has the form
2
+z 2 )
q = aρ2 e−(ρ , (10.4.34)
with a a constant that determines the initial amplitude of the wave (for small
a the waves should disperse to infinity, while for large a they should collapse to
form a black hole).
Before considering any evolutions, let us first discuss the solution of the ini-
tial data, which involves solving the elliptic equation (10.4.33) numerically. In
Chapter 9 we did not discuss the solution of elliptic equations, but here we can
present a quick and dirty method that is very easy to code (it is also extremely
inefficient, so the reader should not use it for any serious applications). The
396 EXAMPLES OF NUMERICAL SPACETIMES
a MADM
1 0.0338 ± 0.0006
2 0.126 ± 0.001
3 0.270 ± 0.002
4 0.459 ± 0.003
5 0.698 ± 0.004
6 0.991 ± 0.005
10 2.91 ± 0.01
12 4.67 ± 0.02
Table 10.1 ADM mass for Brill wave initial data with different amplitudes a.
Fig. 10.18: Initial profile of the conformal factor Ψ for a Brill wave with amplitude
a = 3, along both the axis ρ = 0 and the equator z = 0.
Previous studies have shown that Brill waves of this type have an apparent
horizon already present in the initial data for a 12 [8, 267]. Incidentally, Brill
waves also provide a very nice example of a regular vacuum spacetime that
nevertheless has a non-zero (and positive) ADM mass. These configurations are
also extremely compact, notice that for a = 6 the mass is M ∼ 1 while the
“radius”, which can be estimated as the place where the function q drops to less
than 10% of its maximum value, is of order R ∼ 2.2. For an amplitude of a = 12
we find that the mass is M ∼ 4.7 while the radius remains the same, implying
that M/R ∼ 2, so we shouldn’t be surprised to find that an apparent horizon is
already present in this case.
Here we will only consider one example of a dynamical simulation for the
case of a Brill wave with amplitude a = 3. The initial data for Ψ can be seen in
Figure 10.18, which shows plots along both the axis ρ = 0 and the equator z = 0.
We will evolve this initial data using a 1+log slicing condition and vanishing
shift. The code used here uses second order finite differencing, and is based on
the Nagy–Ortiz–Reula (NOR) formulation discussed in Chapter 5, but adapted
to the case of curvilinear coordinates (see e.g. [245]).
Figure 10.19 shows a plot of the value of the lapse at the origin as a function of
time for three different resolutions ∆ρ = ∆z = (0.1, 0.05, 0.025), using a Courant
parameter of ∆t/∆ρ = 0.2, and with the boundaries located at ρ, z = ±10. The
first thing to notice is that at the lowest resolution of ∆ρ = 0.1 the central
value of the lapse is very different from the other two cases, indicating that this
resolution is still too low to give us a good idea of the correct solution (this
simulation in fact crashed at t ∼ 4.7). At the two higher resolutions, however,
the central value of the lapse already behaves quite similarly, with an initial drop
to a value of just below 0.5, and a later rise back toward one. Figure 10.20, on
the other hand, shows the root mean square of the Hamiltonian constraint as a
function of time for the same three resolutions. The Figure clearly shows that
the code is converging to second order, as expected.
398 EXAMPLES OF NUMERICAL SPACETIMES
Central value of _
0.8
0.6
0.4
0.2
0
0 1 2 3 4 5 6
t
Fig. 10.19: Time evolution of the central value of the lapse α for a Brill wave with am-
plitude a = 3, using three different resolutions. The solid line corresponds to ∆ρ = 0.1,
the dashed line to ∆ρ = 0.05, and the dotted line to ∆ρ = 0.025.
0.5
0.4
0.3
0.2
0.1
0
0 1 2 3 4 5 6
t
Fig. 10.20: Time evolution of the root mean square of the Hamiltonian constraint
for a Brill wave with amplitude a = 3 using three different resolutions. The solid
line corresponds to ∆ρ = 0.1, the dashed line to ∆ρ = 0.05, and the dotted line to
∆ρ = 0.025.
From the fact that the lapse drops to below 0.5 at the early stages of the
evolution we can conclude that a Brill wave with an amplitude of a = 3 is
already very strong. However, since the lapse later returns to 1, we see that
this wave is still not quite strong enough to collapse to a black hole. In fact,
previous numerical studies have determined that the threshold for black hole
10.4 AXIAL SYMMETRY 399
Fig. 10.21: The Cartoon approach: The points on the y = 0 plane are evolved using
standard three-dimensional finite differencing. Those on the adjacent planes are then
obtained by rotating around the z axis and interpolating.
ρ = ρ , ϕ = ϕ + ϕ0 , z = z , (10.4.37)
106 In fact, the code described above already requires a very large Kreiss–Oliger dissipation
reference to the fact the we use a Cartesian code to do a two-dimensional simulation, that is
Cartesian-2D or simply “Cartoon”.
400 EXAMPLES OF NUMERICAL SPACETIMES
Using now the Hamiltonian and momentum constraints of the 3+1 formalism,
equations (2.4.10) and (2.4.11), plus the fact that, in the weak field limit, the
extrinsic curvature Kij is itself a small quantity, we can rewrite this as (in all
following expressions sum over repeated indices is understood)
1 1
M= R dV , Pi = ∂j (Kij − δij K) dV , (A.2)
16π 8π
where R is the Ricci scalar of the spatial metric, which in the linearized theory
is given by
R = ∂j (∂i hij − ∂j h) , (A.3)
with h the trace of hij . Notice that both the mass and momentum are now
written as volume integrals of a divergence, so using Gauss’ theorem we can
rewrite them as
402
TOTAL MASS AND MOMENTUM IN GENERAL RELATIVITY 403
& &
1 1
M= (∂i hij − ∂j h) dS j , Pi = (Kij − δij K) dS j , (A.4)
16π S 8π S
where the integrals are calculated over surfaces outside the matter sources, and
where γij = δij + hij is the spatial metric and dS i = si dA, with si the unit
outward-pointing normal vector to the surface and dA the area element. The
expressions above are called the ADM mass and ADM momentum of the space-
time (ADM stands for Arnowitt, Deser, and Misner [31]). Notice that the ADM
mass depends only on the behavior of the spatial metric, while the ADM mo-
mentum depends on the extrinsic curvature instead. We should stress the fact
that for these expressions to hold we must be working with quasi-Minkowski, i.e.
Cartesian type, coordinates.
In order to derive the ADM integrals above, we started from the weak field
theory and used Gauss’ theorem to transform the volume integrals into surface
integrals. In the case of strong gravitational fields, the total mass and momentum
can not be expected to be given directly by the volume integrals of the energy and
momentum densities, precisely because these integrals fail to take into account
the effect of the gravitational field. However, we in fact define the total mass and
momentum of an isolated system through its gravitational effects on faraway
test masses, and far away the weak field limit does hold, so the surface integrals
above will still give the correct values we physically associate with the total mass
and momentum, but they must be evaluated at infinity to guarantee that we are
in the weak field regime. We then define the ADM mass and momentum in the
general case as
&
1 ij
MADM = lim δ ∂i hjk − ∂k h dS k , (A.5)
16π r→∞ S
&
1 i
i
PADM = lim Kj − δji K dS j , (A.6)
8π r→∞ S
Using the same idea we can also construct a measure of the total angular
momentum J i of the system starting from the density of angular momentum
given by ijk xj j k , with xi Cartesian coordinates (this is just
r ×
j in standard
three-dimensional vector notation). The ADM angular momentum then becomes
&
1
i
JADM = lim ijk xj (Kkl − δkl K) dS l . (A.7)
8π r→∞ S
Notice that, in fact, in the limit r → ∞, the term proportional to K can be
dropped out since at spatial infinity the Cartesian vector
x is collinear with the
area element dS
so that ijl xj dSl = 0. The integral can then be rewritten as
&
i 1
JADM = lim ijk xj Kkl dS l . (A.8)
8π r→∞ S
&
z 1
JADM = lim zjk xj Kkl dS l
8π r→∞ S
&
1
= lim ϕk Kkl dS l , (A.9)
8π r→∞ S
with ϕ
the coordinate basis vector associated with the azimuthal angle ϕ.
There is a very important property of the ADM mass and momentum defined
above. Since these quantities are defined at spatial infinity i0 , and since any
gravitational waves taking energy and momentum away from the system will
instead reach null infinity J + , the ADM mass and momentum will remain
constant during the evolution of the system.
The expression for the ADM mass given in (A.5) has the disadvantage that it
is not covariant and must be calculated in Cartesian-type coordinates. This can
in fact be easily remedied. In an arbitrary curvilinear coordinate system define
the tensor hij := γij − γij
0 0
, with γij the metric of flat space expressed in the same
coordinates. It is easy to convince oneself that hij does indeed transform like a
tensor. We can now rewrite the ADM mass as
&
1
MADM = lim Dj0 hjk − Dk0 h dS k , (A.10)
16π r→∞ S
where Di0 is the covariant derivative with respect to the flat metric γij 0
in the
corresponding curvilinear coordinates, and where indices are raised and lowered
0
also with γij . Notice that the tensor hij is not unique because we can still make
infinitesimal coordinate transformations that will change hij without changing
Dj0 hjk − Dk0 h (gauge transformations in the context of linearized theory).
From the last result we can also obtain a particularly useful expression for
the ADM mass:
&
1
MADM = lim k − k 0 dA , (A.11)
8π r→∞ S
where dA is the area element of the surface S, k is the trace of the extrinsic
curvature of S, and k 0 is the trace of the extrinsic curvature of S when em-
bedded in flat space.108 It is important to mention that the surface S, when
embedded in flat space, must have the same intrinsic curvature as it did when
embedded in curved space. The derivation of this expression is not difficult and
can be obtained by considering an adapted coordinate system, with one of the
coordinates measuring proper distance along the normal direction and the other
two coordinates transported orthogonally off S; see for example [227] (in 3+1
terms, we would have a “lapse” equal to 1 and zero “shift”, but in this case the
orthogonal direction is spacelike and the surface is bi-dimensional).
108 This expression for the ADM mass is due to Katz, Lynden-Bell, and Israel [169] (see
also [227], but notice that our definition of extrinsic curvature has a sign opposite to that
reference).
TOTAL MASS AND MOMENTUM IN GENERAL RELATIVITY 405
In the case when the spatial metric is conformally flat, i.e. γij = ψ 4 δij , the
expression for the ADM mass simplifies considerably and reduces to [222]
&
1
MADM = − lim ∂j ψ dS j , (A.12)
2π r→∞ S
where we have assumed that far away ψ becomes unity.
covariant derivatives to the Riemann tensor (1.9.3) plus the fact that ξ µ is a
Killing field, and in the third line the Einstein field equations. If we now assume
that we have an isolated source and the cylindrical world-tube is outside it then
the above integral clearly vanishes. Equation (A.16) then implies that
∂ν |g|1/2 ∇µ ξ ν n̂µ d3 x = ∂ν |g|1/2 ∇µ ξ ν n̂µ d3 x , (A.18)
Σ1 Σ2
This means that the spatial integral is in fact independent of the hypersurface
Σ chosen, or in other words, the spatial integral is a conserved quantity.
To proceed further we choose coordinates such that n̂µ = δ0µ . Using the fact
that ∇µ ξ ν is antisymmetric, and applying again Gauss’ theorem, the spatial
integral becomes
&
∂i |g|1/2 ∇0 ξ i d3 x = |g|1/2 ∇0 ξ i ŝi d2 x
Σ
&∂Σ
where ∂Σ is now the two-dimensional boundary of Σ and sˆµ is the spatial unit
outward-pointing normal vector to ∂Σ. In the last expression we can notice that
|g|1/2 n̂µ ŝν d2 x = nµ sν dA, where now nµ and sµ are unit vectors with respect
to the full curved metric gµν , and dA is the proper area element of ∂Σ. We can
then write the integral as
&
1
IK ξ := − sµ nν ∇µ ξ ν dA . (A.20)
4π ∂Σ
This is known as the Komar integral [176] and as we have seen is a conserved
quantity (the quantity Aµν = ∇µ ξν is also frequently called the Komar poten-
tial).109 The integral can in fact be calculated over any surface ∂Σ outside the
matter sources. The normalization factor −1/4π has been chosen to ensure that,
for the case of a timelike Killing field that has unit magnitude at infinity, the
Komar integral will coincide with the ADM mass of the spacetime. If, on the
other hand, ξ µ is an axial vector associated with an angular coordinate φ (i.e.
ξ µ = δφµ ), then the Komar integral turns out to be −2J, with J the total angular
momentum (we can check that this is so by considering a Kerr black hole). The
fact that, for a static spacetime, the Komar integral and the ADM mass coincide
can be used to derive a general relativistic version of the virial theorem, but we
will not consider this issue here (the interested reader can see e.g. [69]).
109 The Komar integral is usually written in differential form notation as
I I
1 1
IK = − ∇µ ξ ν dSµν = − µναβ ∇µ ξ ν dxα ∧ dxβ .
4π 8π
In fact, in the last expression, the area element dxα ∧ dxβ is frequently not even written.
TOTAL MASS AND MOMENTUM IN GENERAL RELATIVITY 407
Incidentally, we can use the same derivation that allowed us to show that the
integral over the timelike cylinder σ vanishes to rewrite the Komar integral as a
volume integral of the stress-energy of matter as
1
IK ξ
:= 2 T µν − g µν T nµ ξν dV , (A.21)
Σ 2
with
s being the unit outward-pointing normal vector to S, and Di the standard
three-dimensional covariant derivative. The Hawking mass is defined by thinking
that the presence of a mass must cause light rays to converge, and it turns
408 TOTAL MASS AND MOMENTUM IN GENERAL RELATIVITY
out that the only gauge invariant measure of this on the surface is given by
Hin Hout .110 We would
6 then expect the mass to be given by an expression of the
form M = A + B Hin Hout dS, with the constants A and B fixed by looking at
special cases.
For spheres in Minkowski spacetime the Hawking mass vanishes identically
since Hin = −Hout = 2/r. For spheres in Schwarzschild (in standard coordinates)
we have Hin = −Hout = 2(1 − 2M/r)1/2 /r and A = 4πr2 , so that MH = M .
That is, the Hawking mass gives us the correct mass at all radii, which makes
it a more useful measure of energy than the ADM mass. The Hawking mass in
general is not always positive definite, and is not always monotonic either, but
for sufficiently “round” surfaces (and particularly in spherical symmetry) both
these properties can be shown to be satisfied if the dominant energy condition
holds.
Another less formal but very useful approach to obtaining a local expression
for the mass of the spacetime is based on the fact that many astrophysically
relevant spacetimes will not only be asymptotically flat, but they will also be
asymptotically spherically symmetric. In such a case we know that the spacetime
will approach Schwarzschild far away. For Schwarzschild we can easily prove the
following exact relation between the mass M and the area A of spheres
1/2
A (dA/dr)2
M= 1− , (A.26)
16π 16πgrr A
110 More specifically, the product of the Newman–Penrose spin coefficients ρ and ρ is gauge
invariant under a spin-boost transformation of the null tetrad, and in general it is given by
ρρ = Hin Hout /8.
APPENDIX B
SPACETIME CHRISTOFFEL SYMBOLS IN 3+1 LANGUAGE
In the derivation of 3+1 equations we often need to write the 4-metric of space-
time and its associated Christoffel symbols in 3+1 language. The 4-metric in
terms of 3+1 quantities has the form
g00 = − α2 − γij β i β j , (B.1)
g0i = γij β j = βi , (B.2)
gij = γij , (B.3)
and the corresponding inverse metric is
g 00 = −1/α2 , (B.4)
0i i 2
g = β /α , (B.5)
g ij = γ ij − β i β j /α2 . (B.6)
From this we can obtain the following expressions for the 4-Christoffel sym-
bols in terms of 3+1 quantities
Γ000 = (∂t α + β m ∂m α − β m β n Kmn ) /α , (B.7)
Γ00i = (∂i α − β m Kim ) /α , (B.8)
Γ0ij = −Kij /α , (B.9)
Γl00 = α∂ α − 2αβ Km
l m l
− β l (∂t α + β m ∂m α − β m β n Kmn ) /α
+ ∂t β l + β m Dm β l , (B.10)
Γlm0 = −β l (∂m α − β n Kmn ) /α − αKm l
+ Dm β l , (B.11)
Γlij = (3) l
Γij + β l Kij /α , (B.12)
with Di the covariant derivative associated with the spatial metric γij , and (3) Γlij
the corresponding three-dimensional Christoffel symbols. The 3-covariant deriva-
tives of the shift that appear in these expressions come from partial derivatives
of the metric along the time direction contained in the Γαµν .
The contracted Christoffel symbols Γα := g µν Γα µν then become
1
Γ0 = − ∂t α − β m ∂m α + α2 K , (B.13)
α3
βi
Γi = (3) Γi + 3 ∂t α − β m ∂m α + α2 K
α
1
− 2 ∂t β i − β m ∂m β i + α∂ i α . (B.14)
α
409
APPENDIX C
BSSNOK WITH NATURAL CONFORMAL RESCALING
410
BSSNOK WITH NATURAL CONFORMAL RESCALING 411
where TF denotes the tracefree part of the expression inside the brackets, and
where we have used the Hamiltonian constraint to eliminate the Ricci scalar from
the evolution equation for K, and the momentum constraints to eliminate the
divergence of Ãij from the evolution equation for Γ̃i . In the previous expressions
we have d/dt := ∂t −£β , with £β the Lie derivative with respect to the shift that
must be calculated for tensor densities: ψ, a scalar density of weight 1/6, and γ̃ij
and Ãij , tensor densities with weight −2/3. Also, even though the Γ̃i is strictly
speaking not a vector, its Lie derivative is understood as that corresponding to
a vector density of weight 2/3. Finally, the Ricci tensor is separated into two
contributions in the following way:
φ
Rij = R̃ij + Rij , (C.9)
where R̃ij is the Ricci tensor associated with the conformal metric γ̃ij :
1
R̃ij = − γ̃ lm ∂l ∂m γ̃ij + γ̃k(i ∂j) Γ̃k + Γ̃k Γ̃(ij)k
2
lm
+ γ̃ 2Γ̃kl(i Γ̃j)km + Γ̃kim Γ̃klj , (C.10)
φ
and where Rij is given by φ:
φ
Rij = −2D̃i D̃j φ − 2γ̃ij D̃k D̃k φ + 4D̃i φ D̃j φ − 4γ̃ij D̃k φ D̃k φ , (C.11)
with D̃i the covariant derivative associated with the conformal metric.
The Hamiltonian and momentum constraints also take the form
2 2
R = Ãij Ãij − K + 16πρ , (C.12)
3
2
∂j Ãij = −Γ̃ijk Ãjk − 6Ãij ∂j φ + γ̃ ij ∂j K + 8πe4φ j i . (C.13)
3
With this new conformal scaling, the evolution equations become instead (do
notice that in some places there is ᾱ and in others just α)
412 BSSNOK WITH NATURAL CONFORMAL RESCALING
d
γ̄ij = −2ᾱĀij , (C.17)
dt
d e6φ
φ= − ᾱK , (C.18)
dt 6
d 1
K= −Di Di α + ᾱ e−6φ Āij Āij + e6φ K 2 + 4πe6φ (ρ + S) , (C.19)
dt 3
d " #TF
Āij = e2φ −Di Dj α + αRij + 4πα [γij (S − ρ) − 2Sij ]
dt
− 2ᾱĀik Āk j , (C.20)
d i 1
Γ̄ =γ̃ jk ∂j ∂k β i + γ̃ ij ∂j ∂k β k − 2Āij ∂j ᾱ
dt 3
2
+ 2ᾱ Γ̃jk à − e6φ γ̄ ij ∂j K − 8πe10φ j i ,
i jk
(C.21)
3
where dΩ = sin θdθdϕ is the standard area element of the sphere. Using the fact
that the associated Legendre functions are such that
(l − m)! l,m
P l,−m (z) = (−1)m P (z) , (D.7)
(l + m)!
we can show that the complex conjugate of the Y l,m is given by
Ȳ l,m (θ, ϕ) = (−1)m Y l,−m (θ, ϕ) . (D.8)
When we work with non-scalar functions defined on the sphere we introduce
the so-called spin-weighted spherical harmonics as generalizations of the ordinary
413
414 SPIN-WEIGHTED SPHERICAL HARMONICS
Because of this, ð and ð̄ are known as the spin raising and spin lowering opera-
tors. We also find that
ð̄ð s Y l,m = − [l(l + 1) − s(s + 1)] s Y l,m , (D.18)
l,m
ðð̄ s Y = − [l(l + 1) − s(s − 1)] s Y l,m
, (D.19)
so the s Y l,m are eigenfunctions of the operators ð̄ð and ðð̄, which are gen-
eralizations of L2 . For a function with zero spin weight we in fact find that
L2 f = ð̄ðf = ðð̄f .
From the above definitions it is possible to show that the complex conjugate
of the s Y l,m is given by
l,m
s Ȳ (θ, ϕ) = (−1)s+m −s Y l,−m (θ, ϕ) , (D.20)
Jˆz = ∂ϕ , (D.23)
Jˆ± = e±iϕ [±i∂θ − cot θ ∂ϕ − is csc θ] . (D.24)
The operators for the x and y components of the angular momentum are then
simply obtained from Jˆ± = Jˆx ± iJˆy , so that we find:
Jˆx = Jˆ+ + Jˆ− /2 , Jˆy = −i Jˆ+ − Jˆ− /2 , (D.25)
The s Y l,m can also be constructed in terms of the so-called Wigner rotation
matrices dlms , which are defined in quantum mechanics as the following matrix
elements of the operator for rotations around the y axis:
0 1
ˆ
dlms (θ) := l, m e−iJy θ l, s . (D.26)
416 SPIN-WEIGHTED SPHERICAL HARMONICS
Closed expressions for the rotation matrices dlms , as well as their principal prop-
erties, are well known but we will not go into the details here (the interested
reader can look at standard textbooks on quantum mechanics, e.g. [202, 294]).
There are several very important properties of the spin-weighted spherical
harmonics that can be obtained directly from the properties of the rotation
matrices dlms . In the first place, just as the ordinary spherical harmonics, the
different s Y l,m are orthonormal,
&
l,m
sY (θ, ϕ) s Ȳ l ,m (θ, ϕ) dΩ = δss δll δmm . (D.28)
Also, for a given value of s, the s Y l,m form a complete set. This property can be
expressed in the form
sY
l,m
(θ, ϕ) s Ȳ l,m (θ , ϕ ) = δ(ϕ − ϕ ) δ(cos θ − cos θ ) . (D.29)
l,m
In the above expression we have used the Wigner 3-lm symbols, which are related
to the standard Clebsch–Gordan coefficients l1 , m1 , l2 , m2 |j3 , m3 through
l −l −m3
l1 l2 l3 (−1) 1 2
= √ l1 , m1 , l2 , m2 |l3 , −m3 , (D.31)
m1 m2 m3 2l3 + 1
or equivalently
l1 −l2 +m3 l1 l2 l3
l1 , m1 , l2 , m2 |l3 , m3 = (−1) 2l3 + 1 . (D.32)
m1 m2 −m3
In the above expression the sum runs over all values of k for which the argu-
ments inside the factorials are non-negative. Also, if the particular combination
of {li , mi } is such that the arguments of the factorials outside of the sum are
negative, then the corresponding coefficient vanishes. A more symmetric (though
longer) expression that is equivalent to (D.35) was later derived by Racah [234],
but we will not write it here.
In the general case, (D.35) is rather complicated, but this is not a serious
problem as we can find tables of the most common coefficients in the literature,
and even web-based “Clebsch–Gordan calculators”. Moreover, in some special
cases the coefficients simplify considerably. For example, in the case where m1 =
l1 , m2 = l2 , and l3 = m3 = l1 + l2 we find
l1 l2 l1 + l2 1
= ⇒ l1 , l1 , l2 , l2 |l1 + l2 , l1 + l2 = 1 . (D.36)
l1 l2 l1 + l2 2(l1 + l2 ) + 1
Another particularly interesting case corresponds to taking l3 = m3 = 0 (i.e.
zero total angular momentum in quantum mechanics). In that case we find
l 1 l2 0 (−1)l1 −m1
= l1 , m1 , l2 , m2 |0, 0 = √ δl1 ,l2 δm1 ,−m2 . (D.37)
m1 m2 0 2l1 + 1
√
Taking this result, together with (D.20) and the fact that 0 Y 00 = 1/ 4π, we
can easily recover the orthonormality condition (D.28) from the integral of three
sY l,m , (D.30).
418 SPIN-WEIGHTED SPHERICAL HARMONICS
The cases with l3 = 1 are also interesting as they appear in the expression
for the momentum radiated by gravitational waves. We find
l1 l2 1
= (−1)l1 −m1 δm1 +m2 ,0
m1 m2 0
!
2m1
× δl1 ,l2
(2l1 + 2)(2l1 + 1)(2l1 )
1/2
(l1 + m1 )(l1 − m1 )
+ δl1 ,l2 +1
l1 (2l1 + 1)(2l1 − 1)
1/2
(l2 − m2 )(l2 + m2 )
− δl1 +1,l2 , (D.38)
l2 (2l2 + 1)(2l2 − 1)
l1 l2 1
= (−1)l1 −m1 δm1 +m2 ,∓1
m1 m2 ±1
1/2
(l1 ∓ m1 )(l1 ∓ m2 )
× ± δl1 ,l2
l1 (2l1 + 2)(2l1 + 1)
1/2
(l1 ∓ m1 )(l1 ± m2 )
+ δl1 ,l2 +1
2l1 (2l1 + 1)(2l1 − 1)
1/2
(l2 ∓ m2 )(l2 ± m1 )
+ δl1 +1,l2 . (D.39)
2l2 (2l2 + 1)(2l2 − 1)
REFERENCES
[1] Abrahams, A., Anderson, A., Choquet-Bruhat, Y., and York, J. Einstein and
Yang-Mills theories in hyperbolic form without gauge-fixing. Phys. Rev.
Lett., 75:3377–3381, 1995.
[2] Alcubierre, M. The appearance of coordinate shocks in hyperbolic formulations
of general relativity. Phys. Rev. D, 55:5981–5991, 1997.
[3] Alcubierre, M. Hyperbolic slicings of spacetime: singularity avoidance and gauge
shocks. Class. Quantum Grav., 20(4):607–624, 2003.
[4] Alcubierre, M. Are gauge shocks really shocks? Class. Quantum Grav., 22:4071–
4082, 2005.
[5] Alcubierre, M., Allen, G., Brügmann, B., Lanfermann, G., Seidel, E., Suen,
W.-M., and Tobias, M. Gravitational collapse of gravitational waves in 3D
numerical relativity. Phys. Rev. D, 61:041501 (R), 2000.
[6] Alcubierre, M., Allen, G., Brügmann, B., Seidel, E., and Suen, W.-M. Towards
an understanding of the stability properties of the 3+1 evolution equations
in general relativity. Phys. Rev. D, 62:124011, 2000.
[7] Alcubierre, M., Benger, W., Brügmann, B., Lanfermann, G., Nerger, L., Seidel,
E., and Takahashi, R. 3D Grazing Collision of Two Black Holes. Phys. Rev.
Lett., 87:271103, 2001.
[8] Alcubierre, M., Brandt, S. R., Brügmann, B., Gundlach, C., Massó, J., Seidel,
E., and Walker, P. Test-beds and applications for apparent horizon finders
in numerical relativity. Class. Quantum Grav., 17:2159–2190, 2000.
[9] Alcubierre, M., Brandt, S. R., Brügmann, B., Holz, D., Seidel, E., Takahashi,
R., and Thornburg, J. Symmetry without symmetry: Numerical simula-
tion of axisymmetric systems using Cartesian grids. Int. J. Mod. Phys. D,
10(3):273–289, 2001.
[10] Alcubierre, M. and Brügmann, B. Simple excision of a black hole in 3+1
numerical relativity. Phys. Rev. D, 63:104006, 2001.
[11] Alcubierre, M., Brügmann, B., Diener, P., Guzmán, F. S., Hawke, I., Hawley,
S., Herrmann, F., Koppitz, M., Pollney, D., Seidel, E., and Thornburg, J.
Dynamical evolution of quasi-circular binary black hole data. Phys. Rev. D,
72(4):044004, 2005.
[12] Alcubierre, M., Brügmann, B., Diener, P., Guzmán, F. S., Hawke, I., Hawley,
S., Herrmann, F., Koppitz, M., Pollney, D., Seidel, E., and Thornburg, J.
Dynamical evolution of quasi-circular binary black hole data. Phys. Rev. D,
72:044004, 5 August 2005.
[13] Alcubierre, M., Brügmann, B., Diener, P., Koppitz, M., Pollney, D., Seidel, E.,
and Takahashi, R. Gauge conditions for long-term numerical black hole
evolutions without excision. Phys. Rev. D, 67:084023, 2003.
419
420 REFERENCES
[14] Alcubierre, M., Brügmann, B., Dramlitsch, T., Font, J. A., Papadopoulos, P.,
Seidel, E., Stergioulas, N., and Takahashi, R. Towards a stable numerical
evolution of strongly gravitating systems in general relativity: The conformal
treatments. Phys. Rev. D, 62:044034, 2000.
[15] Alcubierre, M., Brügmann, B., Pollney, D., Seidel, E., and Takahashi, R. Black
hole excision for dynamic black holes. Phys. Rev. D, 64:061501(R), 2001.
[16] Alcubierre, M., Corichi, A., González, J. A., Núñez, D., Reimann, B., and Sal-
gado, M. Generalized harmonic spatial coordinates and hyperbolic shift
conditions. Phys. Rev. D, 72:124018, 2005.
[17] Alcubierre, M., Corichi, A., González, J. A., Núñez, D., and Salgado, M. A
hyperbolic slicing condition adapted to killing fields and densitized lapses.
Class. Quantum Grav., 20(18):3951–3968, 21 September 2003.
[18] Alcubierre, M. and González, J. A. Regularization of spherically symmetric
evolution codes in numerical relativity. Comp. Phys. Comm., 167:76–84,
2005. gr-qc/0401113.
[19] Alcubierre, M. and Massó, J. Pathologies of hyperbolic gauges in general rela-
tivity and other field theories. Phys. Rev. D, 57(8):R4511–R4515, 1998.
[20] Alcubierre, M. and Schutz, B. Time–symmetric ADI and causal reconnection:
Stable numerical techniques for hyperbolic systems on moving grids. J.
Comput. Phys., 112:44, 1994.
[21] Anderson, A. and York, J. W. Fixing Einstein’s equations. Phy. Rev. Lett.,
82:4384–4387, 1999.
[22] Anderson, A. and York, J. W. Hamiltonian time evolution for general relativity.
Phys. Rev. Lett., 81:1154–1157, 1998.
[23] Anninos, P., Bernstein, D., Brandt, S., Libson, J., Massó, J., Seidel, E., Smarr,
L., Suen, W.-M., and Walker, P. Dynamics of apparent and event horizons.
Phys. Rev. Lett., 74(5):630–633, 30 January 1995.
[24] Anninos, P., Brandt, S. R., and Walker, P. New coordinate systems for axisym-
metric black hole collisions. Phys. Rev. D, 57:6158–6167, 1998.
[25] Anninos, P., Camarda, K., Massó, J., Seidel, E., Suen, W.-M., and Towns, J.
Three-dimensional numerical relativity: The evolution of black holes. Phys.
Rev. D, 52(4):2059–2082, 1995.
[26] Anninos, P., Daues, G., Massó, J., Seidel, E., and Suen, W.-M. Horizon bound-
ary conditions for black hole spacetimes. Phys. Rev. D, 51(10):5562–5578,
1995.
[27] Anninos, P., Hobill, D., Seidel, E., Smarr, L., and Suen, W.-M. The collision
of two black holes. Phys. Rev. Lett., 71(18):2851–2854, 1993.
[28] Anninos, P., Hobill, D., Seidel, E., Smarr, L., and Suen, W.-M. The head-on
collision of two equal mass black holes. Phys. Rev. D, 52:2044–2058, 1995.
[29] Arbona, A. and Bona, C. Dealing with the center and boundary problems
in 1d numerical relativity. Comput. Phys. Commun., 118:229–235, 1999.
gr-qc/9805084.
[30] Arbona, A., Bona, C., Massó, J., and Stela, J. Robust evolution system for
numerical relativity. Phys. Rev. D, 60:104014, 1999. gr-qc/9902053.
REFERENCES 421
[31] Arnowitt, R., Deser, S., and Misner, C. W. The dynamics of general relativity.
In Witten, L., editor, Gravitation: An introduction to current research, pages
227–265. John Wiley, New York, 1962.
[32] Ashtekar, A., Beetle, C., Dreyer, O., Fairhurst, S., Krishnan, B., Lewandowski,
J., and Wisniewski, J. Generic isolated horizons and their applications.
Phys. Rev. Lett., 85:3564–3567, 2000.
[33] Ashtekar, A., Beetle, C., and Fairhurst, S. Isolated horizons: A generalization
of black hole mechanics. Class. Quantum Grav., 16:L1–L7, 1999.
[34] Ashtekar, A., Beetle, C., and Fairhurst, S. Mechanics of isolated horizons. Class.
Quantum Grav., 17:253–298, 2000.
[35] Ashtekar, A., Fairhurst, S., and Krishnan, B. Isolated horizons: Hamiltonian
evolution and the first law. Phys. Rev. D, 62:104025, 2000.
[36] Ashtekar, A. and Galloway, G. Some uniqueness results for dynamical horizons.
Advances in Theoretical and Mathematical Physics, 9(1):1–30, 2005.
[37] Ashtekar, A. and Krishnan, B. Dynamical Horizons: Energy, angular momen-
tum, fluxes, and balance laws. Phys. Rev. Lett., 89:261101, 2002.
[38] Ashtekar, A. and Krishnan, B. Dynamical horizons and their properties. Phys.
Rev. D, 68:104030, 2003.
[39] Ashtekar, A. and Krishnan, B. Isolated and dynamical horizons and their
applications. Living Rev. Relativity, 7:10, 2004.
[40] Babiuc, M. C., Szilágyi, B., and J.Winicour. Harmonic initial-boundary evolu-
tion in general relativity. Phys. Rev. D, 73:064017, 2006.
[41] Baker, J. and Campanelli, M. Making use of geometrical invariants in black
hole collisions. Phys. Rev. D, 62:127501, 2000.
[42] Baker, J., Brügmann, B., Campanelli, M., Lousto, C. O., and Takahashi, R.
Plunge waveforms from inspiralling binary black holes. Phys. Rev. Lett.,
87:121103, 2001.
[43] Baker, J., Campanelli, M., Lousto, C. O., and Takahashi, R. Modeling gravita-
tional radiation from coalescing binary black holes. Phys. Rev. D, 65:124012,
2002.
[44] Baker, J. G., Centrella, J., Choi, D.-I., Koppitz, M., and van Meter, J. Binary
black hole merger dynamics and waveforms. Phys. Rev. D, 73:104002, 2006.
[45] Baker, J. G., Centrella, J., Choi, D.-I., Koppitz, M., and van Meter, J. Gravi-
tational wave extraction from an inspiraling configuration of merging black
holes. Phys. Rev. Lett., 96:111102, 2006.
[46] Balakrishna, J., Daues, G., Seidel, E., Suen, W.-M., Tobias, M., and Wang,
E. Coordinate conditions in three-dimensional numerical relativity. Class.
Quantum Grav., 13:L135–L142, 1996.
[47] Barcelo, C. and Visser, M. Twilight for the energy conditions? Int. J. Mod.
Phys. D, 11:1553–1560, 2002.
[48] Bardeen, J. M. and Piran, T. General relativistic axisymmetric rotating sys-
tems: Coordinates and equations. Phys. Reports, 196:205–250, 1983.
[49] Baumgarte, T. W. Innermost stable circular orbit of binary black holes. Phys.
Rev. D, 62:024018, 2000.
422 REFERENCES
[159] Hannam, M., Husa, S., Pollney, D., Brugmann, B., and O’Murchadha, N.
Geometry and regularity of moving punctures. Phys. Rev. Lett., 99:241102,
2007.
[160] Hawking, S. W. The event horizon. In DeWitt, C. and DeWitt, B. S., editors,
Black Holes, pages 1–55. Gordon and Breach, New York, 1973.
[161] Hawking, S. W. and Ellis, G. F. R. The large scale structure of spacetime.
Cambridge University Press, Cambridge, England, 1973.
[162] Hayward, S. A. General laws of black hole dynamics. Phys. Rev. D,
49(12):6467–6474, 15 June 1994.
[163] Holz, D., Miller, W., Wakano, M., and Wheeler, J. Coalescence of primal
gravity waves to make cosmological mass without matter. In Hu, B. L. and
Jacobson, T. A., editors, Directions in General Relativity: Proceedings of the
1993 International Symposium, Maryland; Papers in honor of Dieter Brill,
page 339, Cambridge, England, 1993. Cambridge University Press.
[164] Hulse, R. and Taylor, J. Discovery of a pulsar in a binary system. Astrophys.
J., 195:L51–L53, 1975.
[165] Husa, S. Numerical relativity with the conformal field equations. In Fernández,
L. and González, L. M., editors, Proceedings of the 2001 spanish relativity
meeting, volume 617 of Lecture Notes in Physics, pages 159–192. Springer,
2003.
[166] Isaacson, R. Gravitational radiation in the limit of high frequency. II. nonlinear
terms and the effective stress tensor. Phys. Rev., 166:1272–1280, 1968.
[167] Israel, W. and Stewart, J. M. Transient relativistic thermodynamics and ki-
netic theory. Ann. Phys., 118:341, 1979.
[168] Jantzen, R. T. and York, James W., J. New minimal distortion shift gauge.
Phys. Rev. D, 73:104008, 2006.
[169] Katz, J. I., Lynden-Bell, D., and Israel, W. Quasilocal energy in static gravi-
tational fields. Class. Quantum Grav., 5:971–987, 1988.
[170] Kerr, R. P. Gravitational field of a spinning mass as an example of algebraically
special metrics. Phys. Rev. Lett., 11:237–238, 1963.
[171] Kidder, L. E. and Finn, L. S. Spectral methods for numerical relativity. the
initial data problem. Phys. Rev. D, 62:084026, 2000.
[172] Kidder, L. E., Scheel, M. A., and Teukolsky, S. A. Extending the lifetime
of 3D black hole computations with a new hyperbolic system of evolution
equations. Phys. Rev. D, 64:064017, 2001.
[173] Kidder, L. E., Scheel, M. A., Teukolsky, S. A., Carlson, E. D., and Cook, G. B.
Black hole evolution by spectral methods. Phys. Rev. D, 62:084032, 2000.
[174] Kidder, L. E., Lindblom, L., Scheel, M. A., Buchman, L. T., and Pfeiffer,
H. P. Boundary conditions for the Einstein evolution system. Phys. Rev.
D, 71:064020, 2005.
[175] Kokkotas, K. D. and Schmidt, B. G. Quasi-normal modes of stars and black
holes. Living Rev. Relativity, 2:2, 1999. http://www.livingreviews.org/lrr-
1999-2.
REFERENCES 429
[215] Nakamura, T., Oohara, K., and Kojima, Y. General relativistic collapse to
black holes and gravitational waves from black holes. Prog. Theor. Phys.
Suppl., 90:1–218, 1987.
[216] Nerozzi, A., Beetle, C., Bruni, M., Burko, L. M., and Pollney, D. Towards
wave extraction in numerical relativity: The quasi-Kinnersley frame. Phys.
Rev. D, 72:024014, 2005.
[217] Newman, E. T., Couch, E., Chinnapared, K., Exton, A., Prakash, A., and
Torrence, R. Metric of a rotating charged mass. J. Math. Phys., 6(6):918–
919, 1965.
[218] Newman, E. T. and Penrose, R. An approach to gravitational radiation by a
method of spin coefficients. J. Math. Phys., 3(3):566–578, 1962. erratum in
J. Math. Phys. 4, 998 (1963).
[219] Newman, E. T. and Penrose, R. Note on the Bondi-Metzner-Sachs group. J.
Math. Phys., 7(5):863–870, May 1966.
[220] Newmann, J. V. and Richtmyer, R. D. A method for the numerical calculation
of hydrodynamical shocks. J. Appl. Phys., 21:232, 1950.
[221] Nordström, G. On the energy of the gravitational field in Einstein’s theory.
Proc. Kon. Ned. Akad. Wet., 20:1238–1245, 1918.
[222] O’Murchadha, N. and York, J. W. Gravitational energy. Phys. Rev. D,
10(8):2345–2357, 1974.
[223] Pais, A. ’Subtle is the Lord...’ the science and the life of Albert Einsten. Oxford
University Press, Oxford and New York, 1982.
[224] Petrich, L. I., Shapiro, S. L., and Teukolsky, S. A. Oppenheimer-Snyder col-
lapse with maximal time slicing and isotropic coordinates. Phys. Rev. D,
31(10):2459–2469, 15 May 1985.
[225] Petrov, A. Z. Einstein Spaces. Pergamon Press, Oxford, 1969.
[226] Pfeiffer, H. P. and York, J. W. Extrinsic curvature and the Einstein con-
straints. Phys. Rev. D, 67:044022, 2003.
[227] Poisson, E. A Relativist’s Toolkit: The Mathematics of Black-Hole Mechanics.
Cambridge University Press, 2004.
[228] Pollney, D. et al. Recoil velocities from equal-mass binary black-hole mergers:
a systematic investigation of spin-orbit aligned configurations. Phys. Rev.,
D76:124002, 2007.
[229] Pons, J. A., Martı́, J. M., and Müller, E. The exact solution of the Riemann
problem with non-zero tangential velocities in relativistic hydrodynamics.
J. Fluid Mech., 422:125–139, 2000.
[230] Press, W. H., Flannery, B. P., Teukolsky, S. A., and Vetterling, W. T. Nu-
merical Recipes. Cambridge University Press, New York, 2nd edition, 1992.
[231] Pretorius, F. Evolution of binary black hole spacetimes. Phys. Rev. Lett.,
95:121101, 2005.
[232] Pretorius, F. Numerical relativity using a generalized harmonic decomposition.
Class. Quantum Grav., 22:425–452, 2005.
[233] Pretorius, F. Simulation of binary black hole spacetimes with a harmonic
evolution scheme. Class. Quantum Grav., 23:S529–S552, 2006.
432 REFERENCES
[234] Racah, G. Theory of complex spectra II. Phys. Rev., 62:438, 1942.
[235] Regge, T. and Wheeler, J. Stability of a Schwarzschild singularity. Phys. Rev.,
108(4):1063–1069, 1957.
[236] Reimann, B. Slice stretching at the event horizon when geodesically slicing
the Schwarzschild spacetime with excision. Class. Quantum Grav., 21:4297–
4304, 2004.
[237] Reimann, B. How slice stretching arises when maximally slicing the
Schwarzschild spacetime with vanishing shift. Class. Quantum Grav.,
22:4563–4587, 2005.
[238] Reimann, B. and Brügmann, B. Late time analysis for maximal slicing of
reissner- nordström puncture evolutions. Phys. Rev. D, 69:124009, 2004.
[239] Reimann, B. and Brügmann, B. Maximal slicing for puncture evolutions of
Schwarzschild and Reissner-Nordström black holes. Phys. Rev. D, 69:044006,
2004.
[240] Reissner, H. Über die eigengravitationn des elektrischen felds nach der Ein-
steinshen theorie. Ann. Phys., 50:106–120, 1916.
[241] Reula, O. Hyperbolic methods for Einstein’s equations. Living Rev. Relativity,
1:3, 1998.
[242] Richardson, L. F. The approximate arithmetic solution by finite differences of
physical problems involving differential equations, with applications to the
stresses in a masonry dam. Phil. Trans. Roy. Soc., 210:307–357, 1910.
[243] Richtmyer, R. D. and Morton, K. Difference Methods for Initial Value Prob-
lems. Interscience Publishers, New York, 1967.
[244] Rinne, O. and Stewart, J. M. A strongly hyperbolic and regular reduction of
Einstein’s equations for axisymmetric spacetimes. Class. Quantum Grav.,
22:1143–11166, 2005.
[245] Ruiz, M., Alcubierre, M., and Nunez, D. Regularization of spherical and ax-
isymmetric evolution codes in numerical relativity. Gen. Rel. Grav., 40:159–
182, 2008.
[246] Ruiz, M., Takahashi, R., Alcubierre, M., and Nunez, D. Multipole ex-
pansions for energy and momenta carried by gravitational waves. 2007.
arXiv:0707.4654.
[247] Sachs, R. Gravitational waves in general relativity VIII. Waves in asymptot-
ically flat space-time. Proc. Roy. Soc. London, A270:103–126, 1962.
[248] Salgado, M. General relativistic hydrodynamics: a new approach. Rev. Mex.
Fis., 44:1–8, 1998.
[249] Sarbach, O., Calabrese, G., Pullin, J., and Tiglio, M. Hyperbolicity of the
BSSN system of Einstein evolution equations. Phys. Rev. D, 66:064002,
2002.
[250] Sarbach, O. and Tiglio, M. Exploiting gauge and constraint freedom in hyper-
bolic formulations of Einstein’s equations. Phys. Rev. D, 66:064023, 2002.
[251] Sarbach, O. and Tiglio, M. Boundary conditions for Einstein’s field equa-
tions: analytical and numerical analysis. Journal of Hyperbolic Differential
Equations, 2:839–883, 2005.
REFERENCES 433
437
438 INDEX
Komar symmetries, 16
angular momentum, 119, 406 time orientable, 57
integral, 406 marginally trapped surface, 60
mass, 118, 152, 406 mass
Kreiss–Oliger dissipation, 343, 386, 399 ADM, 107, 113, 118, 152, 389, 403
Kronecker delta, 9 bare, 107
Kruskal diagram, 51, 128 Bondi, 407
Kruskal–Szekeres coordinates, 51 Hawking, 407
irreducible, 56, 116
Lagrangian Komar, 118, 152, 406
density, 34, 80 Newtonian, 390
electromagnetic field, 35 of isolated horizon, 236
frame, 238 Schwarzschild-like, 389, 408
general definition, 33 master function
Hilbert, 38, 78, 80 Cunningham–Price–Moncrief, 283
Klein–Gordon field, 35 Moncrief, 287
Laplace Zerilli–Moncrief, 282
equation, 97, 106, 145, 157 matter
operator, 86, 93, 105, 189, 278, 413 energy density, 72, 92
lapse momentum density, 72, 92
collapse, 125, 131, 137, 380, 389 stress tensor, 74
densitized, 81, 102, 134, 168, 185 maximal slicing, 109, 118, 123, 127, 210,
function, 66, 102, 118, 122 379
pre-collapsed, 212, 218, 382 Maxwell equations, 3, 73, 289
Lax equivalence theorem, 322 method of lines, 339
Lax–Friedrichs method, 336 metric
Lax–Wendroff conformal, 101, 115, 142, 175
method, 336 conformally flat, 96, 105, 207, 405
theorem, 348 Euclidean, 11
leapfrog method, 337 in cylindrical coordinates, 392
Legendre polynomials, 413 in spherical coordinates, 19, 46
induced, 13, 67
length
Lorentzian, 11
contraction, 6
mapping from vectors to one-forms, 11
proper, 6
on the sphere, 19
Levi–Civita tensor, 14, 289
tensor, 11
Lie
three-dimensional, 66
bracket, 16
Michelson and Morley, 2
derivative, 14, 24, 70, 74, 85
minimal coupling, 29
dragging, 15
Minkowski metric, 4, 6, 90, 105, 231, 239,
light-cone, 5, 53
302, 364
light-cylinder, 153
minmod limiter, 351
limiting surface, 125, 129, 210, 220
Misner initial data, 107, 200
linear degeneracy, 268
momentum
local flatness, 27, 373, 393
ADM, 110, 403
Lorentz
canonical, 75, 80
factor, 3
density, 72, 92
gauge, 41, 43
flux in gravitational waves, 310, 315
transformations, 3, 17, 18, 40
momentum constraints, 73, 81, 86, 92, 93,
Lorentz–Fitzgerald contraction, 6
109, 167
Moncrief master function, 287
Mach number, 269 monotonicity preserving methods, 350
Mach’s principle, 1, 55 multi-patch codes, 216
manifold multipole expansion, 278, 313
definition, 7
local flatness, 27 Nagy–Ortiz–Reula formulation, 172, 397
442 INDEX
Z4 formulation, 185