0% found this document useful (0 votes)
225 views

Introduction To General Relativity and Cosmology (PDFDrive)

Uploaded by

ojhasameer501
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
225 views

Introduction To General Relativity and Cosmology (PDFDrive)

Uploaded by

ojhasameer501
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 269

Introduction to

GENERAL RELATIVITY AND


COSMOLOGY
Essential Textbooks in Physics

ISSN: 2059-7630

Published
Vol. 1 Newtonian Mechanics for Undergraduates
by Vijay Tymms

Vol. 2 Introduction to General Relativity and Cosmology


by Christian G. Böhmer
Essential Textbooks in Physics

Introduction to
General Relativity and Cosmology
Christian G. Böhmer
University College London, UK
Published by

World Scientific Publishing Europe Ltd.


57 Shelton Street, Covent Garden, London WC2H 9HE
Head office: 5 Toh Tuck Link, Singapore 596224
USA office: 27 Warren Street, Suite 401-402, Hackensack, NJ 07601

Library of Congress Cataloging-in-Publication Data


Names: Böhmer, Christian G., author.
Title: Introduction to general relativity and cosmology / Christian G. Böhmer
(University College London, UK).
Description: Covent Garden, London ; Hackensack, NJ : World Scientific,
[2016] | Series: Essential textbooks in physics, ISSN 2059-7630 ; vol. 2
Identifiers: LCCN 2016023263| ISBN 9781786341174 (hc ; alk. paper) |
ISBN 1786341174 (hc ; alk. paper) | ISBN 9781786341181 (pbk ; alk.
paper) | ISBN 1786341182 (pbk ; alk. paper)
Subjects: LCSH: General relativity (Physics)--Textbooks. | Cosmology--
Textbooks.
Classification: LCC QC173.6 .B64 2016 | DDC 530.11--dc23
LC record available at https://lccn.loc.gov/2016023263

British Library Cataloguing-in-Publication Data


A catalogue record for this book is available from the British Library.

Copyright © 2017 by World Scientific Publishing Europe Ltd.


All rights reserved. This book, or parts thereof, may not be reproduced in any
form or by any means, electronic or mechanical, including photocopying,
recording or any information storage and retrieval system now known or to
be invented, without written permission from the Publisher.

For photocopying of material in this volume, please pay a copying fee


through the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers,
MA 01923, USA. In this case permission to photocopy is not required from
the publisher.

Desk Editors: Suraj Kumar/Mary Simpson

Typeset by Stallion Press


Email: [email protected]

Printed in Singapore
To Saffron, Isabelle, Lewis, Zachary, Martin and Wiera
Preface

Why another introduction to General Relativity and Cosmology? The primary


motivation for this book was to present a concise yet detailed introduction to
both General Relativity and Cosmology which first and foremost has the
student reader in mind. When I was a student myself I very much disliked
phrases like ‘this is left as an easy exercise to the reader’ or ‘the reader may
quickly verify that’, and I also was not keen on exercises without any
solutions or just some numerical answers. If, for instance, my calculation
gave 7 and the answer was 6, how is it possible for a student to judge whether
the entire calculation was fundamentally wrong or whether there was a small
algebraic mistake, it might have been a typo in the book after all. With this in
mind, the presentation of this book is fairly explicit, hopefully without being
excessively concerned with minor details. Moreover, every chapter finishes
with a short section called ‘Further reading’ which contains some references
which would be a natural continuation for further studies.
When writing a textbook, there is always a temptation to add more
material and to present detailed discussions and as many results as possible.
On the other hand, students can easily feel intimidated by a tome covering the
entire subject. From the students’ point of view what is needed is a book with
just the right amount of material. So, the aim was to keep the book within a
page limit of approximately 260 pages and cover the standard material taught
in two 10 week courses with 3 lectures per week. It was important to me to
give detailed solutions to all 80 exercises which has taken up almost 60 pages
of this book, so the actual main text is roughly 200 pages long which should
be a manageable amount. The downside of this approach is that parts of this
book might appear a little short on words, and that some interesting subjects
had to be left out. Black holes are only mentioned in a few words and there
was no room to cover the Kerr solution, other known exact solutions or the
various different mass definitions used in General Relativity. In Cosmology it
would have been wonderful to cover some cosmological perturbation theory
and structure formation in detail, also Newtonian cosmology has some
interesting features.
Chapter 1 covers differential geometry starting with standard vectors in
Euclidean space. Chapter 2 begins with a brief discussion of some Physics
topics, and then motivates the structure of the Einstein field equations of
General Relativity. The weak field approximation and gravitational waves are
discussed. The chapter finishes with deriving the Einstein field equations by
using the variational approach which is particularly elegant. The ‘Further
reading’ section covers some material related to modifications and extension
of Einstein’s theory. In Chap. 3, the Schwarzschild solutions are discussed.
This includes the classical Solar System tests of General Relativity, all of
which are in excellent agreement with the theoretical predictions. An
introduction to Cosmology is given in Chap. 4. This part includes inflation
and should take students to the beginnings of what one now calls Modern
Cosmology. As already stated, the final Chap. 5 contains the fully worked out
solutions to all exercises.
I am keen to get input and feedback from readers. Therefore, there is a
webpage http://book.christianboehmer.co.uk/ where I will post readers’
suggestions, keep a list of typos and corrections, etc. I will also make some
Mathematica files available which can be used to compute the various tensors
needed in General Relativity and Cosmology. I will also keep a list of
additional topics suggested by readers.
It is difficult to mention each and every one who has influenced this
book. Therefore, I would like to thank all my collaborators, and colleagues at
UCL for the many useful discussions we have had over the years. I would
like to thank Laurent Chaminade from World Scientific who motivated me to
start this project. I am also very grateful to Sebastian Bahamonde, Nyein
Chan, Saffron Glenister, Atifah Mussa, Nicola Tamanini and Matthew
Wright for reading through the manuscript and pointing out various typos,
inaccuracies and making suggestions for improvements.

London & Potton, February 2016


C. G. Boehmer
About the Author

Dr. Christian Böhmer is a Reader in Mathematics at University College


London where he teaches various undergraduate and graduate courses. He is
also involved in teaching PhD students at the London Taught Course Centre.
Dr. Böhmer studied physics at the University of Potsdam, the Technical
University of Berlin and University College Dublin. After graduation, he
completed his PhD studies at the Vienna University of Technology. His
research interests are in general relativity and its modifications, cosmology,
and aspects of continuum mechanics. He has over 60 peer-reviewed articles
and has presented his work in many seminars, and at national and
international conferences.
Contents

Preface

About the Author

1. Differential Geometry
1.1 The Concept of a Vector
1.1.1 Vector operations
1.1.2 Projections and basis vectors
1.1.3 Towards tangent space and all that
1.1.4 Index notation
1.2 Manifolds and Tensors
1.2.1 Tangent space and vector fields
1.2.2 Tensors
1.2.3 Manifolds and metric
1.2.4 Examples of metrics
1.2.5 Geodesics
1.2.6 Covariant derivative
1.2.7 Parallel transport and geodesics
1.3 Curvature
1.3.1 Infinitesimal parallelogram
1.3.2 Riemann, Ricci and Weyl tensors
1.3.3 Geodesic deviation equation
1.4 Euler–Lagrange Equations
1.5 Further Reading
1.6 Exercises

2. Einstein Field Equations


2.1 Some Physics Background
2.1.1 Newton’s theory of gravity
2.1.2 Special relativity
2.1.3 Maxwell equations
2.1.4 Matter tensors
2.2 Geometry and Gravity
2.2.1 Geodesics and Newton’s law
2.2.2 Curvature and the Poisson equation
2.2.3 Field equations of General Relativity
2.2.4 The principle of minimal gravitational coupling
2.3 Weak Gravity
2.3.1 Linearised Riemann and Ricci tensors
2.3.2 Gauge transformations
2.3.3 Linearised Einstein field equations
2.3.4 Gravitational waves
2.4 Variational Approach to General Relativity
2.5 Further Reading
2.6 Exercises

3. Schwarzschild Solutions
3.1 Spherical Symmetry and Birkhoff’s Theorem
3.2 The Schwarzschild Solution
3.3 The Schwarzschild Interior Solution
3.4 Geodesics in Schwarzschild Spacetime
3.5 Testing General Relativity — The Classical Tests
3.5.1 Perihelion precession of Mercury
3.5.2 Light deflection by the Sun
3.5.3 Gravitational redshift of light
3.5.4 Radar echo or gravitational time delay
3.6 The Schwarzschild Radius
3.6.1 Radial null geodesics
3.6.2 Eddington–Finkelstein coordinates
3.6.3 Kruskal–Szekeres coordinates
3.6.4 Black holes
3.7 Further Reading
3.8 Exercises
4. Cosmology
4.1 Classical and Modern Cosmology
4.1.1 Cosmological principle
4.1.2 Geometry of constant time hypersurfaces
4.1.3 Friedmann–Lemaître–Robertson–Walker metric
4.1.4 Particle horizons
4.1.5 Field equations
4.2 Cosmological Solutions
4.2.1 Matter-dominated universe
4.2.2 Radiation-dominated universe
4.2.3 The Einstein static universe
4.2.4 De Sitter universe
4.3 Physical Cosmology
4.3.1 Cosmological parameters
4.3.2 Redshift
4.3.3 Distances in cosmology
4.3.4 Distance redshift relationships
4.3.5 The universe today
4.3.6 Shortcomings in cosmology
4.4 Inflation
4.4.1 Accelerated expansion
4.4.2 Scalar fields in cosmology
4.4.3 Slow-roll inflation
4.5 Further Reading
4.6 Exercises

5. Solutions to Exercises
5.1 Solutions: Differential Geometry
The concept of a vector
Manifolds and tensors
Curvature
5.2 Solutions: Einstein Field Equations
Some physics background
Geometry and gravity
Weak gravity
Variational approach to General Relativity
5.3 Solutions: Schwarzschild Solutions
The Schwarzschild solution
The Schwarzschild interior solution
Geodesics in Schwarzschild spacetime
Testing General Relativity — the classical tests
The Schwarzschild radius
5.4 Solutions: Cosmology
Classical and Modern Cosmology
Physical cosmology
Inflation

Bibliography

Index
1
Differential Geometry

The theory of General Relativity is a theory of gravitation based on the


geometric properties of spacetime. Its formulation requires the use of
differential geometry. One of the great difficulties when working with
geometric objects on arbitrary spaces is notation. As far as Cartesian tensors
are concerned, the issue is much easier. However, in order to prepare for the
later parts on Riemannian and Lorentzian geometry, we will introduce most
of the abstract notation of differential geometry in the familiar Euclidean
setting.

1.1. The Concept of a Vector

Let us start with Euclidean space denoted by . A vector v is a quantity in


with specified direction and magnitude. The magnitude of this vector is
denoted either by |υ| or υ. Graphically we can view a vector as an arrow, with
its length representing the magnitude. Various physical quantities are best
represented by vectors, examples are forces, velocities, moments,
displacements and many others. We define the zero vector O as a vector with
zero magnitude and arbitrary direction. For any vector υ, we can define a unit
vector pointing in the same direction as υ but with magnitude 1, simply by
Two vectors are called equal if they have the same direction and
the same magnitude.
The concept of a vector as such does not require the introduction of
coordinates. This is important conceptually, the outcome of an experiment
should not depend on our choice or coordinates. In most applications,
however, a choice of good coordinates which are adapted to the physical
system can considerably simplify subsequent equations. Try for instance
solving the two simple equations ẍ = 0, ӱ = 0 using polar coordinates, it
becomes difficult.
However, choosing coordinates and units can also lead to substantial
problems. In particular, when two groups working on the same project
assume they use the same coordinates. It was exactly this assumption that
made NASA’s Mars Climate Orbiter mission a failure. Subcontractor
Lockheed Martin designed thruster software that used Imperial units, while
NASA uses metric units in their software. The spacecraft approached Mars at
a much lower altitude than expected and it is likely that atmospheric stresses
destroyed it (NASA, 1999). As trivial as the matter seems to appear, units,
coordinates and their transformation properties are a crucial ingredient of
Engineering, Mathematics and Physics. It also explains why theoretical
physicists are keen to set every possible constant to one.

1.1.1. Vector operations

Having introduced the concept of a vector, we must next define admissible


algebraic operations on vectors. Let u and υ be two vectors, then we define
their sum u + υ to be the vector which completes the triangle when the tail of
υ is placed at the tip of u. When u and υ are interchanged, one arrives at the
same vector (parallelogram rule) and thus u + υ = υ + u. We can also
multiply the vector υ by an arbitrary real number r ∈ ℝ to get the vector rυ.
This vector has the same direction as υ if r > 0 but has opposite direction if r
< 0 and it has length |r||υ|. When r = 0, we get the zero vector. Addition and
the notion of the zero vector allow us to define the inverse (under addition) of
a vector by saying that is the inverse of υ if We will denote this
simply by We can now state what is meant by the operation u − υ,
namely this vector obtained by placing the tip of υ at the tip of u.
Two vectors uniquely define a plane. Let us move the vectors such that
their tails join at the same point. Now, in this plane we can define the angle θ,
say, between these two vectors. This angle can also be used to measure the
area of the parallelogram spanned by the vectors. The angle between any two
vectors should be independent of the vectors’ lengths and should only depend
on their orientation. Let us define the scalar product between two vectors u
and υ by this can be rewritten into the standard
form of the scalar product

where θ ∈ [0, π]. Two vectors are said to be perpendicular or orthogonal if u


· υ = 0. Note that υ · υ = |υ|2.
Having specified a plane spanned by two vectors, we can alternatively
define the same plane by only one (different) vector. Let us take a vector w
which is orthogonal to both u and υ. Then this vector can be used to define
the plane spanned by u and υ. If we combine this with the idea of measuring
the area of the parallelogram spanned by the vectors, we arrive at the
definition of the vector product

where e is a unit vector (|e| = 1) orthogonal to both u and υ, and normal to the
plane spanned by them. Moreover, the orientation of e is given by the right-
hand rule. This means u, υ and e form a right-handed set. Should we use a
left-hand rule, then this would introduce an additional minus sign in the
definition. The length |u × υ| gives the area of the parallelogram. Note that u
× υ = −υ × u.

1.1.2. Projections and basis vectors

The scalar product has a second geometrical interpretation based on the idea
of projections. Let us project u onto υ and denote the projected vector by uυ.
The length of the projection is given by |uυ| = |u| cos θ which can be written
using the scalar product as Since this projection points in the
direction of υ, the projected vector is given by which we write in
the form

Therefore, the vector u can be decomposed into two components with respect
to One part, uυ , which is the projection of u onto υ, and
another part orthogonal to uυ which is given by

In order to see that uυ and are orthogonal, it suffices to compute

Let us assume that i is a given unit vector. Then, we can uniquely


decompose the vector υ into vi and Next, we consider three mutually
orthogonal unit vectors i, j, k which satisfy i × j = k, j × k = i and k × i = j,
we refer to such a set as a basis. Then we can decompose υ uniquely with
respect to those three unit vectors by repeated projection along the three
given vectors

We refer to the numbers (υ · i), (υ · j) and (υ · k) as the components of the


vector υ in the basis {i, j, k}. Note that it is important to distinguish the
representation of a vector (three real numbers) in a given basis from its
abstract meaning (a directed line segment).
Given a basis {i, j, k}, we can now compute all possible combinations of
scalar products (1.1) which are

Clearly, whenever two different basis vectors come together, their scalar
product returns zero as any two such vectors are perpendicular. On the other
hand, when the scalar product between a basis vector and itself is computed,
the angle is zero, giving one. Recall that basis vectors are normalised.
An important observation is that the right-hand sides of (1.7) look like the
3 × 3 identity matrix, a point which helps to justify the more compact index
notation which we will introduce shortly.
When the possible vector products (1.2) are computed we find

Here we should take notice that the right-hand sides look like a
skewsymmetric (anti-symmetric) matrix. The sign of the right-hand side of
the equation can be determined by noting that it is + when the order of the
three vectors forms an even permutation of i, j, k and it is − when the
permutation is odd.

1.1.3. Towards tangent space and all that

Before introducing the very useful index notation, it seems appropriate to


briefly present a more abstract view of vectors, we will make this more
formal later. This viewpoint will also be useful when we discuss Riemannian
geometry or simply curved spaces. Let us consider the 2-sphere, or plainly
the surface of the Earth. Pick two points and connect them with an arrow
(straight line with orientation) from A to B, say. It is now tempting to refer to
this as a vector, however, it is not. Firstly, if we carry out the aforementioned
procedure honestly, then this straight line has to penetrate the surface of the
Earth and go through parts of its interior. On the other hand, if we insist on
drawing the line on the surface, it would not be a straight line any more, it
would be part of a great circle, provided we choose the shortest possible line
connecting A and B. Thus, we seem to struggle with applying the concept of
a vector to the situation of the sphere, or any other curved surface, see also
Fig. 1.1. Yet, everyday life seems to suggest that our approximation of the
surface of the Earth as flat Euclidean space is working rather well locally.
The reason for this is that at every point of the 2-sphere we can define the
plane tangent to its surface. This plane is Euclidean and locally, this means in
a small neighbourhood around this point, every space looks Euclidean. We
will not consider spaces with corners or sharp edges.
Fig. 1.1 Shown is an arbitrary surface. The arrow which connects the two points does not lie within this
surface and hence cannot be regarded as a vector. Also shown are two curves on this surface and a
tangent vector to one of the curves. Tangent vectors are the building blocks of tangent space.

It is a very convenient coincidence that in Euclidean three space the


space and its tangent space at every point are isomorphic, this means their
structures are the same. In simple words, a tangent vector to a plane is itself
part of this plane. The above basis vectors {i, j, k} should, strictly speaking,
be viewed as the basis elements of the tangent space. Then there exists a dual
tangent space with a dual basis. The dual space consists of mappings which
take elements of the tangent space and maps them to the real numbers. This is
nothing but the scalar product (1.1), which is then interpreted as an inner
product. In Euclidean space the space itself, its tangent space and the dual
tangent space are all isomorphic, in general this is not the case. However,
there will always be a mapping which allows us to map elements of the
tangent space to its dual space. This mapping will turn out to be the metric. It
is the metric which is of great importance for applications of physics. It is the
object which determines how we measure distances.
While we do not require this abstract setting yet, it helps to keep this in
mind as it will make the subsequent introduction of the index notation
clearer.

1.1.4. Index notation

We have seen that every vector υ in a Euclidean space can be decomposed


with respect to its basis vectors, see Eq. (1.6). This procedure can easily be
extended to n dimensions in which case we would have n basis vectors. Let
us start with an arbitrary set of basis vectors {e1, e2,..., en} which we will
denote by ei, i = 1,..., n. As mentioned in the previous subsection, this is a
basis for the tangent space. Let us denote the elements of the dual basis by ej,
j = 1,..., n, the ej are the basis for the dual tangent space. These two basis are
dual in the sense that

where the meaning of the scalar product is as before and we introduced the
Kronecker delta One can think of as the components of the identity
matrix, see the remark after (1.7).
We write a vector υ in the basis ei in the form

We refer to the vi as the contravariant components of the vector υ in the basis


ei. Summations over repeated indices will occur at many places in most of the
subsequent equations. We thus introduce the Einstein summation convention
whereby one suppresses the summation symbol and sums over twice repeated
(one upper and one lower) indices. We simply write

to mean the same as Eq. (1.11). Since the notion of a vector is independent of
our choice of basis, we could equivalently write

where we call vi the covariant components of the vector. Note that in the
Euclidean the numbers vi and vi are identical. Since its tangent space
and the dual tangent space are all isomorphic to one does not have to
differentiate between an upper and a lower index.
The previously defined scalar product between two vectors u and υ can
now be written as follows:

As expected, the scalar product is independent of the choice of our basis. We


also note that it easily extends to n dimensions.
At this point we will introduce the Levi-Civita symbol which is defined
by

This can be used to define the vector product using the index notation

Alternatively, we could state ei × ej = εijkek. A crucial point to note here is the


position of the vectors. The object ei × ej should really be viewed as an area
element and not a vector as such. In three dimensions, areas and vectors are
dual objects in the sense that a single vector can uniquely characterise an
area, see the discussion after Eq. (1.1). The direction of the vector determines
the orientation of the area and the length of the vector determines its size.

1.2. Manifolds and Tensors

1.2.1. Tangent space and vector fields

We start with an n-dimensional vector space V with coordinates {X1,...,Xn}


such that V is an open subset of ℝn. If we consider the surface of the sphere
defined by the condition (X1)2+(X2)2+(X3)2 = 1 then, as discussed earlier,
we cannot easily define vectors which are part of this space. In this example
we view embedded in ℝ3, the latter being a vector space.
In what follows we will need the notion of a smooth function f. This is a
function f : V ↦ ℝ whose partial derivatives of any order exist. This means we
can differentiate f with respect to any of its variables as many times as
needed. The space of all such functions is denoted by C∞(V).

Definition 1.1 (Curve). A curve C is a smooth mapping from an interval C :


I ↦ V where I ⊂ ℝ. We generally write the curve C as Xi = Xi(τ) where τ is
the parameter of the curve.

When we think about standard calculus on the real line, we can find the
derivative of a function (provided it exists) and use the value of the derivative
at a given point to construct the tangent at this point. We are now extending
this idea to higher dimensions.

Definition 1.2 (Tangent vector of a curve). The tangent vector is given by T


= Tiei where the components Ti of this vector are given by

This is quite an intuitive definition in agreement with Newtonian


mechanics. We can think of Xi(τ) as the trajectory of a particle and Ti(τ)
would then correspond to the particle’s velocity. We are tempted to write
Ẋi(τ). In this case one naturally identifies τ with Newtonian time. It is clear
that the components Ti depend on the choice of coordinates in V, however,
the tangent vector to a curve should be independent of that choice.
We could have chosen a different parameter, λ say, for our curve C so that
Xi(λ(τ)). Then the tangent vector to the curve is

It is possible to always parametrise a curve C such that the tangent vector to


this curve has unit length.

Definition 1.3 (Affine parametrisation). The parameter λ of a curve C is


called affine if the tangent vector to this curve has unit length.
Example 1.1 (Affine parametrisation). Consider the curve
The tangent vector to this curve is given by
and it has length Let us define a new parameter
s by t = sin(s), in this new parameter our curve is given by (x, y) = (cos(s),
sin(s)). In this parametrisation the tangent vector is T = (− sin(s), cos(s))
which has unit length. s is nothing but the arc length of the curve.

Definition 1.4 (Tangent space TpV ). Let p be a point in V and consider all
possible curves C which contain p. We define the tangent space TpV to be the
set of all tangent vectors of curves at p.

In Euclidean space, we are used to add and subtract vectors at different


points, as these vectors were all constant. In curved spaces, vectors of the
same tangent space at some point p can be combined, however, vectors at
different points are independent objects.
Our next question is the following. Our space V came equipped with
some coordinates Xi which suggests that there should exist some natural basis
vectors ei for the tangent of a curve. In particular, this choice should be such
that the vector T is invariant under coordinate changes. Let f be a smooth
function on V and consider the object F = f(Xi(τ )) which is a mapping F : ℝ ↦
ℝ, or simply a function on the real line.
Let us use the chain rule to compute the derivative of F with respect to τ,
we find

Since this is true for all smooth f, we are led to identify the basis vectors ei
with the partial derivatives ∂/∂Xi. Note that this object corresponds to the
directional derivative of the function f along the direction Ti. In modern
differential geometry one speaks of the vector field T which is a mapping
from the smooth functions into the tangent space. In vector calculus one
would write T ·∇ f. To check if this vector T has indeed the desired
properties we first define what is meant by a coordinate transformation.

Definition 1.5 (Coordinate transformations). Consider the vector space V


with coordinates {X1,...,Xn}. Consider a set {Y1,...,Yn} such that the Yi are
smooth functions of the old coordinates Y1(X1,...,Xn), Y2(X1,...,Xn), etc. For
{Y1,...,Yn} to be a coordinate system of V, it is required that the matrix of first
derivatives

the Jacobian, is invertible.

Let us now return to the tangent vector T. The concept of this vector
should be independent of the choice of coordinates. So, let us consider two
coordinate systems {X1,...,Xn} and {Y1,...,Yn}. Using the chain rule we have

Therefore, the partial derivatives of f change under coordinate


transformations. For the vector T to be invariant under coordinate
transformations, it is required that the components Ti transform inversely to
the partial derivatives, this means we require

where the subscripts X, Y indicate that the components Ti are those


corresponding to the coordinates Xi and Yi, respectively. This is a somewhat
involved notation which is only used in the following calculation to
emphasise the underlying mathematics. In this case we have

We have therefore shown that the vector T is invariant under coordinate


transformations, however, its components Ti do transform in a specific way.
At this point the calculation (1.22) seems a little daunting, despite that it will
soon become second nature.
Some of the older literature on differential geometry starts the entire
subject with the definitions of transformation properties of vectors. Often the
new coordinate system is denoted by a prime X′i which simplifies the
notation. A vector in the primed coordinate system also is indicated by a
prime.

Definition 1.6 (Contravariant and covariant vectors). Let Xi be the


coordinates of the vector space X and let X′i be a different set of coordinates.
A contravariant vector Ti transforms under coordinate transformation such
that

A covariant vector Si transforms under coordinate transformation such that

The covariant components of the vector Si are defined with a lower index
which indicates we should be able to write the vector S as follows:

where Ei are some new basis vectors with an upper index. Our next task is to
interpret these basis vectors. Often the same letter is used for both types of
basis vectors ei and ei where only the index position changes. For now it is
slightly clearer to distinguish them.

Definition 1.7 (Dual vector space). Let V be a vector space. The dual vector
space or dual space V* is defined to the space of all linear mappings from V
into ℝ. This means for E ∈ V* and e ∈ V we have E : V ↦ ℝ and denote this
as E(e) or E · e.

In Euclidean 3-space V = we would identify standard column vectors


with the elements of the space V ∈ V and row vectors with the elements of
υ* ∈ V *. We would have

We note that this is exactly the scalar product in Eq. (1.14). It is worth
mentioning that the Einstein summation convention applies to one upper and
one lower index which ensures that objects from the correct spaces are
matched up to produce meaningful equations.

Note. For readers who have completed Quantum Mechanics, the Dirac ket
vector would naturally be associated with an element of V and the bra
vector with an element of V *. So is a mapping from V to the complex
numbers in this case. is the inner product between the two states.
The spaces appearing in Quantum Mechanics are infinite dimensional and
complex.

We recall the identification ei = ∂/∂Xi where the ei are the basis vectors of
TpV. Therefore the Ei are the basis vectors of (TpV )*. Let us denote these by
Ei = dXi, at this stage this is nothing but a name, and we have

meaning that the Ei and ej are dual basis vector. One could equally write
as in Eq. (1.9). Next, we will show that one can indeed interpret
the dual basis vectors as the coordinate differentials.
Let f be a smooth function and T = Ti ei be an element of TpV. Let us start
with the usual classical differential

According to our naming, this object should be an element of (TpV )*. This
all works out nicely when we consider
This means df(T) is nothing but the directional derivative of f along the vector
T. This is a very nice result which connects vector calculus with geometry,
hence the name differential geometry.
We will now introduce the comma notation whereby partial derivatives
with respect to the coordinate Xi are denoted by ,i or by ∂i, this means we
define

This is very convenient, in particular in vector calculus. In this case we


are working in and have

The components of df in Eq. (1.28) are given by grad f. The curl operator is
specific to three (and seven) dimensions and does not exist in this form in
four dimensions for instance. Many vector calculus identities can be proved
very efficiently using the index notation. In vector calculus, in three
dimensions, it is common to use the symbol ∇, however, in differential
geometry this is reserved for the covariant derivative.

1.2.2. Tensors

The definitions of contravariant and covariant vectors can be extended to


objects with more than one index. The motivation for this stems from the fact
that some objects in nature simply cannot be written as scalar or vector
quantities. One such example is the Cauchy stress tensor which is a rank-2
tensor. The rank of a tensor simply counts its number of indices. Another
example is the permittivity tensor in electromagnetism. The permittivity
tensor relates the exterior electric field E to the electric displacement field D
in the medium. In linear elasticity theory the stresses and strains are related
by the material tensor, which is a rank-4 tensor, it relates two rank-2 objects.
In simple words, a tensor is something where every index transforms
correctly under coordinate transformations. More precisely this is written as
follows.

Definition 1.8 (Tensor of type (p,q)). A tensor of type (p, q) is an object


with p contravariant and q covariant indices, so we write Under
coordinate transformations it transforms according to

The rank of a tensor is the number of its indices, r = p + q.


Tensors of the same type can be added and subtracted, and we can
multiply any tensor by an arbitrary (real) scalar. For tensors with rank r ≥ 2,
there is an additional operation, namely the contraction.

Definition 1.9 (Contraction). Let be a tensor of type (p, q). The


object

is a type (p − 1, q − 1) tensor with rank p + q − 2. We say that we have


contracted over one pair of indices. Note that we sum over the index k.

For a rank-2 tensor, the contraction corresponds to taking the trace of a


square matrix, A for instance. We have tr A = Aii which is the sum of the
diagonal elements. One can think of the contraction as a higher dimensional
version of the trace. For instance, if we begin with a rank 2n tensor, then we
can in principle contract n-times and arrive at a scalar. For a tensor of rank 2n
+ 1 we can also contract n times, however, in this case one arrives at a vector.
Let Tab be a rank-2 tensor, then we define the symmetric and skew-
symmetric parts as follows:

Every tensor can be written as Tab = T(ab) + T[ab]. A rank-2 tensor is called
symmetric if Tab = Tba and skew-symmetric if Tab = −Tba. This is in complete
analogy to matrices. Let A be a square matrix, then one defines the symmetric
part by sym(A) = (A + AT )/2 and its skew-symmetric part by skew(A) = (A −
AT )/2. It is natural to view rank-2 tensors as square matrices, however, one
has to be very careful since tensor components transform under coordinate
transformations. Note that one can generalise Eqs. (1.36) and (1.37) to higher
rank tensors, however, we will not use this notation as it is easier to read
equations written out explicitly, for the time being.
In n dimensions any symmetric rank-2 tensor has n(n + 1)/2 independent
components, and any skew-symmetric rank-2 tensor has (n−1)n/2
independent components. We note that these two numbers add to n2 as
expected.
We have seen that ∂f/∂Xi transforms like a covariant vector under
coordinate transformations. Let us check how that partial derivative of a
contravariant vector Ai transforms under coordinate transformations

The first term on this last line transforms correctly, however, we have an
additional second term. Therefore, the object is not a (1, 1) tensor and
hence the partial derivative is not a ‘good’ derivative operator. We need to
define a new derivative operator which maps tensors to tensors.
However, we can be clever and consider the object given by

We can view this as the skew-symmetric part of this partial derivative


without the factor of 1/2, or the Faraday tensor of electromagnetism. If we
swap the indices i and j in Eq. (1.38) and then compute the transformation
properties of we note that the terms which do not transform correctly
cancel each other. Thus, does transform like a (1, 1) tensor which is a nice
little result since we have not yet defined a meaningful derivative operator.

1.2.3. Manifolds and metric

Before defining what is meant by a manifold, let us briefly discuss the 2-


sphere again. Normally, when we think of this space we view it embedded
in . However, the geometrical properties of the sphere should really be
independent of this embedding. The idea of the manifold is to make this more
precise.

Definition 1.10 (Manifold). An n-dimensional smooth manifold is a set


together with a collection of open subsets {Ωα} and a collection of
coordinates with α ∈ ℕ and i = 1,..., n. These satisfy:
(i) Every point p ∈ is also in at least one {Ωα}. This means that the
collection {Ωα} will contain every point p, we say that the manifold
is covered by {Ωα}.
(ii) For each α the coordinates are bijective functions : Ωα ↦ Oα
⊂ ℝn where the Oα are also open.
(iii) If any Uα and Uβ overlap, then the coordinates and are
related by a coordinate transformation on the intersection Uα ∩ Uβ.

A manifold is something very natural in a way. It means we have some


arbitrarily shaped space we are trying to understand, however, we cannot do
this directly. So, we find some open subsets with coordinates which cover
this space using one patch at a time. These patches all come with some
coordinates which then allows us to locally understand this particular patch.
Whenever two such patches overlap, we can transform coordinates from this
patch to the next and carry on. In this way we are able to study the entire
manifold. Physicists tend to call the for a specific α a coordinate
system while mathematicians prefer the word chart. One can also define
complex manifolds by working with ℂn instead of ℝn. Figure 1.2 is useful in
visualising the idea of a manifold.

Fig. 1.2 Manifold covered by two coordinate patches Ωα and Ωβ. The intersection Ωα ∩ Ωβ is
indicated by the shaded regions. There exists a coordinate transformation between highlighted parts of
Uα and Uβ.

Many spaces can be described using the concept of a manifold, examples


are flat spaces in n dimensions, the n-spheres and n-tori. Other examples
are the Other examples are the Möbius strip and Klein’s bottle.
Having defined a manifold on which we can define scalars, vectors and
tensors, we must now add some additional structures. In the end we want to
build a physical theory which means we want to be able to measure distances
between points, lengths of vectors, and also angles between curves for
instance. This additional structure is provided by the metric gab which is a
symmetric rank-2 tensor with inverse gbc so that One way to
motivate the metric is as follows. Let us start with Pythagoras’ theorem. If the
point P has coordinate (x, y, z) with respect to the origin, then the distance of
P from that origin is given by s2 = x2 + y2 + z2. Next, we consider small
values ∆x, ∆y, ... and then pass to the infinitesimal limit and we write

This says that the infinitesimal distances between points is position


independent. This space is the same everywhere. When we introduced
vectors earlier we discussed the sphere and noted the difficulties of
defining vectors on the surface of the sphere. In order to define the ‘correct’
distance between two points on the sphere, we cannot simply use Pythagoras’
theorem as this line connecting the two points would go through part of the
sphere and not be on its surface. In order to be able to measure distances
correctly for all kinds of spaces, we need to generalise Eq. (1.40). Therefore,
we need to introduce arbitrary functions in front of all the squared
infinitesimal and all the possible cross terms. This means we will write

where all the gij are functions of the coordinates. The object gij is called the
metric tensor, often just called the metric. The object ds2 is called the line
element, it is often used as a synonym for metric. Since dXidXj is symmetric
in i and j, gij must also be symmetric. In 3 dimensions the metric contains 6
functions, in 4 dimensions it contains 10 functions, and in dimension n there
are n(n + 1)/2 functions in the metric.
Another way to motivate (define) the metric is by going back to an
arbitrary vector V which we could write in two different ways V = ViEi =
Viei. So far we avoided the question of how the components Vi are related to
those of Vi. We can define gij to be the mapping which relates these quantities
by requiring Xi = gijX j. This corresponds to viewing the metric as the inner
product of this space. However, this will only work if the metric is positive
definite and it turns out that in General Relativity our metric tensor is not
positive definite. We will put this into the following definition.
Definition 1.11 (Metric and line element). Let be an n-dimensional
smooth manifold. The Riemannian metric tensor gij defines the positive
definite inner product such that

If is only non-degenerate,1 then we call the corresponding metric pseudo-


Riemannian. The line element ds2 is

where the Xi are the coordinates in a patch of .

If we evaluate the metric tensor at some point on the manifold, then,


being a symmetric n × n matrix, it will have n real eigenvalues, taking into
account multiplicity. The number of positive, negative or zero eigenvalues
cannot be changed by coordinate transformations.

Definition 1.12 (Metric signature). The signature of the metric is the


number of positive, negative and zero eigenvalues, however, we will not
encounter a metric with zero eigenvalues. In general, the pair (p, q) of
integers denotes the number of positive and negative eigenvalues,
respectively. Sometimes the signs of the eigenvalues are made explicit,
instead of writing (3, 1) one often writes the signature is (+, +, +, −) or (−, +,
+, +) as the order can be changed by relabelling coordinates.

Definition 1.13 (Lorentzian metric). A metric is called Lorentzian if it has


signature (p, 1) or (1, q).

The Lorentzian metric is the fundamental variable of General Relativity.


The theory is formulated in four dimensions, one time direction and three
space directions, and the Einstein field equations are a set of 10 nonlinear
partial differential equations in the components of the metric tensor. Solving
the Einstein field equations means finding the metric functions.

1.2.4. Examples of metrics

This section will contain some examples of the Euclidean metric in two and
three dimensions using different common coordinate systems. The reader is
encouraged to work through these examples as they will be used at various
points in the remainder of the book. The standard line elements of Euclidean
space using Cartesian coordinates are

in two and three dimensions, respectively.

Example 1.2 (Polar coordinates in two-dimensional (2D)). Let us


introduce x = r cos φ and y = r sin φ, then

Adding up the last two equations shows that Euclidean space in polar
coordinates is given by

Example 1.3 (Cylindrical coordinates in 3D). We introduce x = ρ cos φ, y =


ρ sin φ and z = z. We can follow the previous calculation and arrive at the
metric of Euclidean space in cylindrical coordinates

The use of ρ instead of r is useful as the letter r is generally associated with


the Euclidean distance from the origin. In this case r2 = ρ2 + z2.

Example 1.4 (Spherical coordinates in 3D). Our conventions for the


spherical coordinate system are
Computing all the terms dx, dx2, dy, ... is slightly lengthy but straightforward.
The result is

Therefore, we can write the metric tensor as

The determinant of this metric tensor g = det gij is the product of its diagonal
components, and This is the Jacobian of the transformation from
Cartesian to spherical coordinates which is well known in vector calculus.

Following on from the previous examples, without providing further


details at this point, we state that the volume of a space with given metric
tensor gij is given by

Example 1.5 (Surface of the 2-sphere). Continuing on from the previous


example, we set r = R = const. in order to describe the surface of the sphere.
The corresponding line element now becomes

It is customary in General Relativity to introduce the notation dΩ2 = dθ2 +


sin2 θdϕ2 where dΩ2 is the line element of the unit sphere.

Example 1.6 (Poincaré hyperbolic disk). The line element of the Poincaré
hyperbolic disk is given by
with x2 + y2 < 1. This is a really interesting space which inspired some of the
works of Escher, Circle Limit III (1959) and Circle Limit IV (1960), see
Escher (2015). This is an example of a space with constant negative
curvature, we will define later what this means. Such spaces are important in
Cosmology where we will meet them again.

Example 1.7 (Minkowski space). The line element of Minkowski space is

This is our first example of a Lorentzian metric with signature (−, +, +, +),
here c is the speed of light. This is required because we need the physical
units to match. Alternatively, one could have divided the spatial parts by c2
instead of multiplying the time component. However, conventionally the line
element has units of length. The metric functions are dimensionless.
It is fair to say that Minkowski space is the most important space of
theoretical physics. Consequently, the Minkowski metric is denoted by a
separate symbol, namely ηij.

1.2.5. Geodesics

Let us consider a manifold and a curve C given by Xi = Xi(τ ) with


parameter τ. Let us simply start with a 2D flat space with line element ds2 =
dx2 + dy2 and coordinates Xi = (x, y). Our curve C is now given in parametric
form (x, y) = (x(τ), y(τ)). Let us now eliminate that parameter in the
parametric form and write our curve as y = y(x), just like in calculus. We will
now show that ds, the square root of the line element, corresponds to the arc
length of this curve. Since y = y(x), we have dy = y′(x)dx and therefore ds2 =
dx2 + y′(x)2dx2 = (1 + y′(x)2)dx2. Now we can write
which is the well-known calculus formula.
Consider an arbitrary curve C given by Xi = Xi(λ), where λ is the affine
parameter, see Definition 1.3. In a space with metric gij we have

where the dot means differentiation with respect to λ.


This means, if we have two points which are connected by C, then we can
use the arc length s to determine the distance between these points along that
curve. However, we are much more interested in special curves, namely those
that minimise the distance between our points. In other words, what is the
shortest distance? From a physical point of view this is in fact the most
natural question, related to the principle of least action (Hamilton’s principle)
which is fundamental to modern physics.
In order to find the shortest lines between any two given points, we treat
Eq. (1.62) as the action functional and the integrand as the Lagrangian
function of the system.2 Therefore, we write

where we must remember that the metric is a function of the coordinates Xi.
Note that our Lagrangian does not depend on λ explicitly. The Euler–
Lagrange equations are given by

Since we work in n dimensions, we will find n equations, labelled by the


index k. The left-hand side of Eq. (1.64) is easily computed

We also find
The term in the brackets is the result of applying the chain rule. In order to
find the right-hand side of Eq. (1.64), we need to differentiate the latter with
respect to λ. Remembering that our parametrisation is affine we have L = 1
and hence

Consequently the complete Euler–Lagrange equations are given by

Next, we will rewrite these equations by introducing a new symbol, called


the Christoffel symbol, which plays a paramount role in Differential
Geometry and General Relativity. We start with

and apply gnk to this equation and arrive at

where the Christoffel symbol or connection is given by

This object is symmetric in the lower pair of indices. Equation (1.70) is the
so-called geodesic equation. Sometimes the Christoffel symbol is denoted
differently, using curly brackets

which is useful if one needs to distinguish the Christoffel symbol from a


different connection . Note that older textbooks tend to use the curly
bracket notation. Let us summarise this result as a definition.
Definition 1.14 (Geodesics). A curve C given by Xi(τ) with affine parameter
τ is called a geodesic if it satisfies the equation

where the are given by Eq. (1.71).

Every curve C which satisfies this equation connects points via shortest
distances. As expected, this depends on the metric functions. In 3D Euclidean
space the metric looks like the identity matrix, in other words the line element
is ds2 = dx2 + dy2 + dz2. Hence, all partial derivatives of the metric vanish
identically and the geodesic equation reduces to which we can
integrate twice. We find that straight lines are the shortest lines in flat space,
as one would expect. The geodesic equations also contain the first clue
towards a geometrical theory of gravity which we will discuss in Sec. 2.2.
By definition, the Christoffel symbol is symmetric in its lower pair of
indices. Therefore, this pair gives rise to n(n + 1)/2 components in n
dimensions. There are no other symmetries, so the third index can take any
value and hence the Christoffel symbol has n2(n + 1)/2 independent
components in n dimensions. In dimensions 2, 3, 4 the Christoffel symbol has
6, 18, 40 components, respectively.
For a curve with affine parametrisation, the geodesic equations can also
be derived from considering the simpler Lagrangian

this is Eq. (1.63) without the square root. The advantage of this formulation is
that it provides us with an efficient tool to compute the components of the
Christoffel symbol. We will work through some examples next which show
the two different ways of finding the Christoffel symbols and the geodesic
equations.

Example 1.8 (Euclidean space with polar coordinates). We choose


coordinates X1 = r and X2 = φ, the line element of this space is given by ds2 =
dr2 + r2dφ2, so that the metric and the inverse metric are
Let us compute the Christoffel symbol components which are defined by Eq.
(1.71).

Since the inverse metric is diagonal, the only contribution from g1k
(remember we are summing over k) comes from k = 1, therefore

Also the metric is diagonal which simplifies the calculation. We find

which follows directly from the previous equation. Now we choose k = 2

When i = j = 1 or i = j = 2, all terms vanish. However, if we choose i = 1 and j


= 2 we find

therefore, the other components are given by

We found all six components.

Example 1.9 (Euclidean space with polar coordinates again). Now we


show the second method of finding the Christoffel symbol components. We
start with L = gij ẊiẊj which for polar coordinates and metric Eq. (1.75)
becomes

The two Euler–Lagrange equations of this Lagrangian are given by

By applying the product rule to the right-hand sides, we arrive at these two
equations

By virtue of the geodesic equations, we can directly read of the Christoffel


symbol components. From the r̈ equations we find

while the equation results in

This is in agreement with the previous example. However, this approach


tends to be more efficient for computing these components.

Example 1.10. Show that the geodesic equations based on the line element
ds2 = dr2 + r2dφ2 correspond to straight lines. Following on from the previous
example, the geodesic equations are given by

Consequently, we can integrate the second equation and find 2r2 = 2ℓ where
ℓ is some constant of integration. This means that the angular velocity of
geodesic curves is constant, or more physically speaking, angular momentum
is conserved. We can now replace the term in Eq. (1.89) and find

One can solve this equation as follows. First, we multiply by ṙ and integrate
which yields

where C1 is a constant of integration. Next, separation of variables and


integration leads to

where C2 is another constant of integration. Lastly, we can find φ(λ) by


integrating = ℓ/r2 which gives

where C3 is yet another constant of integration. In total we have four


constants of integration. This is consistent with having two second-order
equations to solve. It remains to show that the system of Eqs. (1.92) and
(1.93) are indeed straight lines in polar coordinates. For this we note that

Therefore, we can write


which looks like the equation of a straight line y = kx+m by choosing x, y and
the constants appropriately.

This calculation was rather painful and would have been rather trivial if
we worked in appropriate coordinates. There are two things we should take
away from this example: First, whenever we are interested in performing
explicit calculations we should ensure that our choice of coordinates is as
smart as possible. Second, there are many long calculations in differential
geometry which cannot be avoided, irrespective of coordinates, so it is a good
idea to practise them even if this means calculating straight lines in a rather
unpleasant way.
In cases when we need to compute the trace of the Christoffel symbol
there exists a useful and simple formula based on the determinant of the
metric tensor g = det(gij ). We start recalling the well-known formula

which we now rewrite in index notation applied to the metric tensor

Going back to Eq. (1.71) and summing over n and j we find

Example 1.11 (Euclidean space with polar coordinates). We already


computed the Christoffel symbol components, see Eqs. (1.80) and (1.81).
Consequently,

Now let us check this result using Eq. (1.100). The determinant of the metric
is g = r2 and hence Therefore, ∂φ(log r) = 0, and also ∂r (log r) =
1/r, both as expected.

1.2.6. Covariant derivative

Since the Christoffel symbol depends on the first partial derivatives of the
metric tensor, it will not transform like a tensor, recall Eq. (1.38). However,
this might be quite useful: Can we combine the partial derivative of a vector
Ai with the Christoffel symbol so that this new object transforms correctly
under coordinate transformations?
To answer this question, let us find the transformation properties of Aj.
A direct calculation shows that

from which we can compute the transformation properties of Aj, see


Exercise 1.9. One observes that the inhomogeneous terms, those that do not
transform correctly, are the same as those of ∂iAn with a different sign.
Therefore, we can use the partial derivative and the Christoffel symbol to
create a new derivative operator which would transform like a tensor under
coordinate transformation. Before defining the so-called covariant derivative,
we introduce this more formally and then show that we arrive at the same
conclusions despite following a rather different route.

Definition 1.15 (Covariant derivative). A covariant derivative (sometimes


derivative operator) ∇a on a manifold is a mapping which takes a type (p,
q) tensor to a tensor of type (p, q + 1) with the following properties:

(i) For any smooth function f the covariant derivative coincides with the
partial derivative

(ii) The derivative is linear, this means for all α, β ∈ ℝ


where the dots indicate tensors of arbitrary rank of the same type.
(iii) The derivative satisfies the Leibniz rule (or product rule)

(iv) The derivative commutes with contraction

Let us briefly discuss the implications of the fourth property. For the
derivative to commute with the contraction is equivalent to the requirement
Recalling that means that

or equivalently ∇agki = −gij gkm∇agjm. The interesting point is that there are
two possibilities of satisfying this relationship. The simplest solution is to
require that the covariant derivative of the metric tensor vanishes. This leads
us to the next definition.

Definition 1.16 (Metricity and non-metricity). A covariant derivative is


called metric compatible or simply metric if it satisfies ∇agij = 0 and non-
metric otherwise. The object of non-metricity is defined by ∇agij = Qaij, it is
a rank-3 tensor symmetric in the last two indices.

Next, let us have a closer look at the first property of the covariant
derivative which states that ∇a f = ∂af. Let us take this relation and
differentiate again covariantly. This gives rise to the object ∇b∇a f = ∇b∂a f.
Had we started with the indices the other way round, we would have arrived
at ∇a∇b f = ∇a∂b f and there is no a priori3 reason why these two should be
the same. Hence, covariant derivatives of scalars and tensors in general do
not commute which motivates the next definition.

Definition 1.17 (Torsion). A covariant derivative is called torsion-free if it


satisfies

for all smooth functions f. It contains torsion if ∇a∇b f ≠ ∇b∇af. The


torsion tensor is defined by

Our next task is to construct this covariant derivative explicitly. We know


how it acts on scalars and we also know that in flat space it must correspond
to partial differentiation. Let us then consider the object ∇iAn. Recall that
partial derivatives of vectors do not transform like tensors, so ∇iAn and ∂iAn
should differ by some quantity which also does not transform like a tensor
under coordinate transformations. This difference should only depend on An
and the geometry of the space, hence we wish to write

and need to determine the coefficients . These are called the connection
coefficients or simply the connection.
First, we note that ∇i(AnBn) = ∂i(AnBn) since AnBn is a scalar. Using the
product rule (∇iAn)Bn + An(∇iBn) = (∂iAn)Bn + An(∂iBn) we can find an
expression for ∇iBn which is given by

Furthermore, again using the product rule, we can deduce that the covariant
derivative acts on each and every index separately.
One of the main results of Riemannian Geometry in the context of
General Relativity is summarised in the following theorem, the proof of
which is important to understand and based on a simple but powerful idea.

Theorem 1.1 (Uniqueness of the covariant derivative). Let ∇a be the


covariant derivative on some manifold . If the covariant derivative is
metric compatible and torsion-free, then the connection coefficients are
uniquely given by the Christoffel symbol, .

Proof. Let us begin with ∇a∇bf and ∇b∇af which written out explicitly give

One of our assumptions is that the covariant derivative is torsion-free which


means that we assume ∇a∇b f and ∇b∇a f. Since partial derivatives also
commute, we must conclude that which means the coefficients are
symmetric in the lower pair of indices. Next, we use that the covariant
derivative is metric compatible, this means ∇agbc = 0. We write out this
identity three times with permuted indices

The idea of the next step is to solve those three equations for the unknown
coefficients. Let us now calculate (1.114) + (1.115) − (1.116) which gives

where we took into account that the metric is symmetric and that the
coefficients are symmetric. We apply gnc to this equation and isolate the
connection. This gives

which is indeed the desired result as the right-hand side matches Eq. (1.71)
with different index names.

Therefore, we have a unique covariant derivative and its connection


coefficients are given by the Christoffel symbol. It is useful to collect some
important formulae when the covariant derivative acts on rank-1 and rank-2
tensors, these will be required regularly

One can easily state the general formula for computing the covariant
derivative of a rank-n tensor, however, we will not need this for what follows
and hence omit this cumbersome equation. Before proceeding, we work
through an example.

Example 1.12 (2D Laplacian in polar coordinates). Let us compute the


quantity gij∇i∇jf where f is a smooth function. We have

When working with Cartesian coordinates (X1, X2) = (x, y), all Christoffel
symbol components are zero and we would find

In polar coordinates (X1, X2) = (r, φ), on the other hand, we have

which is the Laplacian ∆f in polar coordinates.

1.2.7. Parallel transport and geodesics


Definition 1.18 (Parallel transport). Let C be a curve with tangent vector
Ta, and let Va be a vector. We say that Va is parallelly transported along the
curve if
at every point on the curve. The vector Va can in principle be replaced by an
arbitrary rank tensor.

The idea behind this definition is to transport the vector Va along a curve
and changing its orientation as little as possible. Recalling the chain rule
Ta∂aVb = Ẋa∂aVb = dVb/dλ, we can alternatively write Eq. (1.127) in the form

which is a system of first-order ordinary differential equations in the vector


components Vi.

Example 1.13 (Parallel transport). Let us consider 2D Cartesian space with


polar coordinates (r, φ) and the curve Xa = (λ, k) with k being constant. The
tangent vector to this curve is given by Ta = (1, 0), so that a vector Vi
parallelly transported along this curve has to satisfy the two equations

Therefore, V1 = α and V2 = β/λ where α and β are constants which are fixed
by the initial position of the vector Vi. Since r(λ) = λ, we can also write Vi =
(α, β/r) and check directly whether Eq. (1.127) is indeed satisfied. We have

Hence, Vi is indeed the vector parallelly transported along the curve.

Instead of transporting an arbitrary vector along a curve, we can also


transport the tangent vector itself along the curve which defines it. Intuitively
speaking this corresponds to the straightest possible lines, sometimes called
autoparallels, which are not necessarily identical to our shortest possible
lines, the geodesics mentioned in Sec. 1.2.5. It is an interesting fact that in
general shortest lines and straightest lines differ. However, in spaces where
the connection is metric and torsion-free they are the same.

Theorem 1.2. Let ∇a be a metric and torsion-free covariant derivative, then


a curve with tangent vector Ta satisfying the equation

is a geodesic.

Proof. We assume that our curve is parameterised by an affine parameter λ so


that Xi = Xi(λ) and Ti = Ẋi. This gives

The chain rule allows us to write

and we arrive at

which is identical to the geodesic equation (1.70).

Sometimes this fact is used to define geodesics using the concept of


parallel transport.

Example 1.14 (Geodesics on the 2-sphere). The line element of the 2-sphere
was given by ds2 = R2(dθ2 + sin2 θdϕ2), we use spherical polar coordinates Xi
= (θ, ϕ). We note that the azimuth ϕ ∈ [0, 2π) corresponds to the longitude.
The geographic longitude is usually in the interval [−π, π) and the sign
denotes the direction, West or East. Likewise the inclination is θ ∈ [0, π),
and [π/2, π/2) for the geographic latitude.
The Lagrangian is given by so that we arrive at the
following equations

which we rewrite slightly so that they take this form

One can now read off the Christoffel symbol components

and likewise for the ϕ-equation

We will not solve these equations but note that solving them is similar to
Example 1.10. Equation (1.137) yields conservation of angular momentum
and can be used to eliminate from Eq. (1.136). One arrives at great circles.

Since shortest and straightest lines are not necessarily the same, this also
allows us to introduce the concept of the connection fully independent of the
metric via parallel transport. One could state that the vector Vi changes
according to when parallelly displaced from Xa to Xa +
dXa. A connection introduced in this way is often called an affine connection.
In n dimensions it has n3 independent components. On the other hand, we can
use the metric to define the shortest distance between two points which gives
us geodesics.

Definition 1.19 (Geodesics and autoparallels). Let be a manifold with


metric gij and independent connection . A curve Xb is called a geodesic if it
satisfies
A curve Yb is called an autoparallel if it satisfies

The object in the curly brackets stands for the Christoffel symbol used in
(1.72).
Exercise 1.14 shows that the connection in a space with torsion, is given
by

Since is symmetric, depends only on the symmetric part of the


connection. It is tempting to think that torsion will therefore not affect Eq.
(1.143), however, this is not correct. Let us compute the symmetric part of
the connection

At first this is counter-intuitive as the skew-symmetric part of the connection


is the torsion tensor, however, the symmetric part of the connection also
depends on the torsion tensor and hence in spaces with torsion geodesics and
autoparallels are different. In other words, shortest lines are different to
straightest lines.

1.3. Curvature

We are now within touching distance of curvature. Let us consider a small


(infinitesimal) parallelogram defined by two vectors at some point p. Now,
we take a third vector and parallelly transport this vector around this
parallelogram. In a flat space, transporting the vector around the
parallelogram will not change the vector, however, in a curved space the
change of this vector is related to a rank-4 tensor, the so-called Riemann
curvature tensor, which is related to the Christoffel symbols. This section is
entirely about this object.
Let us start with an example, the surface of a sphere, or for simplicity
imagine the Earth. Start at the equator where it intersects the Greenwich
Meridian (0° longitude) and, looking North, walk to the North pole, pretend
to walk over the water. Keep your head fixed. At the North pole, walk back
to the equator (walk sideways as we want to keep our orientation fixed) along
the W90° meridian. Then walk back (this time backwards) along the equator
to where we started. We should now be looking westwards, so we somehow
managed to turn our body by 90° despite keeping our relative orientation
fixed.
If we were to use some tape and a ball to trace our imaginary route out,
we would also notice that we walked along a triangle with three right angles,
this means 270°. This sounds like a contradiction, however, it all fits together
very well. In Euclidean space the sum of the three angles in a triangle is 180°,
in spaces with negative curvature this sum is less than 180°, and in spaces
with positive curvature the sum of the angles is more than 180°. The surface
of a sphere is a space of constant positive curvature.

1.3.1. Infinitesimal parallelogram

We begin with a vector Va at a point p and coordinates Xi. We denote by ξ i


and ζ i two infinitesimal quantities which we can view as the two directions
which define a parallelogram, starting from the point p(Xi). We also denote p′
= p(Xi + ξi) and p″ = p(Xi + ζi) and finally q = p(Xi + ξi + ζi), see Fig. 1.3. The
following calculation is quite tricky and it is easy to get lost, so it is advisable
to spend some time on this.
The equation of parallel transport of Va along ξi can be written as

where Vb is the vector at p. However, for Vb(Xi + ξi) we also have


Fig. 1.3 Vector Va parallelly transported along two different paths in an infinitesimal parallelogram.

where we used a Taylor series expansion and neglected terms of higher than
linear order because we assumed ξi to be infinitesimal. Combining both
equations we can now compute the difference of the vector at the two points
p and p′. We define and find

Next, the new vector Eq. (1.147) is parallelly transported along the other
way ζi towards q. This gives

We have to be careful here as the Christoffel symbol is evaluated at the same


point as the vector, this means at p′. Using again the Taylor series, we have

where we used Eq. (1.146) to re-write the first partial derivative term. We
solve for the term with the second derivatives and substitute into Eq. (1.149).
This yields

We now perform another Taylor series expansion on the terms evaluated at p


′. We note that the lowest-order terms cancel on the righthand side and we
arrive at

As everything is now evaluated at the same point p, we can drop this label
and simply write

We could have parallelly transported along ζi first and secondly along the ξi
direction. This would have resulted in Eq. (1.153) with ξi and ζi interchanged.
We are interested in the difference between those two vectors parallelly
transported along the two parts of the parallelogram

This quantity is given by

The term in the brackets is the so-called Riemann curvature tensor, a rank-4
tensor. However, before discussing this tensor in detail in the next section, let
us make a few observations about Eq. (1.155).
The change of the vector Vb when parallelly transported around the
parallelogram depends on the derivatives of the Christoffel symbols and
terms quadratic in the Christoffel symbols, there are no linear terms. It is not
clear at this point that the term in the bracket is really a tensor, it does contain
combinations of partial derivatives and Christoffel symbols similar to those
in the covariant derivative, however, this needs to be checked. Moreover,
there are some obvious symmetries, interchanging the indices d and a will
change the sign of the expression in the bracket.
This is also a good point to ask the following question: Begin with two
vectors ξi and ζi and parallelly transport ξi along ζi and then ζi along ξi. Does
this process result in a parallelogram? The answer is a somewhat surprising
no, in general. Starting with Eq. (1.148) we would find

Hence the difference between those two is

The right-hand side is zero if and only if the connection is symmetric in the
pair of lower indices. This relates to Definition 1.17 of the torsion tensor and
the interpretation that the torsion tensor measures the failure of a
parallelogram to close. For a given general connection the torsion tensor is
defined to be the skew-symmetric part of the connection

This relation follows from Eq. (1.109) by writing out the covariant
derivatives using the Christoffel symbol. Despite the connection not
transforming like a tensor, we should note that torsion is always a tensor
since the inhomogeneous parts in the transformation will cancel, similar to
Eq. (1.39).

1.3.2. Riemann, Ricci and Weyl tensors

We already mentioned that the term in the bracket of Eq. (1.155) looks like a
covariant derivative acting on a Christoffel symbol. We will now derive this
term differently. Let us start by computing ∇a∇dVb and keeping in mind that
∇dVb is a rank-2 tensor. We have

Next, we write out ∇d∇aVb and subtract both terms from each other, in other
words we are computing the commutator of two covariant derivatives. It
becomes clear that quite a few terms will cancel during this calculation, the
result of which is

Comparing the right-hand side of Eq. (1.161) with Eq. (1.155) we note that
these terms only differ by a minus sign and some renaming of dummy
indices. The good news is that Eq. (1.161) is clearly a tensor-valued object
because we defined the covariant derivative precisely in this way, it maps
tensors to tensors. This in turn implies that the parallel transport of a vector
along a small parallelogram is determined by a tensorial quantity. We can
promote this into a definition.

Definition 1.20 (Riemann curvature tensor). Let ∇a be a covariant


derivative and let Vb be a vector. The equation

defines the Riemann curvature tensor (short Riemann tensor) Radib.


This means we can write Eq. (1.155) as

and for completeness we write out the Riemann tensor explicitly


We note that there are different conventions when defining the Riemann
curvature tensor, for instance, the minus sign in Eq. (1.162). This sign
depends on how one defines the tensor. Had we started with a covariant
vector Vb, we would have found that

This sign is related to the different signs when applying the covariant
derivative to upper or lower indices. Analogously to the covariant derivative
one can generalise formulae (1.164) and (1.165) to higherrank tensors.
Let us briefly revisit Eq. (1.39). We defined Fij = ∇iAj − ∇j Ai, it turns
out that the Christoffel symbols coming from the covariant derivatives will
cancel. A direct calculation shows

which leads us to a nice little result which will be needed.

Theorem 1.3. For Fij = ∇iAj −∇j Ai = −Fji the following identity holds

Proof. We begin with

We add up these three equations and use that Fij = −Fji and = which
gives

again the Christoffel symbols all cancel. Since Fij = ∂iAj − ∂jAi and partial
derivatives commute, we arrive at the identity.
The Riemann tensor satisfies some important identities which are
summarised in a theorem. The proofs are based on direct calculations and are
a good exercise to get used to working with the Riemann tensor. The basic
idea is always the same, namely writing out the definition of the Riemann
curvature tensor with permuted indices.

Theorem 1.4. Let Rabcd be the Riemann curvature tensor defined by a


torsion-free and metric compatible covariant derivative. Rabcd satisfies the
following identities:
(i) Rabcd = −Rbacd.
(ii) Rabcd = −Rabdc.
(iii) Rabcd + Rcabd + Rbcad = 0.
(iv) ∇aRbcde + ∇cRabde + ∇bRcade = 0 (Bianchi identity).

Proof. (i) This follows from the definition, see for instance Eq. (1.165).
(ii) We start by recalling that ∇agbc = 0. Now we compute ∇a∇bgcd −
∇b∇agcd and express this using the Riemann curvature tensor, we have

which implies the second symmetry property.


(iii) We proceed similarly to the proof of Theorem 1.1. We write out
∇a∇bVc − ∇b∇aVc three times with permuted indices

These three equations are added up next, and we can now use the previous
theorem by introducing the notation ℱab = ∇aVb − ∇bVa and arrive at

We know that the left-hand side of Eq. (1.176) vanishes due to Theorem 1.3.
Since this equation is valid for all vectors Vi, we have proved the third
property.
(iv) Since ∇iVj is a rank-2 tensor, we have

On the other hand, we have Eq. (1.165) and by applying ∇a to this equation
we arrive at the second required equation

where we used the product rule on the right-hand side. The term ∇aRbijn is
the object we need for the Bianchi identity. Hence, we want to write out Eqs.
(1.177) and (1.178) three times with permuted indices. Starting with Eq.
(1.177) we have

Likewise, from Eq. (1.178) one gets

Now, if we calculate (1.179) + (1.180) + (1.181) and separately (1.182) +


(1.183) + (1.184) we note that the left-hand sides of these new equations
coincide, hence also their right-hand sides must be equal. Due to property
(iii) the terms involving Rabin will cancel and we find the relation
At this point we note that all terms containing covariant derivatives of the
vector Vk cancel and we are left with

which must hold for all vectors Vn and proves the Bianchi identity.

Theorem 1.5. Let Wabcd be a tensor satisfying (i) Wabcd = −Wbacd, (ii) Wabcd
= −Wabdc and (iii) Wabcd + Wcabd + Wbcad = 0, then Wabcd also satisfies

This means that the three properties (i)–(iii) of the Riemann curvature tensor
imply a fourth algebraic identity.

Proof. The proof is set as Exercise 1.19 and the solution will be given
explicitly. The idea of the proof is to write our identity (iii) four times with
permuted indices and to isolate the required two terms which make up the
fourth algebraic identity.

A rank-4 object with no symmetries has n4 independent components in


general. However, the Riemann curvature tensor has various symmetry
properties and hence the number of independent components is not obvious.
It turns out that in dimensions 2, 3, 4, the Riemann tensor has 1, 6, 20
components, respectively. The general formula is n2(n2 − 1)/12 and its
derivation is given as Exercise 1.20.
Since the Riemann tensor is a rank-4 tensor, we can consider its
contractions (traces) over pairs of indices. Since it is skew-symmetric in the
first and last pair of indices, we cannot contract over those indices. However,
we can contract over one index from the first pair and one from the second
pair.

Definition 1.21 (Ricci tensor and Ricci scalar). The Ricci tensor is defined
by
and is a symmetric tensor due to Theorem 1.5. In n dimensions this tensor has
n(n + 1)/2 independent components.
The Ricci scalar or scalar curvature is defined by

In four dimensions, for instance, the Ricci tensor has 10 independent


components while the Riemann tensor has 20. The corresponding trace-free
part of the Riemann curvature tensor is called the Weyl tensor, or the
conformal tensor. We mention its definition for sake of completeness but will
not discuss the Weyl tensor in detail henceforth.

Definition 1.22 (Weyl or conformal tensor). The Weyl tensor is

where we recall the notation introduced in Eq. (1.37).

Example 1.15 (Hyperbolic disk). We write the line element of Poincaré’s


hyperbolic disk with coordinates Xi = {χ, φ} in the form

and want to compute the Riemann tensor, Ricci tensor and the Ricci scalar. In
two dimensions the Riemann tensor has one independent component, so it
suffices to compute R1212. We begin with computing the Christoffel symbol
components using the Lagrangian approach

The geodesic equations are given by


and hence the Christoffel symbol components from the equations are

while the yields

The formula for R1212 is

By contracting over the second and fourth index, we arrive at the Ricci tensor
with components R11 = −1, R22 = −sinh2 χ and zero otherwise. This means in
particular that the Ricci tensor is proportional to the metric tensor, Rij = −gij.
Moreover, the Ricci scalar is simply R = −2. This justifies calling the
hyperbolic disk a space of constant negative curvature.

One tensor of great importance for General Relativity is the Einstein


tensor Gab which forms the basis of the Einstein field equations.

Definition 1.23 (Einstein tensor). The Einstein tensor is given by

and is a symmetric rank-2 tensor. In n dimensions this tensor has n(n + 1)/2
independent components, so in four dimensions it has 10 independent
components.

Theorem 1.6. The Einstein tensor satisfies


Proof. We start with the Bianchi identities

First, we contract over the indices a and e which gives

where we used Rabda = −Rbada = −Rbd. Next, we apply gcd and get

We relabel one index, take into account that and arrive at

which implies the property ∇cGbc = 0. One refers to this argument as the
twice contracted Bianchi identities imply this property.

1.3.3. Geodesic deviation equation

In the following we derive the so-called geodesic deviation equation. The


idea is to understand the behaviour of nearby geodesics, and we expect the
curvature of the manifold to determine this. Let us start with a surface Xa(λ,
s), we assume that for a fixed value s = s0, the curves Xa(λ, s0) are geodesics
with affine parameter λ. The tangent
vectors to the geodesics are given by

and the vectors connecting nearby geodesics are given by


We can always choose λ and s such that gabTaNb = 0.
From Eqs. (1.204) and (1.205) we get

Since partial derivatives commute these last two expressions must be equal.
Furthermore, the Christoffel symbols are symmetric in the lower pair of
indices which then implies

If we view Na as the vector connecting nearby geodesics, then Ti∇iNa would


describe its rate of change along the geodesic with tangent vector Ti. Hence
we can interpret Ti∇iNa as the relative velocity between nearby geodesics.
Therefore, the object Tj∇j(Ti∇iNa) would correspond to the relative
acceleration which we are going to compute.

where in the third step we used Eq. (1.162).


Next, we rewrite the first two terms as follows:

because we assumed that Xa(λ, s0) are geodesics which means Ti∇iTa = 0 by
Theorem 1.2. Consequently we find

which is the geodesic deviation equation. Sometimes one introduces the


notation

this is the derivative along the geodesic so that the geodesic equation can be
written as

and we could define the relative velocity by Vi = DNi/Dλ and the relative
acceleration by Ai = DVi/Dλ, in analogy with mechanics.
The geodesic deviation equation contains the second clue towards a
geometrical theory of gravity which we will discuss in Sec. 2.2.2.

1.4. Euler–Lagrange Equations

In classical mechanics the action is defined as

where L is called the Lagrangian which is a function of velocity q̇, position q


and time t. The Lagrangian has units of energy, and in classical mechanics it
is the difference between kinetic energy and potential energy L = T − V.
Hence, the action S has units of energy times time which happens to have the
same dimension as angular momentum. The equations of motion of the
system described by the Lagrangian L are found by using the calculus of
variations, one wishes to find q such that S is a stationary point. In order to
find such q, we begin by considering a small change in the position q + δq
with δq ≪ 1, however, we will not allow changes to q at the end points, this
means δq(t1) = δq(t2) = 0. Then, to first order in δq we find

In the following, we will only keep terms up to first order in δq. Let us denote
the change in S due to the change in the position by δS so that

which means we can now write δS explicitly as

As we wish to write the integrand using the small quantity δq we will use
integration by parts on the second term by writing

The first term in this is a total derivative and will not contribute to the
integral because we keep the end points fixed, therefore

The action is stationary if δS = 0 which implies the Euler–Lagrange equations

For a given Lagrangian L the Euler–Lagrange equations are the equations of


motion of the physical system.

Example 1.16 (Harmonic oscillator). Let us have a quick look at the


standard example of a harmonic oscillator with kinetic energy T = mq̇2/2 and
potential energy V = kq2/2 so that L = mq̇2/2 − kq2/2. We have

so that the Euler–Lagrange equation is given by

This is indeed the differential equation of the harmonic oscillator with spring
constant k and particle mass m.

The Euler–Lagrange equation can easily be generalised to systems with


more degrees of freedom. If we consider a system with i particles with
positions qi and velocities q̇i, respectively, then the system will be governed
by i differential equations given by

which means we have one Euler–Lagrange equation per particle.


When studying classical field theories like electromagnetism, we can no
longer use point particles to describe them, but need to study fields. This
means our variables will also become functions of the spatial coordinates. In
classical mechanics all quantities are functions of time only. In the case of
field theories we begin with a field ψ = ψ(t, x, y, z). Its action is

where ∂iψ stands for all possible (first) partial derivatives of the field ψ. The
volume is denoted by Ω. If we insist on S having dimensions of energy
multiplied by time, then ℒd3x must have units of energy and in turn ℒ must
have units of energy density, this means energy per volume. For this reason ℒ
is generally called the Lagrangian density while the Lagrangian would be

which is a function of time only.


The derivation of the Euler–Lagrange equations is analogous to the single
particle classical mechanics case and is based on considering a small change
δψ which is kept fixed at the boundary ∂Ω. The result is given by

and the presence of the extra terms simply follows from integration by parts
with respect to the other variables. We can write this in a much neater way
using the index notation and find

Lastly, if the Lagrangian density depends on several different fields ψA, then
we would have A Euler–Lagrange equations, one for each and every field
similar to Eq. (1.225).

1.5. Further Reading

One particular issue that was missed out is the so-called Lie derivative. This
is a connection independent derivative operator. It provides a measure of the
change of a vector (or tensor in general) along the flow of another vector
field. Let U and V be two vectors, then the Lie derivative of V with respect to
U is defined by ℒUVa = Ub∂bVa − Vb∂bUa. This expression transforms like a
tensor, similar to Eq. (1.39). The Lie derivative is a useful tool for more
advanced topics of General Relativity and Cosmology. It is discussed in more
detail in the recommended books.
Differential Geometry itself is a large research field in Mathematics, there
are plenty of textbooks written which cover this field. Three well-known
books aimed primarily at mathematicians are by Eisenhart (1997), a classic
originally published in 1926, and the books by Bishop and Goldberg (1980)
and do Carmo (1992) both of which make for an interesting and useful read.
It turns out though that the index notation used for most parts of this book has
been superseded in Mathematics by the use of differential forms and the so-
called index-free notation.
However, most of the theoretical physics literature is written in the index
notation. Other recommended books for further reading very much reflect
this point, they are more aimed at theoretical physicists than mathematicians.
The main theme in all of them is the interplay between Physics and
Geometry. The books by Isham (2001) and Nakahara (2003) both aim to
introduce the concepts of modern differential geometry to the reader while
making links with physics. The most comprehensive book which discusses
the concepts of Geometry and Physics in great detail is by Frankel (2012). Its
almost 750 pages cover every important physical theory using the language
of geometry. It is a great book but requires dedication from the reader.
The recommended literature is by no means complete, it simply reflects
the author’s suggestions for further reading.

1.6. Exercises

The concept of a vector

Exercise 1.1. Let a, b, c be given vectors which do not all lie in the same
plane. Show that the volume of the parallelepiped spanned by these vectors is
given by V = |a · (b × c)|.

Exercise 1.2. Show that the so-called scalar triple product satisfies a · (b × c)
= c · (a × b) = b · (c × a).

Exercise 1.3. What is a · (b × c) in index notation?

Exercise 1.4. One can write εijkεlmn as follows:


Here det stands for the determinant of the matrix. Use this to prove the
following three identities:

Exercise 1.5. Show that a × (b × c) = (a · c)b − (a · b)c. Show this again by


using the index notation.

Exercise 1.6. Show that εijk = (ei × ej) · ek.

Exercise 1.7. Prove the following vector calculus identities in index notation

Are you convinced yet that the index notation is a good thing? If not, prove
the above using the standard approach with explicit vector components.

Manifolds and Tensors

Exercise 1.8. Show that the Christoffel symbol transforms as follows:

Exercise 1.9. Show that transforms as follows:


Exercise 1.10. Rewrite the line element of the hyperbolic disk in polar
coordinates, this should give

Exercise 1.11 (takes time). Following on from the previous exercise,


introduce the new radial coordinate ρ = 2r/(1 − r2) and show that the line
element becomes

Lastly, introduce the hyperbolic ‘angle’ χ by using the new coordinate ρ =


sinh χ and show that we can now write

This somewhat motivates the name hyperbolic disk as this line element is
very similar to Eq. (1.58), with the trigonometric function changed to the
hyperbolic one.

Exercise 1.12. Show that the covariant divergence of the vector Ai can be
written as

Exercise 1.13 (takes time). Consider the line-element

First, compute the Christoffel symbol components using Eq. (1.72), and
second, find them using the geodesic equations obtained from the Euler–
Lagrange equations of Eq. (1.74). Next, solve the geodesic equations and
verify that they are straight lines. Find the coordinate transformation to
Cartesian coordinates.
Exercise 1.14. Follow Theorem 1.1 and show that the connection in a space
with torsion is given by

where is the usual Christoffel symbol and Tijk is the torsion tensor.

Exercise 1.15. Consider a covariant derivative satisfying ∇̃agbc = Qagbc.


Show that this implies ∇̃agbc = −Qagbc.

Curvature

Exercise 1.16 (takes time). We began Sec. 1.3 by discussing parallel


transport on the surface of a sphere. This exercise is the real thing: Recall the
line-element of the surface of the 2-sphere ds2 = R2(dθ2 + sin2 θdϕ2), for
simplicity we will set R = 1.

(i) Solve the equations of parallel transport along constant longitudes, also
called meridians. This means ϕ = ϕ0 for such curves.
(ii) Solve the equations of parallel transport along constant latitudes
sometimes called circles of latitude. This means θ = θ0 for such curves.
(iii) Parallelly transport the vector Vi = (1, 0), from (π/2, 0) to the North pole,
then to (π/2, π/2), and lastly back to the starting point (π/2, 0).
(iv) Find the angle between and

Exercise 1.17. Consider the line element ds2 = υ2du2 +u2dv2. Show that this
space has vanishing Riemann curvature tensor. Recall, it is sufficient to
compute R1212.

Exercise 1.18. In the previous exercise we showed that ds2 = υ2du2 + u2dv2 is
Euclidean space in awkward coordinates. Hence, there must exist a
coordinate transformation such that we can write this line element as ds2 =
dx2 + dy2. Find this coordinate transformation. Note: This was a prize
question set by Peter Hogan from University College Dublin when I was
learning General Relativity. My prize was a copy of ‘Gems of Hubble’ by
Mitton and Maran!

Exercise 1.19 (hard). Prove theorem 1.5.

Exercise 1.20. Show that in n dimensions the Riemann curvature tensor has
n2(n2 − 1)/12 independent components.
1 This means for all υ implies that u = 0.
2 For readers unfamiliar with the concepts of variations and Euler–Lagrange equations, a very brief
introduction is given in Sec. 1.4.
3 Literally translated from Latin it means ‘from the one before’ or ‘from the earlier’.
2
Einstein Field Equations

The aim of this chapter is to formulate Einstein’s theory of General


Relativity. While it is easy to state the field equations, it is much harder
combining the physics and mathematics involved to motivate this theory. In
doing so, we will also encounter different theories of gravity which can be
viewed as extensions of General Relativity. As a starting point to this chapter
we should state the main working hypothesis: The framework of differential
geometry is suitable to formulate a consistent and physically meaningful
theory of gravity. It is not clear whether this is possible or not. At this point
this is a huge leap of faith and it might not work out. The idea of
geometrising the different forces in nature has motivated a large amount of
research.

2.1. Some Physics Background

2.1.1. Newton’s theory of gravity

Newton’s law of universal gravitational attraction is usually written in the


form

where M and m are two idealised point masses and r2 = x2 + y2 + z2 is the


Euclidean distance between those masses. G is Newton’s gravitational
constant. Since the force F is a vector, forces have directions, one should
write more correctly
where r12 = r2 − r1 is the vector pointing from the position of the first mass to
the second mass. We will now choose r1 as the origin of our coordinate
system, we also set M = m1 and m = m2 to simplify the following discussion.
It turns out that Newton’s law of gravity is also valid for extended objects
and their gravitational fields behave as if their masses were confined to an
infinitesimal area at their centre of mass.
Newton’s second law states F = mg and hence the force per unit mass, or
gravitational acceleration, can be written in either of two ways

Here r is the Euclidean distance from the origin. We can introduce the
gravitational potential V because the gravitational force is conservative.
Recall that this means that the work done by gravity from one point to
another is path-independent or, more mathematically, the force vector is curl
free. The Newtonian potential φN is defined via the equation

which means the gravitational potential is given by

The subscript in φN indicates the Newtonian potential.


Relation (2.5) shows that the gravitational acceleration of a test mass is
independent of the amount of mass which is accelerated. This observation
goes back to Galileo and was well known before Newton formulated his
force law. However, in principle Newton’s theory would also be valid if we
could distinguish between inertial mass, the m on the left-hand side of (2.3)
and the gravitational mass, m on the right-hand side of (2.3). The
gravitational mass is the analogue of the electric charge in electromagnetism.
In Newtonian gravity, the equivalence between inertial mass and gravitational
mass is a fact deduced by experiments. It turns out that in General Relativity
this fact is a necessary part of the theory and not an additional ingredient.
Equation (2.4) also provides us with a conceptual question: Can we, in
principle, distinguish being accelerated by a powerful rocket from being in a
gravitational field with identical magnitude? Einstein’s equivalence principle
states that we cannot distinguish these scenarios, this means we assume the
physical equivalence of an accelerated reference frame and a gravitational
field.
One can integrate Eq. (2.4) over the volume V with boundary ∂V which
encloses the mass and arrive at Gauss’ law of gravity

In differential form this yields Poisson’s equation for the gravitational field

where ρ is the density distribution of matter and one can define the mass as
the volume integral of the density M = ∫V ρdV. The Laplacian is denoted by ∆,
in Euclidean coordinates it reads ∆ = ∂xx + ∂yy + ∂zz.
For a spherically symmetric source with constant density ρ = ρ0 = const.
we choose φ = φ(r) and have

This leads to the well-known result

2.1.2. Special relativity

All pre-relativistic theories of physics were based on the idea of a universal


time which could be used in experiments. From a geometrical point of view,
this means we were working in a flat 3D space with an external clock. The
key idea of special relativity is to incorporate time into a geometrical
framework which yields a 4D spacetime and formulate all physical theories
within this framework.
One way to argue that this is indeed necessary comes from the speed of
light c and our observations that no massive object can be accelerated to the
point where it travels at the speed of light. If we accept that the speed of light
is an upper speed limit, then we are immediately led to the following thought
experiment. Consider a train travelling at speed 3c/4 with a powerful rocket
launcher in one of the carriages which can fire a rocket at 3c/4. According to
the pre-relativistic Galilean principle of relativity an observer at rest should
observe the rocket to have a speed larger than the speed of light. However,
this would violate our assumption that the speed of light is an upper limit and
hence we are forced to reconsider what exactly happens in this kind of set up.
Following this through yields the concept of time dilation which is the
difference of the elapsed time of two events as seen by two observers in
relative motion. In simple words, clocks tick at different rates depending on
their speed. Clearly, this is in gross contradiction to the idea of a universal
time. An event in special relativity is a point p in Minkowski space which is
characterised by four coordinates, the time and three spatial coordinates.
Historically, the first clues of special relativity came out of Maxwell’s
equations which we will briefly discuss in the next section. The Maxwell
equations are invariant under a particular transformation of the time and the
space coordinates, the so-called Lorentz transformations. It was works of
Poincar´e, Einstein and Minkowski in the early 1900s which made the
connection between the constancy of the speed of light and geometry.
When considering Euclidean 3-space 3, one notices that this space is
invariant under rotations and under translations. Similarly, Minkowski space
with coordinates Xi = {ct, x, y, z}

is also flat and should have some symmetries. In this case we can make 4D
rotations and translations along each of the axes. It is these 4D rotations
which are called Lorentz transformations. This means we can find matrices
Lij such that

which means that they leave the Minkowski metric invariant. One can use
Eq. (2.12) to derive the mathematical properties of the Lorentz
transformations, for instance we can see immediately that det Lij = ±1. The
Lorentz transformations form a 6D group, the Lorentz group. However, as
this is slightly outside the scope of this book we will only state the explicit
form of boosts along the x-direction which are given by

where β = v/c is the velocity normalised by the speed of light, and


is the well-known Lorentz factor which is at the heart of
special relativity. When applied to the coordinates directly we have
, so that the Lorentz transformations (2.13) are explicitly given
by
Fig. 2.1 Visualisation of the Lorentz transformations. The angle α is determined by the velocity β and is
given by β = tan(α), when β → 1 which corresponds to the speed of light, then α = π/4 which would
correspond to the diagonals.

We can visualise these transformations in Fig. 2.1. The x'-axis corresponds to


all simultaneous events for which t' = 0, similar to the x-axis which can be
defined by t = 0. This means that the notion of simultaneity depends on the
velocity of the observer and two observers may not agree on two events being
simultaneous.
Probably the most important geometrical fact about Minkowski space is
that it contains two distinct regions which are separated by the so-called light
cone. The light cone is defined by ds2 = 0 and corresponds to diagonal lines
which indicate the path that light would take through spacetime. Two points
(events) for which ds2 < 0 are said to be separated by a time-like interval, in
this case we will be able to define proper time. Physically speaking this
means a particle with speed v < c can travel from one event to the other and
we can introduce a time order for these events. In the case of the light cone
one speaks of null intervals or light-like intervals ds2 = 0. Last, when ds2 > 0
one speaks of space-like intervals. This means that no signal can be
exchanged between these two events, there is no causal relationship between
them. One can visualise Minkowski space as in Fig. 2.2.
Fig. 2.2 Minkowski diagram. The diagonal lines correspond to the path travelled by light emitted at the
origin, they correspond to null intervals ds2 = 0. The dashed hyperbolas correspond to events separated
by constant space-like intervals ds2 > 0, the solid hyperbolas correspond to events separated by
constant timelike intervals ds2 < 0. The dotted straight line corresponds to a massive particle travelling
with constant speed v < c. If we rotate this figure along the vertical axis, the diagonal lines would
become a cone, hence the name light cone.

It is possible to formulate any physical theory within the setting of


Minkowski space so that the physical equations respect Lorentz invariance.
This means for instance that all propagation speeds should be smaller than
the speed of light. Moreover, in the limit of small velocities we should always
recover the physical theory in the standard Newtonian formulation.
The proper time seen by a local observer on its local clock is denoted by
dτ and is related to the line element simply by dτ2 = −ds2/c2, which can also
be written as

where Xa are the coordinates of the local observer. If this local observer is at
rest, the spatial coordinates are constant which means dx = dy = dz = 0 and so
dτ = dt. Therefore, the proper time of a local observer at rest in Minkowski
space agrees with the coordinate time t. However, when the observer moves,
this will change. Consider an observer moving with constant velocity vx
along the x-axis, then we can write dx = vxdt and dy = dz = 0 and the proper
time becomes
Using the previously used notation with the Lorentz factor γ, this becomes the
famous time dilation equation of special relativity

It means that the clock cycle depends on the relative velocity of the observer.
For 0 < vx < c the Lorentz factor is greater than one, γ > 1. Therefore, the
clock cycle is increasing (more time between ticks) and hence, the moving
clock appears to be running slower. For velocities much smaller than the
speed of light, this effect is very small and hardly detectable. However,
various experiments are in excellent agreement with the time dilation
formula.
Similarly, let us consider a rod of length l = x2 − x1 at some fixed time,
where x1 and x2 denote the rod’s endpoints. We assume this rod to be moving
along the positive x-direction. Now we boost along the x-direction using the
speed of the rod and compute l'. Since the rod is at rest in the boosted
coordinates, l' is the rod’s length at rest. Equation (2.15) immediately implies
that l' = x'2 − x'1 = γ(x2 − x1) = γl. As before, for 0 < vx <c we have γ > 1 and
therefore l < l' which implies that the moving rod is contracted. This effect is
called length contraction.
Consider a massive point particle and assume its motion is described by
the curve Xi(τ ). The path through spacetime which is described by the curve
Xi(τ ) is called the world line of the particle. The tangent vector to this curve
is dXi/dτ, when considering point particles in special relativity one often
denotes this tangent vector by ui calling it 4-velocity. Moreover, we use Eq.
(2.20) to change the parameter τ to physical time t which gives
In analogy to classical mechanics we define the 4-momentum by pi = mui and
interpret p0 = E/c as the energy while the spatial components correspond to
momentum. This yields the very well-known formula

which in lowest order is the mass–energy equivalence relation.

2.1.3. Maxwell equations

Using Euclidean 3-vectors the Maxwell equations are written using the
electric field E, magnetic field B with appropriate sources ρ and j. The
homogeneous Maxwell equations are

and the inhomogeneous Maxwell equations are given by

Here ρ is the charge density and j is the current density, and we work with
Gaussian units. In addition to the Maxwell equations we also need to specify
the force acting on a charged particle, this is the Lorentz force

Interestingly, in Gaussian units the electric and the magnetic field have the
same dimensions, in physical SI units, they differ exactly by units of velocity.
This is no coincidence and points toward special relativity.
It is also possible to formulate Maxwell’s equations in Minkowski space
with metric ηij. We are working with coordinates Xi = {ct, x, y, z}, i = 0, 1, 2,
3. The Faraday tensor Fij is defined by
it is skew-symmetric Fij = −Fji. This means we can write Eα = F0α and Bα =
−1/2 εαβγ Fβγ. Recall that our time coordinate is X0 = ct, therefore all
components of the Faraday tensor have the same dimensions. If we were to
work with coordinates Xi = {t, x, y, z}, we would have to be careful with
factors of c in Fij and the units of time and space.
The Faraday tensor can be expressed using a 4-vector potential Fij = ∂iAj
− ∂jAi with Ai = (−φ, A). The magnetic field is defined by B = curl A and the
electric field is E = − grad φ − ∂A/∂t.
Let us recall Theorem 1.3 which implies that the Faraday tensor must
satisfy the equation

These equations turn out to be equivalent with the homogeneous Maxwell


equations (2.23). On the other hand, the inhomogeneous Maxwell equations
become

where the 4-vector current has components Ji = (cρ, j). Due to the skew-
symmetry of Fij, the latter equation directly implies that the current is
conserved. We have ∂i∂jFij = 0 and therefore ∂iJi = 0 which is the continuity
equation, or the (local) charge conservation equation.
Setting i = 0 in Eq. (2.28) leads to

which is equivalent to
Next, setting i = 1 in Eq. (2.28) we find

This is the x-component of the equation

and likewise for the other components, hereby proving that Eq. (2.28) is
indeed equivalent to Eq. (2.24).
Maxwell’s equations not only provide a link between geometry and
physics, they also contain the fundamental ideas of modern gauge field
theories which we will briefly discuss. The Maxwell equations can be written
compactly using the Faraday tensor which itself can be expressed using the 4-
vector potential Fij = ∂iAj −∂j Ai. If we change Ai by the gradient of a scalar
function χ, say, Ai → Ai + ∂iχ, then the Faraday tensor will be invariant under
this change since partial derivatives commute. This means that Ai contains a
non-physical degree of freedom which we may eliminate or fix by choosing a
specific gauge. One popular gauge in electromagnetism is the Lorenz (not
Lorentz) gauge whereby one sets ∂iAi = 0. Another often-used gauge is the
Coulomb gauge defined by div A = 0.

2.1.4 Matter tensors

Let us begin with recalling the stress tensor or Cauchy stress tensor σij of
continuum mechanics. The geometrical setting is Euclidean 3-space with
Cartesian coordinates. This is a rank-2 tensor with nine components and it
defines the state of stress at any point inside a deformed material. The
diagonal components of this tensor are usually called the normal stresses,
while the remaining components are called shear stresses. For a hydrostatic
fluid in equilibrium the stress tensor is given by −pδij with p being the
hydrostatic pressure, pressure is used instead of stress when the material is
compressible.
Since all known physical theories can be formulated within the
framework of special relativity, the stress tensor is superseded by the energy–
momentum tensor (sometimes called stress–energy tensor or stress–energy–
momentum tensor) Tij, also a rank-2 tensor but in Minkowski space. In
addition to the stresses it contains information about the energy density and
energy fluxes of the matter. The energy–momentum tensor satisfies the four
conservation equations ∂iTij = 0. We can interpret the j = 0 equation as the
energy conservation equation, and the three equations j = 1, 2, 3 as the
momentum conservation equations.
For an ideal fluid in thermodynamic equilibrium the energy– momentum
tensor in the fluid’s reference frame is given by

The hydrostatic pressure is p and ρ denotes the fluid’s energy density.


As before, we work with coordinates Xi = {ct, x, y, z} in Minkowski
space ηij = diag(−1, 1, 1, 1). Let ua be a unit time-like vector ηijuiuj = −1
representing the velocity of the fluid, then we can write the energy–
momentum tensor as

so that we have Tijuiuj = ρc2. In the fluid’s rest frame we simply have ui = (1,
0, 0, 0).
The latter formulation used in Eq. (2.34) is geometrically useful. Let us
define the tensor qij = δij + uiuj, then we can immediately verify the following
relations
and also

Hence the tensor qij has a natural interpretation in terms of projections,


namely it projects onto the surface normal to the vector uj. Projections along
a unit normal are frequently used in General Relativity, especially in the
initial value formulation.
Let us have a closer look at the conservation equation ∂iTij = 0 for the
perfect fluid in Minkowski space. We note that uiui = −1 implies ui∂jui = 0.
Written out explicitly, the conservation equation is given by

We will decompose these j equations into two parts. First, we consider


−uj∂iTij = 0 which becomes

Second, we consider qkj∂iTij. Here we find

Next, we will consider the non-relativistic limit of those equations. In this


limit p/c2 ρ and we also assume that velocities are small when compared to
the speed of light. The 4-velocity is given by ui = (1, v /c) = (1, vx/c, vy /c, vz
/c) and this means we will assume |v | c. Therefore,
So Eq. (2.38) can now be written as follows:

We will take into account p/c2 ρ and arrive at

This is the well-known continuity equation of fluid dynamics. It simply


means that the flow of matter into the system equals the flow out of the
system, provided there are no sinks or sources. This is equivalent to the
statement that energy of this system is conserved.
Next, let us consider Eq. (2.39) which becomes

We assume that pressure changes over time are slow relative to the speed of
light, this means (|v|/c)(∂p/c∂t) ∇p. Then we arrive at

This second equation is the Euler equation of fluid dynamics, which


corresponds to the conservation of momentum. Both these two little results
are very nice as they show how non-relativistic physics naturally emerges
from the relativistic treatment.
We have already discussed Maxwell’s equations and will now mention
the energy–momentum tensor of the electromagnetic field. Let us consider
the Maxwell equations (2.28) and apply Fik to both sides. This gives

Using Eq. (2.27) we can write

next re-write the right-hand side of Eq. (2.47) in the following way

The term in the square brackets is the energy–momentum tensor of the


electromagnetic field

which satisfies the equation . This means that in the absence of


charges and current ji = 0, the energy–momentum tensor is conserved and
satisfies the usual equation = 0. In the presence of sources, the
electromagnetic field is not conserved, however, the total energy–momentum
tensor of the field plus the sources are conserved. The equation
means that any deviation from the conservation of is due the sources ji.
For concreteness let us compute T00 explicitly, for which we find

As expected, this is indeed the energy–density of the electromagnetic field.


2.2. Geometry and Gravity

2.2.1. Geodesics and Newton’s law

Newton’s second law of motion in the context of gravitational fields states

Let us compare this equation with the geodesic equation (1.70) discussed in
Chap. 1. This was given by

If it is possible to describe gravity using the language of differential geometry


(our working hypothesis) then we should identify the Christoffel symbols in
the geodesic equation with the force per unit mass in Newton’s second law.
Let us take into account physical units. We would assign units of length, m
for meters, to our spatial coordinates, we measure time in seconds s, and
masses in kg, using SI units. We denote the physical units of a quantity by
square brackets. The parameter λ in the geodesic equation (2.53) is
dimensionless. We have [g] = m/s2 and find that [Γ] = 1/m. Hence our first
observation is that

which has the units of velocity squared. The only fundamental physical
constant with such units is the speed of light c which is of paramount
importance in modern theoretical physics and is the building block of special
relativity. Hence, we will make the following identification

Let us recall that the Christoffel symbol components depend on the


metric and, more importantly, on its first partial derivatives ∂kgij. On the other
hand, in the Newtonian theory g = − grad φN, so the field depends on the
derivatives of the gravitation potential. This would suggest that the metric
components gij contain the gravitational potential φN which means

where we recall the factor of 1/2 in the definition of the Christoffel symbol. A
direct consequence of this identification is that under the assumption of a
static and spherically symmetric gravitational field,
Eq. (2.10), we would expect

We can improve this slightly by considering the limit M → 0 in which case


we would expect to find an empty space whose metric is flat, in particular we
would expect to find Minkowski space, see Eq. (1.60), the space of special
relativity. Therefore, up to an overall sign, we arrive at the identification

We will now try to make this identification more precise. Start with
coordinates Xi = {ct, x, y, x}. Let us consider the static line element

assuming that φ = φ(x, y, z) 1, and consider the motion of a freely falling


massive particle with 4-velocity ui = dXi/dλ. The trajectory of the particle is
determined by the geodesic equation and satisfies

For non-relativistic motion u0 » u1, u2, u3 so that


In lowest order in φ one finds that and therefore the i = 0 equation
simplifies to du0/dλ = 0 or u0 = C = const. Since u0 = cdt/dλ we can express
the geodesic parameter in terms of time as

The geodesic equation becomes

and we can change the independent variable λ to t which results in

We note that the component of Eq. (2.59) is given by = ∂iφ which


leads to

which is consistent with Eq. (2.52). In turn, this establishes that the metric
defined by Eq. (2.59) correctly reproduces Newton’s law, taking into account
our various approximations. Next, we must take a closer look at curvature
with the aim of getting the Poisson equation out of geometry. We can see that
φ in Eq. (2.52) is the correct Newtonian gravitational potential with an
additional factor containing the speed of light

which is also in agreement with Eq. (2.58).

2.2.2. Curvature and the Poisson equation

Next we must address the question of finding the geometric analogue of the
Poisson equation. Let us begin by recalling the geodesic deviation equation
(1.213) given by

with the first two indices interchanged. Let us derive an analogous equation
in the setting of Newtonian mechanics in the presence of a gravitational field.
We should begin with Euclidean space with Cartesian coordinates Xα = {x, y,
z}, α = 1, 2, 3 and a family of curves Xα(t, s) where t is time along a curve
and s labels the curves. The tangent vector to any such curve is given by Vα =
dXα/dt for fixed value s0. The vector connecting infinitesimally close curves
is Nα = dXα/ds for fixed times. The rate of change of Nα with respect to time
can be interpreted as the relative velocity between nearby curves and its
derivative as the relative acceleration. This is the quantity we wish to find, in
analogy to Eq. (1.209). First, we have

and the relative acceleration is given by

where we used Newton’s second law. Next, we apply the chain rule to the
last term

Therefore, we arrive at the Newtonian equivalent of the geodesic deviation


equation
At this point we can change the indices α, β to full spacetime indices a, b by
adding trivial components wherever necessary. This naturally leads to the
identification

In view of Eq. (2.8), we would like to sum over the indices a and i so that the
right-hand side becomes the Laplacian, provided that φ is time-independent.
Using leads to

This identification is not unexpected since the Riemann curvature tensor


contains the second partial derivatives of the metric and we already
established that the metric should contain the gravitational potential. Hence
Eq. (2.73) is consistent with our previous discussion which led to Eq. (2.58).
Consequently, in the absence of any gravitational fields, nearby geodesics
should experience no relative acceleration and therefore we can conjecture
that the vacuum field equations should take the form RjcTjTc = 0 for all
tangent vectors Ta which is equivalent to Rij = 0. This leads to the our final
identification

Surprisingly, these are the correct vacuum field equations of General


Relativity. For instance, the Minkowski metric (1.60) satisfies this equation.
In order to discuss the inclusion of matter, we need to recall our discussion of
matter tensors.

2.2.3. Field equations of General Relativity

By comparing the geodesic deviation equation in a Lorentzian space with its


equivalent in the Newtonian setting, we arrived at a sensible guess for the
form of the vacuum field equations of General Relativity. In our discussion of
matter tensors, we encountered the important energy–momentum
conservation equation ∇aTab = 0. It seems natural to place the energy–
momentum tensor onto the righthand side of the Einstein field equations, in
analogy to Eq. (2.8), with an appropriate coupling constant which should also
contain Newton’s gravitational constant. However, this also suggests that the
left-hand side of the Einstein field equations cannot be based on the Ricci
tensor alone, since ∇aRab ≠ 0. Einstein indeed originally proposed the field
equations Rab = κTab, however, this was soon dismissed due to the mentioned
inconsistency. We require an object which contains the Ricci tensor and
whose covariant divergence vanishes. The key lies in the twice contracted
Bianchi identities, see Theorem 1.6. This shows that Gab = Rab − Rgab/2 has
the desired property which justifies the name Einstein tensor for this object.
In November 1915, Einstein proposed the gravitational field equations

where κ is a coupling constant which needs to be related to Newton’s


gravitational constant.
Let us have a closer look at these field equations. Applying gij to both
sides gives −R = κT where T = gijTij is the trace of the energy– momentum
tensor, and we needed to use gij gij = 4. Substituting R for T in Eq. (2.75)
gives us an alternative form of the field equations, namely

In the absence of matter Tij = 0 and the vacuum field equations are simply Rij
= 0. For this reason one often refers to metrics satisfying the vacuum field
equations as Ricci flat. Note that this does not imply that the Riemann
curvature vanishes. In fact, there are many known Ricci flat metrics with a
singular Riemann tensor, one of which is the Schwarzschild metric which we
will study in detail in Secs. 3.2 and 3.6.
Since the Ricci tensor, the metric tensor and the energy– momentum
tensor are all symmetric tensors, the field equations (2.75) are a set of 10
nonlinear coupled partial differential equations in the components of the
metric gij, the sources are given by the energy–momentum tensor. In n
dimensions we would have n(n + 1)/2 equations, however, all 2D metrics
satisfy Gab = 0. It should be noted that we can make arbitrary coordinate
transformations which can change the metric, since there are four coordinates
we expect the number of independent equations to be reduce by four.
Therefore, we are led to believe that the Einstein field equations are six
independent equations plus four gauge fixing degrees of freedom.
Shortly after the formulation of the field equations (2.75) Einstein was
interested in a particular solution, now known as the Einstein static universe,
see Sec. 4.2.3. It is a cosmological solution of the field equations which
corresponds to a universe without any dynamics. It turns out that the field
equations do not allow for such a solution if the source is a perfect fluid. This
motivated the introduction of the so-called cosmological constant Λ into the
field equations which then take the form

These field equations allow for such a static solution. One verifies that the
left-hand side of the cosmological field equations still has vanishing
covariant divergence since Λ is assumed to be a constant and the metric has
vanishing covariant derivative. Sometimes Eq. (2.77) are referred to as the
cosmological Einstein field equations. The physical units of the cosmological
constant are [Λ] = 1/m2 so that is a characteristic length scale. The
cosmological constant will be discussed in greater detail in Chap. 4.

2.2.4. The principle of minimal gravitational coupling

The main idea of special relativity was to reformulate physical theories in a


flat 4D spacetime. General Relativity, on the other hand, is based on a curved
manifold. Therefore, we need to address the question of how to make special
relativistic theories compatible with General Relativity.
One way to achieve this is by simply replacing the Minkowski metric by
a general metric ηij → gij, and by replacing all partial derivatives with
covariant derivatives ∂i → ∇i. Sometimes this is called the principle of
covariance. This also means that physical laws take the same form in all
coordinate systems which is also referred to as the general principle of
relativity. More technically speaking the theory is diffeomorphism invariant,
this means invariant under general coordinate transformations.
However, this procedure alone does not result in a unique theory. This
can be seen as follows. Assume we have a theory which contains a term of
the form ∂i∂jvk, then following the above rule would give ∇i∇jvk. On the
other hand, had we started with ∂j∂ivk then we would arrive at ∇j∇ivk. In
Minkowski space our partial derivatives commute and there is no preferred
order, however, on arbitrary manifolds our two suggested terms differ exactly
by the Riemann curvature tensor, see Eq. (1.165).
This is particularly relevant for Maxwell’s equations, as we will see in the
following. Let us work with the 4-vector potential Ai in the Lorenz gauge
∇iAi = 0. Begin with Eq. (2.28) which gives

and we arrive at

On the other hand, we could have started with ∂j(∂iAj − ∂iAi) = −∂j∂jAi,
using the Lorenz gauge condition. Then, replacing partial with covariant
derivatives would have given

without the Ricci tensor part. Equations (2.79) and (2.80) are different which
shows that the principle of minimal coupling is not unique. Some additional
input is needed in order to determine which version of Maxwell’s equations
should be used. In typical laboratory settings, the gravitational fields are
weak and we will not be able to distinguish the two possible theories.
However, the continuity equation of the 4-vector current becomes ∇iJi = 0,
using the principle of minimal coupling. Therefore, we can check the
consistency of the proposed equations with the charge conservation equation
which we assume to hold. One needs to apply ∇i to (2.79) and (2.80), and in
both cases one needs to interchange the order of the covariant derivatives,
hereby introducing curvature terms.
We have

where we again used the Lorenz gauge condition.


The covariant trace of Eq. (2.80) is not identically zero due to the
presence of the Ricci tensor term in Eq. (2.81). On the other hand, this is
precisely the extra term in Eq. (2.79). We can therefore conclude that Eq.
(2.79) should be viewed as the correct Maxwell equation in curved spacetime
since this is consistent with the charge conservation equation.
For matter fields describing particles with half integer spin like electrons,
for instance, the situation is even more complicated. Depending on how a
theory of gravity is constructed, one can arrive at two different theories. Both
theories agree in the weak gravity limit and both predict almost the same
physics. The only difference between these two theories is the treatment of
spin and its coupling to gravity.

2.3. Weak Gravity

One of the main aims of the previous section was to motivate the correct
form of the Einstein field equations as much as possible.1 We will now do the
reverse, namely check whether the Einstein field equations correctly reduce
to the Newtonian equations when we assume slow motions and weak
gravitational fields.
2.3.1. Linearised Riemann and Ricci tensors

We assume that the spacetime is described by a metric which is nearly the


Minkowski metric, with small perturbation due to gravity, hence we write our
metric as follows:

where ηab is the Minkowski metric and hab are the small deviations from this
flat space, they satisfy |hab| 1.
The inverse metric is given by gab = ηab − hab which can be seen as
follows:

where we raise and lower the indices of hab using the Minkowski metric.
To first order in hab the Christoffel symbol is simply given by

Next, we wish to compute the Riemann curvature tensor, given by Eq.


(1.164). The terms containing the squares of the Christoffel symbols will be
of second order and only the derivative terms will contribute to the lowest-
order terms. A direct calculation gives

For the Einstein field equations we need the Ricci tensor and the Ricci scalar
which we are computing next. Summing over the second and fourth index in
Eq. (2.85) gives
where h = hss is the trace of hab. We should pay particular attention to the last
term which is . Here is the D’Alembertian or sometimes
called wave operator. In Minkowski space we simply have

If φ is time-independent then and we note that the Ricci tensor


indeed contains the Laplace operator, as expected from Eq. (2.73).
We can introduce the trace reversed tensor whose trace
is given by , as the name suggests. Using this tensor, the
Ricci tensor (2.86) becomes

Consequently, the Ricci scalar is given by

so that the Einstein tensor, in lowest order in hab, is given by

This Einstein tensor (2.90) would simplify considerably if we could make


terms of the form disappear. This looks very similar to the Lorenz gauge
discussed previously in electromagnetism, so we need to investigate whether
a particular coordinate system exists for which these terms do indeed vanish.
2.3.2. Gauge transformations

Let us start with a small coordinate transformation

where the vector is assumed to be small at all points. Then

and hence we have

Applying the tensor transformation rule (1.34) to the metric tensor gab = ηab +
hab we find that

We assumed that the are small and therefore h'ab is also small. The vector
has 4 degrees of freedom and therefore we will be able to fix 4 values of
h'ab. Let us raise both indices in Eq. (2.95) and also differentiate with respect
to X'a. We arrive at

Using the trace reversed tensor, we have and hence Eq.


(2.96) becomes
Therefore, we can find a coordinate system where by choosing
. This means we have an equivalent of the Lorenz gauge in
General Relativity.

2.3.3. Linearised Einstein field equations

Finally, we are able to state the Einstein field equations linear in the metric
perturbation. Working in the equivalent of the Lorenz gauge we find

Our aim is to recover the Newtonian equations of gravity, and the simplest
way to achieve this is to assume a static gravitational field where all
quantities are time independent. We should also note that the linearised
Einstein field equations are the starting point to the study of gravitational
waves, another important prediction of General Relativity, which we briefly
introduce in Sec. 2.3.4.
For the matter tensor we assume a perfect fluid given by Eq. (2.33). For
non-relativistic matter ρc2 » p and so the only non-vanishing component of
the energy–momentum tensor is T00 = ρc2. Therefore, the linearised Einstein
field equations (2.98) reduce to the single equation

For static fields being the Laplacian.


Consequently, we arrive at an equation similar to the Newtonian one ∆φN =
4πGρ, which reads

Taking into account the Newtonian equation (2.8) we are led to make the
identification
Reverting back to the unbarred quantity, we have

and therefore, the solution of the linearised field equations yields the metric

Comparison of this line element with Eqs. (2.59) and (2.66) implies that we
must have

This is a crucial relation. It fixes the coupling constant κ such that the theory
reproduces correctly the Newtonian limit. Recall Eq. (2.66) which states φN =
φc2 and implies the following choice for a consistent theory

which is the main result of this calculation. In summary, we can now state the
Einstein field equations again, with correct coupling constant

For most practical purposes it is convenient to set G = c = 1, a convention


used consistently in theoretical physics. The main reason is that it can be
quite tricky to keep track of all the correct factors of G and in particular c in
long calculations and it is often easier to reinsert these quantities at the end.
In what follows we will stick to this convention and only re-introduce
physical constants where needed.

2.3.4. Gravitational waves


We begin with the linearised Einstein field equations (2.98) in vacuum. To
simplify the following discussion, we will consider depending on time t
and the z-direction only. Under this assumption the linearised field equations
become

which is a wave equation for the components and corresponds to plane


waves travelling along the z-direction. Similar wave equations describe the
propagation of electromagnetic radiation or the propagation of elastic waves.
We immediately note that the wave speed equals the speed of light, so we can
conclude that these gravitational waves travel at the speed of light.
In order to determine the solution of this wave equation, we use the
ansatz

where Aab is some tensorial amplitude. This corresponds to waves travelling


in the positive z-direction. One could also work with a plus sign and consider
waves travelling in the negative z-direction. One can check directly that Eq.
(2.108) satisfies the linearised Einstein field equations.
Our wave ansatz Eq. (2.108) contains the amplitude Aab which is a rank-2
symmetric tensor. In general, this has 10 independent components since we
work in four dimensions. Recall that we assume to satisfy the Lorenz
gauge which will restrict the form of Aab. Let us briefly consider
the more general plane wave ansatz

where xi = (t, x, y, z) and we introduced the wave (co)vector ki. Then, the
gauge condition becomes
These are four equations in general, and hence Aab can have at most six
independent components. However, we can still make infinitesimal
coordinate transformations of the form (2.91) with which leave the
Lorenz gauge unchanged, see Eq. (2.97). Since is a vector with 4 degrees
of freedom, we can eliminate another four components from Aab. Therefore,
Aab has two independent components which contain the physics of
gravitational waves. In Eq. (2.108) we chose ki = (−ω, 0, 0, ω).
One can choose the such that the gravitational wave solution travelling
in the positive z-direction is given by

This particular gauge is called the transverse-traceless gauge because the


wave travels along the z-direction and distorts objects in the x and y
directions, the transverse directions. Traceless refers to the fact that Aab has
zero trace.
The behaviour of freely falling nearby test particles under the influence of
the gravitational wave is determined by the geodesic deviation equation. In
Eq. (1.213) the tangent vector Ti now corresponds to the test particle’s 4-
velocity which we can choose to be Ti = (1, 0, 0, 0). Therefore, only the
components Ri00a of the Riemann curvature tensor enter the geodesic
deviation equation, and one finds in linear approximation

Fig. 2.3 Visualisation of the gravitational wave with polarisation A+.


The other components vanish. Therefore, the vector Ni connecting nearby
particles must have the form Ni = (0, N1, N2, 0) which means that this vector
is contained in the xy-plane and this plane is perpendicular to the propagation
direction of the gravitational wave. The geodesics deviation equations
become

The solutions to these equations describe harmonic oscillations of some


vector (Ni)initial in the xy-plane. If we imagine a ring of test particles initially
at rest, then the gravitational wave passing through the plane will distort this
ring of particles to an elliptical shape. The lengths of the two semi-axes
changes in time while the area enclosed by the particles remains constant, this
is visualised in Fig. 2.3.
We should also note that the waves with A+ = 0 and those with A× = 0
corresponds to two different polarisations of the waves, these two states are
related by a π/4 or 45◦ rotation. This is in contrast to electromagnetic waves
where the two polarisations are rotated by 90◦. A very recent observation
confirms the first direct detection of gravitational waves, see Abbott et al.
(2016), emitted by the inspiral and merger of a pair of black holes. The first
indirect observation of gravitational waves was due to Hulse and Taylor who
studied the orbital decay of a binary pulsar system which decays energy in
the form of gravitational waves.

2.4. Variational Approach to General Relativity

It turns out that physical theories can generally be formulated using a


variational approach. This means that one states the action or Lagrangian and
derives the equations governing the theory. These equations are the Euler–
Lagrange equations coming from the action, we briefly discussed this
framework in Sec. 1.4. In classical mechanics the interpretation of the
Lagrangian is very clear as it is the difference between kinetic and potential
energy. In General Relativity, on the other hand, it is difficult to define the
concept of kinetic and potential energy and so it is not clear which form the
Lagrangian should take so that variational calculus will result in the Einstein
field equations.
In 1915, Hilbert noted that the action

yields the Einstein field equations when variations with respect to the metric
tensor gij are considered. For this reason action (2.116) is called the Einstein–
Hilbert action. The term is nothing but the proper 4-dimensional
volume on the manifold, see Eq. (1.57). Recall that g stands for the
determinant of the metric tensor, and κ = 8πG/c4. The presence of κ can be
explained by checking the dimensions of the quantities involved. The Ricci
scalar has dimensions of inverse length squared, 1/m2, g = det(gij) is
dimensionless and d4x has units of volume multiplied by time m3×s.
Therefore, the quantity has dimensions m × s. The coupling
constant κ has units of s2/kg/m which means that the Einstein–Hilbert action
has the correct dimensions of energy times time.
One way to motivate the Ricci scalar R in this action is that it is the
simplest curvature scalar which can be constructed from the Riemann
curvature tensor. This single observation immediately allows us to propose
different theories which could be based on the square of the Ricci tensor or
the squared Riemann tensor.
We recall the equation for the derivative of the determinant of the metric
(1.99). Since variations and partial derivatives are closely related, we can
write
where we took into account that the metric is symmetric, and hence we arrive
at

Next, we need to compute the variation of the Ricci scalar with respect to the
metric. We begin by noting that the Ricci scalar is given by gab Rab and that
Rab can be expressed solely in terms of the Christoffel symbols. Let us make
a small change in the Christoffel symbol, a direct calculation using Eqs.
(1.188) and (1.164) shows that we can write

While the Christoffel symbol is not a tensor, the object δΓ is in fact a tensor
as it is the difference between two connections. We used this argument before
when discussing the torsion tensor at the end of Sec. 1.3.1. Inspection of the
right-hand side of (2.119) shows that it can be written as the difference of two
covariant derivatives

Putting these calculations together, we can begin calculating the variations of


the Einstein–Hilbert action with respect to the metric

where we can already recognise the left-hand side (or Einstein tensor) of the
Einstein field equations. In the following we will show that the remaining
term is in fact a total derivative term which will not contribute to the field
equations, analogously to Sec. 1.4 we keep the end points fixed. Using Eq.
(2.120), the last term of (2.121) is

In Exercise 1.12, we showed the identity which we


can now apply to both terms in the integrand and arrive at

Now, we can clearly see that both terms are total derivatives and therefore
these will not contribute to the equations of motion. Therefore, we have the
desired result

In order to derive the right-hand side of the Einstein field equations, we need
to add an action describing the matter. Formally
we write

which means we can write the variations as

where we assume that the matter Lagrangian depends on some fields ψ and
their first partial derivatives. We define the energy–momentum tensor by

so that we can now formulate the complete field equations using the
variational approach. We consider the total action Stotal = SEH + Smatter, its
variations with respect to the metric are given by

Requiring δStotal = 0 yields the Einstein field equations

Additionally, we can consider the variation of Stotal with respect to the matter
fields, as in Eq. (1.229), which gives the equations of motion of the matter.
Recall that these equations are not independent from the Einstein equations
since the twice contracted Bianchi identities imply ∇aTab = 0.

2.5. Further Reading

General Relativity

Early work in General Relativity focused on finding exact solutions of


physical importance to the Einstein field equations. This research moved on
towards a systematic study of exact solutions taking into account various
symmetry properties which can be imposed on the metric which in turn
simplify the field equations. This has been a prosperous activity for many
decades that led to the discovery of a vast number of exact solutions, many of
which are collected in Stephani et al. (2003). The authors of this book state
that they collected a total of over 6,000 references (up to 1999) which dealt
with exact solutions. New exact solutions of the Einstein field equations, with
and without matter sources, are still being discovered.
Substantial progress in General Relativity emerged from a detailed
treatment of the mathematical aspects of the field equations. In particular the
so-called initial value formulation was an important achievement in this
direction. The interested reader should refer to Part II of the book by Wald
(1984) which discusses various advanced topics of General Relativity which
are beyond an introductory course. Also highly recommended are the books
by Weinberg (1972) and Hawking and Ellis (1973), and the very
comprehensive text by Misner et al. (1973), its 1,200 pages can be quite
overwhelming though! Gravitational waves are discussed in detail in the
book by Maggiore (2007). A more recent book which emphasises the many
exciting mathematical aspects of General Relativity is by Choquet-Bruhat
(2008). The discovery of General Relativity was now over 100 years ago, a
centennial perspective of this subject discussing the many current research
activities is by Ashtekar et al. (2015).
Another exciting aspect of General Relativity since its formulation are the
many attempts to modify and extend the original theory. Einstein himself
actively contributed to this field, and so did many others. In the subsequent
text there will be some brief discussions about some of the various extensions
and modifications of Einstein’s original theory. The principal aim is to
highlight the conceptual framework that was established to formulate General
Relativity which itself motivates further studies beyond the actual theory.
More geometry — torsion and non-metricity. We start by recalling the
definitions of torsion and non-metricity. We can define torsion to be the
skew-symmetric part of the connection. We have ∇a∇bf = ∂a∂bf – Γabi∂if

and therefore we can simply write

The object of non-metricity is defined by . As in Theorem 1.1 we


can find the connection and express it in terms of the Christoffel symbol, the
torsion tensor and the non-metricity. A direct calculation gives

This general connection is the basis for so-called metric affine gauge theories
of gravity which have been researched in great detail. The use of gauge
symmetries in gravity is natural in the sense that the other three fundamental
forces are formulated as gauge field theories. A good introductory textbook
into this subject is by Blagojevic (2001) which also covers the basic ideas of
other modifications. Those who wish to read selected original papers with
commentaries, putting those papers into the context of the entire research
field, are probably well advised by Blagojevic and Hehl (2012). In this
context, it is also worth pointing out that there exits an equivalent formulation
of General Relativity based on the torsion tensor rather than the metric. The
interested reader is referred to the article by Maluf (2013) and the book by
Aldrovandi and Pereira (2013).
More dimensions — Kaluza–Klein and beyond. Instead of considering
spaces with torsion and non-metricity, one could also consider a manifold
with metric compatible connection without torsion but with more than four
spacetime dimensions. This first such attempt goes back to Kaluza and Klein
in the 1920s, see again Blagojevic (2001, Chap. 10). The basic idea is quite
simple. Let us consider a 5D manifold, then the metric will have 15
independent components, five more than the 4D metric. One can now
conjecture that the electromagnetic 4-vector potential Ai and an additional
scalar field are the new degrees of freedom introduced in this theory. Since
we only observe three spatial dimensions and one time dimension, one has to
introduce an additional ingredient to reconcile this with a 5D theory. One
generally assumes this extra dimension to be very small. However, there is no
observational evidence for the existence of extra dimensions. Thoughts along
these lines have also motivated additional research by further increasing the
number of dimensions. For instance, String Theory is formulated in a 26D
space which reduces to 10 dimensions when super-symmetry is taken into
account, a good introductory textbook is by Zwiebach (2009). Super-
symmetry, super-gravity and String Theory are also discussed in Blagojevic
(2001, Chaps. 9 and 11).
More curvature — higher-order theories. When discussing the
Einstein–Hilbert action in the variational approach to General Relativity, we
noted that the Ricci scalar R is the simplest possible curvature scalar to be
used in a Lagrangian. This motivates the study of theories which contain
higher order curvature terms in the Lagrangian. One model which has
received considerable attention during the previous decade is based on the
idea of replacing the Ricci scalar by an arbitrary function of the Ricci scalar
in the Einstein–Hilbert action. This function is normally denoted by f(R) and
therefore one speaks of f(R)-gravity, see Felice and Tsujikawa (2010);
Sotiriou and Faraoni (2010) for two reviews on the subject.
The recommended literature is by no means complete, it simply reflects
the author’s suggestions for further reading.

2.6. Exercises

Some physics background

Exercise 2.1. Derive the pressure distribution inside a Newtonian star with
constant density. Can this star be arbitrarily compact? Use M/R as a measure
of compactness.

Exercise 2.2. Show that Eq. (2.27) is equivalent to Eq. (2.23).

Exercise 2.3 (annoying). The positioning of the factors of c in Sec. 2.1.4 is


quite subtle and depends on various choices made. The crucial one being that
we worked with coordinates Xi = (ct, x, y, z) so that our four coordinates have
the same units of length, then the Minkowski metric is ηij = diag(−1, 1, 1, 1).
Let us now insist on coordinates Xi = (t, x, y, z) and ηij = diag(−1, 1, 1, 1).
How does Eq. (2.34) have to be re-written and what is ui in the rest frame and
in general?

Exercise 2.4. Consider Lorentz boosts along the x-direction. The Lorentz
transformation can be thought of as hyperbolic rotations in the sense that the
transformation can be written as

with y' = y and z' = z. Find the matrix form of Lab and show that det(Lab) = 1.
Next, show that the Minkowski metric is indeed invariant under this
transformation. Find the relationships between β, γ and v, and the parameter ζ
which is called rapidity. Finally, determine the angle α from Fig. 2.1 in terms
of the rapidity.

Exercise 2.5. Follow on from Exercise 2.4. For simplicity we will neglect the
y and z directions and work in the t, x plane only. Instead of working with the
time coordinate ct we will now use the imaginary time coordinate ict. Let
R(θ) be a 2D rotation matrix, show that

corresponds to a Lorentz transformation that is compatible with Eqs. (2.133)


and (2.134).

Exercise 2.6 (takes time). Consider two Lorentz boosts along the x-direction
with respective velocities v1 and v2, and rapidities ζ1 and ζ2. Show that the
rapidity of the overall boost ζ (with velocity v) is simply their sum ζ = ζ1 + ζ2.
Use this to derive the velocity addition formula

Exercise 2.7. Compute the components of the electromagnetic energy–


momentum tensor (2.50) and express the result in terms of the electric field E
and the magnetic field B. Which components of Tij are related to the Poynting
vector In Physics, the Poynting vector represents the
directional energy flux density (the rate of energy transfer per unit area) of an
electromagnetic field.

Exercise 2.8. The action of electromagnetism with source term is given by

with Fab = ∂aAb−∂bAa. The geometrical setting is Minkowski space. Derive


the Maxwell equations (2.28) by computing variations with respect to Ai and
determine α.

Geometry and gravity

Exercise 2.9. Show that the vacuum (Tij = 0) Einstein field equations imply
that the Ricci scalar vanishes.

Exercise 2.10. Show that Eq. (2.77) implies the energy–momentum


conservation equation ∇iTij = 0.

Exercise 2.11. Find the equivalent to Eq. (2.76) in the presence of the
cosmological constant.

Exercise 2.12. Let Fij be a skew-symmetric tensor Fij = −Fji. Show that the
equation ∇iFij = 0 implies a conservation law. This means it can be written
as ∂iJi = 0 for a suitably chosen Ji.

Exercise 2.13. The quantity 2GM/(c2r) in Eq. (2.58) is dimensionless. Let α


be some number of order one, the combination αΛr2 is also dimensionless.
Suggest the form of the metric with cosmological term and argue for the form
of Newton’s force law with Λ. Determine a simple upper bound on Λ using
the Solar System.

Weak gravity
Exercise 2.14. Show that the definition for the trace reversed tensor
implies the relation .

Exercise 2.15. Derive the transformation property (2.97) for under small
coordinate transformations directly from the transformation property of hab
beginning with Eq. (2.91).
Variational Approach to General Relativity

Exercise 2.16. Before Eq. (2.117) we wrote ‘variations and partial derivatives
are closely related’. Consider a sufficiently smooth function f(x) depending
on the independent variables x. Write the first few terms of the Taylor series
of f(x + δx), assuming that δx 1. Think of δx as the small quantity h used in
the definition of the derivative. Now introduce the notation δf for the
difference between the changed and unchanged quantity, δf = f(x + δx) − f (x)
and rewrite the expression for the Taylor series. Establish the relationship
between δf /δx and the derivative df /dx.

Exercise 2.17 (hard). Starting with

derive the energy–momentum tensor of the electromagnetic field in General


Relativity using the calculus of variations.

Exercise 2.18. Consider a gravitational action that depends not just on the
Ricci scalar but also the scalar RijRij. Argue that actions of this type yield
theories that contain derivatives of higher than second order.

_______________________
1On a more philosophical note, one cannot derive any physical theory from scratch as ultimately
experiments verify or falsify theories. We model nature as best as we can.
3
Schwarzschild Solutions

In this chapter, we are discussing the famous Schwarzschild solutions. It was


discovered shortly after the formulation of General Relativity and was the
first non-trivial solution of the Einstein field equations. The key idea was to
assume a static and spherically symmetric space and study the field equations
using these assumptions. It turns out that the field equations simplify
considerably, and we will be able to find explicit solutions with and without
matter.

3.1. Spherical Symmetry and Birkhoff’s Theorem

Any symmetry of a manifold can be rigorously defined in differential


geometry, however, it is possible to argue for the correct form of the metric
differently. Let us begin with the standard spherical polar coordinates and let
us compute the ds2 = dx2 + dy2 + dz2 using these coordinates. A direct
calculation (see Example 1.4 in Sec. 1.2.4) gave

We can now argue that any spherically symmetric metric in General


Relativity must contain the line element of the surface of the sphere. One
often writes dΩ2 = dθ2 + sin2θdϕ2 for this line element of the sphere. Hence,
choosing coordinates Xi = {t, r, θ, ϕ}, we should be able to write any such
metric in the form

The functions A(t, r) and B(t, r) are determined by solving the Einstein field
equations and they depend on the choice of matter. The use of the
exponential function is purely for convenience, it simplifies the subsequent
field equations and makes them more manageable.
Let us now consider the exterior gravitational field of a spherically
symmetric star for instance. Outside the star, we have a vacuum and therefore
Tab = 0, and the Einstein field equations reduce to Gab = 0. Computing the
Einstein tensor components explicitly is rather involved. First, one uses the
metric (3.2) to find all the nonvanishing Christoffel symbol components.
These can then be used to compute the Riemann tensor components which in
turn lead to the Ricci tensor components. Finally, one can calculate the
components of the Einstein tensor.
First, we consider the off-diagonal equation G01 = 0 which is given by

where the dot means differentiation with respect to time t. Therefore, the
function B must be independent of time and hence a function of r only. Next
we consider the combination exp(−A + B)G00 + G11 = 0 which gives

which implies that A′ = −B′ where the prime means differentiation with
respect to r. Since B is a function of r only, it follows that A must also be a
function of r only. Therefore, both functions are independent of time and the
vacuum spacetime is static. What we have just shown is part of what is
known as Birkhoff’s theorem which states: Every spherically symmetric
solution of the vacuum field equations is static and asymptotically flat. The
unique exterior solution is the Schwarzschild solution. (We will show the rest
of this statement shortly.) This means that the assumption of spherical
symmetry implies staticity in the vacuum region which has some important
implications.
Let us consider a pulsating (spherically symmetric) star, by this we mean
an object whose radius changes over time. We know that the exterior
gravitational field of this star is static and therefore such a pulsating object
cannot create any gravitational waves. Even the formation of a black hole
would not generate any gravitational radiation provided that spherical
symmetry is perfectly maintained. This also tells us that a good astrophysical
candidate for the emission of gravitational radiation would be a binary
system, two objects orbiting rapidly around their centre of mass.

3.2. The Schwarzschild Solution

We showed that the most general spherically symmetric metric in vacuum is


of the form

where our coordinates are as before Xi = {t, r, θ, ϕ}. The Einstein tensor
components for this metric are given by

Our aim is to solve the vacuum field equations Gab = 0. It seems that Eqs.
(3.6)–(3.9) are three equations for the two unknown functions A and B.
However, recall the twice contracted Bianchi identity, Theorem 1.6, which
states that the Einstein tensor has to satisfy additional equations. This implies
that the equation G22 = 0 can in fact be derived from the other two field
equations, and hence it suffices to consider Eqs. (3.6) and (3.7). The
equations Gab = 0 are now equivalent to

and it is not difficult to solve these equations.


We start by noting that Eq. (3.10) can be written as follows:
and therefore Eq. (3.10) can be integrated to give

where C is a constant of integration. Addition of Eqs. (3.10) and (3.11) gives


r(A′ + B′) = 0 from which we conclude A = −B. Note that there is a second
constant of integration, however, this can always be set to one by rescaling
the time coordinate. Therefore, we arrive at

which implies that the metric is of the form

This metric depends on the constant C for which we need to find a physical
interpretation. Recall that the line element Eq. (3.15) describes the exterior
gravitational field of any spherically symmetric source. In Newtonian gravity
this field is uniquely characterised by the mass M of the object and so we
suspect C to be related to the mass.
One neat way of arriving at the correct interpretation of C is to re-do the
above calculation with Newton’s constant G and the speed of light c. In this
case our metric Eq. (3.5) should be written in the form

and the constant of integration C we would write as GC/c4 so that this C


would have dimensions of mass. As before we find

However, let us now compute the Christoffel symbol components for Eq.
(3.16) and consider the limit c → ∞ which corresponds to the Newtonian
limit. It turns out there is only one non-vanishing component which depends
on the constant C. One finds

which is part of Newton’s force law provided we choose 2M = C, compare


with Eq. (2.4). The other non-vanishing Christoffel symbol components are
due to our choice of coordinates. Alternatively, we can recall Eqs. (2.59) and
(2.66) which would result in the same conclusion. With this identification
made, we can now state the famous Schwarzschild solution

One of the first observations is that this metric is singular at r = 2M and at r =


0. At this point it is not clear whether these are coordinate singularities due to
our bad choice of coordinates or whether these are true spacetime
singularities where the curvature tensor components diverge. The vacuum
field equations can be written as Rij = 0, so the Ricci tensor vanishes
identically. However, this does not imply the vanishing of curvature as
expressed by the Riemann curvature tensor. Recall that the Riemann tensor
has 20 independent components in four dimensions while the Ricci tensor has
only 10, the other ‘half’ of the Riemann tensor is encoded in the Weyl tensor,
see Definition 1.22. In order to check for true spacetime singularities one has
to study the entire Riemann curvature tensor.
The Schwarzschild solution was the first non-trivial exact solution of the
Einstein field equations. It was found by Karl Schwarzschild in 1916 less
than a year after the Einstein field equations were formulated. This solution
has been extensively studied during the last century and it is fair to say that it
is still regarded as one of the most important solutions of the theory. It can be
used to derive predictions of General Relativity which can be tested in the
Solar System and these predictions are different to those from Newtonian
gravity. Observations are in excellent agreement with Einstein’s theory. The
Schwarzschild solution also is at the heart of black hole physics as it can be
interpreted as the exterior of a non-rotating black hole. In the following we
will look at many interesting aspects of the Schwarzschild solution.
The radius r = 2M where the metric is singular is often called the
Schwarzschild radius, re-introducing physical constants, we define the
Schwarzschild radius by

At this point it is useful to check the significance of this radius for an object
like the Sun. Taking the solar mass to be M⊙ = 1.99×1030 kg, we find

The physical radius of the Sun is R⊙ = 6.955 × 105 km. Hence, the
Schwarzschild radius of the Sun is much smaller than the actual radius of the
Sun. The Schwarzschild solution describes only the exterior gravitational
field of the Sun starting from its boundary, so the Schwarzschild radius has
no physical meaning in this context. The interior of the Sun is described by a
solution to the field equations with matter, we discuss one such solution in
the following section.
Let us briefly think about a very massive collapsing star. It is possible for
the gravitational force to be stronger than any of the other forces which
would prevent this collapse. In such a situation the collapse continues beyond
the Schwarzschild radius and the rS = 2M surface becomes physically
meaningful. When this happens the object forms what is known as a black
hole. It is widely agreed that (supermassive) black holes should exist at the
centres of galaxies. Note that the absence of any mechanism to stop the
continued collapse implies that eventually the entire mass would be
concentrated at the centre of the black hole.
Using spatial Cartesian coordinates, the Schwarzschild metric can be
written in the form

where r2 = x2 + y2 + z2 is the Euclidean distance from the origin.


3.3. The Schwarzschild Interior Solution

Next we are interested in solving the Einstein field equations in the presence
of some matter. For simplicity we assume this to be a perfect fluid of constant
density, this serves as a rough model of an incompressible star, for instance a
neutron star. As before, we assume a static and spherically symmetric metric
of the form Eq. (3.5). The energy–momentum tensor Eq. (2.34) in a non-flat
space takes the form

For our given metric the fluid’s 4-velocity is u0 = − eA(r)/2, u1 = u2 = u3 = 0 so


that the components of Tij are given by T00 = ρ eA, T11 = p eB(r), T22 = pr2 and
T33 = sin2θ T22. The resulting three independent Einstein field equations are

The final equation G33 = 8π T33 differs from Eq. (3.26) only by a factor of
sin2θ.
Let us begin with Eq. (3.24), we divide by the common factor of eA and
multiply the entire equation by r2. This yields

Following our observation that led to Eq. (3.12) we can rewrite the previous
equation as

which strongly suggests that we should integrate this relation. The right-hand
side is identical to the Newtonian mass definition in spherical symmetry and
hence we define
as the mass up to radius r. If, moreover, the density is assumed to be constant
ρ(r) = ρ0, then the mass up to r simply becomes

Using the Newtonian mass definition (3.29), we can solve Eq. (3.28) for
one of the metric functions and find

which is in nice agreement with the Schwarzschild solution given by (3.19).


Note that we set the constant of integration which appears to zero in order to
avoid a singular solution for small r. Next, assuming a constant density gives
the explicit form

and we note that the spatial part of the metric describing an incompressible
star is now uniquely determined. We are left with one function to be
determined in the metric

We will encounter a very similar spatial part of a metric again when


discussing Cosmology in Chap. 4. We still need to find the pressure function
and the other metric function to fully solve the field equations.
At this point it is best to state the energy–momentum conservation
equation which for our matter source leads to

Note that this equation is not independent of the field equations but a
consequence of them due to the twice contracted Bianchi identities, Theorem
1.6. We will now eliminate the quantity A′ from the conservation equation
(3.34) and the second field equation (3.25). This gives

Since we already determined the function eB, this can be rewritten as

which is the famous Tolman–Oppenheimer–Volkoff equation. Let us


consider the non-relativistic limit of this equation by assuming p ≪ ρ and
2m/r ≪ 1 so that it becomes

which is the structure equation of Newtonian astrophysics, see also Exercise


2.1.
When assuming a constant density distribution, the Tolman–
Oppenheimer–Volkoff equation simplifies considerably and becomes a
separable first-order ordinary differential equation which reads

When solving this equation we choose the constant of integration such that
p(r = 0) = pc because the central pressure of our hypothetical star is a
physically useful quantity. The solution of Eq. (3.38) can be written in the
following form:

Alternatively, we can use the mass equation (3.30) to rewrite Eq. (3.39) as
follows:
The function p(r) is decreasing and takes its maximum value at r = 0, the
centre of the star. From a physical point of view this is very satisfying. The
pressure decreases as one moves from the centre towards the surface of the
star. We will define this surface or boundary of the star to be the vanishing
pressure surface. This means, we define the radius R of the star by the
relation p(r = R) = 0. We also denote the total mass of the star by M so that M
= m(R), capital letters are used for total quantities.
Using p(r = R) = 0 in Eq. (3.39) allows us to find an expression relating
the total mass M, the radius R and the central pressure pc and energy density
ρ0 which is given by

This equation has one particularly neat implication which has no equivalent
in Newtonian astrophysics. Let us solve this equation for the central pressure
pc, we find

which puts a constraint on the mass–radius ratio of our star provided we


assume pc < ∞. A condition of this form is physically sensible, we would like
to describe the centre of an astrophysical object with some regular form of
matter, in particular we wish to require that all physical quantities are finite.
For the central pressure to be finite means that the denominator of Eq. (3.42)
must be larger than zero which implies

Therefore, in General Relativity we cannot arbitrarily increase the mass of an


object while keeping the radius fixed. In other words there exists a bound on
the compactness of an astrophysical object, this result goes back to Buchdahl.
In turn this implies that an object of a given mass must have a minimal radius
so that the inequality (3.43) is satisfied. Reinstating physical constants, we
would write the Buchdahl inequality as

Before continuing, let us state, for completeness, the metric of this


interior solution. The metric function A can be found by integrating Eq.
(3.34). For a constant density distribution, this integration yields

where C is a constant of integration. It can be chosen arbitrarily, and its value


can be changed by rescaling the time coordinate used in the metric. A
convenient choice is to take A(r = 0) = 0 so that C = ρ0 + pc. Inserting this
solution into Eq. (3.33) gives the Schwarzschild interior metric

where the function p(r) is given by (3.39) or (3.40).


As in the previous section, let us evaluate this inequality for the Sun, we
have

which satisfies the Buchdahl inequality. Interestingly, despite the Buchdahl


inequality being satisfied by about five orders of magnitude, very few stars
are known whose masses exceed 200 M⊙. One should keep in mind though
that the Buchdahl inequality is valid only in the context of static and
spherically symmetric objects which is violated by any realistic astrophysical
object.
This has further implications in astrophysics. When observing spiral
galaxies and investigating the motion of objects near their centres, one is
required to place a very heavy, yet small object into their respective centres to
account for that motion. It turns out that this hypothetical object grossly
violates the inequality (3.43). In fact, simple estimates indicate that the
physical radius of this object is smaller than the corresponding Schwarzschild
radius. Therefore, no known matter type would be able to describe such an
object and only a black hole is compatible with observations.

3.4. Geodesics in Schwarzschild Spacetime

The Schwarzschild solution describes the exterior gravitational field of a


spherically symmetric body, we are now interested in understanding the
motion of test particles in the Schwarzschild spacetime. In practical terms, we
assume that the exterior gravitational field of the Sun is well described by the
Schwarzschild metric and wish to find observational effects within the Solar
System. These effects could test the validity of General Relativity. Doing so
requires the study of geodesics, we will derive the geodesic equations using
the Lagrangian approach used in Sec. 1.2.5. Our Lagrangian is given by

with f(r) = 1 − 2M/r. The dot stands for the derivative with respect to the
geodesic parameter λ. Also, for null geodesics which describe the motion of
massless particles like photons we have L = 0, while for massive particle we
have L = −1.
Let us begin with the equation of motion for θ, we have

This equation can be solved by choosing θ = π/2 which corresponds to


aligning the coordinates so that the motion is confined to the equatorial plane.
This can always be done for the gravitational twobody problem.
Next, we consider the equation of motion for t. We note that the
Lagrangian (3.48) is independent of time, so we expect a constant of motion
which we will interpret as energy E. We arrive at

which means that we can write

Similarly, the Lagrangian (3.48) is also independent of the angular variable ϕ


so we expect a constant of motion related to angular momentum ℓ for which
we have

Substituting the constants of motion E and ℓ back into the Lagrangian


yields

This equation has similarities with a classical mechanical system which can
be made more explicit by rewriting

We can interpret ṙ2/2 as the kinetic energy of a test particle with energy E2/2.
The remaining term is the effective potential determining the motion of the
particle. This effective potential is given by

Some terms in this effective potential are familiar: ℓ2/(2r2) corresponds to the
centrifugal barrier term, while LM/r is the standard Newtonian term. For
massive particles when L = −1 this becomes −M/r as expected. The term
−Mℓ2/r3 is a new general relativistic term which dominates over the barrier
term for small radii.
The earlier-mentioned equations are the starting point for studying
geodesics in the Schwarzschild spacetime. We know that the Schwarzschild
metric gives a good approximation of the external gravitational field of the
Sun and hence we should be able to make some predictions of general
relativistic effects which might be observable in the Solar System. This is the
subject of Sec. 3.5 and led to the ultimate success of General Relativity.
Before proceeding, we will examine a neat and somewhat counter
intuitive example of one particular type of geodesic motion in the
Schwarzschild spacetime.

Example 3.1 (Photon sphere). We are interested in the question of whether


or not photons (massless particles) can have circular orbits in the
Schwarzschild spacetime. For this to be possible, the effective potential Veff
must have a stationary point. Setting L = 0 in Eq. (3.55), we have

so that the stationary point r*, say, is determined by

which has the unique physical solution r* = 3M. This means that at this
radius, photons travels on an exact circular trajectory around the central mass.
If we looked along the tangential direction at this point, we would see the
back of our head in front of us.
A direct calculation also shows that d2Veff/dr2(r*) < 0 showing that the
point r* = 3M is a local maximum of the effective potential. This implies that
the this point is dynamically unstable. The r = 3M surface of the
Schwarzschild spacetime is often referred to as the photon sphere. The shape
of the potential (3.56) is shown in Fig. 3.1.
Fig. 3.1 Effective potential Veff given by Eq. (3.56). The dot indicates the position of maximum where
r = r* = 3M .

Let us now have a closer look at the potential (3.55) for massive particles
L = −1 so that

The shape of this potential is determined by the ratio ℓ2/M 2, see Fig. 3.2. The
possible extremal points of the effective potential are found by solving dVeff
/dr = 0 which reduces to a quadratic equation

and therefore the extremal points are located at

If 12M 2 > ℓ2, then the potential for massive particles does not have any
critical points, for 12M 2 = ℓ2 there is one point and otherwise there are two
points. The smallest possible stationary point for a massive particle is found
when setting 12M 2 = ℓ2, and we obtain rmin = 6M which is twice the radius
of the photon sphere. We can also solve Eq. (3.59) for the angular momentum
ℓ2 which gives
Fig. 3.2 Effective potential Veff for massive particles. The dots indicates the positions of the maximum
and minimum.

and we can use this as an approximation for the angular momentum of an


orbit. For an exactly circular stable orbit, this is indeed the correct expression.

3.5. Testing General Relativity — The Classical Tests

In the following we will discuss the three classical tests of General Relativity.
These are the perihelion precession of Mercury, the light deflection by the
Sun and finally the gravitational redshift of light. We will also discuss a
fourth effect, namely gravitational time or radar echo delay which was
proposed in the 1960s.

3.5.1. Perihelion precession of Mercury

In Newtonian gravity, planets’ trajectories around a central object like the


Sun are described by exact ellipses. This is highly idealised because the
presence of other objects will perturb those trajectories which results in a
slight failure of these ellipses to close. This effect is called perihelion shift or
perihelion precession, see Fig. 3.3. It is strongest for those objects closest to
the central object, and so we are interested in particular in the planet Mercury
which is closest to the Sun. The observed perihelion precession for Mercury
is about 5,600 arc seconds per century, of which 43 arc seconds cannot be
accounted for by Newtonian gravity taking into account all-known
perturbative effects.
Fig. 3.3 Perihelion precession of an ellipse. Greatly exaggerated for planetary orbits.

As we are dealing with the geodesic of a massive particle, we set L = −1.


To begin with, we need to find an expression for dr/dϕ which can be found
by combining Eqs. (3.52) and (3.54) to get

For problems of this type it is always convenient to introduce a new variable


u = 1/r so that Eq. (3.62) becomes

This differential equation can be solved analytically using elliptic functions.


However, since we are only interested in one particular effect described by
this equation, there is no need to delve into special functions. The variable u
can be considered small (inverse radius) for astrophysical objects. Recall that
−Mℓ2/r3 = −M ℓ2u3 is the new general relativistic term which is cubic in the
small quantity u. So, let us study the Newtonian problem first. This means
leaving out the cubic term in Eq. (3.63) and considering

We note that we can complete the square on the right-hand side by


introducing a new variable v = u − M/ℓ2 so that our differential equation
simplifies to
which is of the well-known form and can be solved using
separation of variables. A direct calculation gives

where ϕ0 is the constant of integration. This is the equation of an ellipse,


which is not obvious at first sight, see Exercise 3.16. However, we do note
that this function is 2π periodic.
In order to find the next to leading order correction to this ellipse, we
substitute v = u − M/ℓ2 into Eq. (3.63) and keep terms up to quadratic order in
v instead of u. This results in the new equation

As before, we can now complete the square on the right-hand side by the
substitution

which transforms Eq. (3.67) in the form

and can be integrated using separation of variables. The constant c2 is simply


given by

while the constant c1 is a more involved expression given by

Therefore, the solution is given by


where ϕ0 is a constant of integration. Expressed in our original variable r, the
result reads

In contrast to Eq. (3.66), this solution is not 2π periodic because of the


additional factor of c2 in the argument of cosine. Hence, in General
Relativity, the two body problem does not give rise to closed ellipses. There
is a slight failure of the object to return to its starting point after one turn.
We can express this by computing the shift of the angle ϕ between two
instances where the object is at the same radius r which is

In case c2 = 1, we find ∆ϕ = 0 and we are back to Newtonian orbits. By


assuming that the quantity M 2/ℓ2 ≪ 1, we can approximate ∆ϕ using a series
expansion as follows:

Last, we use expression (3.61) for the angular momentum which in first order
in M gives ℓ2 ≈ rM, so that we arrive at

where we re-inserted the physical constants G and c. Next, for the distance of
the planet Mercury from the Sun we take the semi-major axis r = 5.79 × 1010
m, while the solar mass is M⊙= 1.99 × 1030 kg. Note that this result is
independent of the mass of Mercury, only its distance from the Sun matters.
One computes
which is in units of radians per orbit. The orbital period for Mercury is about
0.24 years. Now we convert radians into arc seconds which means we arrive
at

One can improve this calculation further, which in particular will take into
account the eccentricity of the orbit of Mercury. This results in dividing our
result (3.77) by a factor of (1 − e2) where e is the eccentricity. For Mercury, e
= 0.2 so that (1− e2) = 0.96 which yields the improved result

which is precisely the amount which cannot be accounted for using


Newtonian gravity alone. Amazingly, General Relativity predicts this value
for the planet Mercury. Hence, this is a very strong confirmation for
Einstein’s theory.

3.5.2. Light deflection by the Sun

When light or radio signals pass by a massive gravitational object, they will
experience a pull towards this object and thus not travel along straight lines,
the angle characterising the deviation from a straight line is the deflection
angle, see Fig. 3.4. This effect can be observed on Earth for signals that pass
by near the Sun. For light signals of distant galaxies this can only be done
during a Solar eclipse, however, for radio signals this is possible
continuously. One compares the observed trajectories with those seen half a
year later or earlier when the Sun is no longer between the observer and the
source.
Fig. 3.4 Light deflection by a massive object. Greatly exaggerated for light rays passing by near the
Sun.

We define the deflection angle to be

where the subscripts ± indicate the asymptotic angles, as r becomes very


large. The situation is symmetric with respect to the object and hence we can
use

The starting point to compute the deflection angle is again the equation
for dϕ/dr which is found by combining Eqs. (3.52) and (3.54) with L = 0.
This yields

The minimum distance r0 of the signal from the gravitational objects can be
defined by solving dr/dϕ = 0 for which we find

This follows from setting the numerator in (3.82) to zero. The quantity r0 is
often called the impact parameter in the literature. Substituting the energy E
for the impact parameter, we arrive at the following differential equation for
the angle
Note that this equation is independent of angular momentum ℓ. At this point
one could follow an approach similar to that used when computing the
perihelion precession. However, one could also use a different technique
based on a Taylor series expansion.
One can integrate directly, so that ϕ+ is given by

First, we introduce a new independent variable u = 1/r for which the integral
transforms to

Looking at this integral more closely, we see that this is difficult to solve for
general r0 and M. However, as before, we do not need to know the value of
this integral exactly, we are mainly interested in the first-order corrections to
Newtonian gravity and hence will treat M as a small parameter and make a
series expansion of the integrand.
Setting M = 0 in Eq. (3.86) gives

In the absence of any masses we find ∆ϕ = 2(π/2) − π = 0 so that there is no


angle between the incoming ray and the outgoing ray which means that light
signals travel along straight lines. This is indeed the expected result for flat
space.
For the first-order correction, we need to compute the derivative of the
integrand of (3.86) with respect to M and evaluate this derivative for M = 0,
this leads to
One can attack this integral with the substitution u = sin(α)/r0 which reduces
it to a trigonometric integral and can be solved using standard techniques.
The final result is given by

and we can write the first two terms of the quantity ϕ+ as

where the factor of M comes from the Taylor expansion in the mass term.
Therefore, the deflection angle of the Schwarzschild spacetime is given by

For light rays or radio signals passing nearby the Sun we approximate the
distance of closest approach by r0 = R⊙. We already computed the necessary
numbers in Eq. (3.47), which with an additional factor of two yields

This deflection angle was first observed during a solar eclipse in 1919 led by
Eddington. It was the first experimental observation of an important
prediction by General Relativity.
It appears that we have lost the Newtonian case somewhere along the
calculation. When computing the perihelion precession, we first calculated
the Newtonian orbit and next the general relativist correction to it. Here, our
Taylor series approach gave us the flat space result first and the general
relativistic result second. So what is the deflection angle predicted by
Newtonian gravity? First, one should ask whether a massless particle like the
photon should feel gravitational attraction in the first place. In classical
Newtonian physics where the photon is a massless particle, the answer would
be No. From a modern physics point of view, on the other hand, one would
simply argue that the photon has energy and momentum and hence should
feel the gravitational attraction of massive bodies. Once the finiteness of the
speed of light was established, one could study the photon path in a
Newtonian gravitational field by treating the photon like a real particle
moving at the same speed. The resulting bending angle has been known since
the late 18th century and is given by

which is precisely half the deflection angle predicted by General Relativity.

3.5.3. Gravitational redshift of light

Redshift z is a dimensionless quantity used in various fields of physics and is


defined by

where λobs is the wavelength of the observed signal and λe is the wavelength
of the emitted signal. This redshift will play an important role in Cosmology,
see Secs. 4.3.2 and 4.3.4.
Let us consider a signal emitted with wavelength λe, then this signal will
have local energy hfe where h is Planck’s constant and fe is the frequency of
the photon (recall λf = c). We discussed in Sec. 2.1.2 the proper time of a
local observer. The change of the proper time of the emitter determines the
frequency of the emitted wave while the proper time of the observer
determines the frequency of the observed wave.
In General Relativity, the strength of the gravitational field will affect
clock cycles. Let us consider an emitter and an observer, both of which are at
fixed spatial coordinates, for instance we can think of the Sun emitting some
radiation which is observed on Earth. The local movement of the Earth
relative to the Sun during the wave’s propagation will introduce an additional
Doppler effect, however, for now we are interested in the purely gravitational
effect. Let us consider the Schwarzschild metric (3.19), and assume the
emitter is located at re, θe, ϕe, then the proper time is given by

Likewise, the proper time of the observer is

Before proceeding, let us briefly compare these two equations with the time
dilation relation (2.20), we write

for some fixed radius r and note that the term from the Schwarzschild metric
plays the role of the Lorentz factor. Hence, clocks near massive objects will
also be running slower than clocks far away where the gravitational field is
weak.
We can interpret dτe and dτobs as the times between two maxima of the
emitted and observed waves, respectively. Combining Eq. (3.94) with (3.95)
and (3.96) gives

where we note that the coordinate time differences dte and dtobs are equal for
spatially fixed emitters and observers.
Let us now consider again the Sun, and estimate the redshift z for a signal
emitted on the surface of the Sun and received on Earth. Since the distance
between the Sun and the Earth is much larger than the radius of the Sun, we
can neglect the term 2M⊙/robs ≪ 1 in Eq. (3.98), therefore
where in the final step we assumed M⊙/R⊙ to be small. Taking the physical
mass and radius of the Sun and reinserting the gravitational constant and the
speed of light yields

This effect is fairly small which makes it difficult to measure. Due to the high
surface temperature of the Sun, the thermal velocities of various atoms are of
the order of 103 m/s, this will introduce Doppler effects which are of similar
order of magnitude to the redshift z. Nonetheless, observational data is in
excellent agreement with the predictions of General Relativity.
Let us make a small remark about the gravitational redshift. The
discussion is independent of the Einstein field equations, we would have
arrived at the same result using only Eqs. (2.59) and (2.66). In this sense, the
gravitational redshift is not directly testing General Relativity but it is
verifying the principle of equivalence.

3.5.4. Radar echo or gravitational time delay

Strictly speaking the gravitational time delay is not one of the three classical
tests, however, the idea behind this test is very similar to the earlier-
mentioned ones where a signal passes nearby the Sun. An intense
electromagnetic wave is directed to another planet when this planet is almost
opposite to the Earth on the far side of the Sun. A radio telescope on Earth is
detecting the reflection or echo of this signal. As the signal and its reflection
pass by near the Sun, the photons will not travel along straight lines. Hence,
their travel times will differ from those expected from a straight line path.
Now we will estimate this time delay for Earth and Venus, but it equally
applies to other planets.
The starting point of this calculation is the geodesic equation with L = 0
for photons. We are interested in the coordinate time t, so combining Eq.
(3.51) with Eq. (3.53) gives an expression for dr/dt given by

As in Sec. 3.5.2 we introduce the impact parameter r0 as the distance of


closest approach of the photon to the Sun. This minimum distance is defined
by dr/dt(r0) = 0, so that we find

which can be substituted back into Eq. (3.101) to eliminate E2/ℓ2. This yields

Now we can separate the variables and integrate to find the travel time
between some radius r1 and the distance of closest approach r0, which is

Many integrals that appear when studying geodesics in the Schwarzschild


spacetime need to be approximated since one cannot find explicit solutions in
terms of elementary functions in closed form. As in the previous sections, we
assume M to be small. Then the integrand in first order in M is given by

and the three resulting terms are standard integrals which lead to the result

The first term corresponds to the travel time of the signal had it followed a
straight line path. The terms proportional to the mass contain the additional
travel time due to the curved trajectory of the photon, this excess time
between two points is therefore given by . Let us now
consider a signal sent from Earth ♁ to Venus ♀, then reflected back to Earth
where it is observed. The excess time compared to the straight line is given
by

where the factor of 2 comes from the fact that first the signal and then its
reflection have to be considered. In simple words, the photon has to travel to
Venus and then back again.
For signals which pass nearby the Sun, we can assume r0 ≪ r♁ and r0 ≪
r♀, which allows us to further simplify the time delay. In lowest order in r0
we find

where we reinserted the physical constants G and c. In principle, we should


take into account that the proper time ∆τ for the signal differs from the
coordinate time ∆t, however, since 2GM♁/c2/r♁ ≪ 1 this effect can be
ignored in the present calculation. We obtain the final result

While there are some technicalities that somewhat limit the accuracy of
measuring this effect, it is fair to say that measurements are again in excellent
agreement with the prediction by General Relativity.

3.6. The Schwarzschild Radius

When we solved the Einstein field equations to find the Schwarzschild


solution, we already noted that this metric is singular when r = 2M or r = 0.
Our first aim is to understand whether the r = 2M surface corresponds to a
real physical singularity or to choosing ‘bad’ coordinates to cover the
manifold. Clearly, when we solved the field equations, we were primarily
interested in finding a solution in some coordinate system which took into
account our requirements of staticity and spherical symmetry, rather than
worrying about the entire manifold.

3.6.1. Radial null geodesics

Let us briefly recall Examples 1.10 and 1.13. In both cases the geodesics of
these spaces proved invaluable to understand their geometric properties.
Therefore, it seems natural to consider radial geodesics of the Schwarzschild
spacetime. Radial geodesics have no angular momentum, so we set ℓ = 0, and
moreover we will be primarily interested in radial null geodesics which
means L = 0. In more physical terms we are studying the propagation of
radial photons. The geodesic equations are (3.51) and (3.54) which become

The second equation becomes ṙ = ±E and can be integrated with respect to


the affine parameter λ. This gives

where r0 is a constant of integration which corresponds to the distance of the


photon from the centre when λ = 0. For the positive sign the radius increases
with λ and hence we will speak of outgoing null geodesics while for the
negative sign the radius decreases with λ and so we will speak of incoming
null geodesics. The most notable implication of Eq. (3.112) is that the r = 2M
surface does not appear to introduce any conceptual problems when dealing
with radial null geodesics. Therefore, any such geodesic will reach the
Schwarzschild radius at finite affine parameter. However, this changes quite
dramatically when working with coordinate time. We have
for outgoing and incoming geodesics, respectively. This can be integrated
with respect to r and we arrive at

where t0 is a constant of integration. Hence, as r → 2M, the coordinate time t


approaches positive or negative infinity for outgoing or incoming null
geodesics, respectively, in contrast to the finite affine parameter.

3.6.2. Eddington–Finkelstein coordinates

We can solve Eq. (3.114) for the constant t0 for both signs and arrive at
quantities which are constant for either outgoing or incoming null geodesics.
We define

where u is constant for the outgoing null geodesics and v is constant for the
incoming ones. We can visualise the incoming and outgoing geodesics in the
original Schwarzschild coordinates (t, r) in Fig. 3.5.
It is now tempting to introduce new coordinates for the Schwarzschild
spacetime based on u and v. These are called the Eddington–Finkelstein
coordinates, u is the outgoing Eddington– Finkelstein coordinate and v is the
incoming Eddington–Finkelstein coordinate. Let us eliminate the time
coordinate t using the new coordinate u, we have
Fig. 3.5 Incoming and outgoing geodesics in Schwarzschild coordinates (t, r), the lines correspond to u
= const. and v = const., the Schwarzschild radius r = 2M and the origin r = 0 are indicated by thick
lines. Also included are some light cones.

so that the Schwarzschild metric (3.19) in outgoing Eddington– Finkelstein


coordinates becomes

In these coordinates the Schwarzschild metric is no longer singular at r = 2M,


however, the metric remains singular at r = 0. A similar form of the metric is
found when working with ingoing Eddington– Finkelstein coordinates.

3.6.3. Kruskal–Szekeres coordinates

In order to find a coordinate system which covers the entire manifold, we


first introduce another set of coordinates given by

which removes the prefactor 1 − 2M/r from the metric. Last, instead of
working with these null coordinates (recall that u and v are constant for
outgoing or incoming null geodesics, and so are U and V ) we will introduce
a new time and a new space coordinate as follows:
In these so-called Kruskal–Szekeres coordinates the Schwarzschild metric
takes the final form

where the radius r is defined implicitly by

Metric (3.121) is now well defined and regular for all r > 0. Therefore, we
can create a spacetime diagram for the entire manifold, see Fig. 3.6.

Fig. 3.6 Incoming and outgoing geodesics in Kruskal–Szekeres coordinates (T, X), the dotted diagonal
lines correspond to the incoming and outgoing null geodesics. The Schwarzschild radius r = 2M and the
origin r = 0 are indicated by thick lines. Some r = const. lines are indicated by dashed lines.

3.6.4. Black holes

After discussing geodesics in the Schwarzschild spacetime, we are now in a


position to be slightly more precise about black holes. The Schwarzschild
radius is often referred to as the event horizon of black holes as it separates
two regions which are of particular interest. From Fig. 3.6 we conclude that
an outgoing null geodesic emitted at any radius r which is between the event
horizon and the centre 0 < r < 2M cannot leave this region. Graphically this
follows from the fact that radial null geodesics travel along diagonal lines and
hence cannot intersect with the r = 2M lines which are also diagonals. On the
other hand, all incoming radial null geodesics will eventually cross the
horizon and approach the singularity at r = 0. The precise mathematical
definition of a black hole in General Relativity is in fact not as simple as it
appears as there is no reason to believe that black holes have to be spherically
symmetric. Since most astrophysical objects are rotating, we would expect a
black hole to also rotate and hence have angular momentum, see for instance
Wald (1984). We mentioned that the centre of the black hole corresponds to a
true spacetime singularity which is ‘hidden’ from the outside by the black
hole horizon. It is widely believed that the gravitational collapse of stars or
other matter sources will produce black holes instead of ‘naked singularities’
which would be a singularity without a horizon hiding it. Research in these
directions is ongoing and some of the questions involved are very subtle.

3.7. Further Reading

This entire chapter focussed on one special solution of the Einstein field
equations. It would therefore be natural to continue with other known exact
solutions to the field equations. Of particular interest is the Kerr solution
which we can view as the rotating generalisation of the Schwarzschild
solution, see for instance Stephani et al. (2003, Chap. 20). Recall that the
Schwarzschild solution was found only a year after the field equations were
discovered. The Kerr solution, on the other hand, was only found in 1963.
The axially-symmetric Einstein field equations are considerably more
difficult than the spherically symmetric ones. There are many other
interesting solutions of the field equations which could form the basis of
further studies, on the more speculative end this would include wormhole
solutions for instance.
An interesting gravitational effect due to the rotation of a massive object
like the Earth is the so-called Lense–Thirring effect, sometimes called frame-
dragging effect. This very small effect would change the spin direction of
gyroscopes that orbit the Earth. This effect was eventually confirmed
experimentally by Gravity Probe B in 2011. The experimental verification of
General Relativity is thoroughly discussed in Will (2014).
Black holes are also a fascinating subject of current research. One can
show that the total area of all black holes in the Universe cannot decrease
which shows strong similarities with the second law of thermodynamics. This
analogy goes much further and one can formulate analogues of the other
thermodynamic laws in the context of black holes, see Wald (1984). In 1974,
Hawking used a quantum field theoretical calculation to show that black
holes emit black body radiation, Hawking radiation, of a specific
temperature, the so-called Hawking temperature.
The previously mentioned centennial perspective by Ashtekar et al.
(2015) would also be a suitable continuation for readers interested in an
overview of current research topics. Large parts of this book should be
accessible to readers at this point, all articles contain references to the
original literature.
For readers who wish to improve their problem and exercisesolving skills,
probably the best book available is by Lightman et al. (1975). It contains
almost 500 problems with fully worked-out solutions.
The recommended literature is by no means complete, it simply reflects
the author’s suggestions for further reading.

3.8. Exercises

The Schwarzschild Solution

Exercise 3.1. Find the Schwarzschild radius of a proton and compare this
with its actual radius.

Exercise 3.2. Find the non-vanishing Christoffel symbol components of


metric (3.5).

Exercise 3.3. Show that Eq. (3.22) is indeed equivalent to Eq. (3.19).

Exercise 3.4 (takes time). Consider the static and spherically symmetric
metric ds2 = −e2A(r)dt2 + e2B(r)(dr2 + r2dθ2 + sin2θdϕ2) in isotropic
coordinates. Recall that this r is different from the radial coordinate r used in
standard Schwarzschild coordinates. Compute all non-vanishing Christoffel
symbol components and then compute the non-vanishing Ricci tensor
components.

Exercise 3.5. Use the Ricci tensor components from the previous exercise to
find the non-vanishing components of the Einstein tensor.

Exercise 3.6 (takes time). Finally, continuing from the previous two
exercises solve the vacuum field equations, thereby deriving from first
principles the Schwarzschild solution in isotropic coordinates given by Eq.
(3.22).

Exercise 3.7 (hard). Consider the metric

Show that this is the Schwarzschild metric by finding the coordinate


transformation which transforms this metric into the form (3.19). Find the
relationship between µ and mass parameter M .

Exercise 3.8. Using the Schwarzschild coordinates (3.5), solve the vacuum
field equations in the presence of the cosmological term. The result is the so-
called Schwarzschild–de Sitter or Kottler solution which is given by

The Schwarzschild Interior Solution


Exercise 3.9. The spatial part of the Schwarzschild interior metric (3.33) can
be written in the form ds2 = dr2/(1 − kr2)+ r2dΩ2. Show that this is the metric
of a 3-sphere.

Exercise 3.10. By solving the field equation with components i = j = 0 in Eq.


(3.6), find the spatial part of the Schwarzschild interior metric with
cosmological constant. Assume constant density.
Exercise 3.11. For the Schwarzschild interior solution we introduced the
mass definition Eq. (3.29) which corresponds to the Newtonian mass. This
definition ignores the fact that we should integrate on a curved manifold and
also ignores that pressure contributes to the gravitational field. The proper
mass is defined by

Find the proper mass for the Schwarzschild interior solution.

Exercise 3.12 (hard). The mass measured at infinity of a static and


spherically symmetric star is given by

Show that this mass equals to total mass M = m(R) defined by Eq. (3.31).

Geodesics in Schwarzschild Spacetime


Exercise 3.13 (A classic). Consider a radio commentator falling radially into
a Schwarzschild black hole. As he approaches the Schwarzschild radius, his
broadcast wavelength strongly redshifts. The radio listener (far away from the
black hole) observes the time dependence of this redshift λobs/λe ≈ exp(−t/µ).
Find the relationship between µ and the mass of the black hole.

Exercise 3.14. The setting of this question is the Schwarzschild spacetime.


Consider a freely falling, massive test particle initially at rest at r = 10M.
Show that the proper time required for this particle to reach the centre r = 0 is
given by

Exercise 3.15 (hard). The de Sitter solution in Schwarzschild coordinates is


given by

which is the M → 0 limit of the Kottler or Schwarzschild–de Sitter solution.


Show that for Λ ≤ 0 this metric is regular everywhere. Identify the radius rΛ
where the metric is singular when Λ > 0.
Find the geodesic equations of the de Sitter solution and consider radial
geodesics. Show that a freely falling observer starting at the origin with
velocity v will cross the surface r = rΛ for finite affine parameter, thereby
showing that rΛ corresponds to a coordinate singularity.

Testing General Relativity — The Classical Tests

Exercise 3.16 (hard). Show that Eq. (3.66) is indeed the equation of an
ellipse in this context. Determine the relationship between the constants in
this equation and those characterising the ellipse.

Exercise 3.17. Estimate the deflection angle of light rays or radio signals
passing nearby the surface of Jupiter, and compare it to that of the Sun.

Exercise 3.18. Consider the integrand of Eq. (3.104) and make a series
expansion in the mass parameter M up to linear terms.

The Schwarzschild Radius


Exercise 3.19. Consider the de Sitter metric (3.125). Introduce a new
coordinate u = t − f (r) which replaces the time coordinate. Find f(r) such that
the de Sitter metric becomes regular across rΛ. Is this metric regular
everywhere?

Exercise 3.20. Derive the Schwarzschild metric in incoming Eddington–


Finkelstein coordinates.

Exercise 3.21. Derive the Schwarzschild metric using both incoming and
outgoing Eddington–Finkelstein coordinates (u, v) instead of the coordinates
(t, r).

Exercise 3.22. Using Eq. (3.120), find the explicit coordinate transformation
for the Kruskal–Szekeres coordinates T and X in terms of the original
Schwarzschild coordinates t and r. Distinguish between r > 2M and r < 2M
(this part is harder).
4
Cosmology

Cosmology is the study of the universe as a whole. Its study has a very long
history, humanity at all times was interested in the universe and longed to
understand it. Physical cosmology in its modern form began shortly after the
formulation of General Relativity when various researchers were interested in
understanding the cosmological consequences of the theory.

4.1. Classical and Modern Cosmology

4.1.1. Cosmological principle

One can broadly differentiate between classical cosmology and modern


cosmology. Classical cosmology focusses on the study of particular solutions
of the Einstein field equations which could model the universe. It was in
1929 when Hubble observed the redshift of distant galaxies thereby providing
observational evidence for an expanding universe. The next crucial
observation was the discovery of the cosmic microwave background radiation
by Penzias and Wilson in 1964. At this point in time cosmology was still a
relatively small research field, often seen as the more speculative end of
General Relativity.
While the 1964 observation was hugely important, it was much later that
cosmology was transformed into a substantial and independent research field.
In 1998/1999, observations of type Ia supernovae showed that the universe is
not just expanding, but that this expansion is in fact accelerating. The most
straightforward explanation for such a behaviour is to introduce the
cosmological constant Λ into the Einstein field equations. Together with the
COBE (1989) and WMAP (2001) missions, which studied in detail the
structure of the anisotropies of the cosmic microwave background radiation,
the so-called standard model of cosmology was formulated. Modern
Cosmology generally refers to the study of this standard model. In the
following we will discuss the most important aspects of Classical Cosmology
and then introduce the concepts of Modern Cosmology.
We recall that the Einstein field equations are very complicated in
general. However, when studying the static and spherically symmetric case in
Sec. 3.1, it turned out that the field equations simplified considerably because
of this symmetry. Our first task therefore is to identify some suitable
assumptions about the universe so that we arrive at manageable field
equations.
In order to do this, we also need to make some fundamental assumptions
about the laws of physics within our universe. We are unlikely to be able to
directly test the validity of physical laws on cosmological scales, we also
need to establish what this scale is. As our primary working assumption we
assume that the laws of physics are the same everywhere and at all times, and
that the universe is connected.
Let’s imagine a night sky and we are looking in a certain direction. We
note that some areas contain more brighter objects than others. We are taking
this further by placing a fictitious cube with side length into any part of our
observable universe and count the number of galaxies in this volume. It turns
out that the number of galaxies in any cube of side length of about 100Mpc (1
pc ≈ 3.26 ly ≈ 3.09 × 1016m) is approximately the same. On such scales the
universe looks statistically uniform, or homogeneous. On the other hand, the
cosmic microwave background radiation has roughly the same intensity in
every direction, it is isotropic. These two observation are generally promoted
to what is known as the cosmological principle.

Cosmological principle: The universe is homogeneous and isotropic at


all times when viewed on large enough scales.
This means the universe is the same for all observers, independent of their
location. On cosmological scales, test particles and observers are galaxies
modelled as a perfect fluid with energy density ρ and isotropic pressure p. By
a cosmological reference frame we mean a set of coordinates in which
physical quantities are homogeneous and isotropic. In this reference frame we
consider a comoving observer which is at rest in that frame. The proper time t
measured by this comoving observer, starting at t = 0, is called the
cosmological or cosmic time. This cosmological time is very similar to
Newtonian time in classical mechanics.

4.1.2. Geometry of constant time hypersurfaces

At any particular cosmic time t1 the universe defines a 3D manifold, a space-


like hypersurface. It turns out that homogeneity and isotropy imply that this
hypersurface is a space of constant curvature. Any two such spaces of the
same dimension, same metric signature and same value of constant curvature
are locally isometric. Fortunately, spaces of constant curvature are quite
simple. Spaces of constant positive curvature are spheres, space with
vanishing curvature are Euclidean spaces and spaces of constant negative
curvature are hyperbolic spaces.
It can be shown that the metric of a 3D space of constant curvature can be
written as

where x, y, z are the standard Euclidean spatial coordinates, and k is the


curvature constant which can take the values k = {+1, 0, −1}. The Ricci
scalar of this 3-metric is given by

which is a constant, and hence, the sign of k determines the sign of the scalar
curvature. Recall that we encountered various spaces of constant curvature in
the previous sections. Exercise 1.11 discussed hyperbolic space, while
Exercise 3.9 was about the 3-sphere.
Working with standard Euclidean coordinates in Eq. (4.1) has some
advantages when it comes to interpreting results. However, this choice of
coordinates if somewhat inconvenient since the denominator depends on all
spatial coordinates. To begin with we introduce standard spherical polar
coordinates (1.52)–(1.54). In these coordinates the metric becomes

and r is the Euclidean distance from the origin. This metric simplifies
considerably if we introduce the new radial coordinate

In the coordinates X i = {ρ, θ, φ} metric Eq. (4.1) takes the form

which is easier to handle for explicit computations.

Note. Many authors, including myself, often write metric (4.5) using r
instead of ρ for the radial coordinate. For the purpose of this text it is cleaner
to use r for the true Euclidean distance and use ρ to emphasise that we are
working with a different radial coordinate, as defined in Eq. (4.4).

When k = 1, metric (4.5) is that of a 3-sphere (see Exercise 3.9) which has
finite volume. Recall that the volume of the manifold is given by Eq. (1.57)
which allowed us to show that V ( 3) = 2π2. The respective volumes of
Euclidean space and hyperbolic space are infinite.

4.1.3. Friedmann–Lemaître–Robertson–Walker metric

The discussions of the previous subsection are now combined into a metric
ansatz which allows us to study homogeneous and isotropic cosmological
models in the context of General Relativity. This so-called Friedmann–
Lemaître–Robertson–Walker metric is given by
where the unknown function a(t) is called the scale factor or sometimes
expansion parameter. The name scale factor is very natural as this function
‘scales the size’ of the spatial part of that metric. An obvious question is
whether one can generalise this metric by working with an additional
function N(t) so that the temporal part of the metric becomes −N(t)2dt2. In
fact, this is equivalent to Eq. (4.6) since we can always introduce a new time
coordinate t′ via dt′ = N(t)dt which will absorb the new function. We noted
that Eq. (4.5) uses a convenient choice of coordinates, however, one can
introduce yet another coordinate system which is particularly elegant and
often used.
For k = 1, we introduce a third angle by ρ = sin χ, when k = 0 we simply
relabel ρ = χ. When k = −1 we introduce a hyperbolic angle ρ = sinh χ. A
direct calculation shows that metric (4.6) can now be written as follows:

The function Σ(χ) is given by

One can also combine these three cases into one function by writing

provided one carefully deals with the limit k → 0, and also recalls the
relationship between trigonometric and hyperbolic functions, namely sin(ix)
= i sinh(x). The quantity Σ(χ) can be thought of as the metric distance in this
space and hence we will also write dmetric = Σ(χ) when discussing
cosmological distances.
There are two nice things about this function. First, it somewhat justifies
why one speaks of spherical and hyperbolic geometries (be careful here, in
general one cannot deduce geometrical facts directly from the metric as it
could be a simple space written in very strange coordinates). Second, let us
assume that χ ≪ 1 and write the series expansion of Σ(χ), we find

We now see one of the key features of differential geometry, locally all
spaces appear to be flat, and that curvature is a higher-order effect.
Another important remark is in order at this point. Even if k = 0 so that
the spatial geometry is Euclidean, the full 4D manifold is not flat. We will see
this more explicitly when discussing the field equations where we will state
the Ricci scalar which is non-zero for k = 0.

4.1.4. Particle horizons

When discussing cosmology in the context of General Relativity we need to


ask the rather natural question: How much of the universe can we in principle
observe? Since signals can only travel at the speed of light, the age of the
universe multiplied by the speed of light gives us a certain length scale which
restricts what we can potentially observe at any given point in cosmological
time. In models where the universe is expanding in time, this issue is even
more pressing. In order to simplify the following discussion, we will briefly
restrict ourselves to k = 0. In this case we write our cosmological metric as

and work with spatial Euclidean coordinates. Make the coordinate


transformation

so that Eq. (4.11) transforms into


The metric in this form is a multiple of Minkowski space and τ is called
conformal time. Hence, we would expect all coordinates to range from −∞ to
+∞. However, for τ being able to take all such values, the integral in Eq.
(4.12) cannot be arbitrary which in turn implies that a(t) is somewhat
restricted. Let us make this more formal.
We would like to establish which observers could have sent a signal
which reaches another observers at or before an event p. The boundary
between those world lines that can reach the second observer and those that
cannot is called the particle horizon at p. This is a local definition as different
observers will have different particle horizons.
Let us assume the universe began at t = 0 and let us consider an observer
at some time tobs, which means we are considering a universe of finite age. If
τ diverges as t → 0, the observer will be able to receive signals from each and
every point of the observer’s past. If a(t) = a0tα with a positive constant a0,
then the integral in Eq. (4.12) diverges as t → 0 if α ≥ 1 and hence there will
be no particle horizon. If, on the other hand, the integral converges, α < 1,
then there exits a particle horizon because τ will be bounded from below and
so part of the spacetime described by (4.13) is ‘missing’.

4.1.5. Field equations

The cosmological Einstein field equations are

where Λ is the cosmological constant. Its value is about Λ ≈10−52m−2 which


means that it is only of relevance on very large scales. We can use the radius
associated with the cosmological constant as a rough guide to the size of the
universe, we have

which is in good agreement with various observations.


The Ricci scalar for our metric (4.6) is given by

where the dots denote differentiation with respect to cosmological time t. The
components of the Einstein tensor are given by

Likewise, the non-vanishing components of the energy–momentum tensor of


a perfect fluid are

where ρ stands for the energy–density and p is the pressure.


The cosmological Einstein field equations are often written in two
different but equivalent forms. The first form is the one directly obtained by
equating the Einstein tensor with the matter tensor and adding the
cosmological term, which results in

An alternative form of these equations which is often used in cosmology is


found by eliminating the term ȧ/a from the second equation (4.21) which
yields
This set of equations is often referred to as the Friedmann equations, they are
the starting point for analysing cosmological solutions to the Einstein field
equations and are the very basis of cosmology. In principle one could start
the study of cosmology from these equations without much reference to
General Relativity. While this approach is possible up to some point, more
advanced subject in cosmology will again require the full machinery of
differential geometry.
We recall that the energy–momentum tensor satisfies the conservation
equation , and we must keep in mind that this equation is not
independent of the field equations. It is implied by the field equations by
virtue of the twice contracted Bianchi identities, see Theorem 1.6. One can
see this explicitly by differentiating Eq. (4.22) with respect to time t, and
elimination of the curvature parameter k from the Friedmann equations,
which leads to

One tends to refer to this equation as the cosmological energy–momentum


conservation equation. This is equivalent to the equation , the three
equations with i = 1, 2, 3 are identically satisfied.
Cosmological models based on the Friedmann–Lemaître–Robertson–
Walker metric are described by two independent equations, two out of the
three Eqs. (4.22)–(4.24). They contain three unknown functions, namely the
scale factor a(t), and the matter ρ(t) and p(t). Therefore, our equations are
under-determined and we need some additional input to close this system.
The most physical approach is to specify the matter content of the universe,
this means we must chose an equation of state for the matter. In cosmology
one typically considers a perfect fluid with linear equation of state p = wρ
where w is the equation of state parameter. In classical mechanics w would
correspond to the square root of the sound speed in this fluid.
When w = 0, the pressure vanishes and one speaks of a pressureless
perfect fluid which is often called a dust. In cosmology this is called a matter-
dominated universe. For w = 1/3 one deals with radiation, recall that radiation
also carries momentum which results in radiation having pressure. For
standard matter the equation of state parameter satisfies 0 ≤ w < 1. The upper
bound means we require the speed of sound to be less than the speed of light.
When w = 1 one speaks of stiff matter, the sound speed would be the speed of
light. Cosmology also deals with non-standard equations of state: for −1 ≤ w
< −1/3 one speaks of dark energy, w = −1 corresponds to the cosmological
constant, and for w < −1 the fluid is called a phantom fluid.
For such a linear equation of state p = wρ, the energy–momentum
conservation equation (4.24) can be integrated using separation of variables
which yields

The constant of integration is denoted by ρ0. We will interpret this result in


the following section when discussing cosmological solutions in detail.

4.2. Cosmological Solutions

4.2.1. Matter-dominated universe

For a matter-dominated universe (w = 0) the conservation equation can be


integrated and we find ρ = ρ0a−3, see Eq. (4.25), which holds independently
of the curvature parameter k. This is, of course, the expected result for matter
when we recall the basic definition of density, namely mass or energy per
volume. The scale factor a corresponds to a length scale and hence we can
interpret a3 as a volume. Since the cosmological Einstein field equations are
two independent equations and we have already solved one of them, we are
left with one more equation to be solved, namely Eq. (4.22). Upon
substitution of the density in terms of the scale factor we have

We can solve this equation for ȧ and apply separation of variables which
results in
where t0 is a constant of integration. The positive root will always be chosen
so that the scale factor is positive. Our aim of finding the function a(t) has
been reduced to an integration problem. Unfortunately, integrals of this type
are pretty hard and we cannot state an explicit solution in terms of elementary
functions for all Λ and k. However, we can calculate this integral for some
cases, thereby understanding the qualitative behaviour of the different types
of possible solutions.
Let us begin by considering the simplest case, a matter-dominated
universe with zero curvature parameter k = 0 and vanishing cosmological
constant, Λ = 0. In this case, our integral simplifies considerably and we have

which we can now solve for the scale factor. The result is

and we assume that the universe had zero volume at time t0 so that a(t0) = 0,
moreover, we will set t0 = 0, as the time when the universe began is arbitrary.
The most important aspect of this solution is the scaling of a(t) with respect
to t, namely a(t) ∝ t2/3. This implies that there exists a particle horizon.
Moreover, ρ(t) ∝ t−2 so that the energy density diverges as t → 0 which is
called the big bang.
Next, we are interested in the matter-dominated universe with k = 1 in
which case we need to integrate

We can evaluate this integral using the substitution a = (8πρ0/3) sin2 (u/2) so
that we arrive at
It turns out that we cannot solve this equation explicitly for the scale factor
a(t) and have to accept a solution in parametric form which we can write as

We set t0 = 0 which means that a(t = 0) = 0. The curve described by these


equations is in fact well known in mechanics, it is the standard
parametrisation of the cycloid. The cycloid is the curve traced by a point on
the rim of a circle rolling along a straight line. A universe of this type will
have a maximal size because of Eq. (4.32). The maximum is attained when u
= π which means we have

Fig. 4.1 Scale factor a(t) for a matter-dominated universe without cosmological term, for the three
different curvature parameters.

after which the universe will contract and decrease in size. At tend = 8π2ρ0/3
the universe will have shrunk to a = 0. The final phase of this universe is
called the big crunch opposed to the big bang. The scale factor for the matter-
dominated universe is shown in Fig. 4.1.
4.2.2. Radiation-dominated universe

Having discussed the matter-dominated solutions first, we now move on to


the radiation-dominated solutions of the Einstein field equations. As before,
we will restrict ourselves to Λ = 0 for now and deal with the cosmological
term later. For a radiation-dominated universe, the energy–momentum
conservation equation, Eq. (4.25) yields ρ = ρ0a−4. The remaining Einstein
field equation (4.22) now gives

which we can integrate explicitly for all three possible curvature parameters.
This results in

for k = ±1, while for k = 0 we find

Let us solve those equations for the scale factor

For a spatially flat universe we have a(t) ∝ t1/2, ρ(t) ∝ t−2, and there exists a
particle horizon. As in the matter-dominated case, the spatially-closed
universe has a maximal size amax and a finite age tmax. The hyperbolic case is
characterised by a(t) ∝ t for large t.

4.2.3. The Einstein static universe


The Einstein static universe is a very particular solution of the Einstein field
equations which is mainly of historical interest. Consider the field equations
with k = +1, p = 0 and Λ ≠ 0. Is it possible to find static solutions where a =
aE and ρ = ρE are constants? The field equation (4.22) gives

which means that the ‘size’ of this universe is determined by the energy
density and the cosmological constant. Moreover, from the other field
equation (4.23) we find

Thus, the energy density and the cosmological constant cannot be chosen
independently. Note that this solution only exists for positive Λ as energy
densities cannot be negative. Without cosmological term this solution does
not exist.
The most interesting aspect of this solution is that it is unstable with
respect to small homogeneous perturbations of the scale factor and the energy
density. This is a very nice calculation discussed in Exercise 4.8.

4.2.4. De Sitter universe

Let us consider the Einstein field equations with cosmological term Λ ≠ 0,


without any additional matter sources, this means ρ = p = 0.
The one independent field equation is

which we can solve using separation of variables. This gives


which for k = 0 immediately results in

where a0 is our new constant of integration. This solution is generally called


the de Sitter solution, it is particularly interesting in the context of inflation.
We note that the de Sitter universe cannot have started with a big bang at
finite time in the past, unlike the matter and radiation-dominated models.
The general solution to Eq. (4.42) can be written in the form

which for k = 0 reduces to (4.43). As before, the constant of integration is


chosen such that a(0) = a0.
For the de Sitter solution with k = 0 the resulting line element is

and has some very interesting properties which are not at all obvious at first
sight. First, it appears that this metric is non-static, however, we can show
that it is in fact static. Let us consider a time translation t ↦ t̃ + T where T is
some constant, which means the metric becomes

Since exp we note that the


spatial part of the metric is rescaled. Introducing new spatial coordinates
and similarly for the other two coordinates, we find that the
de Sitter metric, after time translation and rescaling of the spatial coordinates,
is given by

which is identical to Eq. (4.45). Therefore the de Sitter metric is


forminvariant under time translation and hence static. This is slightly
surprising, it turns out that de Sitter space has a larger symmetry group than
the other cosmological metrics we encountered. De Sitter space has some
other interesting mathematical properties, for instance it can be mapped to
part of a 5D hyperboloid. Our choice of coordinates only covers part of the
entire manifold.

4.3. Physical Cosmology

Our next aim is to connect the underlying geometry with quantities that can
be determined by observations. The Friedmann–Lemaître–Robertson–Walker
metric contains only one function, the scale factor a(t), and the curvature
parameter k. Recalling that one function already contains an infinite number
of degrees of freedom, it is clear that we cannot determine the exact
functional form of the scale factor by observations. This is intuitively clear,
no experiment can access the form of a(t) for t > ttoday . However, a viable
approach would be to make a series expansion of a(t) around ttoday with the
aim of determining the first few coefficients in this series by observations.
Having said that, we could of course solve the field equations for specific
forms of matter and determine the functional form of a(t) for all times and
compare this with cosmological observations. It turns out that cosmology
follows both these approaches.

4.3.1. Cosmological parameters

Let us begin here with the aforementioned series expansion of a(t) at the time
t0 = ttoday. Conventionally, quantities with subscript 0 refer to their values
today, where today really refers to the current cosmological time, and we will
write a0 = a(t0). Therefore we have

The current size of the universe a0 is a matter of convention, often this


quantity is simply set to one, however, we will keep a0 arbitrary for now.
Since we cannot determine this number, we will write our series as follows:

The quantity ȧ0/a0 corresponds to an expansion or contraction velocity of the


universe, depending on the sign of ȧ0. We define the Hubble function

which measures the rate of expansion. We note that this quantity is invariant
under rescaling of a(t) by some constant, a(t) ↦ λa(t) which follows from its
definition. Today’s value of H is denoted by H0 and is called the Hubble
parameter. It is conventional to use units of km/s/Mpc to give the measured
value of H0. This corresponds to units of 1/s which is consistent with the
units in Eq. (4.50).
Since the quantity H0 has units of inverse time, the quantity x = (t − t0)H0
is dimensionless. Our next step is to re-write Eq. (4.49) using this quantity
which leads to

where we introduced the new dimensionless parameter q0 = −ä/(aH2). The


number q0 is called the deceleration parameter, the minus sign in the
definition and also the name are historically motivated. It measures the rate of
change of the acceleration of the universe. In principle one can continue with
the series (4.51) to include higher-order terms in x, and introduce further
dimensionless constants. For the present purpose it is sufficient to discuss the
Hubble parameter and the deceleration parameter. The definitions to these
two parameters are based on the Friedmann–Lemaître–Robertson–Walker
metric and are independent of the Einstein field equations, one could speak of
kinematic quantities.
On the other hand, let us now recall the Einstein field equation (4.22)
using the Hubble parameter

Division by H2 on both sides of this equation yields an equation which


naturally contains only dimensionless quantities

This leads to the definition of the density parameters in cosmology

Here ρcrit is the so-called critical energy density which is given by ρcrit =
3H2/(8π) and we should note that this is a function of time in general. In
models with matter and radiation one distinguishes the different density
parameters with additional subscripts.
Next, we introduce the total energy density Ωtotal = Ω + ΩΛ which allows
us to write the first field equation in the very neat form

Despite its simplicity, this is one of the most important equations in


cosmology. In cosmological models where k ≠ 0, the total density Ωtotal is a
function of time. However, if k = 0 then Ωtotal = 1 at all times. Hence, in
spatially flat cosmological models the total density is determined from the
very beginning. We can also read this equation the other way around. If we
were able to accurately measure or estimate the matter content of the
universe, we could determine the curvature of the constant time
hypersurfaces. This is quite amazing really! Note that we are only interested
in the sign of k since its numerical values can always be rescaled. What we
are witnessing here is the very essence of General Relativity, namely that
geometry and matter are inextricably linked via the Einstein field equations.

4.3.2. Redshift

Electromagnetic signals in General Relativity and Cosmology are described


by null geodesic. In particular we are interested in radial geodesics where θ
and ϕ remain constant. The Lagrangian describing such geodesics follows
from Eq. (4.7) and is given by

Since we are considering null geodesics, L = 0 and we immediately find

where the sign and the limits of this integral depend on some conventions.
We choose coordinates so that the observer is located at the origin, and as the
signal was emitted before it was received, t1 < t0. Since we consider signals
travelling towards the observer at the origin, we must choose the negative
sign.
In order to clarify the notation we will now denote te = t1, and t0 = tobs
where the subscripts stand for emitter and observer, respectively. Hence, the
final form of our geodesic equation is

The quantity χ is called the comoving distance in cosmology. We already


noted that for a k = 0 universe this is equal to the metric distance, which is
simply dmetric = χ. It is no coincidence that this is identical to the newly
introduced conformal time coordinate in Eq. (4.12).
Let us now assume that a first signal is emitted at time te and that a
second signal is sent at time te +δte. The emitter is assumed to be at rest, more
precisely comoving, by which we mean that it follows the expansion of the
universe only and has no intrinsic motion relative to the observer. The
comoving distance is unchanged between the two signals so that

We split the first integral into three parts by changing the limit. Instead of
integrating from te + δte to tobs + δtobs, we integrate from te to tobs, then from
tobs to tobs + δtobs, and then need to subtract the integral from te to te + δte.
This second integral will cancel the last term in Eq. (4.59) and we arrive at

Next we assume that δtobs ≪ 1 and δte ≪ 1, so that we can apply the
fundamental theorem of calculus which yields

plus corrections of higher order in the small parameters. We can interpret


δtobs and δte as the time intervals between any two maxima of the signal’s
wave when observed or emitted. Hence we can write δtobs = λobs and δte = λe,
so that Eq. (4.61) simply becomes λobs/λe = a(tobs)/a(te). The standard
definition of the redshift is the relative difference between the observed and
emitted wavelengths, this quantity is usually denoted by z. Applied to our
cosmological setting we have

Therefore, the gravitational redshift due to the universe’s expansion is


determined entirely by the scale factor at the time of emission and time of
observation. This should not be too surprising as our cosmological model is
based on homogeneity and isotropy and only contains one unknown function
in the metric. Our final result is the simple formula
Let us briefly return to Eq. (4.51) and consider this in linear order in x,
this means we can write

Using our redshift formula, and multiplying the entire equation by the speed
of light c we obtain

The quantity z c has units of velocity and is referred to as the redshift


velocity, it can be interpreted as the recessional velocity of the galaxy
emitting the electromagnetic signal. The quantity (t−t0)c has units of length
and corresponds to the distance between the emitting galaxy and the
observer. This is nothing but Hubble’s law

first observed in 1929. This was the first observational evidence to support an
expanding universe, hence supporting a dynamic and not a static universe.
Note that Hubble’s law and our derivation of it are independent of the
Einstein field equations, in the sense that we studied geodesics on a manifold
with metric (4.7) and considered a series expansion of the scale factor.

4.3.3. Distances in cosmology

There are different notions of distances used in cosmology, most of which


will reduce to the Euclidean distance for nearby objects. It is common to
express these distances using the redshift as this is one of the directly
observable quantities in cosmology.
Let us begin by introducing the luminosity distance dL. Consider an
object with absolute luminosity L, this is the total power radiated by that
object. Light and other radiation expand from the object in spherical
wavefronts. Thus, at Euclidean distance dL, the apparent luminosity l, that is
the power received per unit area by an observer, is simply This
total power L spreads over the sphere of radius dL. We can use this relation as
our definition of the luminosity distance

which must be placed into the cosmological setting we are interested in. We
assume the emitting galaxy to have redshift z, at time te and with metric
distance dmetric = Σ(χ). From Eq. (4.7) we know that the effective radius of
the sphere is a(tobs)Σ(χe). Therefore, the electromagnetic signal will have
spread over the area

We denote by Lreceived the total power received at the effective radius


a(tobs)Σ(χe), so that we can write the apparent luminosity as l = Lreceived /A.
As the absolute luminosity has units of energy per unit time we can also write
Lreceived = ∆Eobs/∆tobs. Rewriting Eq. (4.62) using frequencies results in νe/
νobs = a(tobs)/a(te) which implies that the emitted and observed energies will
obey the same relation. Likewise, we have ∆tobs/∆te = a(tobs)/a(te). Hence we
arrive at

which means that the apparent luminosity is given by

Inserting this latter relation into our luminosity distance definition (4.67)
gives
which is the equation we wanted to derive. We note that Σ(χe) is the difficult
part of this equation. Determining this function requires knowledge of the
comoving distance χ at the emitter (see Eq. (4.57)) which in turn needs the
functional form of a(t) which is a solution of the cosmological Einstein field
equations. It is at this point where everything comes together, we will discuss
some concrete examples and also state a generic formula useful in this
context.
Let us introduce the angular diameter distance first. Let us begin with a
galaxy of diameter D and angular diameter δ. If this galaxy has Euclidean
distance dA from the observer, we have

for small angular diameters. As before, we use this relation as the definition
of the angular diameter distance

Next, we must reconsider this in the setting of an expanding universe. This is


much simpler than in the previous case. When measuring the distance across
the diameter of the galaxy we have t = te, χ = χe and we can always align our
coordinates such that ϕ = 0. Hence, the diameter D is given by

where (g22)e means that this metric component is evaluated at the emitter.
Therefore, we find for the angular diameter distance

We can relate the scale factor at the time of emission to that of the time of
observation using the redshift Eq. (4.62) which yields the final form
Comparison of the luminosity distance dL with the angular diameter
distance dA shows that they differ by a factor of (1 + z)2, this means we have
the simple relationship

In the following section we will show how to derive explicit formulae for the
distance redshift relationships for specific cosmological models.
Some other distances can be defined in cosmology, see for instance Hogg
(1999), which also contains references to some of the literature.

4.3.4. Distance redshift relationships

After defining the luminosity distance dL and the angular diameter distance
dA, we are now in a position to derive their explicit forms for different
cosmological models.
Let us begin with considering a matter-dominated universe, w = 0,
without cosmological term, Λ = 0, assuming the spatially flat case, k = 0. We
know from Sec. 4.2.1 that a(t) = a0t2/3, we also have the Hubble parameter H
= 2/(3t), and recall the definition of the redshift, Eq. (4.63) which reads 1 + z
= a(tobs)/a(te). Our aim is to calculate Σ(χe) explicitly.
Beginning with Eq. (4.8) we have

Since we conduct observations today, we have H0 = 2/(3tobs ) and therefore


In the ultimate step we used the redshift definition together with the explicit
form of a(t). Moving the factor (1 + z) into the bracket leads to our final
result

for the luminosity distance. This also determines the angular diameter
distance

Let us emphasise here that both results (4.80) and (4.81) are only valid for the
specific model we chose.
It is instructive to consider the small z approximation of dL using
which results in

where the dots stand for some higher-order corrections. This is precisely
Hubble’s law (4.66) and at this point it is a consequence of the Einstein field
equations since we worked with an explicit solution.
Similar calculations can be performed for different cosmological models,
however, it might not always be possible to arrive at nice explicit solutions
for these quantities. Hence, let us next study these distances in a more general
setting.
In Eq. (4.58), we can change the integration variable from time to redshift
by writing
On the other hand, we have a(z) = a(tobs)/(1 + z) so that

which can be combined to

We must note that the Hubble parameter is now viewed as a function of the
redshift H(z). We introduce a new function H(z) = H0E(z) with E(0) = 1 so
that we can rewrite Eq. (4.58) as follows:

Now, we must determine the function E(z) from the field equations.
Using Eq. (4.52) we can write

which has similarities with the cosmological density parameters. For the sake
of completeness we also introduce the parameter Recall
the critical density evaluated today Moreover we assume
ρtotal = ρm + ρrad with their corresponding conservation equations expressed
using redshifts. Then

Substituting everything back into Eq. (4.87) gives us the explicit expression

which only depends on the redshift and the values of the parameters today,
this means at the time of the observations. Since the explicit form of E(z) in
general contains a square root of a polynomial in z, it is clear that the integral
in Eq. (4.86) might be quite difficult to compute.
We can also integrate Eq. (4.85) directly with respect to the redshift, to
find the difference between the age of the universe at observation tobs and
emission te which gives

This quantity is sometimes called the lookback time. By assuming that the
time of emission of a signal was approximately at the beginning of the
universe, we can set te = 0. Since we assume that a(t) → 0 as t → 0, we also
have z →∞ as we approach the beginning. Hence, we can determine the age
of the universe for a specific cosmological model by

Example 4.1 (Flat radiation-dominated universe and refocussing). Let us


compute the luminosity distance in a spatially flat, k = 0, radiation-dominated
universe, w = 1/3, without cosmological term, Λ = 0. The function E(z) is
given by Eq. (4.89) by setting Ωm0 = ΩΛ0 = Ωk = 0 so that E(z)2 = Ωrad0(1 +
z)4. Since radiation is the only matter source, we also have Ωrad0 = 1.
Therefore, the comoving distance χ is given by

Since Σ(χe) = χe in a spatially flat universe, we find


Fig. 4.2 Angular diameter distance redshift relation.

which looks again like Hubble’s law, but valid for all redshifts.
Consequently, the angular diameter distance is

The shape of dA as the redshift changes is quite intriguing, see Fig. 4.2. There
exits a maximum value for the angular diameter distance at z = 1. For larger
values of redshift, the angular diameter distance decreases which seems
rather counter intuitive. Moreover, any given angular diameter distance
corresponds to two different redshifts. This effect is called refocusing in
cosmology.

4.3.5. The universe today

After discussing various aspects of cosmology, let us now put this into
context with our universe and current observations which determine the
various physical parameters discussed. One of the most important quantities
is today’s value of the Hubble parameter H0. The physical units of H0 are
inverse time, however, it is usually written as

where h is a dimensionless constant. Measuring H0 is notoriously difficult


and there are many uncertainties about its value. Planck 2015 data Ade et al.
(2015) found H0 = (67.8 ± 0.9)kms−1Mpc−1, however, others have found
higher values, and also lower values have been reported, see [Ade et al.
(2015), Sec. 5.4]. For all practical purposes we can safely assume that 0.6 < h
< 0.8, and for explicit calculations we will use h = 0.7.
The quantity 1/H0 has units of time and is called the Hubble time which is
approximately

However, the actual age of the universe is larger with the Planck data
suggesting tuniverse = 13.8 × 109 year.
When multiplying the Hubble time with the speed of light c we get a
quantity with units of length. This so-called Hubble length is approximately

Next, we need to know the matter content of the universe which is also
linked to its spatial geometry. Let us begin with the curvature parameter for
which observations suggest |Ωk| < 0.005, this means that the curvature of the
constant time slices is very close to zero. If we take this as evidence to set k =
0, then by Eq. (4.55) we will have Ωtotal = 1 at all times. However, we will
see in Sec. 4.3.6 that k close to zero yields some problems in cosmology.
There is strong evidence for the presence of a cosmological term Λ with ΩΛ
making up slightly less than 70% of the contents of the universe, so that the
total matter content is roughly Ωm = 0.3.
The matter density Ωm accounts for all matter which interacts
gravitationally, it includes radiation and neutrinos, for instance, but also any
form of luminous and non-luminous matter. Ordinary matter in the universe
is referred to as baryons because protons and neutrons account for most of its
density. The baryon density is denoted by Ωb and its value is approximately
Ωb = 0.05 which means that ordinary matter only makes up about 5% of the
contents of the universe we observe. In turn this implies that the remaining
25% must be in form of some non-luminous matter which interacts
gravitationally and has only very weak interactions with other matter. This
form of matter is called dark matter and is subject to substantial research.
There exists a large number of different dark matter models which are
inspired by a diverse range of physical theories. In analogy to dark matter,
one speaks of dark energy when theories are explored which mimic the
cosmological constant. Dark matter and dark energy taken together make up
approximately 95% of the matter content of the universe and it is somewhat
fair to say that we do not understand either of them from a theoretical point of
view.
We should also mention the critical density of the universe observed
today which is approximately (ρcrit)0 = 1.88h2 × 1029g cm−3. Let us recall
that the proton mass is mp = 1.76×10−24 g, then we can conclude that the
critical density observed today corresponds to about one proton per cubic
meter.
In cosmology, it is possible to define the temperature of the universe at
any given time. We will not derive these relations rigorously but only state
the main results which are needed for some of the subsequent discussions.
The Stefan–Boltzmann law of power radiated from a black body reads ρ
∝ T 4. Next we recall that in a radiation-dominated universe ρ ∝ a−4 from
which we conclude the relationship T ∝ a−1. This allows us to relate the
temperature to the scale factor and by virtue of Eq. (4.63) to the redshift z.
We are led to

where T (tobs) generally corresponds to the temperature we observe today and


T (te) corresponds to the temperature of the universe at some time in the past.
The idea here is that we are interested in the temperature of the universe
when a signal was emitted at te which we can observe today.
In a matter-dominated universe we are dealing with nonrelativistic
particles for which we assume the Maxwell–Boltzmann distribution that
states ρ ∝ T 3/2. Moreover, in a matter-dominated universe we have ρ ∝ a−3
which implies the relationship T ∝ a−2. Hence we arrive at
This immediately implies that matter cools down much faster than radiation
in an expanding universe, we also see that a smaller universe (matter or
radiation dominated) in the past was much hotter.
At a temperature of about T (trec) = 3 × 103 K photons decouple from
matter which means that hydrogen can form once the universe has cooled
sufficiently. In cosmology this is called decoupling or recombination, note
the misnomer as these objects were never combined previously. We observe
an almost uniform cosmological microwave background radiation which
corresponds to a temperature of roughly T(rad)(tobs) = 2.73 K. If we interpret
this radiation as the cooled photon gas at decoupling, the moment the photons
could travel freely, we can estimate the redshift of recombination using

This tells us that the universe was about 1,100 times smaller at the time of
decoupling.
We can also estimate the temperature of the matter part of the universe.
At the time of recombination matter and radiation had the same temperature
so that we have

which is three orders of magnitude cooler than the radiation.

4.3.6. Shortcomings in cosmology

(a) The flatness problem

Let us begin by recalling Eq. (4.55) which relates the total energy content of
the universe to its spatial curvature
We already noted that in a spatially flat universe Ωtotal = 1 at all times. When
k ≠ 0, the density parameter evolves with time, and this is determined by aH.
For a radiation-dominated universe we found a ∝ t1/2 and H = 1/(2t), so that
aH ∝ t−1/2, and for a flat matter dominated universe a ∝ t2/3 and H = 2/(3t),
so that aH ∝ t−1/3. Let us now consider a universe with a small non-
vanishing curvature parameter. We assume that its evolution is close to that
of the flat case which implies Ωtotal − 1 ∝ t in the radiation-dominated case
and Ωtotal − 1 ∝ t2/3 in the matter-dominated case. The point here is that the
quantity Ωtotal − 1 is an increasing function with time. Since we observe
Ωtotal(ttoday ) to be of order unity, we must conclude that Ωtotal was very close
to one at earlier cosmological times. An estimate based on nucleosynthesis
calculations requires |Ωtotal(tnucleo) − 1| ≤ 10−16 at which point the universe
was about one second old. At times earlier than this, the density parameter of
universe would have to be even closer to one. The flatness problem simply
states that such finely tuned initial conditions for the universe appear to be
unlikely. In other words, let us prescribe some random initial conditions for
the universe at some very early time. It turns out that in almost all cases the
universe will evolve very differently to the universe we observe, and that our
present universe is in fact highly improbable.

(b) Horizon problem

In Sec. 4.1.4 we discussed the presence of particle horizons. Let us assume


that the universe began at t = 0 with a(0) = 0. Then the present distance of an
object which emitted light at the beginning is given by a0χ(te) so that for a
flat universe we have

Using the explicit forms of a(t) for a radiation dominated and


matterdominated universe, respectively, we find the following particle
horizons

Currently we are in a matter-dominated universe so that today’s particle


horizon is dhor(ttoday ) = 2/H0, which is twice the Hubble length that was
mentioned in Eq. (4.97).
Given that the age of the universe is about tuniverse = 13.8×109 yr, we can
find the time of recombination by using Eq. (4.98) which gives

which implies approximately trec = 4.4 × 105 yr. Prior to the decoupling of
the photons, the universe was radiation-dominated and hence the particle
horizon at recombination was dhor(trec) = 2 trec. Reinserting the speed of light
c into the distance to the particle horizon gives

At first sight, the horizon appears to be moving with a velocity faster than the
speed of light which would contradict special relativity. The reason for this is
simply that the universe is expanding while the photon is travelling. Locally
photons move with the speed of light.
When the photons decoupled at trec the distance over which causal
interactions could have occurred is approximately 0.27 Mpc. However, the
cosmic microwave radiation we observe today is fairly homogeneous and
isotropic over much larger scales, namely those corresponding to our particle
horizon scale. This fact is usually referred to as the horizon problem. This
means regions across the sky separated by about one to two degrees (θ ≈
(zrec)−1/2 ≈ 0.03 ≈ 1.7°) cannot have interacted before recombination and
hence there is no physical reason why these regions should look similar. This
means that these causally disconnected regions should have evolved
independently. However, we observe that these regions are statistically
indistinguishable. Homogeneity and isotropy of the cosmic microwave
background radiation therefore must have been encoded into the universe at
early times of its evolution. As mentioned in the above, it is very unlikely for
a universe with randomly chosen initial conditions to have this property.
There are some additional issues with respect to particles (or topological
defects) forbidden to be present by observations which are predicted by some
theoretical models applicable at very high temperatures. These are referred to
as unwanted relics but we will not discuss them in more detail.

4.4. Inflation

The original motivation for inflationary cosmology came from the desire to
solve these shortcomings in cosmology. However, the most important aspect
of inflation is that it can generate the initial irregularities in the early universe
which led to the formation of larger structures like galaxies. Inflation models
also predict tensor perturbations which would yield gravitational waves.
In simple words, inflation is an early time epoch of the cosmological
evolution where the universe’s acceleration is positive. It turns out that this
single requirement suffices to resolve the flatness problem and the horizon
problem in a very elegant way.

4.4.1. Accelerated expansion

We can formally define inflation by the condition

Note that neither the scale factor of the matter-dominated universe nor the
radiation-dominated universe satisfy this condition, in both cases the
acceleration is a decreasing function of time. Interestingly, the de Sitter
solution, discussed in Sec. 4.2.4, has positive acceleration and hence is
relevant in the context of inflation. Next, we wish to reformulate the inflation
condition (4.109) using the Hubble parameter as this will provide us with
more physical definitions.
From the definition of H = ȧ/a we have so that our inflation
condition can also be written as

which is always satisfied for positive Another useful formulation of the


inflation condition begins with ȧ = Ha and again differentiation with respect
to time. We have

because of ä > 0 and a2 > 0. The interesting part in this way of defining
inflation is the quantity H−1/a. Recall that H−1 is the Hubble length (in units
where c = 1). The quantity H−1/a is the comoving Hubble length and this
length scale determines whether two regions of spacetime can communicate
now, in other words it sets the size of the observable universe. This means
that during an epoch of accelerated expansion the observable universe
becomes smaller!
Before discussing the implications of this particular aspect further, we can
give a third definition of inflation based on the matter content of the universe.
Recall Eq. (4.23) without cosmological constant

Hence, the condition ä > 0 immediately implies that the matter must satisfy
the condition

Since the energy density ρ is always assumed to be positive, we conclude that


inflation requires matter to have negative pressure which violates the so
called strong energy condition. For a linear equation of state p = wρ,
condition (4.113) becomes w < −1/3, a value which one also encounters
when discussing the stability of the Einstein static universe in Exercise 4.9.
In order to understand how inflation solves the shortcomings of standard
cosmology discussed in Sec. 4.3.6 we will primarily need to focus on the
inflation condition (4.111) which states that during inflation the comoving
Hubble length is decreasing. This allows for the following possibility: the
length scale of the particle horizon could be much larger than the comoving
Hubble length today. However, if the size of the observable universe rapidly
decreased during a period of accelerated expansion, then those regions which
are separated by the particle horizon today might have been in causal contact
in the past. Therefore, inflation provides us with a simple solution to the
horizon problem. Moreover, the flatness problem stemmed from the fact that
the function 1/(a2H2) is increasing for a radiation or matterdominated
universe. Inflation is defined precisely by the opposite condition, 1/(a2H2) is
a decreasing function, and hence |Ω − 1| is driven towards zero rather than
away from it, thereby solving the flatness problem.

4.4.2. Scalar fields in cosmology

The simplest matter field which satisfies the inflation condition ρ + 3p < 0 is
a scalar field which we shall call ϕ. Despite scalar fields being used in
cosmology for about three decades, the existence of scalar fields like the
Higgs boson in the standard model of particle physics has only been
confirmed by the Large Hadron Collider at CERN in 2012 and 2013.
Independent of the particle nature of the field driving inflation in the early
universe, the scalar field provides an ideal model to study a concrete inflation
model which shows good agreement with observations. Moreover, the scalar
field is mathematically quite simple and so allows the explicit computation of
many effects which would be much harder for a more complicated field.
In Sec. 2.4, we introduced the variational approach to General Relativity
which also contained a definition of the energy–momentum tensor by the
matter Lagrangian, see Eq. (2.127). The scalar field action is given by
where ϕ = ϕ(xi) is the scalar function depending on all the coordinates, and V
(ϕ) is a potential which depends only on the scalar field. Different scalar field
inflation models are characterised by their potentials. The energy–momentum
tensor is given by

Since ϕ is a scalar field, we could write partial derivatives instead of


covariant derivatives, however, the use of the covariant derivative makes it
clear that ∇aϕ is a tensor. Since we are interested in a cosmological scalar
field, we assume ϕ to be a function of time only, and consider the
Friedmann–Lemaître–Robertson–Walker metric. In this case the energy
density and pressure are

The scalar field has its own equation of motion which we can derive by
considering the variations of Eq. (4.114) with respect to ϕ. Alternatively, we
recall that any cosmological matter satisfies the conservation equation
and hence we can compute the conservation equation of the
scalar field directly from ρϕ and pϕ. We have

so that the conservation equation yields

This can be satisfied by either a constant scalar field which would then be
equivalent to a cosmological constant, or provided that the scalar field
has to satisfy the so-called Klein–Gordon equation

The scalar field driving inflation is generally referred to as the inflaton.


Using Eqs. (4.116) and (4.117) we can define an effective equation of
state for the scalar field

which is not a constant but a function of time. This equation also provides us
with support that a scalar field can give rise to a period of accelerated
expansion. If the kinetic energy is much smaller than the potential
energy V (ϕ), the effective equation of state will give wϕ ≈ −1. This
corresponds to a cosmological constant which we know provides accelerated
expansion.
When studying inflation models it is common to work with the reduced
Planck mass MPl instead of the gravitational coupling constant κ. The reduced
Planck mass is defined by

which when working in natural units where G = c = ħ = 1, simply becomes


In these units the scalar field has dimensions of mass which are
useful in the context of particle physics. Using the Planck mass, the
cosmological field equations, with scalar fields as matter sources, are given
by
The first definition of inflation was simply the requirement ä > 0, therefore
Eq. (4.124) implies that this condition is satisfied provided that
Physically speaking this condition means that we need a model with small
kinetic energy relative to the potential energy, we can think of slow motions
here. For instance a sufficiently flat potential should eventually satisfy this
condition as the particle loses kinetic energy. This leads us directly to slow-
roll inflation which is a very useful approximation method for inflation
models.

4.4.3. Slow-roll inflation

When introducing the effective equation of state (4.121) we noted that we can
achieve accelerated expansion provided that the kinetic energy is much
smaller than the potential energy. This also follows directly from Eq. (4.124).
We will make this more formal and introduce the slow-roll approximation as
follows: we assume and The second condition, slow
accelerations, is natural when one thinks about slow motions. However, small
velocities do not imply small accelerations. One could have a situation where
the velocity is always small but changes very rapidly. For example, the
velocity v(t) = 0.1 × sin(100t) is small, then the acceleration
is clearly large compared to the velocity.
In this approximation, our field equations (4.123) and (4.125) simplify to
The symbol ≃ means that both sides are equal within the slowroll
approximation. Our original equations were of second order and nonlinear in
and ϕ. In the slow-roll approximation the equations are first order and can,
in principle, be solved using separation of variables.
Two convenient parameters are the so-called slow-roll parameters ϵ and η
which are defined by

where the prime denotes differentiation with respect to the scalar field. We
say the slow-roll approximation is valid if ϵ ≪ 1 and η ≪ 1, and we should
take note that these quantities only depend on the potential V. In particular the
parameter ϵ can be motivated nicely. We recall our inflation condition (4.120)
which read
Differentiating Eq. (4.126) with respect to time gives

where for the second part we used Eq. (4.127). Dividing the latter equation
by −2H3 yields

Therefore, our inflation condition (4.120) is satisfied provided that ϵ < 1. One
could also use relation (4.130) as the defining equation for the slow-roll
parameter via It turns out that this quantity only depends on the
shape of the potential in this approximation.
During inflation the scale of the universe typically increases by about
1026 which is a very large number, so we are primarily interested in the
exponent. In order to quantify the amount of inflation one considers the ratio
of the scale factor at the end of inflation to its values at some time before the
end of inflation. We define the number of e-foldings by
which seems slightly counter-intuitive. We note that N(tend) = 0, therefore the
number of e-foldings at some time t measures the amount of inflation that
still has to occur after that time. Typically, the number of e-foldings is 50–70
in order to solve the flatness and horizon problems satisfactorily.
Within the slow-roll approximation we can explicitly calculate this
quantity without ever needing to solve the cosmological field equations. This
is one of the great features of this approximation, we can check whether a
specific potential V (ϕ) can give rise to inflation. We can define the end of
inflation, for instance using ϵ = 1 or η = 1. Lastly, we can compute the
amount of inflation that has taken place by computing the number of e-
foldings. To compute N (t), we begin with recalling that H = ȧ /a so that

However, we can now exploit the slow-roll field equations. From Eq. (4.127)
we find dt = −(3H/V′)dϕ while Eq. (4.127) allows us to express H in term of
the potential. This yields

which again only depends on the potential and does not involve the scale
factor or solutions to field equations. We will finish with two examples to see
the various features of the slow-roll approximation worked out explicitly for
two potentials.

Example 4.2 (Massive potential V (ϕ)=m2ϕ2/2). Consider the potential V (ϕ)


= m2ϕ2/2 where m would correspond to the mass of the scalar field. We have
V ′(ϕ) = m2ϕ, and V ″(ϕ) = m2 so that
We can now compute the slow-roll parameters using their definitions and
find

For this particular potential both parameters are identical. Our condition for
inflation ϵ ≪ 1 implies that . If we define the end of inflation by
the condition ϵ = 1, we find that
Next, let us compute the total number of e-foldings for this potential. We
have

This means that choosing ϕi = 15MPl yields a little over 60 e-foldings. Let us
emphasise again that we did not solve the field equations so far.
So, let us finish this example by solving the slow-roll approximated field
equations. Equation (4.126) can be solved for the Hubble parameter to get

which we can substitute into the second equation (4.127) to give

We can cancel a factor of mϕ on both sides and integrate directly with respect
to time, the result of which is

where C is a constant of integration. We already know that we require ϕi =


15MPl and for this model to give the right
number of e-foldings. Setting the time ti, when inflation began, to zero ti = 0,
we can fix the constant of integration to give C = 15MPl and find the solution

Since we also know the field value at the end inflation, we can compute tend
from which yields

We are not done yet as we still need to determine the scale factor a(t). Since
H(t) = ȧ/a = d log(a)/dt we can directly integrate Eq. (4.137) using the
explicit solution for the scalar field. This gives

where D is another constant of integration. Since we assumed that inflation


began at ti = 0, we have a(ti) = D so that D corresponds to the scale of the
universe at that point in time, we denote this by ai. Hence our final scale
factor is given by

As a final consistency check we will verify that ä/a > 0 during inflation.
Differentiating the result (4.143) twice with respect to time and dividing by
a(t) yields

Clearly, at ti the acceleration is positive while at tend we find ä(tend) = 0 in


agreement with our conceptual framework of inflation.

Example 4.3 (Power-law inflation). Power-law inflation is a solution to the


field equations based on the following potential:

where V0 and p are positive constants. We begin with computing the slow-
roll parameters using and so that we
find

Therefore, the inflation condition is satisfied provided that p > 1. However,


unlike in the previous model, there is no natural end to inflation since p is a
constant. Therefore, if the inflation condition is satisfied at one point in time,
it will be satisfied at all times. For such an inflation model one would need to
introduce an additional mechanism to stop inflation. However, solving the
slow-roll approximated field equations gives a particularly nice solution
which we will now derive.
From the first field equation (4.126) we get

Substitution into the second field equation (4.127) results in

which we can write as follows:

Now both sides can be integrated and we arrive at


Here t0 is some constant of integration. Let us substitute this result back into
Eq. (4.147) in order to find H(t). It turns out that almost all terms cancel
nicely and we are left with

which justifies the name power-law inflation of this model. Recall that
inflation never ends so it would be meaningless to compute the number of e-
foldings without a mechanism to stop it.

4.5. Further Reading

The theory of inflation was originally introduced to resolve the shortcomings


of big bang cosmology. More importantly, inflation provides us with a
mechanism to create the initial density perturbations (quantum fluctuations)
of the early universe that seed the growth of larger gravitating structures.
These will eventually evolve into the objects we see in today’s night sky. The
theory of inflation is very successful in that it is able to provide us with the
appropriate initial conditions compatible with the universe we observe today.
Connecting the fluctuations of the inflaton in the very early universe with the
Einstein field equations and extracting observable quantities out of this model
is at the heart of modern cosmology. Three books which would be a natural
continuation along these lines are by Dodelson (2003), Weinberg (2008) and
Liddle (2015). A very different book is by Rowan-Robinson (2004) which
emphasises the empirical evidence and observational data which underlies
cosmology.
In order to study how small fluctuations of densities affect the evolution
of the universe and how such fluctuations evolve within an expanding
universe, one needs to study cosmological perturbation theory. This is very
similar to the weak gravity approach we considered in Sec. 2.3. The main
difference is that instead of considering small perturbations about Minkowski
space, one is interested in small perturbations about the Friedmann–Lemaîre–
Robertson–Walker metric. As in Sec. 2.3.2 one can choose a particular gauge
in which the perturbed metric takes a particularly nice form. For instance,
when studying scalar perturbations with k = 0 one can write

where Φ and Ψ are functions of all the coordinates and are considered to be
small. Using this metric one can compute the Einstein field equations in
linear order in the perturbations, along the lines of the weak field limit.
Cosmological perturbation theory is an important tool of modern cosmology,
see for instance the book by Gorbunov and Rubakov (2011).
Most topics in modern cosmology focus on the Friedmann–Lemaître–
Robertson–Walker metric and study physics in this setting. However, one
could also study cosmological models which are homogeneous but
anisotropic, this is very interesting in its own right and some of the
shortcomings in cosmology are less of an issue in this framework. The
interested reader is referred to the book by Ryan and Shepley (1975) which
discusses such models in detail.
The recommended literature is by no means complete, it simply reflects
the author’s suggestions for further reading.

4.6. Exercises

Classical and modern cosmology

Exercise 4.1. Consider an infinitely large and eternal universe which is static,
and contains an infinite number of stars. Such a universe should imply the
night sky to be very bright, instead of dark. This is known as Olbers paradox.
Propose various explanations to solve this paradox.

Exercise 4.2. Construct an argument based on thermodynamics to deduct that


the universe cannot be infinitely old (heat death paradox or Clausius
paradox).
Exercise 4.3. Calculate the non-vanishing Christoffel symbol components of
the Friedmann–Lemaître–Robertson–Walker metric given by Eq. (4.6).

Exercise 4.4. Following on from the previous exercise, compute the Ricci
tensor components of (4.6) and verify that the Ricci scalar is given by Eq.
(4.16).

Exercise 4.5 (hard). Verify that the volume of the 3-sphere 3 is given by V(
3) = 2π2, and also show that V( 2) = 4π and V( 4) = 8π2/3. One has to be

slightly careful with the notion of volume at this point as the unit sphere 2
embedded in 3 is usually said to have area 4π. However, when 2 is viewed
as a manifold, we can visualise this manifold as the surface area of a sphere.
The volume of this manifold corresponds to its surface, thus the confusion.
The interested reader might wish to push this exercise further and show
that the volume of the general n-dimensional sphere n is given by 2πn/2/
Γ(n/2) where Γ is the gamma function, Γ(1/2) = , and Γ(n + 1) = nΓ(n) (no
complete solution given for this final part).

Cosmological solutions

Exercise 4.6. Solve the cosmological field equations assuming a matter-


dominated universe w = 0, without cosmological term Λ = 0 and with spatial
negative curvature k = −1.

Exercise 4.7. Generalise the Einstein static universe by including pressure.

Exercise 4.8. Show that the Einstein static universe, Sec. 4.2.3, is unstable
with respect to small homogeneous perturbation, aE → aE + δa(t) and ρE →
ρE + δρ(t).

Exercise 4.9. Include pressure into the previous calculation, assume the
equation of state p = wρ, and show that one can have stable perturbation
provided −1 <w < −1/3.

Exercise 4.10. Show that the de Sitter solution for k/Λ > 0 can also be
written in the form
Hint: Start with Eq. (4.42) and use the appropriate substitution.

Physical cosmology

Exercise 4.11. Use Eq. (4.91) to find the age of the universe assuming it is
spatially flat and matter-dominated with Λ = 0. Where have we encountered a
similar result?

Exercise 4.12. Find the luminosity distance in a spatially flat and matter
dominated universe using Eq. (4.86).

Exercise 4.13. In the radiation-dominated universe with Λ = 0 and k ≠ 0,


show that

where Ω0 = Ω(t0) is the present value of the density parameter. Determine the
value of the constant A. Using this result find a relationship between the
Hubble parameter H and the quantities H0, Ω0, z.

Exercise 4.14. Consider the spatially spherical universe, filled with radiation,
without cosmological constant. Consider a light signal travelling along the
azimuth so that χ = θ = π/2. Show that a light ray emitted at the ‘big bang’
will have travelled halfway around the universe by the time of the ‘big
crunch’.

Exercise 4.15. Consider the spatially spherical universe, filled with matter (w
= 0), without cosmological constant. Consider a light signal travelling along
the azimuth so that χ = θ = π/2. Show that a light ray emitted at the ‘big bang’
will have travelled all the way around the universe by the time of the ‘big
crunch’.

Exercise 4.16. The laboratory wavelength of the so-called hydrogen−α line is


656.3 nm. An elliptic galaxy has the visible angular size θ = 4′. The hydrogen
−α line in its spectrum has the wavelength 657.0 nm. Find the distance to the
galaxy and its proper size, using H0 = 60 km/(s Mpc).

Inflation

Exercise 4.17. Show that the cosmological equations are equivalent to the
autonomous nonlinear system

Find the function F(H, ϕ) explicitly. Assuming a constant potential V = V0 >


0, integrate the cosmological field equations to find H(t) and a(t). Show that
for large t, a(t) ∝ exp(µ(t − t0)) and find µ.

Exercise 4.18. Derive Eq. (4.115) using the variational approach and
definition (2.127) .

Exercise 4.19 (takes time). Derive the cosmological Klein–Gordon equation


(4.120) using the variational approach and metric (4.7) with k = 0.

Exercise 4.20. Show the field equations in slow-roll approximation imply

and argue for the condition η ≪ 1.


5
Solutions to Exercises

5.1. Solutions: Differential Geometry

The concept of a vector

Exercise 1.1. The area of the parallelogram spanned by the vectors b, c is A =


|b × c| = bc sin ϕ, and the normal to this parallelogram is (b × c)/|b × c|. We
now project a along this normal to find the height of the parallelepiped

To find the volume, we multiply the area A by the height h and arrive at

Exercise 1.2. Following on from the previous exercise, we see that the
volume of the parallelepiped spanned a, b, c is given by V = |a · (b × c)|. We
can also find the volume by first computing the area of the parallelogram
spanned by the vectors a, b, and then project c along the corresponding
normal which would result in the volume c · (a × b). Lastly, we can begin
with the parallelogram spanned by a, c, however, we have to use the normal
such that it points in the same direction as b. Hence the normal would be c×a
and the volume would be b · (c × a). Alternatively, we could write out all
vectors in terms of basis vectors a = a1i + a2j + a3k and compute the triple
vector products explicitly using Eqs. (1.7) and (1.8).

Exercise 1.3. We have ei · ej = and ei × ej = εijk ek and hence


This also explains directly the symmetry properties of the scalar triple
product.

Exercise 1.4. We write out the determinant explicitly

Let us sum over the indices k and n which gives

Let us continue with this result and sum also over j and m which gives
Lastly, we arrive at εijk εijk = 2 = 6,
where we repeatedly used = 3 in three dimensions.

Exercise 1.5. We start with

and then
Next we compute the right-hand side of the identity (a · c)b − (a · b)c which
is

This agrees with the left-hand side we computed. Starting with the left-hand
side again gives

and we are done using the index notation.

Exercise 1.6. We have ei × ej = εijses, therefore (ei × ej) · ek = εijses · ek = εijs


= εijk which is what we wanted to show.

Exercise 1.7. The first two identities follow from the symmetry properties of
εijk. We have div curl A = εijk∂k(∂iAj) = 0, where we note that ∂k∂i is
symmetric in the indices ki which εijk is skewsymmetric. Likewise curl grad f
= εijk∂i(∂j f)ek = 0. For the third identity we write

The final one is proved as follows:


Manifolds and tensors
Exercise 1.8. Recall the transformation property of the partial derivative

Likewise we can work out

The first term contains the homogeneous transformation part, the remaining
terms are the inhomogeneous parts. Taking the definition of the Christoffel
symbol into account, there are three terms of the form gab,c. So there will be a
total of six inhomogeneous terms in the transformed Christoffel symbol, four
of which have plus signs and two of which have negative signs. Four of these
terms cancel each other and the remaining two inhomogeneous terms are
added up and cancel the factor 1/2 in the definition of the Christoffel symbol.
This yields the result.

Exercise 1.9. We have A′j = ∂X′j /∂XsAs which we can simply apply to both
sides of Eq. (1.238). Then (∂Xc/∂X′j )(∂X′j /∂Xs) As = As = Ac and likewise
for the second term which implies the result.

Exercise 1.10. We showed that ds2 = dx2 + dy2 = dr2 + r2dϕ2 in polar
coordinates. Then a direct substitution gives

Exercise 1.11. Start with p = 2r/(1 — r2) which is a quadratic equation in r


given by r2 + 2r/p — 1 = 0 which we can solve to find (the
other solution is unphysical). Next we compute dp for which we have

so that we can write

Once we rewrite the right-hand side in terms of ρ we are done.

The final calculation is

Hence we find the desired result

With ρ = sinh χ we find dρ = cosh χ dχ and dρ2 = cosh2χ dχ2. Next 1 + ρ2 = 1


+ sinh2χ = cosh2χ where we used the standard hyperbolic identity. Therefore,
ds2 = dχ2 + sinh2χdφ2.

Exercise 1.12. We start with and now recall Eq. (1.100) so


that we can write

Exercise 1.13. The metric tensor and inverse metric tensor are given by

Hence, ∂ygij = 0. The only non-vanishing Christoffel symbol component is


The Lagrangian is so that ÿ = 0 and y = cλ
+ d (c, d are constants), while the z equation is

Therefore, we arrive at

and can verify that the only non-vanishing Christoffel symbol component is
Now, we divide this last equation by ż and write

We note that
and can now integrate both sides with respect λ which gives

Separation of variables leads to

Therefore, using y = cλ + d we can eliminate λ and arrive at

which is the equation of straight line. This also strongly suggests that one
should introduce a new coordinate of the form x = tanh z. One can check

so that

Exercise 1.14. Write out (1.114) + (1.115) − (1.116) in detail


Finally we apply gic to both sides and arrive at

which is the stated result.

Exercise 1.15. The metric and inverse metric always satisfy Let
us differentiate this using . This gives

where in the second step the product rule was used. Now, we apply gbf and
arrive at

Curvature

Exercise 1.16. Following Example 1.14, our coordinates are Xi = (θ, ϕ) and
the non-vanishing Christoffel symbol components are
Next, we can recall the equations for parallel
a
transport (1.128). T is the tangent vector of the curve along which we
transport.
(i) Two points with the same longitude have coordinates (θ1, ϕ0) and (θ2,
ϕ0), respectively. A curve C connecting these points is given by Xa(λ) = (θ1 +
λ(θ2 − θ1), ϕ0) and has tangent vector Ta = (θ2 − θ1, 0) which we write as Ta
= (θ¯, 0). Therefore, the equations for parallel transport are
The first equation simply yields , in the second equation we can
1
separate variables. We use θ(λ) = θ + λ so that

which we can integrate to find

We can determine the value of the constant of integration by the condition


V2(0) = and arrive at the result

(ii) We now repeat this process for parallel transport along constant
latitudes. A curve C connecting two such points is given by Xa(λ) = (θ0, ϕ1 +
λ(ϕ2 − ϕ1)) and has tangent vector Ta = (0, ϕ2 − ϕ1) which we write as Ta =
(0, ). The parallel transport equations are

This pair of equations is the harmonic oscillator equation in firstorder


formulation. To see this, differentiate the second equation with respect to λ
and eliminate V1. This gives
with k = cos θ0. The solution to this is

which implies that V1 is given by

We can now fix our constants of integration by Vi(0) = . This gives the
second result

(iii) Let us begin by transporting Vinitial = (1, 0) from P1 = (π/2, 0) to P2 =


(ϵ, 0), ϕ is constant and we find

We cannot transport to the north pole as our coordinates become singular at


this point. Next, we transport from P2 = (ϵ, 0) to P3 = (ϵ, π/2), now θ is
constant, k = cos θ0 = π/2 cos(ϵ) hence

Next, we transport from P3 = (ϵ, π/2) to P4 = (π/2, π/2). This leads to


We can now safely consider the limit ϵ → 0 which yields

Finally, we transport this vector from P4 = (π/2, π/2) back to P1 = (π/2, 0),
this gives k = cos θ0 = 0 and we arrive at the result

This shows that the final vector is rotated by −90°.

Exercise 1.17. We begin with the Christoffel symbol of the metric defined by
ds2 = v2du2 + u2dv2 which is given by

We compute R1212 by direct calculation using Eq. (1.164) which in this case
is

Therefore,
and we find the desired result R1212 = 0.

Exercise 1.18. We begin with

where we used u = exp U and v = exp V. Next we introduce the new


coordinates V + U = α, V − U = β so that V = (α + β)/2 and U = (α − β)/2.
Substitution into the line element gives

Next, we introduce

and we arrive at

which is Euclidean space in polar coordinates, see Example 1.2. Putting all
the transformations together, you hopefully arrive at

Finally, it is also a good exercise to show that dx2 + dy2 indeed produces the
required metric.

Exercise 1.19. We write out property (iii) four times as follows:


where we underlined all terms which are part of the identity we wish to
prove. Now we compute (5.71) – (5.72) – (5.73) + (5.74). This yields

where property (ii) was used. All the remaining terms will now cancel. For
instance

and so on for the remaining two pairs. Hence 2Wabcd − 2Wcdab = 0, which we
wanted to show.

Exercise 1.20. One possible proof is as follows: The Riemann curvature


tensor is skew-symmetric in the first and last pair of indices. The first pair
can take (n − 1)n/2 different values, and so can the second pair, so we write
RAB with A, B = {1,... , (n − 1)n/2} for Rabcd. Next, we know that Rabcd =
Rcdab which becomes RAB = RBA, in other words RAB is symmetric and
therefore has ((n − 1)n/2)((n − 1)n/2 + 1)/2 independent components. We are
left with property (iii) to take into account, this is not easy, and involves
some combinatorics. There are ‘n choose 4’ equations we need to take into
account. This gives
which after a bit of simplification gives the result n2(n2 − 1)/12.
Note that the derivation involves a Binomial coefficient which is not
defined for n < 4. In the cases n = 2 and n = 3, property (iii) simply does not
give any additional equations.

5.2. Solutions: Einstein Field Equations

Some physics background

Exercise 2.1. We begin with the Euler equation for hydrostatic equilibrium
which is ρg = ∇p. Next, assuming spherical symmetry we have dp/dr =
−ρGm/r2. The mass m and the density ρ are related by the standard relation
dm/dr = 4πρr2. Since we consider a constant density star ρ = ρ0 = const., we
have m = (4π/3)ρ0r3. Therefore, dp/dr = −(4πG/3) r, which we can integrate
to find

where pc is a constant of integration which corresponds to the central


pressure of the Newtonian star. One verifies that the units of G r2 are those
of pressure, which has units of force per area.
Next, we need to find the boundary of the star which we choose to be the
vanishing pressure surface. Hence, we define R such that p(R) = 0. This gives
this equation
while the compactness measure is M/R = (4π/3)ρ0R2. For a Newtonian star
there are no constraints limiting the mass and size of this hypothetical star.
The general relativistic treatments of static and spherically symmetric stars
(Sec. 3.3) with constant density will show that M/R < 4/9 which makes these
solutions more realistic.

Exercise 2.2. Let us begin with i = 1, j = 2, k = 3, then ∂1F23 + ∂3F12 + ∂2F31


= 0. Using the components of Fij means that this is ∂1(−Bx) + ∂3(−Bz )+
∂2(−By ) = 0 which is the explicit form of div B = 0. For all other index
combinations where i, j, k take values 1, 2, 3 we either arrive at a trivial
identity or the same equation. This is the second equation of (2.23).
Now, let us choose i = 0, j = 1, k = 2, then ∂0F12 + ∂2F01 + ∂1F20 = 0.
Again, using the components we have ∂0(−Bz) + ∂2Ex + ∂1(−Ey ) = 0 or

We recognise the first term as the z-component of the vector curl E, therefore
we have one of the three equations. The choices i = 0, j = 1, k = 3 and i = 0, j
= 2, k = 3 yield the y and x components, respectively. Hence, the first
equation of (2.23) is shown and therefore the equivalence established.

Exercise 2.3. Let us start with ηijuiuj = −1. In the new coordinates, we should
write ηijuiuj = −c2 and find ui = (c, 0, 0, 0), similar to before but multiplied by
c. The physical meaning of the components of the energy–momentum tensor
is independent of the choice of coordinates, nonetheless, we need to
compensate for factors of c from the Minkowski metric. We can write Tij = (ρ
+ p/c2)uiuj + pηij, and can check explicitly, T00 = (ρ + p/c2)c2 + p(−1) = ρc2,
and T11 = T22 = T33 = p. In general, we would write ui = (c, v) which is nice in
the sense that this vector has units of velocity. Both formulations are related
by the simple transformation ui → cui. We should also note that the choice of
coordinates Xi = (t, x, y, z) means that time and space have the same units.
This gives a first hint that setting c = 1 is a pretty good idea as it is very easy
to get factors of c wrong.
Exercise 2.4. We can read off the matrix form directly from the coordinate
transformation

To compute the determinant, we expand along the fourth and then along the
third row which gives

Next, we need to find LcaLdbηcd. First, we work out

and next one can verify Lca(Ldbηcd) = ηab which is what we wanted to show.
Comparing Eq. (5.82) with Eq. (2.13) gives the following identifications
cosh(ζ) = γ and sinh(ζ) = γβ, and then sinh(ζ)/ cosh(ζ) = tanh(ζ) = β = v/c. The
angle α in Fig. 2.1 is related to the velocity β by β = tan(α) which gives tan(α)
= tanh(ζ) and hence α = arctan(tanh(ζ)).

Exercise 2.5. A 2D rotation matrix R(θ) is generally written as

so that R(iθ) is given by


Now, we can compute the transformation property

which is in agreement with the hyperbolic form of the transformations given


by Eqs. (2.133) and (2.134). This justifies the notion of hyperbolic rotations
for Lorentz boosts.

Exercise 2.6. We begin with Lab from Eq. (5.82) with ζ1 and ζ2, respectively,
and compute

where we used the hyperbolic identities


Therefore, we see that the rapidity of the overall boost is ζ = ζ1 + ζ2.
We can use this result and combine it with the previous exercise where
we showed that tanh(ζ1) = β1 = v1/c. Since ζ = ζ1 + ζ2 we also have that
tanh(ζ) = tanh(ζ1 + ζ2). Now apply the hyperbolic identity which expands
tanh(ζ1 + ζ2), this yields

Finally, we can replace all hyperbolic tangent terms with their respective
velocities and arrive at the desired result

Exercise 2.7. We already computed the T00 component. Next, let us look at
the T0α components where α = 1, 2, 3 can only take the spatial values. Hence,

because η0α = 0 for all α, and F00 = 0.


Therefore, we find

which we identify as the x-component of the vector product of E and B and


we suspect that cT0α = Sα. A direct calculation will show that this is indeed
the case for the other two components as well.
The spatial components of the energy–momentum tensor are given by
Exercise 2.8. We work in Minkowski space so gij = ηij, and recall Fab = ∂aAb
− ∂bAa. We will make the metric in the action explicit and write ηacηbdFabFcd
so that

Considering Aa + δAa gives

where we integrated by parts and neglected the boundary terms.


Therefore, we arrive at the following equations ∂bFab = −(α/4)Ja which
matches the Maxwell equations (2.27) provided we choose α = 16π/c.

Geometry and gravity

Exercise 2.9. The easiest way to see this is to use Eq. (2.76) as the Einstein
field equation. Then Tij = 0 implies T = 0 and therefore the vacuum field
equations are Rij = 0. Taking the trace gives R = 0 as desired.

Exercise 2.10. As mentioned in the text, we will show this directly by


computing the covariant derivative of the left-hand side. This is

where we used that Λ is a constant and thus not affected by the derivative.
Next, by Theorem 1.6 we have ∇iGij = 0. Moreover, we assume the
covariant derivative to be metric compatible, therefore ∇igij = 0 and
consequently the field equations with cosmological term also imply the
energy–momentum conservation equation.

Exercise 2.11. We begin with applying gij to Eq. (2.77). This gives R − 2R +
4Λ = κT, hence −R = κT − 4Λ which we substitute back into Eq. (2.77). This
gives

Exercise 2.12. We begin by writing out explicitly the covariant derivative

where we used = 0 because Γ is symmetric in the two lower indices


while F is skew-symmetric. Using Eq. (1.100), we have

Next, we can multiply this equation by and apply the product rule. This
shows that ∇iFij = 0 is equivalent to

Integration over the entire space results in a conserved quantity.

Exercise 2.13. One could suggest that the simplest metric component is

The sign here does not matter as both α and Λ are not assumed to be positive.
The gravitational field g is related to the Christoffel symbol, hence, the
simplest approach towards a Λ-corrected Newtonian force law would be to
differentiate our suggested metric component with respect to r which gives

Recall that the definition of the Christoffel symbol contains a factor of 1/2,
and also the minus sign when relating the potential to the field. Taking these
into account leads to

Therefore, the cosmological constant contributes linearly to the Newtonian


force law.
Within the Solar System Newtonian gravity is a very good approximation
to the gravitational field of the Sun. The dwarf planet Pluto has a semi-major
axis of about 5.9×1012 m and should not be affected by the term Λ. Since Λ
has units of inverse length squared, we find the simple bound

The observed value of Λ is about 10−52m−2, much smaller than the bound set
by the Solar System. In other words, the contribution of Λ becomes only
significant on cosmological scales.

Weak gravity
Exercise 2.14. We begin with ab = hab − ηabh/2 and take the trace of this
equation which gives = h − 4d/2 = −h. Here we used ηabηab = = 4. Now
we can substitute h by −h¯ and arrive at ab = hab + ηab /2 which is
equivalent to hab = ab – ηab /2.

Exercise 2.15. We start by recalling that gab = ηab + hab implies that gab = ηab
− hab, see Eq. (2.83). Next, instead of (2.91) we must consider
from which we directly compute the analogue of (2.93) which reads

Using, as in the main text, Eq. (1.34) gives

Therefore,

which indeed matches Eq. (2.95) with lowered indices. Raising and lowering
indices with η does not affect the signs. Computing the transformation
property of the trace yields hʹ = h − which we can combine with the
previous equation to find the transformation of the trace-reverse tensor, we
have

Lastly, we can differentiate to arrive at

where the last two terms cancelled because the order of partial differentiation
does not matter.

Variational approach to General Relativity


Exercise 2.16. We begin with f(x + δx) = f (x) + fʹ(x)δx + fʺ(x)(δx)/2 + ···. We
move f (x) to the left-hand side and get f (x + δx) − f (x) = fʹ(x)δx + fʺ(x)(δx)/2
+ ··· and use the suggested notation so that
Now, we divide by δx and consider the limit δx → 0 and we find that indeed
fʹ(x) = limδx→0 δf /δx.
Exercise 2.17. To find the energy–momentum tensor we need to make
the metric explicit in the action, we write FabFab = Fab Fcdgcagdb so that the
action (without source terms) is given by

Therefore, the matter Lagrangian is LEM = FabFcdgcagdb . Recall Eq.


(2.118) for the square root term. We can now consider variations with respect
to the metric

Finally, using Eq. (2.127) we arrive at

which is in agreement with Eq. (2.50).


Exercise 2.18. We know that the Ricci R already contains second partial
derivatives of the metric. Naively, we would expect to arrive at fourth-order
differential equations when computing the Euler–Lagrange equations because
we need to integrate by parts twice when varying. However, the Einstein field
equations are only second order because the variations δRab do not contribute
to the field equations, see the discussion after Eq. (2.121).
The moment we consider a Lagrangian which is not linear in the Ricci
scalar, this argument will not work. If our functional contains the term RijRij
then we would arrive at δRijRij + RijδRij. While δRab does give a surface term
δRijRij does not. The integration by parts used to arrive at Eq. (2.123) will
now also act on the Ricci tensor Rij introducing higher-order derivative terms.
The same holds for gravitational actions containing an arbitrary function of
the Ricci scalar which is not linear. All such theories yield fourth-order
theories (or higher order if covariant derivatives are also included).
Additional discussion: This immediately leads to some problems when
the Newtonian limit is considered. In Sec. 2.2.2, we discussed how the Ricci
tensor is related to the Poisson equation of Newtonian gravity. Therefore, a
higher order theory of gravity will not directly reduce to the Poisson equation
but will contain higher derivative terms. These must be very carefully
examined as any gravitational theory has to pass a series of stringent tests to
be compatible with current observational data. It turns out that many
gravitational theories already fail solar system tests, while those that do not
tend to fail on cosmological scales. Constructing a gravitational theory as
successful as General Relativity is indeed a very difficult task.

5.3. Solutions: Schwarzschild Solutions

The Schwarzschild solution

Exercise 3.1. The Schwarzschild radius in physical units is rS = 2GM/c2 . For


the proton mass we take mp = 1.67 × 10−27 kg so that
which is a very small number! The radius of the proton is about rp = 0.84 −
0.87 × 10−15 m with some interesting discrepancies in recent measurements.
This means that the Schwarzschild radius of the proton is almost 40 orders of
magnitude smaller than its physical radius. This means we can safely neglect
gravitational effects when studying particle physics.

Exercise 3.2. We work with Xi = {t, r, θ, ϕ}, i = 0, 1, 2, 3. We recall


definition (1.71) of the Christoffel symbol. Let us begin with n = 0, then

where, in the first step, we took into account that the metric is diagonal,
therefore g0k is non-zero for k = 0 only. Then we used that the metric
components are independent of time, hence ∂0gij = 0.
Therefore,

The terms in the brackets can only give a non-zero contribution if either j = 0,
i = 1 or j = 1, i = 0, thus

The other components are calculated along the same lines and the results are
Exercise 3.3. We begin with introducing standard spherical polar coordinates
which gives

where we also changed the letter r to ρ in order to avoid confusion with the
coordinate names.
Since our angular variables are already in the correct form, we need to
match up to factors in front of the dΩ2 = dθ2 + sin2θdϕ2. Therefore, we
introduce a new radius as follows:

This implies

A direct calculation also shows that

Hence we find
which shows that Eq. (3.22) really is equivalent to the Schwarzschild metric.

Exercise 3.4. In this question we will only state the results as some previous
exercises were quite explicit about the computations required. Doing these
calculations takes some time and patience. We use coordinates Xi = {t, r, θ,
φ} with i = 0, 1, 2, 3. The non-vanishing Christoffel symbol components are

These yield the following Ricci tensor components

Exercise 3.5. The Einstein tensor is defined by Gab = Rab − gabR/2. So, first
we calculate the Ricci scalar
Applying this to the Ricci tensor components from the previous exercise, we
find

Lastly, we add these four terms together and simplify as much as possible to
arrive at

We will show explicitly how to compute G00 and only state the other results.
Due to the factor of g00 in front of the Ricci scalar we note that R00 and g00R
will have the same prefactor, hence

A particularly nice feature of G00 is that in this component the vacuum


equation G00 = 0 will be independent of A, and therefore will only depend on
the function B. This in turn implies that we can solve this differential
equation and can immediately find B. Before doing so, we state the other
components of the Einstein tensor. These are

Exercise 3.6. As already mentioned before, the vacuum equation G00 = 0


only depends on Bʹ and Bʺ, so we can introduce a new variable b = Bʹ to
reduce the order of the differential equation by one which is then given by

Next, we can introduce a new function defined by b = r/β(r), then (5.138)


becomes

where we multiplied by the factor −β/2. This is a general linear firstorder


ODE which we can solve using integrating factors. The solution if given by

here C1 is a constant of integration. Let us now go back to our original


variable B. First,

which we can integrate to find B so that


with C2 being another constant of integration. The metric function is e2B, and
we arrive at

Next, we substitute our solution for B(r) into the equation for G11 and solve
for Aʹ, this gives

Integration yields

which gives us the other metric function to be

The constant C3 can always be set to one by rescaling the time coordinate.
Next as r becomes very large the metric should approach Minkowski
spacetime. This implies 2C1C2 = 1 or C2 = 1/(2C1) and hence our metric is

and we are left with one constant of integration which has to be connected to
the mass parameter. We know from the weak field approximation that g11 has
to be approximately 1 + 2M/r + ···. This can only be achieved if we choose
the C1 = 1/M which is indeed the choice consistent with the result given by
(3.22).

Exercise 3.7. We begin by relabelling the time and radial coordinates by


using τ and ρ instead of t and r, respectively. Then the metric becomes

The Schwarzschild metric contains the term r2dΩ2 which motivates the first
coordinate transformation

which after differentiation yields

Next, this needs to be substituted into Eq. (5.148) and we find

Let us now write the metric gab in matrix form to visualise the previous
equation
It is now fairly clear that we aim to find coordinates which diagonalise this
matrix. Since our angular coordinates and also the radial coordinate are
already in the correct form, we must introduce a new time coordinate. Let us
try the following transformation

where the prime is the derivative of f with respect to r. Then our metric
becomes

In order to make this line element diagonal, we need to choose fʹ(r) such that
the dt dr vanishes. This means

One can integrate this directly, however, we do not need to know f(r)
explicitly as only fʹ(r) enters the transformed metric. Lastly, we work out the
transformed dr2 term for which we get

Therefore, we arrive at the final result


which is indeed the Schwarzschild metric, provided we set µ = 4M.

Exercise 3.8. The field equations (3.10) and (3.11) with cosmological
constant become

Addition of both equations yields A = −B as in the Schwarzschild case. Now,


the first equation is equivalent to

where C is a constant of integration. We set C = 2M and solve for e−B which


results in

This is the Schwarzschild–de Sitter or Kottler solution.

The Schwarzschild interior solution

Exercise 3.9. Without loss of generality we assume that k = 1. The energy


density ρ0 cannot be negative, one can always achieve that value by rescaling
the radial coordinate. Let us introduce a new coordinate via r = sin χ which
gives

On the other hand, let us start with the definition of the 3. We choose
coordinates xi, i = 1, 2, 3, 4, and have
We parameterise the 3-sphere using three angles as follows:

A direct but lengthy calculation yields

Hence, we showed that the spatial part of the Schwarzschild interior solution
has the geometry of a 3-sphere.

Exercise 3.10. The field equations with cosmological term are Gab + Λgab =
8πTab. This means we can readily state the a = b = 0 field equation by adding
the term Λg00 = −ΛeA to the left-hand side of Eq. (3.24). This gives

We move the Λ to the right-hand side and can jump to Eq. (3.28) which
becomes

We can now integrate this equation and solve for e−B. Using the standard
mass definition m(r) = 4πρ0r3/3 we find

which we can also write as 1 − kr2 with k = 8πρ0/3+ Λ/3.

Exercise 3.11. For the Schwarzschild interior solutions ρ(r) = ρ0 = const. and
m(r) = (4π/3)ρ0 r3 so that
using standard integration techniques starting with (8π/3)ρ0r2 = sin2u.
Extra material: Interestingly, making a series expansion in M yields

Hence, in lowest-order approximation, both mass definitions agree.


Moreover, we can interpret the quantity Mp − M as the gravitational binding
energy of the star.

Exercise 3.12. For the desired result to be true, we need to show that the
integrand is 4πρ0r2. This is a somewhat laborious task. We recall Eqs. (3.32)
and (3.39). We also need Eq. (3.45), however, we need to choose the constant
differently to ρ0 + pc. This particular choice is motivated because gives eA(0)
= 1 at the origin. The exterior metric is the Schwarzschild which satisfies eA
= e−B. We will choose our constant accordingly as our mass definition
measures the mass at infinity. This gives

and leads to the choice C2 = (1 − 2M/R).


Next, we will begin this calculation as follows:
We used the notation as shorthand for the entire square root expression
. At r = R this becomes = . Therefore

and we arrive at

From Eq. (3.42) we can solve for which gives

so that we can finally write

which proves the statement about the integrand. Therefore M∞ = M.


Extra task: The motivated reader may wish to prove a more general
statement, namely that M = M∞ is implied by the field equation. This means
that this result is not specific to the Schwarzschild interior solution but
applies to all such solutions, not just the constant density case, see Lightman
et al. (1975, Problem 16.24).

Geodesics in Schwarzschild spacetime


Exercise 3.13. In Lightman et al. (1975), the authors suggest to work with
outgoing Eddington–Finkelstein coordinates which is particularly suited to
this problem. However, one can also attack this more directly working with
the usual Schwarzschild coordinates. In Eq. (3.98), we found the redshift of
the Schwarzschild spacetime. Assuming the observer to be far away from the
Schwarzschild radius, we have

where re (the radial location of the emitter) is the position of the radio
commentator. We will now find re(t) using the geodesic equations (3.51) and
(3.54) with L = −1 (massive particle, the radio commentator is not massless,
his signal is) and l = 0 because we consider radial motion only. Then

where we expanded near r = 2M. The term linear in r is independent of E.


Separation of variables and integration yield

where t0 is a constant of integration. We now have

from which we can find the approximate relationship exp((t + t0)/2M ) ≈ 1 −


2M/r. Substitution of this result into Eq. (5.183) gives
Therefore, we conclude µ = 4M .

Exercise 3.14. We are dealing with a massive particle on a radial geodesic,


and so we can set l = 0 and L = −1. Then we are left with one geodesic
equation, namely Eq. (3.54) which for this case becomes

Our initial condition ṙ = 0 at r = 10M fixes the constant E such that

Consequently, we are left with

where we chose the negative sign as the particle is moving towards the
centre. The proper time λ0 required to reach the centre is therefore given by

In order to evaluate this integral, we use the substitution r = 10M sin2x, dr


= 20M sin x cos x dx, and get
Here we neglected the constant of integration as we will evaluate a definite
integral. Now, r = 0 corresponds to x = 0, while r = 10M corresponds to x =
π/2. Therefore,

Exercise 3.15. For Λ ≤ 0 the metric components are clearly regular, however,
we still have to be careful since we are working with spherical coordinates
which do not cover the entire sphere, we still need to be cautious at the poles.
For Λ > 0 the metric is singular if 1 − (Λ/3)r2 = 0 which we use to define rΛ
= .
The most efficient way to find the geodesics equations is to start with the
Lagrangian

If we now compare with Eq. (3.48), we notice that the only difference is the
functional form of f(r), otherwise we can use the same equations that
followed. In particular, we have

Hence, the final equation is derived from (3.53) with l = 1 and L = −1, this
yields

which we can solve for ṙ to get

We can fix the constant E by using the initial condition that the velocity at the
origin is v, this gives

Separation of variables and integration yields

where λ0 is a constant of integration. We can set λ0 = 0 which means that r(0)


= 0. Therefore, r(λ) is an increasing function which passes through all values
of the radial variable and so reaches rΛ when λ = rΛ arcsinh(1/v). This is in
analogy to the Schwarzschild radius, geodesics are well defined. However,
when working with coordinate time we note that this would diverge as r →
rΛ.

Testing General Relativity — the classical tests

Exercise 3.16. The standard equation of an ellipse is generally written in the


form x2/a2 + y2/b2 = 1 where a and b are the semi-major and semi-minor axes
which were chosen to coincide with the Cartesian axes. In order to show that
Eq. (3.66) is indeed the equation of an ellipse, we need to consider the most
general ellipse. Its centre does not have to coincide with the origin and
moreover the major and minor axes do not have to be parallel to the Cartesian
axes.
Recall that the sum of the distances between any point P on an ellipse to
the two focal points F1, F2 is constant, and equal to the major axis. This
means we can define an arbitrary ellipse by |P F1|+|P F2| = 2a. We will now
choose the origin to coincide with the focal point F1. Pythagoras theorem
gives that the distance f between the centre of the ellipse and the focal points
satisfies f2 + b2 = a2. Let r be the position vector of the point P, then we can
write our ellipse with one focal point at the origin as follows
, where f is the position vector of the other focal point. Now we introduce
polar coordinates so that we write r = (r cos(ϕ), r sin(ϕ)) and = (cos(φ),
sin(φ)). Here φ0 is the angle which determines the direction of the second
focal point. Since |r| = r we arrive at

Expanding out the left-hand side gives 4a2 − 4ar + r2 so that the r2 terms
cancel, this yields

Let us state Eq. (3.66) explicitly again, this reads

Finally, we are able to identify the parameters accordingly. We must choose


ϕ0 = −φ0 + π to get the correct sign, and

Hence, we are indeed dealing with an ellipse.

Exercise 3.17. We will use a fairly rough estimate here. The radius of Jupiter
is roughly one-tenth of the Sun’s radius, while the mass of Jupiter is three
orders of magnitude smaller
Therefore, light deflection near Jupiter is about two orders of magnitude
weaker than light deflection near the Sun.

Exercise 3.18. It is well known that series can be added, multiplied, etc. So
we should consider the different terms separately. Using the Binomial series
(1 + x)α = 1 + αx + ··· for the first term
we get

Next, we consider the term

where we used the first approximation again. In order to evaluate the entire
second term in the integral of Eq. (3.104) we write it as follows:

Therefore, the complete integrand gives


This final equation is equivalent to Eq. (3.105).

The Schwarzschild radius

Exercise 3.19. From u = t − f (r) we get dt = du + f′dr and dt2 = du2 + 2f′du
dr + (f′)2dr2. Then the de Sitter metric becomes

We can now choose f(r) so that the dr2 term vanishes, then the du dr term
will reduce to a constant. We select

and our de Sitter metric in the new coordinates becomes


Clearly, this metric is now regular at as the metric and inverse
metric are both regular near this radius. However, this metric is not regular
everywhere. This is due to our use of spherical coordinates which cannot be
used to describe the entire sphere, we face the usual problems near the poles.
To find the explicit form or f(r), we need to integrate Eq. (5.212) which
gives

where we neglected the constant of integration.

Exercise 3.20. The incoming Eddington–Finkelstein coordinate v is given by

so that

Then the Schwarzschild metric becomes

Exercise 3.21. From Eq. (3.117) and the previous exercise we have
so that by adding and subtracting these equations we find

We square both equations and multiply by the factor in the Schwarzschild


metric to get

Next we combine these and find

Finally, the Schwarzschild metric becomes

where r is defined implicitly by the equation


Exercise 3.22. We begin with

and likewise

Therefore, we can compute

Strictly speaking this is only valid for r > 2M. If r < 2M we need to be
careful with the exponential of the logarithm. Also, for r < 2M we notice
from Eq. (3.19) that the roles of the radial and time coordinate change.
Hence, for r < 2M the coordinate transformation is given by

5.4. Solutions: Cosmology


Classical and Modern Cosmology

Exercise 4.1.
(i) Probably the simplest way out of this paradox is to assume that the
universe is of finite age and that only finitely many stars can be observed
in a given volume. If the number density of stars is sufficiently low, the
night sky will be dark.
(ii) We could also consider an eternal universe by taking into account that
stars themselves only have a finite lifetime. So, we would have a
situation where new stars appear and old stars disappear. Provided the
number density of the stars stays reasonably low, we would again have a
dark sky at night.
(iii) Another possibility would be to have an eternal and infinite universe that
is expanding. Due to the resulting redshift of the light, only a finite
amount of light can reach an observer and therefore the night sky would
again be dark.

Exercise 4.2. Let us consider an eternal universe. The second law of


thermodynamics states that entropy cannot decrease Ṡ ≥ 0, which means that
heat will flow from warmer to cooler objects. Therefore, a universe of infinite
age should already have achieved thermal equilibrium. Clearly this is not the
case as the universe contains various structure of differing temperature.
Therefore, the universe cannot be infinitely old.

Exercise 4.3. As the calculation is straightforward and explicit examples


were provided in earlier parts, only the results will be stated. Using Xa = {t, ρ,
θ, φ}, a = 0, 1, 2, 3, the non-vanishing Christoffel symbol components are
Exercise 4.4. Substitution of the Christoffel symbol components given by
(5.233) into Eq. (1.164) and summing over the indices d and b yields

Therefore, we can compute the Ricci scalar as follows:

which matches Eq. (4.16).

Exercise 4.5. The line element of 2 is ds2 = dΩ2 = dθ2 + sin2 θdϕ2, so that
det gij = sin2θ. Hence

For the 3-sphere we found ds2 = dχ2 +sin2 χdΩ2 = dχ2 +sin2 χ(dθ2 + sin2
θdϕ2), therefore det gij = sin4χ sin2θ and the volume will be given by

We start to recognise the pattern for higher-dimensional spheres, let us


denote the line element of n by , then we find

So, for the 4-sphere we just need another angle ξ, say, and we have
so that det gij = sin6ξ sin4χ sin2θ. Hence

For the final part of the exercise, let us denote then we


rewrite the previous results as V ( 2) = 2πI1, V ( 3) = 2πI2I1, and V ( 4) =
2πI3I2I1. So for Sn we would have to evaluate

The integrals In are well known and there exist various formulae to find them
explicitly. They are the reason for the appearance of the gamma function.

Exercise 4.6. For a matter-dominated universe w = 0, with Λ = 0 and k = −1,


solving the field equations means solving Eq. (4.27) which in this case
becomes

Instead of the trigonometric substitution used for the k = 1 case, we must use
hyperbolic functions here. Start with

so that our integral becomes


As in the k = 1 case we cannot find a(t) explicitly but can write the solution in
parametric form

This solution expands indefinitely which can be seen as follows. For large u
≫ 1 we have cosh u = (eu + e−u)/2 ≈ eu/2 and sinh u = (eu − e−u)/2 ≈ eu/2,
and also sinh u ≫ u for u ≫ 1. Then

and therefore we can conclude that a(t) ≈ t for late times. We can see this
behaviour in Fig. 4.1, a(t) approaches the diagonal asymptotically.

Exercise 4.7. Equation (4.39) is independent of the pressure and therefore


remains unchanged. On the other hand, Eq. (4.23) now becomes Λ = 4π(ρE +
3pE ). The presence of pressure changes the value of the cosmological
constant of the Einstein static universe.

Exercise 4.8. Throughout this calculation we assume quantities like δa to be


small, and neglect terms higher than linear order. Also, we will include
pressure and the equation of state p = wρ in order to also answer the next
exercise as the approach is identical. We begin with
Then the conservation equation (4.24) becomes

which we can integrate to find that δρ ∝ −δa(1+ w). Equation (4.23) gives

This implies

which is a standard second-order equation with exponential solutions if (1 +


3w)(1 + w) > 0. For w = 0 we have which implies instability. We can
find oscillating solutions if (1 + 3w)(1 + w) < 0 which implies the range −1 <
w < −1/3. Unfortunately, the values of the equation of state parameter do not
describe physical matter source. Matter of this type is often referred to as
dark energy.

Exercise 4.9. See previous exercise.

Exercise 4.10. Inspection of Eq. (4.42) suggests the substitution


which yields

which we can solve for a. This immediately leads to the result.


Physical cosmology

Exercise 4.11. The function E(z) for a spatially flat, matter-dominated


universe without Λ is given by Since there are no
other matter sources, we have Ωm0 = 1 so that

This looks similar to the distance of the particle horizon given by Eq. (4.104).
This is not unexpected when comparing Eq. (4.103) with Eq. (4.78).

Exercise 4.12. As in the previous exercise, we have E(z) = (1 + z)3/2, hence

This leads to

which is in agreement with our earlier result given by Eq. (4.80).

Exercise 4.13. Starting with the definition Ω = 8πρ/(3H2), we get

In the radiation-dominated epoch Since


Λ = 0 we have ΩΛ = 0, and

Putting all this together gives


Now we solve for Ω and find

Therefore A = 2.
Next, from Eq. (5.258) we have

Then, using the previous result we have

which is the desired result.

Exercise 4.14. Recall the solution (4.38) for the k = +1 case. First, we will fix
the constant t0 by requiring that a(0) = 0, which implies The other
time value where the scale factor vanishes is The angle α
travelled for the geodesics is

Using a standard trigonometric substitution, we arrive at

which is exactly halfway around the universe.


Exercise 4.15. As in the previous exercise we have to evaluate

with a(t) implicitly defined in Eqs. (4.32) and (4.33). Therefore, this is less
straightforward than the radiation-dominated case. However, things will
become very simple. Let us change the integration variable from t to the
parameter u. We have dt = (4πρ0/3)(1−cos(u))du from
Eq. (4.33) while a(u) is given by Eq. (4.32). This gives

and we are left to identify the parameter values correctly. As already


discussed in the main text, t = 0 corresponds to u = 0 while tend is attained
when u = 2π. This implies that α = 2π, which is exactly all the way around the
universe.

Exercise 4.16. By definition we have

Therefore,

With dL ≈ dA for z = 10−3, and dA = R/θ we get


Inflation

Exercise 4.17. Start with The cosmological field equations with


scalar field source can be written as

Their sum gives

On the other hand, from (5.271) we can solve for and get

Accordingly, we have F(H, ϕ) = 3H2 − 8πV (ϕ)).


When V = V0 we can separate variables in (5.273) and get

where t0 is the constant of integration. This gives

To find a(t) we recall Now, we integrate H(t) and find

and so
where a0 is another constant of integration. When t ≫ t0 we can write cosh(x)
≈ 1/2 exp(x), therefore,

Exercise 4.18. The scalar field action is

and for now we are only interested in variations with respect to the metric.
Recalling that

We rearrange the terms and find

Therefore,
Now, if ϕ = ϕ(t) then ∇aϕ = ∂aϕ which is non-zero if and only if the
derivative with respect to time is considered.

Exercise 4.19. This exercise contains in fact two parts. First, we need to find
the Klein–Gordon equation for an arbitrary metric and the consider the
Friedmann–Lemaître–Robertson–Walker metric. Our Euler–Lagrange
equations are given by Eq. (1.229) with covariant derivatives. Hence

Next, we must compute the explicit form of □ϕ. We have

where we used the Christoffel symbol components (5.233). This gives final
result

Exercise 4.20. Start with differentiation Eq. (4.127) with respect to time. This
gives

using the product rule on the left and the chain rule on the right. We solve for
3 and arrive at

We can solve Eq. (4.127) for and substitute to get

Now, using Eq. (4.126), we see that the first term contains η and the second
term contains −ϵ. Now we find the desired equation

The slow-roll approximation requires ≪ 1 and therefore, we note that η ≫


1 will guarantee that this is indeed satisfied.
Bibliography

Abbott, B. P. et al. (2016). Observation of gravitational waves from a binary black hole merger, Phys.
Rev. Lett. 116, 061102.
Ade, P. A. R. et al. (2015). Planck 2015 results. XIII. Cosmological parameters, arXiv:1502.01589.
Aldrovandi, R. and Pereira, J. G. (2013). Teleparallel Gravity — An Introduction (Springer,
Heidelberg).
Ashtekar, A., Berger, B. K., Isenberg, J. and MacCallum, M. (2015). General Relativity and
Gravitation — A Centennial Perspective (Cambridge University Press, Cambridge).
Bishop, R. L. and Goldberg, S. I. (1980). Tensor Analysis on Manifolds (Dover Publications, New
York).
Blagojevic, M. (2001). Gravitation and Gauge Symmetries (CRC Press, Taylor & Francis, New York).
Blagojevic, M. and Hehl, F. W. (2012). Gauge Theories of Gravitation — A Reader with Commentaries
(Imperial College Press, World Scientific, London).
Choquet-Bruhat, Y. (2008). General Relativity and the Einstein Equations (Oxford University Press,
Oxford).
do Carmo, M. (1992). Riemannian Geometry (Birkh¨auser, Boston).
Dodelson, S. (2003). Modern Cosmology (Academic Press, San Diego).
Eisenhart, L. P. (1997). Riemannian Geometry (Princeton University Press, Princeton).
Escher, M. C. (2015). Gallery — recognition & success, http://www.mcescher.com/gallery/recognition-
success/. Last accessed on 30th June 2016.
Felice, A. D. and Tsujikawa, S. (2010). f(R) Theories, Living Rev. Relativ. 13, 3, doi:10.1007/lrr-2010-
3, http://www.livingreviews.org/lrr-2010-3. Last accessed on 30th June 2016.
Frankel, T. (2012). The Geometry of Physics (Cambridge University Press, Cambridge).
Gorbunov, D. S. and Rubakov, V. A. (2011). Introduction to the Theory of the Early Universe:
Cosmological Perturbations and Inflationary Theory (World Scientific, Singapore).
Hawking, S. W. and Ellis, G. F. R. (1973). The Large Scale Structure of Space-Time (Cambridge
University Press, Cambridge).
Hogg, D. W. (1999). Distance measures in cosmology, eprint astro-ph/9905116.
Isham, C. J. (2001). Modern Differential Geometry for Physicists (World Scientific, Singapore).
Liddle, A. R. (2015). An Introduction to Modern Cosmology (John Wiley & Sons, Chichester). 265
Lightman, A. P., Press, W. H., Price, R. H. and Teukolsky, S. A. (1975). Problem Book in Relativity
and Gravitation (Princeton University Press, Princeton), http://www.nrbook.com/relativity/. Last
accessed on 30th June 2016.
Maggiore, M. (2007). Gravitational Waves — Volume 1: Theory and Experiments (Oxford University
Press, Oxford).
Maluf, J. W. (2013). The teleparallel equivalent of general relativity, Annalen Phys. 525, 339–357.
Misner, C. W., Thorne, K. S. and Wheeler, J. A. (1973). Gravitation (W. H. Freeman and Company,
San Francisco).
Nakahara, M. (2003). Geometry, Topology and Physics (CRC Press, Taylor & Francis, New York).
NASA (1999). Mars climate orbiter, http://mars.jpl.nasa.gov/msp98/orbiter/. Last accessed on 30th
June 2016.
Rowan-Robinson, M. (2004). Cosmology (Oxford University Press, Oxford).
Ryan, M. P. and Shepley, L. C. (1975). Homogeneous Relativistic Cosmologies (Princeton University
Press, Princeton), http://wwwrel.ph.utexas.edu/Members/larry/RyanShepley.pdf. Last accessed on
30th June 2016.
Sotiriou, T. P. and Faraoni, V. (2010). f(R) Theories of gravity, Rev. Mod. Phys. 82, 451–497.
Stephani, H., Kramer, D., MacCallum, M. A. H., Hoenselaers, C. and Hertl, E. (2003). Exact Solutions
of Einstein’s Field Equations (Cambridge University Press, Cambridge).
Wald, R. M. (1984). General Relativity (The University of Chicago Press, Chicago).
Weinberg, S. (1972). Gravitation and Cosmology: Principles and Applications of the General Theory
of Relativity (John Wiley & Sons, New York).
Weinberg, S. (2008). Cosmology (Oxford University Press, Oxford).
Will, C. M. (2014). The confrontation between general relativity and experiment, Living Rev. Relativ.
17, 4, doi:10.1007/lrr-2014-4, http://www.livingreviews.org/lrr-2014-4. Last accessed on 30th June
2016.
Zwiebach, B. (2009). A First Course in String Theory (Cambridge University Press, Cambridge).
Index

A
affine parameter, 10
angular diameter distance, 176, 178
arc length, 26
autoparallel, 41

B
Bianchi identity, 49
twice contracted, 55
Birkhoff’s theorem, 112
black hole, 113, 117, 123, 144
Buchdahl inequality, 121

C
chart, 19
Christoffel symbol, 27
trace, 33
transformation, 33
classical tests, 127
gravitational redshift, 136
light deflection, 132
perihelion precession, 127
radar echo delay, 138
conformal tensor, 53
conservation equation
charge, 77
cosmological, 161
energy, 78
energy–momentum, 87
momentum, 78
contravariant vector, 13
coordinate system, 19
coordinate transformation, 12
cosmological constant, 160
cosmological principle, 153, 155
cosmology, 153
covariant derivative, 34, 37
covariant vector, 13
critical density, 171, 183
cross product, see vector product
curl, 15
curvature, 42
curve, 10
cylindrical coordinates, 24

D
dark energy, 183
dark matter, 183
de Sitter solution, 168
deceleration parameter, 170
decoupling, see recombination
density parameter, 171
differential, 15
directional derivative, 15
distance redshift relations, 177
div, 15
dual basis vector, 15
dual vector space, 14

E
e-foldings, 194
Eddington–Finkelstein coordinates, 142
Einstein field equations, 88, 103
Einstein summation convention, 8
Einstein tensor, 54
Einstein–Hilbert action, 100, 102
energy-momentum tensor, 78
electromagnetic field, 82
ideal fluid, 79
variational, 103
equivalence principle, 69
Euler equation, 81
Euler–Lagrange equations, 58
event, 70
event horizon, 144

F
Faraday tensor, 18, 76
field equations
linearised, 95
cosmological, 89, 160–161
Einstein, 88
flatness problem, 185, 189
Friedmann–Lemaˆitre–Robertson–
Walker metric, 157

G
geodesic, 28, 40–41
geodesic deviation equation, 56, 98
grad, 15
gravitational
radar echo delay, 138
radiation, 97
redshift, 136–137
time delay, 138
waves, 97

H
Hawking radiation, 146
Hawking temperature, 146
horizon problem, 185, 189
Hubble
function, 170
law, 174
length, 182, 188
length comoving, 188
parameter, 170, 181
time, 182
hyperbolic disk, 25, 54

I
index notation, 7
scalar product, 7
vector, 8
vector product, 9
inflation, 187, 191

J
Jacobian, 12

K
Kerr solution, 146
Klein–Gordon equation, 191
Kronecker delta, 7
Kruskal–Szekeres coordinates, 144

L
Lagrangian, 57, 60
length contraction, 74
Levi-Civita symbol, 9
Lie derivative, 60
light deflection, 132, 135
Newtonian, 135
line element, 22
linearised
Einstein field equations, 95
Einstein tensor, 93
Ricci tensor, 93
Riemann scalar, 93
Riemann tensor, 93
lookback time, 180
Lorentz
boosts, 71
force, 76
transformations, 71
Lorenz gauge, 78, 90, 94–95
luminosity distance, 174, 178

M
manifold, 19
mass, 118
gravitational, 68
inertial, 68
matter action, 103
Maxwell equations, 75, 90
homogeneous, 75
inhomogeneous, 75
metric, 22
Lorentzian, 22
signature, 22
metricity, 35
minimal gravitational coupling, 89
Minkowski space, 25, 71

N
Newton’s theory of gravity, 67
Newtonian potential, 68–69
non-metricity, 35
null geodesics, 123
cosmology, 172
incoming, 142
outgoing, 142
radial, 141

P
parallel transport, 38
particle horizon, 158
perihelion precession, 127, 131
photon sphere, 125
Planck mass, 192
polar coordinates, 23
power-law inflation, 197
projection, 3

R
radar echo delay, 138
recombination, 184
redshift
cosmological, 153, 172
gravitational, 136
Ricci
scalar, 52
tensor, 52
Riemann tensor, 47
identities, 49
Riemann curvature tensor, 47

S
scalar field, 190
scalar product, 3
scale factor, 157
Schwarzschild, 113
effective potential, 124
geodesics, 123
interior metric, 122
interior solution, 117
radius, 116, 141
solution, 115
shortest lines, 26, 41
skew-symmetric part, 17
slow-roll
approximation, 193
inflation, 192
parameters, 193
special relativity, 70
spherical coordinates, 24
spherical symmetry, 111
straightest lines, 41
symmetric part, 17

T
tangent space, 11
tangent vector, 10
temperature, 183
tensor, 16
contraction, 17
definition, 16
rank, 16
time dilation, 74
Tolman–Oppenheimer–Volkoff
equation, 120
torsion, 35, 46
transverse-traceless, 98

U
universe
age, 180
de Sitter, 167
Einstein static, 166
matter dominated, 163
radiation dominated, 165

V
variational approach, 100
vector, 1
addition, 2
basis, 4
contravariant, 13
covariant, 13
direction, 1
field, 12
index notation, 8
magnitude, 1
unit, 1
zero, 1
vector product, 3
volume, 24

W
weak gravity, 92
Weyl tensor, see conformal tensor
world line, 75

You might also like