0% found this document useful (0 votes)
7 views

Differential Geometry I Script 15-11-24 NEW

Uploaded by

tangbowei39
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

Differential Geometry I Script 15-11-24 NEW

Uploaded by

tangbowei39
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 273

Differential Geometry I

Lecture by Tom Ilmanen

Maciej Swiatek

15.01.2024
2
Contents

Preface 9

I Curves and Surfaces 11


1 Curves 15
1.1 Definition and some Restrictions . . . . . . . . . . . . . . . . . . . 15
1.2 Arc-length . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.3 Geometric quantities . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.3.1 The tangent vector . . . . . . . . . . . . . . . . . . . . . . 19
1.3.2 The curvature vector . . . . . . . . . . . . . . . . . . . . . 22
1.4 Curves in R2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
1.4.1 The main interpretations of the curvature scalar . . . . . . . 29
1.4.2 Curvature determines curve up to rigid motion . . . . . . . . 34
1.4.3 The interaction between global and local . . . . . . . . . . . 35
1.5 Curves in R3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
1.5.1 First remarks . . . . . . . . . . . . . . . . . . . . . . . . . 37
1.5.2 The curvature scalar in three dimensions . . . . . . . . . . . 39
1.5.3 Torsion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
1.5.4 How Curvature and Torsion determine curve . . . . . . . . . 43
1.5.5 The Fréchet-Frame . . . . . . . . . . . . . . . . . . . . . . 44
1.5.6 Global theorems for curves in R3 . . . . . . . . . . . . . . . 46

2 Surfaces 49
2.1 Some definitions and basic quantities . . . . . . . . . . . . . . . . . 49
2.2 The curvature of surfaces in R3 . . . . . . . . . . . . . . . . . . . . 54
2.3 The Geometric Definition of Curvature on Surfaces . . . . . . . . . 55
2.3.1 A bit about Qp . . . . . . . . . . . . . . . . . . . . . . . . 57
2.3.2 The independence of Qp (v) from the curve we choose . . . 59
2.3.3 Proof of the theorem . . . . . . . . . . . . . . . . . . . . . 61
2.4 The second fundamental form . . . . . . . . . . . . . . . . . . . . 64
2.4.1 Simplifying the second fundamental form . . . . . . . . . . 64
2.4.2 The mean and Gauss curvatures . . . . . . . . . . . . . . . 66
2.5 Symmetry and Curvature . . . . . . . . . . . . . . . . . . . . . . . 70

3
4 CONTENTS

2.5.1 Examples: Determining Ap quickly through Symmetry . . . 73


2.6 Interlude: A bit about Differentiation . . . . . . . . . . . . . . . . . 77
2.6.1 Vector Fields on R3 . . . . . . . . . . . . . . . . . . . . . . 78
2.6.2 Vectorfields on a surfaces . . . . . . . . . . . . . . . . . . . 80
2.7 Another characterisation of curvature: The Weingarten Map . . . . 83
2.7.1 The Weingarten Map . . . . . . . . . . . . . . . . . . . . . 84
2.7.2 Proof of connection between the Weingarten map and Ap . 87
2.8 *Other formulas for curvature . . . . . . . . . . . . . . . . . . . . . 91
2.8.1 Curvature of a parametrized sufrace . . . . . . . . . . . . . 91
2.8.2 The curvature of a graph . . . . . . . . . . . . . . . . . . . 94
2.9 Intrinsic Geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
2.9.1 The intrinsic metric of M . . . . . . . . . . . . . . . . . . . 96
2.9.2 The first fundamental form . . . . . . . . . . . . . . . . . . 97
2.10 Intrinsic Isometries and Gaussian curvature . . . . . . . . . . . . . . 98
2.11 Two Important theorems . . . . . . . . . . . . . . . . . . . . . . . 105
2.11.1 The Gauss-Bonnet-Theorem . . . . . . . . . . . . . . . . . 105
2.12 The unification theorem . . . . . . . . . . . . . . . . . . . . . . . . 107

II Manifolds 109
3 Topology and topological manifolds 113
3.1 Humble beginnings . . . . . . . . . . . . . . . . . . . . . . . . . . 113
3.2 A topological space . . . . . . . . . . . . . . . . . . . . . . . . . . 119
3.3 Charts: Part I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
3.4 The Hausdorff Condition . . . . . . . . . . . . . . . . . . . . . . . 124
3.5 The ant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
3.6 Interlude: Useful topology for the course . . . . . . . . . . . . . . . 126

4 Smooth Manifolds 127


4.1 Charts II: Compatible charts . . . . . . . . . . . . . . . . . . . . . 127
4.2 Examples of smooth manifolds . . . . . . . . . . . . . . . . . . . . 130
4.3 The maximal atlas . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
4.4 The final definition of a smooth manifold . . . . . . . . . . . . . . 137
4.5 Cartesian Products, Smoothness . . . . . . . . . . . . . . . . . . . 137
4.5.1 The Cartesian product of two manifolds . . . . . . . . . . . 138
4.5.2 Smooth functions between manifolds . . . . . . . . . . . . . 138
4.6 Diffeomorphisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
4.6.1 What you can get from diffeomorphisms . . . . . . . . . . . 141

5 Tangent Vectors 143


5.1 Tangent vectors from charts . . . . . . . . . . . . . . . . . . . . . 144
5.1.1 Vectors and Maps in Rn . . . . . . . . . . . . . . . . . . . 145
5.1.2 Back to manifolds . . . . . . . . . . . . . . . . . . . . . . . 145
5.1.3 The definition . . . . . . . . . . . . . . . . . . . . . . . . . 146
5.2 Tangent vectors as directional derivatives . . . . . . . . . . . . . . 150
CONTENTS 5

5.2.1 The basis of Tp M with the second definition. . . . . . . . . 153


5.2.2 The coefficients of a vector . . . . . . . . . . . . . . . . . . 154
5.2.3 Proving that ∂x ∂
i
i=1,...,n
is a basis . . . . . . . . . . . . . 155
5.3 Tangent vectors from curves . . . . . . . . . . . . . . . . . . . . . 158
5.4 Tangent vectors as operators that satisfy the product rule. . . . . . 160
5.5 Change of coordinates . . . . . . . . . . . . . . . . . . . . . . . . . 160
5.6 Differentiation of a function between manifolds . . . . . . . . . . . 163
5.7 The chain rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
5.8 The coordinate expression for df (p) . . . . . . . . . . . . . . . . . 166

6 Tangent Spaces and Tangent Bundels 169


6.1 First example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
6.2 Definitions and Conditions . . . . . . . . . . . . . . . . . . . . . . 170
6.2.1 The projection map . . . . . . . . . . . . . . . . . . . . . . 172
6.3 Vector fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
6.4 A few intriguing examples . . . . . . . . . . . . . . . . . . . . . . . 175
6.4.1 The sphere . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
6.4.2 The circle, again . . . . . . . . . . . . . . . . . . . . . . . . 178
6.5 The sphere, torus and Klein bottle . . . . . . . . . . . . . . . . . . 178
6.6 A few last things about spheres . . . . . . . . . . . . . . . . . . . . 178

III Creating and Embedding Manifolds 181


7 Submanifolds and how to get them 185
7.1 Definition of a submanifold, Compatibility with manifolds . . . . . . 185
7.2 Examples of submanifolds. . . . . . . . . . . . . . . . . . . . . . . 186
7.3 Construction and Verification of submanifolds . . . . . . . . . . . . 188
7.4 Immersions, Submersions and Embeddings . . . . . . . . . . . . . . 189
7.4.1 Examples of Immersions, Submersions and Embeddings . . . 190
7.4.2 Embeddings are injective immersions . . . . . . . . . . . . . 192
7.5 The Inverse function theorem. . . . . . . . . . . . . . . . . . . . . 193
7.6 Consequences of the inverse function theorem . . . . . . . . . . . . 195
7.7 A few more examples . . . . . . . . . . . . . . . . . . . . . . . . . 196
7.8 What comes next . . . . . . . . . . . . . . . . . . . . . . . . . . . 198

8 Coverings 199

9 Orientations 201
9.1 Orientations on Vectorspaces . . . . . . . . . . . . . . . . . . . . . 201
9.1.1 Why the determinant? . . . . . . . . . . . . . . . . . . . . 203
9.2 Orientations and linear maps . . . . . . . . . . . . . . . . . . . . . 207
9.3 Orientations on Manifolds . . . . . . . . . . . . . . . . . . . . . . . 207
9.4 Orientation double cover . . . . . . . . . . . . . . . . . . . . . . . 209
6 CONTENTS

10 Techanicalities: Countability 213


10.1 Second countability, σ-compactness and para-compactness . . . . . 213
10.1.1 Second Countabilty . . . . . . . . . . . . . . . . . . . . . . 214
10.1.2 σ-compactness . . . . . . . . . . . . . . . . . . . . . . . . . 214
10.1.3 Paracompactness . . . . . . . . . . . . . . . . . . . . . . . 215
10.2 How the three definitions interplay . . . . . . . . . . . . . . . . . . 218
10.3 The Long line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220

11 Tools: Bumps, Cutoffs and partitions of unity 221


11.1 Bumps and Cut-offs . . . . . . . . . . . . . . . . . . . . . . . . . . 221
11.2 Partition of Unity . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
11.2.1 Local finiteness . . . . . . . . . . . . . . . . . . . . . . . . 225
11.2.2 Parition of unity, definition . . . . . . . . . . . . . . . . . . 227

12 Immersions: Local Immersion Theorem 231


12.1 A local answer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236
12.2 A global answer . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236
12.3 Proper Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237

13 Embeddings: Whitney’s Theorem 243


13.1 Whitney’s embedding theorem . . . . . . . . . . . . . . . . . . . . 243
13.2 Refinments of Whitney’s Embedding Theorem . . . . . . . . . . . . 244
13.2.1 Additional refinments . . . . . . . . . . . . . . . . . . . . . 244
13.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245
13.4 Proof of Whitney’s Embedding Theorem . . . . . . . . . . . . . . . 245

14 Submersions: Local Submersion Theorem 249


14.1 Submersions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249
14.2 Back to the question . . . . . . . . . . . . . . . . . . . . . . . . . 250
14.3 The local submersion theorem . . . . . . . . . . . . . . . . . . . . 252
14.3.1 Consequences of the Local Submersion Theorem . . . . . . 252
14.4 Critical Points and Values . . . . . . . . . . . . . . . . . . . . . . . 254

IV Vectorfields, Lie Bracket, Flows and Lie Derivative 259


15 Lie Brackets and Vectorfields 263
15.1 Vectorfields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263
15.2 Why DX Y is not so trivial. . . . . . . . . . . . . . . . . . . . . . . 265
15.2.1 What we can do . . . . . . . . . . . . . . . . . . . . . . . . 265
15.3 Lie Bracket . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266
15.4 Local and Differential Operators . . . . . . . . . . . . . . . . . . . 268
15.5 Lie Algebras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269

16 Flows 271

17 Lie Derivative 273


Disclaimer

Currently, the script is not finished. In particular, the part about Flows and Lie
Derivatives is missing (Currently still in progress). Additionally, a few examples are
missing, for example, the Long line, the Helicoid and a few examples of smooth
manifolds, so if you are learning for the exam, don’t forget that these exist.
Additionally, I cannot guarantee that the script is error-free, in fact, it has many.
I tried to make sure that the big ones are corrected, but there will still be some
errors. As always, remember to verify everything yourself, and anything that differs
from your notes could be wrong. The lecture takes precedence when it comes to
the exam and grading.

7
8 CONTENTS
Preface

Sections with an asterisk in front of them are not mandatory to read, you can
skip them as you like (”You should know these ideas exist, but don’t need to learn
them”). Big thanks go out to Anastasia Sandamirskaya and Ji Zhexian for helping
with writing the first lecture about surfaces.
Disclaimer: We generally stay true to the notation of the lecture. The only real
exception to this (so far) is the symbols we use for charts, for which ψ, χ were used
in the lecture as general charts, whereas we use Ch1 and Ch2 (literally: chart 1 and
chart 2), to make it the equations a bit more direct. Other than that, the symbol
T1→2 is also a product of this script, for the transition/overlap map, which in the
lecture was always written out with the charts (χ ◦ ψ −1 ). We did this to make a
few equations a bit more readable.
Note: There are currently a few pages, particularly where there are many figures
in the text, that have very wide gaps. This changes every time one adds any text
because latex re-chooses where to put things like figures and definitions and then
tries to fit the text around this. I would need to fix this manually at every such
point and will do it after the script is done. For now, I hope you can get past this.

9
10 CONTENTS
Part I

Curves and Surfaces

11
13

We start with some of the most intuitive examples of the type of manifolds we
will be working with, that is, with curves and surfaces embedded in some form of
Rn .
14
Chapter 1

Curves

In this chapter, we will deal with curves. We first define what we mean by a curve,
and impose some restrictions on the kind of curve we want to deal with. We won’t
prove all the things we claim in this chapter, as some of these things you should
have already seen in a Calculus class and this is only a quick overview.

1.1 Definition and some Restrictions


Given that this is Differential Geometry, we do not want to work with discontinuous
curves. We therefore choose to work with smooth curves.
Definition 1.1.1 (Smooth curve). We define a smooth curve in N dimensions
to be a function from some interval I to RN , which is smooth. Mathematically:

A smooth curve is a function γ : I → RN so that it is in C ∞ (I)


The interval can be any sort of interval you want, open, closed, half-open, etc.
We also allow things like (∞, 0]. You can see an example in Figure 1.1. The
thing we are interested the most in in Differential Geometry, is not actually the
parametrization of the curve. What we usually mean by the curve is the actual line
in RN that you can draw on a piece of paper. That is what we mean by a curve.
The actual real geometric line, not the function that assigns it a parameter value.
That is why the parametrization is not the main player in Differential Geometry.
The curve exists independent of parametrization. It is (mathematically) the image
of the function γ, We will use names like γ for the image of the function, not just
for the function, as that is what we care about the most.
Smoothness is not the only property we will (usually) want a curve we work
with to have, because smoothness in the sense above does not guarantee that the
image of the curve is a smooth object. (It only requires the parametrization to be
smooth.) We can see this with the example below.
Example 1.1.1 (Smooth parametrization doesn’t imply smooth image). Take
the curve γ : t ∈ R → (t2 , t3 ) ∈ R2 . It is clear that this is a smooth curve. (The

15
16 CHAPTER 1. CURVES

Figure 1.1: An example of a smooth curve. The Interval I gets mapped onto a
curve in R3 with the function γ

Figure 1.2: The curve γ : t ∈ R → (t2 , t3 ) ∈ R2 . At t = 0 we see that the image


of the curve is not a smooth object
1.2. ARC-LENGTH 17

coefficients are polynomials in t, all the derivatives exist and are continuous.)
But look at Figure 1.2. The image of the curve is obviously not smooth in R2
at t = 0 or equivalently x = (0, 0). What is happening over there? Well, it
resembles the absolute value function a bit. It also had a sort of sharp bend
at a point. The problem back then was with the derivative. It simply did not
exist, which made the curve have a weird behavior (the sharp bend).Similarly,
here the problem is also with the derivative. It exists, obviously, since this is a
smooth curve. But it becomes 0 at the problem point (t = 0 or the origin). A
curve with such a bend is not something we want to really work with, therefore
we put another restriction on the curves we work with. We eliminate curves like
the one from this example simply by saying we don’t work with curves whose
derivative becomes 0 anywhere.

As we saw in the previous example, we get into problem situations if the deriva-
tive of the curve with respect to the parametrization parameter is zero. We therefore
define regular curves as those for which this doesn’t happen, or in other words, where
the velocity never vanishes.

Definition 1.1.2 (regular curves). A smooth curve is called a regular curve, if:


̸= 0 for all t ∈ I (1.1)
dt
where γ is the smooth curve and I is the interval it is defined on.

Note 1. We will use various notations for the derivative of a curve. These
include:

= γt = γ̇ (1.2)
dt

1.2 Arc-length
Now that we have said what we mean by a curve and restricted it so as to not run
into problems like the one in the example above, we can start with the geometry.
Undeniably, one of the most important quantities in geometry is the length. If you
know the lengths of a problem, you already know quite a bit of the geometry. What
is the length of a (piece of a) curve?
Well, we already restricted ourselves to work with regular curves, so our moti-
vation will be more on the intuitive side.
Imagine you have any curve, like the one in the figure 1.3. The idea is that we
divide the curve into very small almost-straight parts, calculate the length of these
parts by approximating that part as a straight line and then summing up all of those
back together. We can do it for reasonable (i.e regular) curves. Of course, in reality
what we do is go infinitesimal, at which point this becomes an integral.
dt ∆t, where by ’s’ we
For the small piece as seen in the figure, we have ∆s = dγ
mean the length and by ∆s the very small length of that very small part. Afterwards
18 CHAPTER 1. CURVES

we add all of these up, and in the continuum limit we get an integral:
Z Z t

s(t) = ds = dt (1.3)
γ t0 dt

Figure 1.3: A curve and our intuitive way to understand the definition of the
arc-length. We zoom in on a very small part of the curve, between t′ and t′ +∆t.
There, if ∆t is small enough, the line will be approximately straight and we can
use the velocity vector to calculate the length of that piece approximately. Note
that the velocity vector is drawn in way smaller than it would actually be for
any reasonable ∆t, just so that the picture is clearer.

Definition 1.2.1 (Arc-length). We define the arc-length of a curve γ : I → Rn ,


by first choosing a specify point t0 ∈ I and it’s image γ(t0 ) as a reference point.
Then the arc length s(t) between t and t0 is:
Z t

s(t) = dt (1.4)
t0 dt Rn

Note that in the definition we did not assume that t > t0 , a negative arc-length
is possible, simply by going into the opposite direction of the parametrization of the
curve.
We already mentioned that the geometrically interesting object is the image of
γ, not the function γ (i.e the parametrization) itself. We will care mostly for things
we can define on the image of γ that are not dependant on that parametrization.
The arc-length is something independent of the parametrization1
1 Of course, we can always choose to parameterize the curve in the other direction, which

changes the arc-length by a minus. We can also choose a different reference point other than
γ(t0 ). But these are choices that are rather trivial and we won’t really mention them from
now on.
1.3. GEOMETRIC QUANTITIES 19

In line with this philosophy, we can define a very convenient, but also more
geometrically ”real” parametrization. The idea is that the arc-length is a geometric
object independent of parametrization, and that, for regular curves, we can use the
arc-length to parameterize the curve.

Lemma 1.2.1 (Reparametrization of a regular curve). Let γ : t ∈ I → RN be


a regular curve. Then we can re-parameterize it to a new (regular) curve β(s),
so that

=1 (1.5)
ds
In other words, we parameterize it by the arc-length s,

Proof. We will only sketch the proof, as this is something rather simple and you
very likely already saw the proof in a calculus class2
The first step is to take the arc-length and see it as a function of t:
Z t

s = f (t) = dt (1.6)
t0 dt

We can then take the inverse of this function, call it g(s) = f −1 and express
t as a function of s. If we take β = γ ◦ g = γ(g(s)), we found the right
parametrization. The only thing left that you need to convince yourself is that
the velocity is really of unit length. (You can do this using the chain rule.)

Of course, the curve is still regular, and all the properties like smoothness are
still obeyed by the curve. The image of β and γ is of course exactly the same, i.e,
you can’t change the curve simply by re-parameterizing. You might find Figure 1.4
helpful in visualizing this.

1.3 Geometric quantities


We continue our search for geometric quantities (other than arc-length), that we can
find in connection to curves. We will find that the tangent vector and the curvature
vector (see definition below) are both independent of the specific parametrization
of our curve and that they tell us a lot of geometric information.

1.3.1 The tangent vector


You have, throughout your studies, seen many functions, and many curves. Your
intuition from Calculus about the tangent vector will probably make you very quickly
say that the the tangent vector (here called τ ) should look like this:

? dγ
τ (t) = (1.7)
dt
2 If not, try it yourself as an exercise.
20 CHAPTER 1. CURVES

Figure 1.4: Different Parametrizations of the same curve. The curve is drawn in
green, the ticks are the points on the curve with the parameter-values written
next to them. In (a) you see a typical non-special parametrization (i.e the ”t”
, in (b) you find the curve parameterized by the arc-length (s). It is intuitively
clear, why the parametrization is not something geometrically interesting. The
real curve (green line) exists, independent of the ticks. In (c) you find another
parametrization by the arc-length, except with a different choice of reference
point on the curve.
1.3. GEOMETRIC QUANTITIES 21

A bit of thought however, reveals that this cannot be true. Why? Well, it is not
independent of parametrization. Imagine, for example, you were to go twice as fast
along the curve. Then your velocity vector (=tangent vector in this example) would
be, twice as big at every point. But if we want the tangent vector to be something
fundamentally independent of the parametrization, then equation 1.7 cannot be the
correct definition of the tangent vector.
How can we fix this? Well, look at figure 1.5. It shows the same curve, param-
eterized in three different ways, with the ”fake” tangent vectors (from equation 1.7
drawn in. The thing that should jump at you is that, while the length of the vectors
does change, the direction does not 3 . The tangent vector how we defined it is not
the geometrically real thing, rather the unit tangent vector is, which is exactly how
we choose to define it below.

Figure 1.5: The ”fake” tangent vectors from equation 1.7 for different
parametrizations of the same curve (green). In (a) you see a random
parametrization (black ticks) and its ”fake” tangent vector (blue) from equation
1.7. In (b) you have the same situation, only that this time you go twice as fast
along the curve. Notice that the vectors (the physically drawn arrows) change.
In (c) you have the same thing, but this time parameterized with arc-length.

Definition 1.3.1 (tangent vector of a curve). We define the tangent vector τ


3 At least to the precision of my drawing skills
22 CHAPTER 1. CURVES

of a curve γ : t ∈ I → RN to be the vector:

dγ/dt
τ (t) = (1.8)
|dγ/dt|

If we parameterize by the arc-length, then the formula for the tangent vector
becomes:

τ (s) = (1.9)
ds
since dγ
ds = 1.

In this sense, we get something geometric, independent of the parametrization.


You can also think of this as us choosing the tangent vector to be the one from
figure1.5(c)
Note 2. We abused notation a bit. We write γ(s) instead of β(s), since, as we
already mentioned, we only really care about the image of those, and they are
ds for ds for the same
the same, and it avoids cluttered notation. We also write dγ dβ

reason. By the chain rule, we get ds = dt ds , which we also write as dγ


dβ dγ dg
dt ds , to
dt

avoid introducing another symbol,g, since it is just the function that expresses
the parameter t in terms of s.

1.3.2 The curvature vector


We now come to the curvature vector. It is the fundamental object that describes
how much the tangent vector of the curve changes. Look at, for example, figure
1.5(c). It depicts the tangent vector for the curve, which is parameterized by its arc-
length. This vector does not change it’s length (it is per definition of unit length),
but it does rotate as you follow along the curve. Notice also, that the more the
tangent vector rotates, the more ”curved”4 the curve is. That’s where the name
comes from.
Definition 1.3.2 (The Curvature Vector of a Curve). Let γ : s ∈ I → RN be
a regular curve, parameterized by the arc-length. Then we define the curvature
vector κ to be:
dτ d2 γ
κs = = 2 (1.10)
ds ds
It is clear, that, because s is independent of any sort of parametrization, dτ
ds is
too.
Figure 1.6 shows a curve with the curvature vector drawn in. Notice that the
curvature vector seems to be orthogonal to the tangent vector5 . This turns out to
be true, universally, as we will now prove.
4 At
least in the intuitive sense, for well-behaving curves.
5 Thephysics students among you might find this very similar to how in physics we some-
times separate acceleration into a parallel and perpendicular part, the former changing the
speed, the latter curving the trajectory. Since here we don’t change the speed, only the curving
part is left.
1.3. GEOMETRIC QUANTITIES 23

Lemma 1.3.1 (κ ⊥ τ ). The curvature vector κ is orthogonal to the tangent


vector τ .

Proof. The proof is strikingly simple. We know that the length of τ is set to
one. Therefore ⟨τ, τ ⟩ = 1 and dsd
⟨τ, τ ⟩ = 0 since the length (and therefore the
scalar product) doesn’t change along the trajectory. We can use the product
rule:
d dτ
0= ⟨τ, τ ⟩ = 2⟨ , τ ⟩ = 2⟨κ, τ ⟩ (1.11)
ds ds
Therefore, the the scalar product of the two vectors is 0, i.e they are orthogonal,
as claimed.

Figure 1.6: A curve with τ and κ drawn in. Notice that κ is orthogonal to τ .

This is something physics students are very familiar with. The situation is very
analogous to the trajectory of a particle. The speed of the particle doesn’t change,
so the only direction the acceleration (= curvature vector) can have is perpendicular
to the curve.

Note 3. We want to make a quick check on the units of all the quantities that
we described so far. Let’s assume that our RN holds some sort of length unit,
like the cm, which we will write as [L]. Let’s also assume the parameter of
our parametrization has the units of time, like sec, which we denote [T ]. Then
both γ and s have units [L], so the tangent vector dγ ds has units of [L]/[L] = 1
and is unit-less. This is something we want explicitly, as the geometric object
should not be dependant on the parametrization, which means it should also be
independent of the unit of the parametrization [T ]. The ”fake” tangent vector
we defined before has, on the other hand. units of [L]/[T ].
2
The curvature vector ddsγ2 has units of [L]/[L]2 = 1/[L].
24 CHAPTER 1. CURVES

Until now, we have only given a formula for the curvature vector in the arc-
length-parametrization. We will now write down the formula for the curvature
vector with any parametrization.
Lemma 1.3.2 (Curvature Vector in arbitrary Parametrization). Let γ : t ∈
dγ/dt
I → RN be any curve and τ (t) = γt (t) = |dγ/dt| the tangent vector. Then the
curvature vector κ(t) can be written as : 6

1
 
γt γt
κ= 2 γ tt − ⟨γ tt , ⟩ (1.12)
|γt | |γt | |γt |
d2 γ
where γtt is dt2 .

Before we go on to prove this, we first want to talk about what each part of the
equation means.
2
We know that κ = ddsγ2 and therefore expect it to have something to do with
d2 γ
dt2 . This turns out to be the case, the first term is indeed γtt . But there is a
correction term of −⟨γt , |γγtt | ⟩ |γγtt | , which has a nice geometric explanation.
It projects γtt onto the normal plane of the tangent vector. See Figure 1.7 for
a visual example. After we have projected γtt onto the normal plane, we still divide
2
it by |γt | . You can see it as just a factor that makes sure that the units work out.
We can see this simply by comparing units. The part that projects γtt onto the
Normal plane has the same unit as γtt , so we can just look at γtt . (Because we add
2
them. That doesn’t change the units.) The unit of γtt = ddt2γ are clearly [L]/[T ]2 ,
while the unit of κ is 1/[L], as we saw above. Therefore, to get a consistent formula,
2
we need something that has units of [T ]2 /[L]2 . 1/ |γt | is exactly a factor like that.
Note 4. We call the normal plane a plane, even though that is technically only
correct if we have a curve in R3 . In R2 it is a line, in R4 a hyperplane and in
general an (N − 1)-dimensional vector-space.

Proof. We now prove equation 1.12. The proof consist in its most basic form just
of taking the definition of κ in the arc-length-parametrization and switching to
the t-parametrization, using the normal rules of derivatives (chain rule / product
rule). We start with the chain rule.
dτ dt dτ
κ= = (1.13)
ds ds dt
Rt
where by dsdt
we of course mean dg
ds where g = f
−1
and f (t) = t0

dt dt.
Therefore:
 −1  −1
dt dg df dγ 1
= = = = = 1/ |γt | (1.14)
ds ds dt dt |dγ/dt|
6 If you already have some experience of Differential Geometry or you are rereading this after

already learning further chapters, you might notice how this is the the covariant derivative of
the tangent vector
1.3. GEOMETRIC QUANTITIES 25

Figure 1.7: A curve with τ and κ drawn in, as well as γtt . Because we move
along the curve faster and faster (the ticks are more spread out), γtt has a
component in the ”forward” direction, which we cancel out in equation 1.12.
2
The vector is still too long though, which is why we need to divide by |γt | .
26 CHAPTER 1. CURVES

If we insert the definition of τ in the t-parametrization we get:


dt dτ 1 d γt
κ= = (1.15)
ds dt |γt | dt |γt |
d|γt |
Now we need to use the quotient rule and the fact7 that dt = ⟨γtt , |γγtt | ⟩.
1 d γt
κ= (1.16)
|γt | dt |γt |
d|γt |
1 (γtt |γt | − γt ( dt )
= 2 (1.17)
|γt | |γt |
1 (γtt |γt | − γt ⟨γtt , γt / |γt |⟩
= 2 (1.18)
|γt | |γt |
1
 
γt γt
= 2 γ tt − ⟨γ tt , ⟩ (1.19)
|γt | |γt | |γt |
And we get the result as promised.

1.4 Curves in R2
We have, by now, defined exactly what we mean by a curve, seen the concept of
what sort of object is geometric, and defined a few of these, like the arc-length,
tangent and curvature vectors. We will now use all of these concepts to describe
curves in the two dimensional plane.
The main idea that makes this a lot simpler, is that the curvature vector κ
reduces to a number. This is because the direction of the curvature is always
predetermined in two dimensions by the direction of the tangent vector.
To see this, we note that, as we showed before, the curvature vector κ lies in
the normal ”plane” of the tangent vector, which in two dimensions means that it
lies on a straight line perpendicular to τ . Therefore, we only need to specify one
number8 to determine the curvature vector.
Let’s say we are at a point on a curve, like the one drawn in figure 1.8. We
can construct a right handed basis of R2 at that point by taking τ as our first
basis vector, and the vector that one gets if one rotates τ by 90 deg (in the positive
sense.), which we will call N . Since τ is of unit length and we get N by rotating τ
by 90 deg, this is an orthonormal basis. (One that is right hand sided.) Notice that
it immediately follows that:

κ = kN (1.20)
for some k ∈ R, because we know that κ and τ are orthogonal. We call this k the
curvature scalar. It is an important quantity in differential geometry, and we will
find its equivalents for different geometric objects throughout the subject.
7 You should recognize this from Calculus II, given maybe in a different notation: dr/dt =

x/dt = ⃗xr d⃗
(∇r) ∗ d⃗ x
dt
8 On each point of the curve
1.4. CURVES IN R2 27

Figure 1.8: A curve with its tangent, curvature and normal vectors drawn in
at a point on the curve. As you see, the curvature vector is just some number
times the normal vector. Note however, that k is not just the absolute value of
κ, since it can also be the negative of its length, if it points in the other direction

Example 1.4.1. Our first example is the simplest curve that is not a straight
line (because a straight line, of course, has no curvature9 ),which is a circle of
radius R.
The curvature of that circle is

k = 1/R (1.21)

if the circle is parameterized in the counter-clockwise direction. Before you


dive into algebra, let’s consider why this result makes a lot of sense. Take the
few circles in figure 1.9. It is intuitively clear, that the bigger the radius of
the circle, the less curved it gets. As you get progressively bigger radii, the
circles look more and more ”flat” at the top, or in other words, less curved.
The biggest circle (only drawn in partially) is almost flat, and if you were to
draw something like R = 100 you could probably not see the difference anymore
between a straight line and the circle. So the result, that the curvature scalar
is the inverse of this radius makes a lot of sense.
The proof of this claim is a very good exercise in converting parametrizations
and getting geometric information out of coordinates, and we will therefore leave
it as an exercise.

9 If you don’t immediately believe this, convince yourself of it.


28 CHAPTER 1. CURVES

Figure 1.9: A few circles with different radii, with their respective τ, κ, N drawn
at the point (0, R) of each curve. The bigger the circle, the less curved it is, as
reflected by the formula k = 1/R.
1.4. CURVES IN R2 29

1.4.1 The main interpretations of the curvature scalar

The curvature scalar has a lot of interpretations. Let’s first state them, then discuss
their consequences.

Proposition 1.4.1 (Interpretations of the curvature scalar). As always, let γ


be a two dimensional curve, with all the properties we already discussed. Then
the curvature scalar has the following interpretations:

1. The curvature scalar is the rate of change of the angle the tangent vector
makes with the x axis. Mathematically, let θ = arctan ττ21 (s)
(s) be exactly
that angle Then:


k= (1.22)
ds

2. The absolute value of the curvature scalar k tells us the radius of the
osculating circle, which is the distinct circle, that agrees with the curve
up to order two.

1
|k(s)| = (1.23)
R(s)

where R(s) is the radius of that circle at the point of the curve whose
parameter-value is s.

The first interpretation of the curvature scalar should make a lot of sense in-
tuitively. We know that the tangent vector cannot change its length, since per
construction it is of unit length. Therefore the only thing that can really change is
the direction, i.e the angle it makes with the x-axis. This, along with the fact that
the curvature vector describes how the tangent vector changes, makes the first part
of the proposition rather intuitive. See figure 1.10 for a visualisation.
The proof is not to complicated, you just need to derive θ(s) and remember
that (1) the derivative of arctan(x) is 1+x
1
2 and (2) the normal vector in terms of

the components of τ is (−τ2 , τ1 ).

Proof of the first interpretation. As we said, you only need to derive θ. Let’s
30 CHAPTER 1. CURVES

Figure 1.10: A curve, with its tangent vectors drawn in, and a table that shows
how the tangent vector rotates.

start:
 
dθ d τ2
= arctan (1.24)
ds ds τ1
 
d(arctan(x)) d τ2
= (1.25)
dx ds τ1
1 τ˙2 τ1 − τ2 τ˙1
= (1.26)
1 + x2 τ12
1 τ˙2 τ1 − τ2 τ˙1
= (1.27)
τ2
1 + 22 τ12
τ1
1
= ⟨(τ˙1 , τ˙2 ), (−τ2 , τ1 )⟩ (1.28)
τ12 + τ22
1
= ⟨κ, N ⟩ (1.29)
1
=k (1.30)

The factor in the fraction is one because it’s the square of the length of τ , which
is one.

Now, what about the second interpretation? Well, you can imagine a circle,
going along the curve, that locally looks like the curve. (The curve tries to be
1.4. CURVES IN R2 31

as similar to the circle as possible, but because the radius of the osculating circle
changes with s, it doesn’t become a circle.)
Figure 1.11 gives a picture of a curve and it’s osculating circles at different
points of the curve. As you hopefully agree with, the bigger the radius of the circle,
the more straight the curve will be at that point (as both of them agree to order
two so they locally behave quite similarly.) Therefore we expect that the second
interpretation is correct, that is, the curvature scalar is inverse to the radius of the
osculating circle.

Figure 1.11: A curve with its osculating circle drawn in at a few places along the
curve. (The biggest one only partially drawn in) It is clear that the bigger the
osculating circle is, the straighter the curve will be, which gives the connection
to the curvature scalar.

Proof of the second interpretation. We will not prove this, as it is quite simple,
but we will sketch a proof. The osculating circle agrees with γ up to order
two. Therefore, we can expect that the second derivatives (i.e the k’s) agree
for the curve and the osculating circle (which we can see as a second curve.) at
that point. We know that at that point, the circle has kcircle = 1/Rcircle , and
therefore this should also be true for the first curve. The only missing parts of
the proof are (1) the proof that an osculating circle exists, which it does10 , and
a more rigorous way of presenting the above argument.

There is actually also a third interpretation of the curvature scalar, for a special
kind of curve. Let’s say, that the curve is the graph of a function y = u(x) that
assigns a y-value to every x-value, like the one in figure 1.12. Consider the second
10 A straight line is a circle of infinite radius in many aspects of geometry, this is also

true here, if the curve is locally straight at a point, the radius of it’s osculating circle will
blow up and the circle will become as straight line, but the theorem will still hold. For the
mathematicians: 1/∞ = 0 in this case.
32 CHAPTER 1. CURVES

Figure 1.12: A graph of a function as a curve.

derivative of u. Can we connect it to k, which is also, a second derivative? Yes. In-


fact, this is a very common theme that will accompany you throughout differential
geometry. Curvature is a second derivative and a second derivative is curvature in
some sense11 . The relationship between k and uxx is not trivial however. k ̸= uxx !
The actual relationship is:
uxx
k= 3/2
(1.31)
(1 + u2x )
We have to compensate, because x is not s. The proof of this is quite simple, if
quite long. The basic strategy is, as with many of these proofs, differentiate until
you get to where you want to be.

Proof. Let’s start, by collecting different terms that might be useful. Firstly,
γ(x) = (x, u(x)) and therefore:

γx = (1, ux ) (1.32)
1/2
|γx | = 1 + u2x (1.33)
γx (1, ux )
τ= = 1/2
(1.34)
|γx | (1 + u2x )
(−ux , 1)
N= 1/2
(1.35)
(1 + u2x )

The first three should be rather clear, coming straight from the definition. The
last one comes from the fact that N is just τ , but rotated by 90 deg, which means
we switch the two entries of the vector and put a minus in-front of the first one12
11 Conditions apply, as always.
12 Ifthis is not clear to you, try it out with the rotation matrix of positive 90 deg. You’ll
see that this is correct.
1.4. CURVES IN R2 33

We can now just use the definition of κ and the chain rule and calculate until
we get there.
dx dτ
κ= (1.36)
ds dx
 −1
dx ds −1
= = |γx | (1.37)
ds dx
1 dτ
→κ= (1.38)
|γx | dx
1 d γx
= (1.39)
|γx | dx |γx |
Before we continue, there is something to note about what we already found.
Inside the derivative, we already normalize once, and then again outside of the
derivative. In this sense, κ is a normalized version of a second derivative.
1 d (1, ux )
... = (1.40)
|γx | dx |γx |
1 (0, uxx ) d 1
 
= − (1, ux ) (1.41)
|γx | |γx | dx |γx |

(We used the product rule.) Now, we know that k = ⟨κ, N ⟩. The last term in
the equation above for k is proportional to (1, ux ) which is proportional to τ ,
which means that when we form the scalar product to get k, it drops out, since
τ is orthogonal (per construction) to N . We get:

k = ⟨κ, N ⟩ (1.42)
1 (0, uxx ) (−ux , 1)
=⟨ , ⟩ (1.43)
|γx | |γx | |γx |
uxx uxx
= 3 = 3/2
(1.44)
|γx | (1 + u2x )
Here, we used that that aforementioned second term is orthogonal to N and left
it out. At the end we just collected terms.
Now, after seeing how much manual computation this took, you might be a
bit astounded as to why. The reason is the same reason why anytime you actually
want to compute something in differential geometry it usually turns into a mess
of derivatives. We are turning something fundamentally coordinate-based13 (uxx )
into something geometric (k). Coordinate-based objects usually have, as you might
imagine, a lot of information in them that is only related to the choice of our
coordinates and we have to filter that information out when we do the conversion.
This is the reason why there is so much to compute, even if the steps aren’t too
complicated.
13 To make this discussion more general, we write coordinate-based, even though right now

it’s just a parametrization. You can see a parametrization as coordinates on the curve.
34 CHAPTER 1. CURVES

1.4.2 Curvature determines curve up to rigid motion


There is one more thing we want to discuss about the curvature scalar before we
move on. We want to talk about how much the curvature (scalar) actually tells us
about a curve, or to what degree it determines the curve, in the sense that you have
a function k(s) which you say is the curvature scalar of the curve, and ask yourself
how much freedom you still have left. It will turn out that the curvature determines
the curve, up to it’s position at s = 0 and the angle of the tangent vector at that
point. This mirrors Newton’s law a lot, the reason being that both are differential
equations of second order14 . You can also see this as having the freedom to preform
any rigid motion (a rotation or translation, but no mirroring or stretching) and still
getting a curve with the same curvature scalar. You can look at figure 1.13 for an
example.

Figure 1.13: You can preform a rigid motion and not change anything about
the curvature of the curve.

Theorem 1.4.1 (Curvature determines curve up to a rigid motion). The cur-


vature k(s) of a curve determines that curve up to a rigid motion.
Proof. We will, again, not prove this rigorously. We will give you a sketch, from
which it should be clear that a proof can be constructed.
The basic idea is to integrate twice.
Firstly, integrate the equation:

k(s) = (1.45)
ds
to get: Z s
θ(s) = θ(0) + k(s) ds′ (1.46)
s0
14 The only difference to Newton’s law is that the speed (with respect to s) can’t change for

a curve, that is why we don’t get to pick any tangent vector.


1.4. CURVES IN R2 35

Figure 1.14: A picture of the theorem.

where θ(0) is a constant we can choose freely. The next step is to integrate the
equation:

(s) = τ (s) = eiθ(s) (1.47)
ds
and get: Z s

γ(s) = γ0 + eiθ(s ) ds′ (1.48)
s0
giving us yet another constant γ0 , which we can choose freely. To get a more
rigorous proof, you would need to show that these are the solutions (just dif-
ferentiate them) and that these are the only solutions (use a theorem from
calculus.)

1.4.3 The interaction between global and local


Before we close the subject of curves in R2 , we want to talk about a general theme
of differential geometry, that shows up throughout the subject, and apply it to
curves in R2 . The theme is the interaction between global and local properties of
geometric objects. The idea is that local properties (like curvature), which only feel
a tiny piece of the object (around any point), give constrains on (or even determine)
global properties. Local things are things that only need a small surrounding of
a point to be defined at that point, like the curvature vector, or later the metric.
Global things are typically integral quantities, which often are related to topology.
We will give an example (without proof) of a theorem that follows along the line of
this idea, but before we do that, we need to define two further restrictions, that we
will need to make (and will from now on assume that the curves we usually work
with will usually obey.)
36 CHAPTER 1. CURVES

Figure 1.15: A problem case of a closed curve, for which the tangent vector at
the beginning is not the same as the tangent vector at the end. We want to avoid
this, so we just take these kinds of curves (and ones where higher derivatives
don’t match up) out of the set of curves we consider
1.5. CURVES IN R3 37

Definition 1.4.1 (more restrictions, simple curves).

1. An N -dimensional smooth curve γ is called simple, if it has no self-


intersections, i.e if γ(s) = γ(t) then s = t. The one exception we make
are the edges, as we don’t want to call a closed curve, like a circle, self-
intersecting, just because it returns back to it’s beginning.

2. Similarly, a curve is closed if it is defined on an interval [a, b] and γ(a) =


γ(b)

3. If we are working with closed curves, we will want them to have the (nice)
property that, if we extend them periodically to a curve from R → RN ,
they are smooth. This is to avoid annoying situations like the one in figure
1.15

With these restrictions, we can state the theorem.

Theorem 1.4.2. Let γ be a two dimensional regular closed curve that obeys
the above restriction. Then: Z
kds = 2πn (1.49)
γ

for some n ∈ Z

The integral quantity is the global quantity we mentioned before, while k is the
Rb
curvature scalar, which is a local quantity. Since γ k ds = a dθ
R
ds ds, the global
quantity is just the total angle (with signs) by which the tangent vector rotated, a
profoundly global thing.
It makes sense that this would be so. If the curved is closed (and is smooth
on the edge, if made periodic), then the angle τ has to rotate by must be some
multiple of a whole rotation, since it ends up where it started.

Proposition 1.4.2. If the curve is simple, n = ±1

This proposition tells us that a non-intersecting curve’s τ can only turn once in
total. See the examples in figure 1.16
With this we conclude the topic of two-dimensional curves, and move on to
three-dimensional curves.

1.5 Curves in R3
1.5.1 First remarks
Very early in our discussion of two-dimensional curves we figured out that in two
dimension, the curvature vector is not really necessary for the description of how the
curve curves. Better said, the curvature scalar (the signed length of the curvature
vector) held all the information about curvature, the direction of the curvature vector
38 CHAPTER 1. CURVES

Figure 1.16: Two simple curves and the graph that shows by how much the
tangent vector rotated.

was always predetermined. This is very different for curves in three dimensions.
Here, the vector character of the curvature vector really stands out.
We saw at the beginning of this chapter that the curvature vector lies in the
normal plane of the tangent vector. In two dimensions this helped us, by letting us
forget about the direction of the curvature vector and only consider the curvature
scalar. This time, we cannot do this, as we have an entire plane that the curvature
vector could lie in.
In the two dimensional case, we defined a moving frame composed of τ and
N , which was a right handed orthogonal basis. This is the idea we will use to get
further in three dimensions.
Definition 1.5.1 (The moving frame). Let, as always, γ be a curve, this time in
R3 with all the properties we already mentioned before (smoothness, regularity
etc.) We define three vectors at each point, that will compose the moving frame
we will usually use.

1. The first is the tangent vector τ . Remember that it is normalized.


2. The second one is the normalized curvature vector, which we will call N ,
N = |κ|
κ

3. The third vector will be defined by the equation β = N × τ . Because both


τ and N are of unit length, β is too, and all three together form a right
1.5. CURVES IN R3 39

handed orthonormal basis.

The second vector will be called the normal vector15 , while the third vector is
called the bi-normal vector.

Together, N and β span the normal plane of τ . For a visualisation of the moving
frame, look to figure 1.17.

Figure 1.17: A three dimensional curve, with the moving frame drawn in. N is
κ, but normalized, β is the cross product of N and τ .

1.5.2 The curvature scalar in three dimensions


Because the curvature in three dimensions is really a vector, unlike in two dimen-
sions, we will not find that we can describe the entire curve just with the curvature
scalar. We still introduce it, as |κ|. But this time, it cannot change signs, which
means the two dimensional version is not exactly the same thing as the three di-
mensional.

Definition 1.5.2 (Curvature scalar for curves in three dimensions). The cur-
vature scalar is simply the absolute value of the curvature vector, defined as
k = |κ|

Now, you might have noted that to define N , we divided by the curvature scalar
and this becomes a problem, if k is zero. This is an actual problem and happens
for any curve that is (to second order) straight at some point. We will exclude this,
simply by adding another restriction on our definition of a curve.
15 Even though it is not the only normal vector to τ , but it is a very special normal vector,

since it goes in the direction of κ. Therefore we call it the normal vector


40 CHAPTER 1. CURVES

Definition 1.5.3 (ordinary). A curve γ is called ordinary, if the curvature


vector never vanishes, which means that for all points on the curve |κ| =
̸ 0.

With this aside, let’s go back to the curvature vector. We saw that the curvature
scalar will simply not provide enough information about our curve that we can paint
a complete picture. We will need another agent, which will be called the torsion.

1.5.3 Torsion
Here we will define what we mean by the new agent we said we needed in the
previous section, called torsion. Torsion will also be another geometric object16 . It
will tell us, in a sense, how much the curvature vector changes.

Definition 1.5.4. Let γ be an ordinary three dimensional curve. We define


the torsion vector λ to be the projection of the derivative of the normal vector
onto the bi-normal:
dN
λ=⟨ , β⟩β (1.50)
ds
and the torsion scalar l to be:
dN
l=⟨ , β⟩ (1.51)
ds

Why do we do this? Why don’t we just define the torsion vector to be dN ds ?


Well, it turns out, that this has a lot of unnecessary information.
Consider it’s τ and N components. For the τ component we get (product rule):

dN d dτ d
⟨ , τ⟩ = ⟨N, τ ⟩ − ⟨N, ⟩ = 0 − ⟨N, κ⟩ = −k (1.52)
ds ds ds ds
which is just (minus) the curvature scalar, which we already know, and for the other
component we get (Product rule again):

dN d⟨N, N ⟩ d
2⟨ , N⟩ = = 1=0 (1.53)
ds ds ds
since a unit vector can’t change in it’s own direction (otherwise it’s length would
change.).
That is why we take the projection.It provides us with the only new information.
The information we want is how the normal plane changes, but only in the direction
of the bi-normal-vector.
We can form a table with all the objects we have so-far introduced and a few
things to note on them. See table 1.1.
One thing that you might find surprising at first is that the units of λ are not
1/[L]2 . The reason is because we normalize κ before differentiating, which means
we multiplied the units by [L].
16 As a reminder, something is a geometric object or geometrically invariant when it does

not change if we change the parametrization, or rotate RN .


1.5. CURVES IN R3 41

Figure 1.18: A curve, and it’s normal vector and plane changing along the curve.

object formula name/interpretation derivatives units


γ - Position 0 [L]
τ dγ
ds tangent vector / velocity 1 []
κ κ = kN curvature vector / acceleration 2 1/[L]
λ λ = lβ torsion / ”jerk” 3 1/[L]

Table 1.1: The main geometric objects we have defined up-til now.

We can see what we are doing as a Taylor-expansion, which we stop after 4


terms (3 derivatives.)
We can draw a parallel between k and l. k measures how much γ deviates from
being a straight line. Look at figure 1.19(a). k is the value that measures how γ1
deviates from a straight line. In the same figure, in (b), you see a curve that lives
entirely in a plane (even though it might have been defined in three dimensions.)
It’s l value is zero17 . In (c), you see a a curve γ3 , which lives almost in a plane,
close to the vicinity of a particular point on the curve, but which deviates from that
plane. For it l is not zero18 .
Imagine the red plane in the figure is the xy-plane. Then the one curve would
look like γ2 (s) = (x2 (s), y2 (s), 0) and the other curve would look like γ3 (s) =
(x3 (s), y3 (s), z3 (s)), i.e it would have a non-zero third component.
We can expand z3 (s) (which we will call z(s)) around s = 0, which we can
assume to be the parameter of the thick point in the picture. Then, since the
xy-plane is the one that the curve ”almost” lives in, z(0) = 0, of course, but also
z ′ (0) = 0, z ′′ (0) = 0. This means that both the tangent and curvature vector lie in
the xy-plane. The first non-zero derivative will be the third, and if we Taylor-expand

17 Think about why.


18 Itmight be zero on some particular points of an arbitrary curve, where to high order the
curve locally really almost lives in a plane, and only deviates a bit ”further”.
42 CHAPTER 1. CURVES

Figure 1.19: The parallel between k and l. k measures how lose a curve is to
a straight line (Part (a)). In part (b) you see a curve that lives entirely in a
plane, while in (c) you see a curve that deviates from the plane it is almost in,
at least in the direct vicinity of the thick point with the moving frame drawn
in. l is the thing that measures this.
1.5. CURVES IN R3 43

z around 0, we will get something like:

z(s) = 0 + 0s + 0s2 + cls3 + ... (1.54)

where l is the torsion scalar, and c is some universal constant. This is why the
torsion scalar measures how much the curve deviates from living in a plane. As an
exercise, you should prove all the claims we did not prove in this discussion and find
c. You can take the curve γ = (s, as2 , bs3 ), as a first example and see where in the
Taylor-series around 0 you find k and l)
You can also prove that if k = 0 for all s, then the curve is a straight line, and
if l = 0 for all s, the curve lies in a plane. (You can do this as an exercise, or decide
that the above discussion convinced you of this.)
In summary, torsion measures how much a curve twists away from a plane and
into the third dimension.

Example 1.5.1. What is the curve with constant curvature and torsion? We
won’t show it here, but it can be shown that it is a helix. A helix, by the way,
can be parameterized by (R cos(t), R sin(t), mt) where m is some constant.

Figure 1.20: A helix

Why does it make sense that it’s a helix? Well, we want constant curvature,
which means a circle is involved, but we also want it to twist out of the plane
it lives in, at a constant rate, which is why we get a helix.

1.5.4 How Curvature and Torsion determine curve


We have gotten a good intuition about curvature and torsion by now. Are these all
the things we need to know about a curve in R3 ? It turns out that, pretty much,
yes, will be the answer, very similarly to how in two dimensions we only really need
the curvature scalar.
44 CHAPTER 1. CURVES

This is stated in the following theorem, which can be proven similarly to its
equivalent in two dimensions.
Theorem 1.5.1 (k and l determine curve up to a rigid motion). Let k(s) ≥ 0
and l(s) be any smooth functions. If we set the curvature scalar and torsion
scalar to these functions, respectively, we determine the curve uniquely in R3
up to a rigid motion in R3 .
As we said before, we won’t prove this. But we hope you see how interesting the
result it. We need only two real functions to describe a curve in three dimensions,
and only one real function in two dimensions. In both of these cases, we reduced
our description by an entire function.

1.5.5 The Fréchet-Frame


We want to talk a bit more about τ, N and β. We already said that these form
a right-hand-sided orthonormal moving frame, basically a coordinate system that
moves with the curve. Now, you can define arbitrary moving frames, but this one is
somewhat special, not least because both of the components are normalized versions
of geometrically important vectors (κ and λ). That is why this choice holds a special
name, it’s called the Fréchet-frame. (Sometimes Frenet–Serret frame).
Before we talk about what makes this choice of moving frame special, we want
to talk a bit about general moving frames.
Let’s say we have some curve γ in three-dimensional space, and some moving
frame (not necessarily the Fréchet-frame) attached to it, that is orthonormal. See
figure 1.21.
We will abuse notation a bit and write the three basis vectors of the frame
e1 (s), e2 (s), e3 (s) as a vector like this:

e1 (s)
 
e2 (s) (1.55)
e3 (s)
Then, as we will show in a second, for a general moving frame we can write:

e1 (s) e1 (s)
   
d 
e2 (s) = A(s) e2 (s) (1.56)
ds
e3 (s) e3 (s)
where A(s) is a 3 × 3 matrix.
This matrix has a special property, generally, for any moving frame.
Proposition 1.5.1. Let γ be any curve and (e1 (s), e2 (s), e3 (s)) an orthonormal
moving frame. Then the matrix A(s) from equation 1.56 is anti-symmetric.
Where does the anti-symmetry come from? Well, there are two parts of the
anti-symmetry. Firstly, the diagonal is zero, which comes simply from the fact that
a vector of unit length cannot change in its own direction, otherwise the length
would change. Then there is the anti-symmetry of the other components, which
1.5. CURVES IN R3 45

Figure 1.21: A curve with its Fréchet-frame and a second picture of the same
curve with a random moving frame.
46 CHAPTER 1. CURVES

stems from the orthogonality. If one vector changes, the others have to change in
a way that they all stay orthogonal to each other.

Proof. We know that ⟨ei (s), ej (s)⟩ = δij , in-dependant of s. Therefore:

d dei (s) dej (s)


0= ⟨ei (s), ej (s)⟩ = ⟨ , ej (s)⟩ + ⟨ei (s), ⟩ = Aij + Aji (1.57)
ds ds ds
Aha, this is exactly the condition for anti-symmetry.

Now we will be able to see why the Fréchet-frame is special.

Theorem 1.5.2 (Fréchet-Frame-derivatives). For a Fréchet-Frame, we get:

0 0
    
τ k τ
d   
N = −k 0 l  N  (1.58)
ds
β 0 −l 0 β
where k = k(s), l = l(s) are the curvature and torsion scalars, respectively,
This is a particularly simple matrix. The special thing about a Fréchet-frame is
that it reduces the three independent matrix-components of A to two, specifically
two components we already know.

Proof. Many of these are definitions (for example, the first equation is just the
definition of N ), the rest aren’t too hard to check and we leave them to you as
an exercise.

1.5.6 Global theorems for curves in R3


We already introduced the principle that local and global quantities influence each
other, when we discussed the topic for curves in R3 . The same principle, of course,
applies to curves in R3 .
We will not fully prove any of the below, just state them. The first one will be
familiar from curves in two dimensions.

Theorem 1.5.3 (Fenchel’s theorem). Let γ be a closed curve in R3 (RN works


as well.) Then: Z
|k| ds ≥ 2π (1.59)
γ

The main steps in proving Fenchel’s theorem are the following:


(i) The curve τ has image in S 2 not contained in any open hemisphere. It is in a
closed hemisphere iff γ is a plane curve.
(ii) Any curve of length ≤ 2π in S 2 is contained in a closed hemisphere, and any
curve of lenght < 2π is strictly contained in an open hemisphere.
1.5. CURVES IN R3 47

Proof. (see exercise sheet 1, nr 1c)


(i) Suppose the image of τ in contained in an open hemisphere. By rotating
R3 we can assume that the last coordinate of τ (s) is > 0. Then (as τ is the
derivative of γ) the last coordinate of γ(t) in R3 is strictly increasing in t, so
γ(0) = γ(L) is not possible, which contradicts γ being a closed curve. If the
last coordinate of τ (s) is only ≥ 0 for all s, then it actually must be = 0for all
s. Because if the last coordinate of τ (s) is < 0 for some s, there needs to be an
s̃ such that the last coordinate of τ (s) is < 0 to get a closed curve (we need to
lose height again). But if the last coordinate of τ (s) is 0 everywhere the curve
stays in the plane with last coordinate 0.
(ii) is left as an exercise to the reader.

The next theorem says something about the same quantity as above, for knotted
curves. Knotted curves are curves that cannot smoothly be transformed into a circle,
without crossing themselves. You can see a few examples in figure 1.22

Figure 1.22: The trefoil, figure-8-knot and unknot. The first two are knotted,
the last one is not, even though it might look like it at first.

Theorem 1.5.4. Millner’s theorem Let γ be any three-dimensional curve, that


is knotted, closed and simple. Then:
Z
|k| ds ≥ 4π (1.60)
γ

The bound in Millner’s theorem is sharp: for example the trefoil knot has total
absolute curvature of exactly 4π.

With these two examples, we conclude our discussion of curves and move on to
the second type of object we want to discuss before talking of differential geometry
in a general matter, those objects being surfaces in R3 .
48 CHAPTER 1. CURVES
Chapter 2

Surfaces

In this chapter, we deal with surfaces, which are the obvious next step after curves in
our discussion. We will not treat surfaces in the more general RN , but just surfaces
in R3 , as they are more intuitive and are enough for the purposes of this lecture.
We start with the the definition and then define a few basic properties, before
moving further to a discussion of the geometry and curvature.

2.1 Some definitions and basic quantities


The easiest type of surface in R3 is simply a graph of a function of x and y (Like
the one in figure 2.1(a)) We have a function f (x, y) and we set this to be the
z-coordinate. You can have a look at the specific example below.

Proposition: The surface z = f (x, y) = (x2 + y 2 )

In figure 2.1(b), you can see the graph of the function z = f (x, y) =
(x2 + y 2 ). It is, of course, a surface (in every intuitive way, but also in the
more general definition we will give below.)

It is clear that the surface in the example should be a surface. The idea that a
surface should always be the graph of a function is however, not a good one. There
are two simple examples that should definitely be surfaces, but wouldn’t be, if that
was our definition. The first is the xz-plane. Obviously, it should be a surface. If
anything should be a surface, the xz-plane should be. And yet, it is quite easy to
see that you can’t write it as a function of x and y. Well, you might say, that there
is nothing special about x and y in R3 and that we should be allowed to choose
any plane to describe our surface. For example, we could take y = f (x, z) = 0 to
describe the xz-plane. Yes, that is a possibility, but still we find a problem. Take
the sphere. It should definitely be included in our definition of a surface. But I dare
you to find any plane for which you can write the whole sphere as a graph of a
function. It should be very clear, that this is not possible.

49
50 CHAPTER 2. SURFACES

temp

Figure 2.1: A picture of many surfaces, that explain our definition of a surface. In
(a) you can a graph of some random function of x and y. The picture in (b) is
simply the graph of f (x, y) = x2 + y 2 . In (c) you see the first problem with the
naive definition, because we cannot write the xz-plane as the graph of a function
of x and y. In (d) you see the further problem, that even if we allow for the
function to be defined on an arbitrary plane, the sphere can simply never be of
sure form globally, but it can be made such locally (e).
2.1. SOME DEFINITIONS AND BASIC QUANTITIES 51

Can we repair this situation? Yes, if we recognize that, while we cannot write
surfaces like the sphere as a graph of a function, at every point on the surface, in
some (possibly very small) region of the surface, we can write that region as the
graph of a function (on some plane in R3 ). That is we can write the surfaces locally
as the graph of a function1 . This leads us to our full definition.

Definition 2.1.1: Smooth surface in R3

smooth surface M in R3 is a subset of R3 , so that for every point, there


is a surrounding of that point which can be written as a smooth graph of
some function in some direction.

Now that we have settled on what we mean by surfaces mathematically, we need


to be able to describe derivatives, since we are, after all, doing differential geometry.
You can imagine that you are a small ant, living on this surface, and you, as an
ant, can only move on the surface, never out of it2 . Then everything you can see
and feel is two dimensional, but the usual calculus we know in R3 is obviously three
dimensional.We somehow need to reduce the dimensions. An intuitive idea, is that
smooth surfaces, locally (that might be very locally) look like a plane, the same
way that a smooth curve locally looks like a straight line. We can take this plane,
called the tangent plane, as the space of directions, and do our differentiation there,
because in the limit (of a very small area around the point of interest), the surface
and plane will converge to the same thing. See figure 2.1.
We define the tangent plane a bit differently however, through curves that live
in M . This is mostly because it will be this definition that we will use to define the
tangent space when talking of manifolds generally, in later chapters.

Definition 2.1.2: Tangent plane

The tangent plane to M at p is the set of all vectors based at p and tangent
to M at point p. We can formally define it as:

Tp M = v ∈ R3 and v = γt (0) for γ smooth curve on M and γ(0) = p




(2.1)

Apart from tangent vectors, we also have normal vectors. These are vectors that
are normal to M , which, of course, means that they are also normal to Tp M

Definition 2.1.3: Unit normal vector

The unit normal vector to M at p, denoted N (p), is a normalised vector


perpendicular to the tangent plane at p.

You can look at figure 2.1 for an example.


1A function that depends on the point!
2 You are an ant. You cannot fly or jump or anything like that.
52 CHAPTER 2. SURFACES

Figure 2.2: A surface M , with a point p and the tangent plane Tp M drawn
in, at different levels of ”zoom”. Globally, M and Tp M are very different,
but the more we zoom in, the more do the coincide, until they become
”almost” the same.

Figure 2.3: A surface, it’s tangent plane, and a normal vector.


2.1. SOME DEFINITIONS AND BASIC QUANTITIES 53

We have, of course, at every point, two choices of normal vectors. We can


either choose N to point in one, or the other direction. When we were discussing
curves, we also had this choice, and simply solved it by saying that we would always
take N in a way so that we get a right-handed coordinate system with the other
important vectors3 . Here, without any further important objects, we cannot really
do this. There isn’t an obvious choice of preferred directions in the tangent plane,
so that we could just choose N ’s orientation to get a right-hand coordinate system.
With closed surfaces (like a sphere), we can at least talk of N pointing ”inwards” or
”outwards”, but for something that isn’t closed (like a plane), we don’t even have
this luxury. What we will do, is simply choose some direction for N whenever we
need it, and use that one consistently.
But we don’t need to only talk of a normal vector at a point, we can talk of the
normal vectors at all the points of the surface, see figure 2.1

Figure 2.4: The normal field N of some manifold M

Definition 2.1.4: Unit normal field

The unit normal field of M is the function N : M → R3 which maps points


p in M onto the unit normal vector at p. We require N to be continuous
(or sometimes smooth).

Some remarks to the unit normal field:


• We have two possible choices for a unit normal vector at a point (”inwards”
or ”outwards”). Since we require N to be continuous it must be in the same
direction on a connected set, so overall we get a total of 2k possible unit
normal fields, where k is the number of connected components in M .
• In some cases we can only define N locally. A classical example being the
Möbius-strip4 , which you can see in figure 2.1.
3 In 2d: τ , In 3d τ, β
4 You can easily build a Möbius-strip, from a strip of paper. You just need to tape the ends
54 CHAPTER 2. SURFACES

• If M is the boundary of an open set in R3 , then N can be defined globally.

Figure 2.5: The Möbius strip. You can try to draw a normal field by starting
at some point and drawing the next normal vector and then the next until
you get back to where you started, but you will find that when you come
back, your normal will point into the other direction than the one you started
with (= not smooth).

Now that we have developed the most basic tools we could use to do differential
geometry on surfaces (The analogues of τ and N for curves) we can proceed to
talk about curvature.

2.2 The curvature of surfaces in R3


We will find that the the curvature of a surface at a point will be described by an
object called the second fundamental form5 . There are three equivalent ways to
define the curvature of surface, which are summarized below.
A Through the curvature of curves through p (geometric definition, QP )
B By using the Hessian of M , regarded as a graph over its own tangent plane
Tp M (2nd fundamental form, Ap (x, y))
C By defining the Weingarten map Wp
Furthermore, we will prove two useful formulas for explicitly computing curvature:
D Formula via appropriate parametrisation
E Formula for a graph of a function
of the strip of paper together, but in a way that the strip turns once. If you never did this,
you should try.
5 You might wonder whether there is a first fundamental form. Yes there is, but for peda-

gogical reasons, we will introduce it a bit later.


2.3. THE GEOMETRIC DEFINITION OF CURVATURE ON SURFACES55

We shall start with the geometric definition of curvature, since it is the most intuitive
one.

2.3 The Geometric Definition of Curvature on


Surfaces

The first way to define curvature on a surface is using curves that lie on the surface.
If the surface is curved, then any curve passing through the point of interest p
has to be curved as well. We cannot just take the curvature of some curve on
M going through p, simply because this will give us different values for different
curves. (There is an infinite amount of curves going through p that live on M ,
all curved differently, see figure 2.3) There is however a specific amount of curving
that a curve has to curve at point p to still stay on M , otherwise it would leave.
This amount of curvature, will be in the direction of N . Why? Well, in the tangent
direction, we can have pretty much any curvature we want, the curve can be as
curved as it want on M . That pretty much leaves on the direction of N . From this
simple thought follows the next definition.

Figure 2.6: A surface M and a point p. There is a lot of curves going through
p, some less, some more curved. (They all have to live on M though.)
56 CHAPTER 2. SURFACES

Definition 2.3.1: Geometric definition of curvature


Let M be a surface and N the unit normal vector field. Let p ∈ M , v ∈ Tp M
with |v| = 1, γ(t) any curve in M with some parametrisation t, such that
γ(0) = p, γt (0) = v. Then define:

p (v) = Qp (v) = ⟨κγ (0), N (p)⟩


QN

What we basically do is define a function that gives us the normal component


of the curvature of a curve going through p on M , with its tangent vector matching
with v. See figure 2.3

Figure 2.7: The idea of the previous definition. The function Qp takes a
(normalized) v from the tangent plane, takes a curve through p with a
matching tangent vector, and puts out the normal component of κγ .

There are two questions you might have already asked yourself. Firstly, why do
we only define this when |v| = 1? That is an easy question to answer. Mostly
convenience. It will simplify the proof and calculations, while not leaving out any
real information. Let’s say you want to know Qp (v) for some vector with length
2. Then you can do this entire procedure for the same vector, but normalized, and
simply parameterize the curve γ in such a way that you move twice as fast along it.
The second question is whether this definition makes any sense, whether it
is well-defined, because we have, after all, many curves going through p with their
tangent-vector equaling v and it is not obvious that we always get the same number
for Qp (v) for any choice of appropriate curve. This will turn out to be true however,
as we will prove in a short time. For now, we will quickly assume that this is true
and talk about the object Qp a bit.
2.3. THE GEOMETRIC DEFINITION OF CURVATURE ON SURFACES57

2.3.1 A bit about Qp

We want to take a few notes and intuitive ideas about what Qp is and describes at
this point.

• Firstly, Qp (v) is not standard notation, because it will turn out that Qp will
simply be the second fundamental form in a slightly different manner, which
will be denoted as Ap (X, Y ), and which we will introduce shortly.

• Qp (v) will turn out to be a quadratic form in the components of v, i.e it will
P2
be of the form Qp (v) = i,j=1 aij v i v j , where aij are some numbers that
come from the surface’s geometry and v i is the i-the component6 of v.

• Qp (v) = Qp (−v). This follows from the fact that for an appropriate curve
which we use to calculate Qp (v), we can simply take the same curve, but
reverse it’s parametrization. Then the tangent-vector changes direction (or
sign), but the curvature vector stays the same, and therefore by definition
Qp (v) does too7 .

• Qp (v) can be either positive or negative, there are no general restrictions on


it’s sign. The signs hold geometric information however. Also, if we use −N
as our normal field, the sign of Qp (v) flips, simply because N is in the scalar
product.

• The largest and smallest values of Qp belong to two normalized vectors e1 , e2


that are orthogonal to each other. To know these two directions and their
Qp values is also enough to specify Qp (v) for all v in Tp M with unit length.
The values of these are most often denoted kmax = k1 and kmin = k2 .The
quantities k1 and k2 are called the principle curvatures at p, while the two
directions e1 and e2 are called the principle directions/axes of curvature.

We will prove many of these claims soon, for now we just wanted to familiarize you
with these properties.

6 Notice the index is upstairs. There is a reason for this sort of notation, where upstairs and

downstairs indexes have different meanings. For now, however, there really is no difference,
but we will use this notation to get you accustomed.
7 Prove that the tangent-vector changes sign, but the curvature doesn’t.
58 CHAPTER 2. SURFACES

Figure 2.8: A sphere, the north pole, a random unit tangent vector and a
great circle.

Proposition: A sphere of radius R

As always, a sphere is a good way to find an intuitive understanding of


what an object like Qp (v) describes. Let us assume that for all v in Tp M
of unit length it really is enough to calculate Qp (v) for a single curve and
that this number will be the same for all other appropriate curves. Then
we can calculate Qp (v) very easily. Because the sphere is very symmetric,
we can get away with just calculating Qp for the north pole, and expect
it to be the same (geometrically) at all other pointsa . The coordinates
(Cartesian) of the north-pole are (0, 0, R) and the normal vector (we choose
the inwards normal vector) is N = (0, 0, −1). Let’s say we are given a
v = (v 1 , v 2 , 0) to calculate Qp (v) of. For this v, we can choose the curve to
be the appropriately parameterized greater circle lying in the plane spanned
by N and v. The curvature vector of this greater circle at p must, of course,
be R1 N , so that

1 1
Qp (v) = ⟨κγ , N ⟩ = ⟨ N, N ⟩ = (2.2)
R R
Also, this is independent of v, which really just says something about the
(local) symmetry of the sphere.
a You can alternatively see this as us using special (Cartesian) coordinates, for
which the point on the sphere that is of interest p has coordinates (0, 0, R). Then,
since Qp is defined by a scalar product of two fundamentally geometric vectors, it
should transform geometrically.
2.3. THE GEOMETRIC DEFINITION OF CURVATURE ON SURFACES59

Proposition: Independence of Qp from the curve

on S 2 ] We have calculated Qp for the sphere. We will do the same thing


again, but with a different curve and observe how it is that we get the same
result. We have the same situation as in the last example, but this time we
pick a different curve, that is not a greater circle. Instead, let’s take a circle
on the sphere that passes through p and has the tangent vector v, but is
smaller than a greater circle. See figure 2.3.1. Because the circle is smaller,
κβ is a lot larger than κγ from the last example. But κβ also points in a
different direction, not in the direction of the normal vector of the sphere
at the north pole. When we projecta κβ back onto N , we get the same
result (prove this). In this way, the projection onto N ensures that we get
the same result, no matter the curve.
a Because we are calculating a scalar product

Figure 2.9: Here we have the same situation as in the previous figure, but
instead of a greater circle, we take a smaller circle to compute Qp (v). We
have however rotated the sphere (v is pointing towards you), so that it would
be simpler to see what is going on, namely that the curvature vector of the
smaller curve is far larger, but that the projection onto N is still the same.

2.3.2 The independence of Qp (v) from the curve we choose


Now that we have a bit more intuition/information about Qp (v), we want to prove
that it is independent of our choice of curve, so long as the curve lies on M and
has the tangent vector v.
60 CHAPTER 2. SURFACES

Theorem 2.3.1: Qp is well-defined

et M be a surface, p some point on the surface and v some unit vector


in Tp M . Then Qp (v) = ⟨κγ (0), N (p)⟩ is independent of our choice of the
appropriate curve γ. (Appropriate means that γ lies on M , γ(0) = p and
that the tangent vector γt = v.)

To prove this theorem, we will need the lemma below.

Lemma 2.3.1
Let everything be the same as in the theorem above. Then

⟨κγ (0), N (p)⟩ = ⟨γtt (0), N (p)⟩ (2.3)

What the lemma says, is basically that instead of the geometric curvature κ of
the curve, we can instead use the second derivative of γ.
The proof is quite simple.
Proof. Let γ be any appropriate
 curve. We simply compute ⟨κγ (0), N (p)⟩, using
the fact that κγ = |γ |2 γtt − ⟨γtt , γt ⟩ |γ |2 . We can use the fact that |γt | =
1 γt
t t
|v| = 1, which simplifies the formula for κ

⟨κγ (0), N (p)⟩ = ⟨γtt − ⟨γtt , γt ⟩γt , N (p)⟩ = ⟨γtt (0), N (p)⟩ (2.4)

where in the last step we have used the geometric fact that γt = v is in the
tangent plane, and therefore normal to N .

Curvature is a second derivative


Before we go on to prove the theorem, we will discuss a big idea the proof relies on,
which is an overarching theme in differential geometry. It’s the idea that curvature is
a second derivative. This is something that is fully universal throughout differential
geometry. This fact however, shows up only if you either use some smart ”natural”
coordinates or differentiate with respect to a geometric quantity. In general coordi-
nates, it is usually an equation containing a second derivative but with some other
non-geometric ”decorations”.
For a curve, we defined the curvature vector to be:
d2 γ
κ= (2.5)
ds2
It is manifestly a second derivative, and because we are differentiating after a geo-
metric quantity (s), it is simply just the second derivative. Later, while discussing
curves, we saw that, if you can write the curve as a graph of a function y = u(x),
then the curvature becomes:
uxx
k= (2.6)
(1 + u2x )3/2
2.3. THE GEOMETRIC DEFINITION OF CURVATURE ON SURFACES61

At any point whose tangent is parallel to the x-axis, ux = 0 and:

k = uxx (2.7)

It is clear that we can also (locally) write the curve as a graph over it’s tangent, as
seen in figure 2.3.2, where we also find that the curvature scalar becomes a second
derivative.
We will see that the curvature of a surface will also become a second derivative,
when you write the surface as the graph of a function over its tangent plane (locally).

2.3.3 Proof of the theorem


We will now, finally, get to the proof.
Proof.

Step 1 We start, by choosing an orthonormal basis of R3 , so that p = (0, 0, 0)


and Tp M is the xy-plane, and N = (0, 0, 1). This is mostly a matter
of convenience, and certainly doesn’t change ⟨κγ (0), N (p)⟩. Then we can
write the surface (locally) as the graph of a function where z = f (x, y) for
a function given by f . For a picture see figure 2.3.3
Step 2 Now let v ∈ Tp M . Then it will be of the form v = (v 1 , v 2 , 0) ∼ (v 1 , v 2 )
(We associate the xy-plane in R3 with R2 in the usual way)
Step 3 We now let γ be any curve on M with γ(0) = p and γt 0 = v and compute
⟨κγ (0), N (p)⟩. To do this, we notice that γ(t) = (x(t), y(t), z(t)), and
γtt (t) = (x′′ (t), y ′′ (t), z ′′ (t)), obviously. Then:

⟨κγ (0), N (p)⟩ = ⟨γtt (0), N ⟩ = ⟨(x′′ (0), y ′′ (0), z ′′ (0)), (0, 0, 1)⟩ = z ′′ (0)
(2.8)

where we used the lemma from above. We found the theme of the last
subsection. Curvature is a second derivative. What we have basically
done is write (a local part of) the surface as a graph over its tangent plane
at p, and found that, similarly to curves, curvature became the second
derivative in those coordinates.
Step 4 Somehow, we need to use the fact that γ lies on the surface, which we have
not yet used. Of course, because γ does lie on the surface, its coordinates
have to be: γ(t) = (x(t), y(t), f (x(t), y(t))), because the z component (for
a local part, again) is not independent of x and y. If we now evaluate
z ′′ (0) we get:

zt t = fx (x(t), y(t))xt (t) + fy (x(t), y(t))yt (t) (2.9)


ztt (t) = fxx xt xt + fxy xt yt + fx xtt + fyx xt yt + fyy yt yT + fy ytt (2.10)

But, we only need it at t = 0, where it simplifies a lot. Firstly, xt =


v 1 , yt = v 2 because we are using a curve whose tangent vector is v. But
62 CHAPTER 2. SURFACES

Figure 2.10: A curve that is just the graph of the function u(x). When
written as the graph of a function over its tangent, k becomes the second
derivative.
2.3. THE GEOMETRIC DEFINITION OF CURVATURE ON SURFACES63

Figure 2.11: The coordinates we are using for the proof of the theorem.
The point p is at the origin and the tangent plane is the xy-plane

also fx (0) = fx (x(0), y(0) has to be 0, because the xy-plane is Tp M is


tangent to the surface, which we describe with the function f . So what
we get is:

ztt (0) = fxx v 1 v 1 + 2fxy v 1 v 2 + fyy v 2 v 2 (2.11)

Firstly, we have just proven that Qp (v) is a quadratic form, but we have
also proven that Qp (v) is independent of the curve we choose. The quan-
tities fxx , fxy and fyy have nothing to do with our choice of curve, only
with the underlying surface.

We can rewrite our result quite a bit and see another way of looking at it.

Qp (v) = fxx v 1 v 1 + 2fxy v 1 v 2 + fyy v 2 v 2 (2.12)


  
 fxx fxy v1
= v1 v2 = v T Hv (2.13)
fyx fyy v2

where H is the Hessian-matrix of f evaluated at the point p. If we write x =


x1 , y = y 1 , z = z 1 we can write it in one sum:
2
X ∂f
Qp (v) = vi vj (2.14)
i,j
∂xi ∂xj

which makes the quadratic-form-ness of Qp a bit more visible. (The partials are of
course evaluated at p)
64 CHAPTER 2. SURFACES

2.4 The second fundamental form


Now that we have seen the geometric definition, we can manipulate it a tiny bit
and define the second fundamental form.

Definition 2.4.1
Let M and p be, as always, a surface and a point on it. Let us also use
coordinates where locally we write the surface as the graph of a function f
over its tangent Tp M at p. The the second fundamental is defined as:
2
X ∂f
Ap (X, Y ) = i ∂xj
X iY j (2.15)
i,j=1
∂x

as a function Ap : Tp M × Tp M → R. The vectors X, Y are both in the


tangent space.

We can very quickly note that the second fundamental form is symmetric, in
the sense that Ap (X, Y ) = Ap (Y, X) simply because the hessian is a symmetric
matrix. You can also see the second fundamental form as the first Taylor-coefficient
of f , which actually tells us anything geometric. The zero-th coefficient just tells
us where the surface is, the first only how the tangent plane is oriented, both of
which are not geometric things. Only the second one tells us anything geometric,
that is, the curvature at the point p.
Note 5. The formula above can only work if we use coordinates where p =
(0, 0, 0) and the xy plane is the tangent-space, because there fx and fy vanish.
We need to carefully pick this coordinate system to extract the geometric infor-
mation. There is a general formula in any coordinates, but it long and tedious,
which is the reason why we won’t do it here.
To connect the second fundamental form to curvature, we note that:

⟨κγ , N (p)⟩ = Ap (γt (0), γt (0)) (2.16)

2.4.1 Simplifying the second fundamental form


The second fundamental form is a bilinear form, whose matrix is the hessian of
the surface in our special choice of coordinates. It is also symmetric. This means
that we can diagonalize it, by choosing good coordinates of Tp M . We can do that
since our choice of coordinates has an arbitrariness, in the sense that we only really
fixed the xy-plane to be Tp M and the z-axis to be along N , but we can still rotate
or flip around the z-axis and get another coordinate system that works exactly as
well as the original one. In linear algebra terms, we can preform an orthonormal
(rotation or flip or both) transformation on Tp M (the xy-plane). We will use this
to our advantage. If you had a good course on linear algebra, you probably know
the singular value decomposition, which you can see summarized below.
2.4. THE SECOND FUNDAMENTAL FORM 65

Proposition 2.4.1: Singular Value Decomposition

Let’s say you have a scalar product (not necessarily the standard euclidean
one) and a bilinear form. That is you have the maps

⟨,⟩ : V × V → R (2.17)
B :V ×V →R (2.18)
(2.19)

where the first has the usual properties of a scalar product and the second one
is a linear in both entries and symmetrica . Then there exists an orthonormal
basis e1 , e2 , . . . , en of the vectorspace V , so that Bij = B(ei , ej ) = λi δij
for some numbers λ1 , λ2 , . . . , λn . In other words, the matrix Bij is diagonal
with the λ’s as the diagonal. The λ’s are called the singular or principal
values.
a B(X, Y ) = B(Y, X)

If you have not seen the singular value decomposition before, it simply states
that you can rotate and flip your coordinate system so as to turn a particular bilinear
form into the form:

B(v, w) = λ1 v 1 w1 + λ2 v 2 w2 + · · · + λn v n wn (2.20)

where vi and wi are the components of v, w in the new orthonormal system.


This is what we want to use on the second fundamental form, since it is a
symmetric bilinear form. If we do find that special orthonormal coordinates system
of Tp M , we find that the matrix of Ap becomes:

k1 0
 
Ap = (2.21)
0 k2

where k1 and k2 are some numbers that only depend on the surface. They are the
principal curvatures we mentioned when discussing Qp (v), which you can see quite
simply. If we work in that coordinate system, the second fundamental form used on
the same vector twice becomes:

Qp (v) = Ap (v, v) = k1 v 1 v 1 + k2 v 2 v 2 (2.22)


or, if we use that |v| = 1 (as we used while first introducing the principal curvatures),
we can write v1 = cos(θ), v2 = sin(θ) and get:

Qp (v) = Ap (v, v) = k1 cos2 (θ) + k2 sin2 (θ) (2.23)

You can verify (exercise) that the maxima and minima happen when v = ±e1 , ±e2
in those special coordinates, which verifies that (1) k1 , k2 are the extreme values of
Qp (v) and that (2) the directions of the principal axes of curvature are orthogonal
to each other (e1 ,e2 are orthogonal per construction).
66 CHAPTER 2. SURFACES

2.4.2 The mean and Gauss curvatures


The last two sections introduced the second fundamental form and its connection
the the principal curvatures. We will now define two new curvatures we construct
from k1 and k2 and give all four numbers some more meaning.
In figure 2.4.2 you can see the geometric meaning of k1 and k2 . The first one
tells us the maximal curvature (and e1 it’s direction), while the second tells us the
minimal curvature (and e2 its direction). From these two we can construct these
two other numbers:

H = k1 + k2 (2.24)
K = k1 k2 (2.25)

Figure 2.12: The principal curvatures and axes of curvature on a rather


standard surface. We have k1 , which is positive and k2 , which is negative.
Note that which is which (max/min) depends on the direction of N we
chose, for this picture, since k1 is the positive one, N points up. Also, the
principal axis are orthogonal to each other, even if it might not look like it
in the picture because of perspective.

The first H, is called the mean curvature, while the second, K, is called the
Gauss curvature. You might wonder why H is called the mean curvature when it
should be k1 +k
2 . That used to be the old definition, but over time, people got sick
2

of writing that 2 everywhere, and redefined H.


2.4. THE SECOND FUNDAMENTAL FORM 67

What is the meaning of H? To understand it, we hope, that by now you are
convinced that the matrix form of A in our special coordinates is just the Hessian
of the function that describes our surface, and that when we rotate Tp M , this
does not change. What does change is that the Hessian becomes diagonal in these
coordinates (which we will denote x′ , y ′ ). So what we get for that matrix is:

∂2f
! 
0 k1 0

H = Ap = = (2.26)
∂x ′2

0 ∂2f 0 k2
∂y ′2

So that:
∂2f ∂2f
H = k1 + k2 = ′2
+ ′2 (2.27)
∂x ∂y
which is the two dimensional Laplacian of f . Now, the Laplacian tells us quite a bit
about average quantities, which also makes it more clear why H is called the mean
curvature. It tells us how much the function f deviates from it’s value at (0, 0) in
our coordinates, which is, of course, 0. It makes sense that a this would be a good
value to describe a part of the curvature. Because the Laplacian is inevitably tied
up in (physical) problems like drums and soap films, it first came up way before
curvature in this sense was discussed. Surfaces with H = 0 are called minimal
surfaces, not only because of the history OF THE Laplacian, but also because they
are usually surfaces of minimal area for some boundary problem.

Figure 2.13: The signs of k1 and k2 change with the other choice of N .
If we choose N1 as in the picture, k1 is positive, while if we use N2 , it is
negative. The absolute value stays the same however.

Now what about Gauss’s curvature? Well, it has a very nice property. It’s sign
is independent of our choice of N , unlike k1 , k2 , and H. First regard k1 and k2 .
68 CHAPTER 2. SURFACES

Figure 2.14: The Laplacian locally is just the average difference of f (x0 )
and f (x0 + ϵe) where e are all possible directions, ϵ is a parameter that goes
to 0

Look at figure 2.4.2. There you should see how k1 changes depending on the choice
of our normal. If we choose N1 , then the normal and the curvature vector of the
curve belonging to k1 point in the same direction, if we choose the other direction,
they point in opposite directions.
The fact that the sign of H tells us nothing geometric either can be seen alge-
braically quite easily.
H = k1 + k2 (2.28)
Therefore if we change the signs of k1 and k2 it is clear that the sign of H changes.
You can also look at this through the meaning of H. H is the two dimensional
Laplacian of the function that the surface is a graph of (in good coordinates it is.)
If we reverse the direction of N , we practically just take −f as the function M is a
graph of (locally). But the Laplacian is linear, which means it changes signs too and
since H is just the Laplacian it does too. The Gauss curvature, on the other hand,
does not change signs. The important thing to note is that if you change the signs
of k1 and k2 . their product, which is the Gauss curvature, does not. The sign of the
Gauss curvature captures the relationship of the signs of the principal curvatures. If
they have the same sign (++ or −−), then K will be positive. Otherwise it will be
negative. The sign relationship of the principal curvatures are very geometric things.
In particular, if both principal curvatures have the same sign, then the curves that
they belong to curve in the same direction, and if they have opposite signs, into
other directions. In the first case you have something that looks like a bulge, in the
other something that looks like a saddle point, as you can see in figure 2.4.2.
2.4. THE SECOND FUNDAMENTAL FORM 69

Figure 2.15: Typical example for each of the possible signs of the Gauss
curvature. If it is positive, both curves curve in the same direction, creating
a bulge. Otherwise, they curve in opposite directions and create a saddle.
70 CHAPTER 2. SURFACES

There is another way to express H and K, that might be more enlightening. If,
as always, we call the matrix of the second fundamental form (so the hessian of f )
A, then we get the result:

H = tr(A) (2.29)
K = det(A) (2.30)

as you should check yourself (exercise). These formulas are important, as they
illustrate the geometric character of H and K.

Proposition: The Sphere

We start with the standard example of a surface, a sphere. Let it have


radius R and a normal N = −x/R that points into the sphere. Then, for
any direction we pick, a greater circle will have a curvature R1 N , which
means that the second fundamental form will be Ap (X, Y ) = R1 ⟨X, Y ⟩
and the principal curvatures will be k1 = k2 = 2/R. The mean curvature
H will be H = k1 + k2 = R2 and the Gauss curvature will be K = R12 .
From the sign of K we would expect a (locally) bulge-like surface, which
the sphere of course is. To illustrate the power of the geometric approach
to this, we invite you to also derive the results above purely algebraically,
which will take a bit longer. You can restrict yourself to the north pole
p = (0, 0, R) and N = (0, 0, −1). You can calculate the hessian of f . It is
a simple calculation, but quite error prone and rather annoying. You can do
it yourself to see it, and probably already did quite a few times throughout
your studies. The result is that:

1/R 0
 
D2 (f ) = (2.31)
0 1/R

from which our results from above follow. If you actually did the calculation,
you are probably happy about the power of geometry.

2.5 Symmetry and Curvature


Now that we have found the right tool to understand the curvature of a surface, that
is, the second fundamental form, we want to make our computations as quick as
possible. A way to do this, is to exploit the symmetry that a surface has. Take the
sphere, for example. It is the most symmetric surface you can think of, any rotation
or reflection originating at the origin preserves the sphere. We used symmetry in
the previous section to very quickly calculate the second fundamental form and the
different curvatures. We now want to state and prove a theorem that will allow us
to use a specific symmetry to make the problem almost trivial, if the surface has
that kind of symmetry.
2.5. SYMMETRY AND CURVATURE 71

Theorem 2.5.1: Symmetry and Curvature

Let M and p, as always be a surface and a point on the surface. Let S be


some isometry of R3 , which preserves M . In particular, let S be a reflection.
There are a few requirements for the idea to work:

• The point p for which we are calculating the second fundamental form
is a fixpoint. S(p) = p
• The tangent plane Tp M is preserved by S, which means S(Tp M ) =
Tp M . In other words, the plane P that we reflect our space on is
perpendicular to Tp M .

Then, as I hope you can agree, S acts on the tangent-plane like a reflection
through a line L, which passes through the origin. Then the direction of
this line as well as the direction orthogonal to it are the principal axes of
curvature.

You can see the setup of the situation in figure 2.5 and the way the symmetry
acts on the tangent plane in figure 2.5. I hope you can appreciate how useful this

Figure 2.16: A surface that is symmetric under a reflection across the plane
P . The plane P is perpendicular to the tangent plane Tp M , and the line of
intersection of the two planes (L) is also drawn in.

theorem is. It is very easy to recognize a reflective symmetry like the one we need,
and if we do find it, we have pretty much immediately found the axes of curvature.
We then only need to calculate Ap for two combinations of e∥ and e⊥ to specify
Ap completely, and often these are quite easy to find (like with the sphere).

Proof. Let e⊥ and e∥ be an orthonormal system so that the first vector is orthog-
72 CHAPTER 2. SURFACES

Figure 2.17: Here we drew the tangent plane and the principal axes of cur-
vature. The symmetry acts as a reflection along the line L and the two
directions e∥ , e⊥ turn out to be the axes of curvature (per the theorem we
will prove in this section.)

onal to L and the second is parallel. We need to prove that in these two axes,
A is diagonal. We will do this by a contradiction. First, not that if k1 = k2 ,
then A = k1 I, which already is diagonal, so we can assume k1 ̸= k2 . We will
show that any other orthonormal basis cannot diagonalize A8 .

Step 1 Our first claim is that Ap cannot change under the symmetry as it is a
geometric object. As a formula, we claim

Ap (S(X), S(Y )) = Ap (X, Y ) (2.32)

This is easy to see geometrically, because the curves flip under the sym-
metry.
Step 2 Let’s assume there is another orthonormal basis of Tp M , which diagonal-
izes A and let’s call these basis vectors e1 , e2 . Then S(e1 ), S(e2 ) also diago-
nalize A. This is easy to to see because of the previous step.A(S(e1 ), S(e2 )) =
A(e1 , e2 ) = 0. and A(S(e1 ), S(e1 )) = A(e1 , e1 ) = k1 and similarly for the
other combinations. A in that basis has to have the form
k1 0
 
A= (2.33)
0 k2

Step 3 We now want to show that S(e1 ) cannot lie in the direction of e2 and vice
versa. Well, we know that the maximal and minimal values of curvature
8 This is why we excluded the case k = k , because any orthonormal basis diagonalizes A
1 2
in that case.
2.5. SYMMETRY AND CURVATURE 73

lie in directions that are perpendicular to each other, see figure 2.5 for
a picture of an Ap with k1 ̸= k2 . It should be clear that the maximum
(k1 ) cannot occur for a vector between e1 and e2 and the same goes for
the minimum. Therefore S(e1 ) can also not have a component in the e2
direction.

Step 4 But then S(e1 ) has to lie along e1 and be of unit length (since S is an
isometry) and the same goes for e2 . So we get the possibilities:

S(e1 ) = ±e1 (2.34)


S(e2 ) = ±e2 (2.35)
(2.36)

Step 5 But which vectors have this property? Well, S is a reflection along L, so
the only vectors with this properties are the ones that lie along L, which
get mapped onto themselves, and the ones lying on the line perpendicular
to L (L⊥ ), as you can see in figure 2.5. Therefore:

(2.37)

e1 , e2 ∈ ±e∥ , ±e⊥

But this is exactly what we wanted to prove9 .

This theorem is very useful when computing curvatures and we will now provide
a few examples.

2.5.1 Examples: Determining Ap quickly through Symme-


try

We want to discuss a few examples of surfaces and their curvatures.

9 The fact that we can also have a ± before the vectors in the set is not important, as we

all combinations will diagonalize A


74 CHAPTER 2. SURFACES

Figure 2.18: A typical way the curve Ap looks like if k1 ̸= k2 , on the


circle of unit length in Tp M . Here we took one where both curvatures are
positive to see it better, but you can easily visualize the other possibilities
by moving the curve up and down.
2.5. SYMMETRY AND CURVATURE 75

Figure 2.19: The tangent plane (drawn twice) and the way vectors get
mapped onto other vectors when you reflect along L. The vectors drawn
in purple are the only ones that get mapped onto the same line that they
already laid on.
76 CHAPTER 2. SURFACES

Proposition: The cylinder

We start with an example slightly more complicated than a sphere and a


point p, like in figure ?? for which we want to calculate the curvature. Unlike
the sphere, we cannot take any plane as a symmetry, but the cylinder does
have two planes (for the point p) that can be used. Firstly we have the
xy-plane. It is clear that it is perpendicular to Tp M and that p stays where
is is, and that the horizontal line drawn in purple also gets mapped to itself.
Therefore we know that that is our first principal axis. The other one has
to be orthogonal in Tp M , therefore it has to point up. We can then very
easily calculate the principal curvatures. The first one is the curvature of
the straight line which is zero. The other one is the curvature of a circle
of radius R, so k2 = 1/R. We also have H = 1/R and K = 0 (Verify
all these.). Because K = 0, the cylinder turns out to be flat! An intuitive
explanation is that you can use scissors to cut along the cylinder and roll it
out onto a flat piece of paper.

Proposition: The Catenoid

The catenoid is the surface of revolution of the function cosh(z). You take
the graph of cosh(z) as a function of z and rotate it around the z-axis to get
a surface, that looks a bit like a sci-fi picture of a wormhole, as you can see
in figure 2.5.1. We have drawn in a point p lying on the ”original” graph of
cosh(z) that points in the x direction. Its position on that line is arbitrary
though. For that point, we can use the xz-plane exactly like with cylinder
(although the orientation is different) and get that the line tangent to the
graph is the one the reflection preserves. So we know one of the directions
s tangent to to the graph. The other one is the y − axis, since it needs
to be orthogonal and lie in the tangent plane. We leave it as an exercise
to you to show that the principal curvatures have opposite signs and that
their values are: k1 = cosh1 2 z , k2 = cosh
−1
2 z and that H = 0 (the surface is

a minimal surface) and that K = cosh4 z is negative, which means we have


−1

bulges locally.

The catenoid is historically quite a special surface, because it is based on the


function cosh(x). One of the first contexts in which it was used was as the solution
to the chain problem. The chain problem is quite simply, how does a chain look if
you hold it at two places and let it hang there.

Proposition: The helicoid

The helicoid is another surface we want to introduce, where the symmetry


argument does not work.
2.6. INTERLUDE: A BIT ABOUT DIFFERENTIATION 77

2.6 Interlude: A bit about Differentiation

Before continuing our discussion of curvature, we want to stop for a minute and
discuss how one defines derivatives on surfaces and fix the notation. We start by
introducing standard calculus notation and then defining derivatives on surfaces.

Definition 2.6.1: Differential of f : Rn → R

Let f : Rn → R be any smooth function that assigns each value x =


(x1 , . . . , xn ) ∈ Rn a value in R. We define the differential of f to be:
 
∂f ∂f
df (x) = Df (x) := , . . . , : Rn → R (2.38)
∂x1 ∂xn x

Definition 2.6.2: Differential of f : Rn → Rm

Let f : Rn → Rm be a smooth function, where we write f = (f α )m α=1 =


(f 1 , . . . , f m ). Similarly to the previous definition we define the differential
of f at x to be:
 ∂f 1 ∂f 1

∂x1 ... ∂xn
df (x) = Df (x) =  ... ..  : Rn → Rm (2.39)

. 
∂f m ∂f m
∂x1 ... ∂xn x

In both cases we defined the differential to be a map of the same kind as f ,


that locally agrees with f (to first order). We now want to define the directional
derivative.
78 CHAPTER 2. SURFACES

Definition 2.6.3: Directional derivative


Let f : Rn → Rm be a smooth function like in the above definition and
x ∈ Rn the point at which we want to evaluate the directional derivative
and X ∈ Rn the direction in which we want to evaluate it. (The first is a
point, the second is a vector.) Then we define the directional derivative as:
 ∂f 1 ∂f 1
  
∂x1 ... ∂xn
X1
 .. 
(DX f )(x) = Df (x)(X) =  ... .. 
 . :R →R
n

. 
∂f m ∂f m Xn
∂x1 ... ∂xn x
(2.40)
which we can rewrite like this:
m
n X
X ∂f α
DX f (x) = X j eα (2.41)
α=1 j=1
∂xj

where eα is the standard basis in Rn

Proposition

Let γ(t) ∈ Rn be any smooth curve, so that γ(0) = x and γt (0) = X.


Then:
df (γ(t))
DX f (x) = (2.42)
dt t=0

This equation is very useful while calculating derivatives but also quite intuitive.
Simply put, to differentiate a function in the direction of X, take Any curve that
has (at the point of interest) its tangent-vector equal to X and differentiate on that
curve. Locally, they will look the same, after all. You can look at figure 2.6 for an
example.

Proof. The proof of this Proposition is very simple. You basically use the chain
rule once, and get exactly what you need.

df (γ(t))
= Df (γ(0))γt (0) = Df (x)X = DX f (x) (2.43)
dt t=0

The last equation is just the definition of DX f (x). Notice that the multiplication
is a matrix multiplication.

2.6.1 Vector Fields on R3


Now that we have fixed the notation of derivatives in Rn , we want to talk about
vector fields, restricting ourselves to the case of R3 .
2.6. INTERLUDE: A BIT ABOUT DIFFERENTIATION 79

Figure 2.21: The catenoid. It is the surface of revolution of the


function cosh(z).
80 CHAPTER 2. SURFACES

Definition 2.6.4: Vector field on R3

A vector field on R3 is a function X : U ⊂ R3 → R3 , where U is some


opena subset of R3 .
a We choose it this way so that we can differentiate at every point.

We can also define the derivative of one vector field in the direction of another
vectorfield, which we do point wise.

Definition 2.6.5: Derivative along a vector-field

Let X, Y be two smooth vectorfields in R3 . Then we define the derivative


of Y along X as:
3
X ∂Y j
DX Y (p) = DX(p) Y = (DY )(p)(X(p)) = Xi (p)ej (2.44)
i,j=1
∂xi

where ej is the standard basis of R3 .

What we get is another vectorfield. Note that the derivative at p depends on the
value of X only at p, but that the same is not true for Y . This is simply because
we are interested in the change of Y along X(p). From the first we need a few
values so we can calculate the change (that is an surrounding of p), while we only
need X(p) for the direction.

2.6.2 Vectorfields on a surfaces


Now that we have fixed the notation of vectorfields in R3 , we want to see how
we can apply the ideas of vectorfields onto surfaces. (In this section M is again a
surface and p a typical point on M .)
The difference between R3 and a surface is that, unlike in R3 , there are two
kinds of vectorfields, that behave quite differently from another and have different
geometric meanings.

Definition 2.6.6: Vectorfields on M


We can define two kinds of vectorfields. Let’s start with the simple general-
ization:
• We call a smooth function X : M → R3 a vectorfield along M.
• If X(p) ∈ Tp M for all p ∈ M , then we call the vectorfield tangent to
M.

The first kind of vectorfields of course contains the second, but the second is
a bit special. It is special in-so-far as that geometrically small vectors (|v| < ϵ for
2.6. INTERLUDE: A BIT ABOUT DIFFERENTIATION 81

Figure 2.22: The catenoid and the lines we need to calculate the
principal axes of curvature using symmetry, for some point p.

some ϵ) lie in the surface, because locally the surface looks like the tangent plane.
You can see the difference between the two types in figure 2.6.2 and
How can we differentiate a vectorfield along a surface M ? Let’s say we have
vectorfields X, Y defined on M and want to differentiate Y with respect to X,
where X of course has to be tangent to M . How can we proceed? There are a
few possibilities. we could for example take a curve that at t = 0 is at p and has
the tangent vector X(p) and use the proposition from two sections ago and repeat
for all the points. We will however use a different (but equivalent) way. We will
extend X, Y to be smooth vectorfields X̃, Ỹ on an open set U ∈ R3 , so that X̃, Ỹ
restricted to M match X, Y , and then define the derivative to be DX Y = DX̃ Ỹ

Definition 2.6.7: Derivative of Y along X on M

Let X be a smooth vectorfield tangent to M , and Y be a smooth vectorfield


along M . Then we define the derivative DX Y of Y after X by first extending
X and Y onto a subset U ∈ R3 : X̃ : U → R3 and Ỹ : U → R3 (so that
they are smooth) and define:

DX Y (p) = DX̃ Ỹ (p) (2.45)

This definition is independent of the extensions X̃, Ỹ , which we prove now:


82 CHAPTER 2. SURFACES

Figure 2.23: An example for the intuition behind the Proposition, for an
f : R2 → R. If you move along X and the curve γ with a very tiny step (h
is very small) the values you get will be very similar, which is why you get
the same result.

Figure 2.24: A typical vectorfield in R3


2.7. ANOTHER CHARACTERISATION OF CURVATURE: THE WEINGARTEN MAP83

Proof. We know that DX̃ Ỹ depends only on the value X̃(p) = X(p) and on the
values of Ỹ in a small surrounding of p. (Surrounding in R3 ). Therefore the
independence of the extension of X is given. We also know that:

dỸ (γ(t))
DX̃ Ỹ = (2.46)
dt t=0

for some γ(t) with γ(0) = p and γt (0) = X(p). We can use any smooth curve
that satisfies the two conditions, in particular we can take a curve in M , so that
the expression becomes:

dỸ (γ(t)) dY (γ(t))


DX̃ Ỹ = = (2.47)
dt t=0 dt t=0

since we are using a curve that lies on M and Ỹ M


=Y

with this, we are done with the interlude and return to curvature.

2.7 Another characterisation of curvature: The


Weingarten Map
We have already seen two different ways to look at curvature, firstly a very geometric
definition (what we called Qp ) and then as the second fundamental form (Ap ). We
now turn to the last major way to describe curvature we will cover, called the
Weingarten-map. The idea of this way of looking at curvature is that they way
tangent vectorfields have to change (locally) to stay tangent to M is another way
to detect curvature. Let X, Y be two vectorfields, tangent to M . Then, because
they are tangent, there is an amount they have to change to stay in the tangent-
plane, similarly to how curves on M have to curve to stay on M in our geometric
definition. You can see an example in figure 2.7 From this though emerged the

Figure 2.25: Two vectorfields X, Y and the derivative DX Y .

following theorem.
84 CHAPTER 2. SURFACES

Theorem 2.7.1: Vectorfield characterization of Ap

Let X, Y be two tangent vectorfields on a surface M , and N a unit normal


field (the same one as used in the definition of Ap ). Then we can rewrite
Ap in the following way:

Ap (X, Y ) = ⟨DX Y, N ⟩ = −⟨DX N, Y ⟩ (2.48)

The first equality say that we can characterize curvature by how vectorfields
change along M , specifically by the component of that change normal to the surface.
The second one tells us that we can characterize curvature by how the normal
vectorfield changes on M .
We will prove this theorem, but we want to first discuss a few ways to see
this intuitively. We already saw that a tangent vectorfield has to change to stay
tangent to M . We can also see this, not as Y changing, but the tangent-plane
rotating as you move on M , which you can see in figure 2.7. This should be clear,
since the tangent-plane is in one sense the first linear approximation of M , and if
M curves then the tangent-plane also has to rotate. You can also see this as all
the possible tangent-vectors changing to accommodate M ’s curving, and since the
tangent-plane is build out of all of these vectors, it has to change as well. You can
also describe the change of the tangent-plane by describing how the normal vector
changes, since it has to change with the tangent-plane. This is how you get the
second equality. It is clear that DX N has all the information about how N , and
therefore Tp M changes, and this is why this map gets a special name, it’s called
the Weingarten map.

2.7.1 The Weingarten Map


We write down the definition of the Weingarten map, which we motivated just now
explicitly for clarity.

Definition 2.7.1: The Weingarten Map

Let M be a surface and N a choice of unit normal field. Then we define


the Weingarten map to be:

Wp : Tp M → Tp M : X(p) → DX N (p) (2.49)

By the theorem we then get that:

Ap (X, Y ) = −⟨Wp (X), Y ⟩ (2.50)

There are a few things to note about the Weingarten map, that one can see
quite easily. Firstly, Wp is self-adjoint, because Ap is symmetric.

⟨Wp (X), Y ⟩ = Ap (X, Y ) = Ap (Y, X) = ⟨Wp (Y ), X⟩ = ⟨X, Wp (Y )⟩ (2.51)


2.7. ANOTHER CHARACTERISATION OF CURVATURE: THE WEINGARTEN MAP85

Figure 2.26: A vectorfield along M in (a), and a vectorfield tangent to M .


The first type sticks out of the surface, while the second type lies along the
surface.
86 CHAPTER 2. SURFACES

Figure 2.27: Vectors tangent to M live in the tangent plane at the point they
come out of, which locally, you can imagine, as them living in the surface.
2.7. ANOTHER CHARACTERISATION OF CURVATURE: THE WEINGARTEN MAP87

Secondly, we can once again use that a unit normal vector cannot change in it’s one
direction, and conclude that the Wp really goes to Tp M , because ⟨Wp (X), N ⟩ = 0.
You can see this, by applying the same calculation we already did quite often. We
know that, ⟨N, N ⟩ = 1 since the unit normal is of unit length. We can therefore
calculate:
0 = DX ⟨N, N ⟩ = 2⟨DX N, N ⟩ (2.52)
which is exactly the claim we just described.

Proposition: The Sphere

We turn to our standard example of a surface, the sphere. What is the


Weingarten map of the sphere?. Let’s again choose an inner normal field
N = −x/R, where R is the radius of the sphere. Then we can compute Ap
rather quickly using the theorem:
x 1
Ap (X, Y ) = −⟨DX N, Y ⟩ = ⟨DX , Y ⟩ − ⟨DX (x), Y ⟩ (2.53)
R R
We need to compute DX (x) = DX ((x1 , x2 , x3 )). We get:

DX (x) = DX ((x1 , x2 , x3 )) (2.54)


= D((x1 , x2 , x3 ))(X) (2.55)
1 0 0
   1
X
= 0 1 0 X 2  = X (2.56)
0 0 1 X3

and therefore, by inserting:


1
Ap (X, Y ) = − ⟨X, Y ⟩ (2.57)
R
The Weingarten map is just: Wp (X) = − R1 X

The example showed the way in which you can sometimes use the Weingarten
map to calculate something really fast. Imagine doing the calculation of Ap by
using the Hessian Matrix. There is so many derivatives that you’d probably make
a small mistake, but at the very least it would take way longer.

2.7.2 Proof of connection between the Weingarten map


and Ap
Now that we have gotten some intuition about the Weingarten map, and saw a first
example, we want to prove the theorem. There is two things we need to prove.

• We need to show that Ap (X, Y ) = ⟨DX Y, N ⟩.

• We also need to show that ⟨DX Y, N ⟩ = −⟨DX N, Y ⟩


88 CHAPTER 2. SURFACES

We will actually prove the second part first, as it is both easier and we need it to
prove the first part.

Proof. Let us first prove the second claim. It is not too hard to prove. It’s
the same calculation we already did very often. We know that ⟨Y, N ⟩ = 0
everywhere, because Y is a tangent vector field, and N is a unit normal field
and they are, per construction, orthogonal to each other. Then we get:

0 = DX (⟨Y, N ⟩) = ⟨DX Y, N ⟩ + ⟨DX N, Y ⟩ (2.58)

per the product rule, and therefore:

⟨DX Y, N ⟩ = −⟨DX N, Y ⟩ (2.59)

as promised.

There is an immediate consequence of this. We know that DX Y (p) depends


only on the value of X at the point p, but for Y you need a surrounding of p that
it depends on, since you are calculating changes of Y . But because of the equality,
one can clearly see that the expression ⟨DX N, Y ⟩ only depends on X(p) and Y (p),
and definitely not on Y on a small surrounding. This is quite surprising at first,
because ⟨DX Y, N ⟩ looks, at first sight, like it would depend on more that just Y (p).
We will use this in the proof of the second plane.
The idea of the second part of the proof a lot of calculation, but has one basic
idea. Ap is, in essence, just the Hessian, D2 f , if you write M as the graph of
f . So on the one side, we have a second derivative, and on the other, a term with
DX Y . In a vague sort of sense, we will see that Y is roughly like DY f , at least
when it comes to this specific case.

Proof. The setup of our proof is drawn in (a), which is the same as in the proof
that Qp is the same as Ap . This time we want to extend Y to a vector that is
in the tangent-plane of q. In (b) we find that our first step will be extending Y
to be constant on the tangent-plane.

Step 1 We begin by fixing p ∈ M , and using coordinates of R3 where p = (0, 0, 0),


Tp M = R2 × {0} and x3 = f (x1 , x2 ) describes the surface. This is the
exact same construction we used when we proved that Qp and Ap are the
same thing, in section 2.3.3.

Step 2 We let X = (X 1 , X 2 , 0) and Y = (Y 1 , Y 2 , 0) be two vectors in the tangent-


plane, and want to calculate ⟨DX Y, N ⟩ at the point p. We want to do this
algebraically, in the coordinate system we choose, so we will actually need
to extend X, Y to some extensions X̃, Ỹ (on M ). We note already here,
that we already showed that ⟨DX Y, N ⟩ depends only on X(p), Y (p), and
therefore any suitable extension we pick will be good enough, and the
whole thing won’t depend on the extension.
2.7. ANOTHER CHARACTERISATION OF CURVATURE: THE WEINGARTEN MAP89

Figure 2.28: Even the most boring tangent vectorfield like Y has to change
a minimal specific amount to stay on M .

Step 3 We now think about which extension might be good to work with. We
want to extend X, Y to a surrounding of p on M , that contains the typical
point q ∈ M . The choice we will make is to ensure that the extensions
X̃(q), Ỹ (q) are both tangent to M at q, or in other words, that they are
an element of Tq M .
We can construct our vectors10 , first, by extending them to constant vec-
tors on Tp M . So we take Ỹ (q) = (Y 1 , Y 2 , ?), where we don’t know the
third component yet. We know that q = (x1 , x2 , f (x1 , x2 )) if it has coor-
dinates x1 , x2 on Tp M . When is a vector in q’s tangent space? Well, if we
have a graph, like f , we can see what we have to do by quickly looking at
a one-dimensional-example. which you can see in figure 2.7.2. Very simi-
larly to the figure, the a vector (Y 1 , Y 2 , Y 3 ) will be in the tangent-space
Tq M of q, if Y 3 = df (Y 1 , Y 2 ). So our extension becomes:

Y (x1 , x2 , f (x1 , x2 )) = (Y 1 , Y 2 , df(x1 ,x2 ) (Y 1 , Y 2 )) (2.60)

This is what we meant by Y is somehow analogous to DY f . If contains


df , which will turn into the second derivative in the next steps.
Step 4 Now, extend Ỹ to a vectorfield on the whole of R3 , and in particular, make
it independent11 of x3 .
Step 5 Do the above procedure for X.
10 we will only explain the extension of Y , the one of X is exactly the same.
11 We can do this and we do it for convenience.
90 CHAPTER 2. SURFACES

Step 6 Calculate DX Y p
. We get:

3
X ∂ Ỹ j
DX Y = DX̃ Ỹ = X̃ i e
p j
(2.61)
p p
i,j=1
∂xi
3
X ∂
= Xi (Ỹ 1 , Ỹ 2 , Ỹ 3 ) (2.62)
i=1
∂xi p
3 2
X ∂ X ∂f 1 2 j
= Xi (Y 1
, Y 2
, (x , x )Y ) (2.63)
i=1
∂xi p j=1
∂xj

3 X
2
X ∂ ∂f 1 2
= (0, 0, Xi (x , x ) p Y j ) (2.64)
i=1 j=1
∂xi ∂xj
= (0, 0, XHY ) (2.65)

where H is the hessian at p. Because N = (0, 0, 1), we get:

⟨DX Y, N ⟩ = XHY = Ap (X, Y ) (2.66)

which is exactly the result we wanted.

Figure 2.29: The tangent-plane of M changing as M curves.


2.8. *OTHER FORMULAS FOR CURVATURE 91

Figure 2.30: The unit normal vector has to change as M curves.

2.8 *Other formulas for curvature


We have promised you five different ways to describe the curvature of surfaces. We
have given you three already, all based on different ways of thinking about curvature.
The other two, we claimed, are simply formulas for computation. We present them
here. You don’t need to know them by heart but it is worth knowing that they
exist, shall you need them sometime. Both of these formulas are analogous to the
two formulas for the curvature of a curve we derived before. For clarity, we write
them down here:
1
 
γt γt
κ= 2 γtt − ⟨γtt , ⟩ (2.67)
|γt | |γt | |γt |

uxx
k= 3/2
(2.68)
(1 + u2x )

The first, in principle, just describes the curvature vector of a parameterized curve.
The second one describes the curvature of a curve as a graph. We will start with
the first.

2.8.1 Curvature of a parametrized sufrace


Here, we will mention the formula for parameterized surfaces analogous to formula
2.67.
Firstly, we need to describe what exactly we mean by a parameterized surface,
which is not to difficult:
92 CHAPTER 2. SURFACES

Definition 2.8.1: Parameterized Surface


We define a parameterized surface, which we call M , in the usual way, as a
map F : U ⊂ R2 → R3 , where U is called the coordinate space. We want
it to have a non-singular Jacobian too. In equation:

F (x1 , x2 ) = (F 1 (x1 , x2 ), F 2 (x1 , x2 ), F 3 (x1 , x2 )) (2.69)

We can get a basis of Tp M at each point p, by taking the basis vectors of the
coordinate space e1 , e2 and applying dF |p onto them. That way we get two new
vectors (at each p):

Xi = dF |p(ei ) (2.70)

which (you should check) turn out to be basis-vectors of Tp M . We can also write
the Xi ’s out:

∂F α 3
Xi = ( ) (2.71)
∂xi α=1

We change notation a bit (to make the formula we will get readable in the end) so
α
that for ∂F α α
∂xi we write Fi . Similarly we use Fij for second derivatives etc.

We can then define the metric, which is a tool that tells us how to transform
from changes of coordinates to changes in lengths.

Definition 2.8.2: The metric of a parameterized surface

We define the metric of a parameterized surface as the map (from the


coordinate space or M , whichever is more convenient) as:
X
gij = g(Xi , Xj ) = ⟨Xi , Xj ⟩ = Xiα Xjα (2.72)
α

In actuality, gij is a collection of four maps, but it will turn out to be a tensor
(in later chapters), and a mighty object in differential geometry, so we call it one
map. Let us also define g ij to be the inverse of gij , if you see gij as a matrix. Then
for a parameterized surface we get the following formulas for some curvatures:
2.8. *OTHER FORMULAS FOR CURVATURE 93

Figure 2.31: The setup of our proof is drawn in (a), which is the same
as in the proof that Qp is the same as Ap . This time we want to extend
Y to a vector that is in the tangent-plane of q. In (b) we find that our
first step will be extending Y to be constant on the tangent-plane.
94 CHAPTER 2. SURFACES

Proposition 2.8.1: Curvature of a parameterized surface

The Weingarten map is:

W = g −1 D2 F · n (2.73)

where n is the unit normal, which we can write as u = X1 ×X2


|X1 ×X2 | . (point wise
definition.)
Written out we get:
3
2 X
X ∂F α α
Wki = g ij n (2.74)
j=1 α=1
∂xj ∂xk

For the mean curvature we get:

H = tr(W ) (2.75)

or written out: XX
H= g ij Fijα nα (2.76)
i,j α

or written out:
!
F2β F2β F11
α
− 2F1β F2β F12
α
+ F1β F1β F22
α
H = nα γ γ δ δ γ γ 2 (2.77)
F1 F1 F2 F2 − (F1 F2 )

where all indexes repeated twice are summed over (Einstein convention).
For the Gauss-curvature K we get a similar result:
α α β β
F11 n F22 n − (F12 n )
α α 2
K = det(W ) = (2.78)
F1 F2 F1 δF2δ − (F1γ F2γ )2
γ γ

The formulas might look horrible, which in a way they are. But they are certainly
useful, because often you know F and can simply plug the derivatives in and get H
and K. It’s a formula which is very easy to implement on a computer.
By the way, if you know H and K, you can figure out the principal curvatures
quite easily. Let’s say you know k1 and k2 .
Then you can realize that:

0 = (x − k1 )(x − k2 ) = x2 − (k1 + k2 )x + k1 k2 = x2 − Hx + K (2.79)

which you can of course solve easily for k1 , k2 .

2.8.2 The curvature of a graph


We will now provide you with the formulas for the mean and Gauss-curvatures for
a graph.
2.9. INTRINSIC GEOMETRY 95

Proposition 2.8.2: Curvature of a graph

Let’s say you have a function z = f (x, y), so that the graph of f is the
surface M for which you want to calculate H, K for. Let’s denote the
partial derivative after x, fx as usual. Then you can write H and K as:

(1 + fy2 )fxx − 2fx fy fxy + (1 + fx2 )fyy


H= (2.80)
(1 + fx2 + fy2 )3/2
2
fxx fyy − fxy
K= (2.81)
(1 + fx + fy )2
2 2

As in the previous case, if you know H and K, you can solve for the principal
curvatures by the same calculation,
Notice the similarly of both cases to the formulas for a curve. Specifically, notice
how both of the formulas for the case of a graph have a correction term (in the
denominator) similar to the case of curves.

2.9 Intrinsic Geometry


Up until now, all our considerations of surfaces were based on the assumption that
we are a three-dimensional creature, who has knowledge about all of R3 and can
find information about a surface from the three-dimensional ambient space. A lot
of what we did were extrinsic things, that you can figure out out of the ambient
space. We want to amend this now.

Figure 2.32: The idea of the third Step of the proof, but in the one-
dimensional case. A vector (x1 , x2 ) is in the tangent-space (lies in the
tangent of f at p), if x2 = df (x1 ) = dx
df 2
x

Imagine that you are an ant on some surface, like the one in figure 2.9 and you
96 CHAPTER 2. SURFACES

cannot see ”outside”, into the third dimension. You stay entirely confined to the
surface, which is your world. Imagine also, that you are a curious, very smart, ant.
You also have a measuring tape, with which you can measure lengths (infinitely
precisely) and have access to infinite computational power, as we said, you are very
very smart. In this world, light can obviously not go in straight lines (unless the
surface is flat), so we say it goes as straight as it can, and what the ant sees is a
result of this. You can see all of this drawn in figure 2.9. From this idea, that is,
what the ant can understand about the world it lives in from measuring lengths and
looking around, we can define what we mean as an intrinsic quantity on the surface.
Intrinsic quantities, simply said, are those that you can infer from measuring lengths
only.

2.9.1 The intrinsic metric of M


We promised, some time ago, that we would come to the first fundamental form
after covering the second fundamental form. We fulfill this promise in this section.
Let’s start simply. Let us say you are the ant, and you go around measuring the
length of a curve γ on the surface, that connects two points on the surface p and
q 12 .
Then the thing we, as three dimensional creatures, would call the length would
simply be:
Z b
L(γ) = |γt | dt (2.82)
a

where |γt | uses the norm in three dimensional space. We know from our treatment
of curves, that this is independent of parametrization, and only depends on the
image of γ. It is something intrinsic, that the ant can measure.
From this we can define the distance between two points on the surface M .
We simply pick the curve connecting two points that has the smallest length (if it
exists), otherwise the infimum.

dM (p, q) = LM (p, q) = inf L(γ) (2.83)

for curves γ that connect p and q. Usually, a curve connecting the two with the
property that its length is the distance between the points exists, but sometimes it
does not. That is why we define it as an infimum13 . We call a curve with the above
property a geodesic.
The surface, equipped with these lengths as distances, is a metric space, as is
quite easy to show (exercise).

The two metrics


Here we are really talking about two different metrics, and it is important that you
keep them distinct. There is the metric that we get just from the surface, defined by
the lengths our ant can measure, and then there is the metric that comes from the
12 By this we of course mean that γ : [a, b] → M with γ(a) = p, γ(b) = q
13 Exercise: Find a surface and a pair of points, for which no such curve exists.
2.9. INTRINSIC GEOMETRY 97

external structure of the R3 that the surface might live in, which is not something
the ant can ever experience. You can see the difference in the following figure. We

Figure 2.33: The way we extend Y , is by using the the same (but two-
dimensional) construction from figure 2.7.2.

can now go on to define the first fundamental form.

2.9.2 The first fundamental form


The first fundamental form, also called the Riemannian metric of M or just the
metric, is (at each point on M ) the restriction of the scalar-product onto Tp M .

Definition 2.9.1: The metric


The metric g is defined as the map from Tp M × Tp M → R, which is the
restriction of the three-dimensional scalar product ⟨·, ·⟩R3 onto Tp M ×Tp M .

g(p)(X, Y ) = ⟨X, Y ⟩R3 Tp M


(2.84)
98 CHAPTER 2. SURFACES

This g in turn defines both dM and Lγ on the surface, i.e it is the only thing
you need to calculate L(γ) on the surface for any curve γ. It is also determined
from L(γ) or dM . This can be proven, but we will abstain from doing so until
later chapters. Our ant can therefore find it, it exists without any reference to the
ambient space R3 . We call anything that can be deducted from g in a sensible way,
intrinsic.

Note 6. Notice how both g(p) and A(p) are bilinear forms on Tp M . This is why
the first is called the first fundamental form, and the second is called the second
fundamental form. If you are wondering, there is also a third fundamental form,
which was introduced by Gauss, but it is not used much nowadays anymore,
because it doesn’t really add any new information, and can be calculated from
the first two.

2.10 Intrinsic Isometries and Gaussian curva-


ture

We continue with our discussion of an ant, and intrinsic geometry. In particular,


we want to talk about when the ant can make out the difference between two
different surfaces it lives on. Take a very simple example, two spheres. One is
centered at the origin, the other is centered around some point far away from the
origin. These two are different surfaces, in the sense that they are different sets of
points, mathematically. It is clear, however, that this difference only arises because
of the ambient space R3 , specifically because of the choice of coordinates we made
and that an ant living on either of these spheres could not tell them apart in any
way. They are, geometrically, the same thing. You can just translate one to the
other in the ambient space, their position in the ambient space, in other words,
is certainly not something intrinsic. The idea we want to develop in this section
goes beyond this simple case. We want to abstract away the effects of the ambient
space, take away anything our tiny smart ant cannot perceive and see what is left.
In particularly we want to see if two surfaces, that we perceive as different in the
ambient space, cannot be told apart by the ant. The tool we will develop is called
an intrinsic (local) isometry
2.10. INTRINSIC ISOMETRIES AND GAUSSIAN CURVATURE 99

Figure 2.34: The parameterization of the sphere. The lines we usually draw
on the sphere, are, as you know, the lines of constant longitude and latitude,
which are lines inherited from the coordinate space. We can also get basis-
vectors of Tp M for all p from the coordinate space, using the vectors Xi =
dF |p (ei ) as basis vectors of Tp M at p.
100 CHAPTER 2. SURFACES

Figure 2.35: A surface, on which our very smart ant lives. It has a measuring
tape, and (somehow) access to infinitely much computational power, so that
it can figure out as much about the world it lives in, as possible.
2.10. INTRINSIC ISOMETRIES AND GAUSSIAN CURVATURE 101

Definition 2.10.1: Intrinsic Isometry

Let (M, gM ) and (N, gN ) be two surfaces, with their own respective metrics.
We call a function ϕ : M → N an (intrinsic) isometry, if it is bijective,
smooth, and preserves distances. That is, for any two p, q ∈ M , if p̃ = ϕ(p)
and q̃ = ϕ(q), then
dM (p, q) = dN (p̃, q̃) (2.85)
or equivalently, if for any curve γ in M :

LM (γ) = LN (ϕ(γ)) (2.86)

or equivalently, if the metric is preserved:

gM (p)(X, Y ) = gN (p̃)(X̃, Ỹ ) (2.87)

where X̃ = dϕp (X) is the corresponding vector in N to X. Similarly, a


local isometry is one that is an isometry from an opena subset of M to an
open subset of N .
a Here, open means open in M , not R3 , that is a two dimensional set.

We can make the meaning of something being (geometrically) intrinsic a bit


more clear now. Something is intrinsic, simply, when it is invariant under (local)
isometries. Here the power of isometries shows itself quite clearly. Isometries pre-
serve the metric, but forget about all the unnecessary information that comes from
the ambient space. They only see, what the ant can see, and nothing else.

Proposition: The cylinder

Take a piece of paper. It is clearly flat, it could be nothing else. Roll it


up. This is an (local) isometry. You can see this, by taking a very tiny
part of the paper and notice that you don’t stretch or disturb it. The local
lengths, i.e the metric, stays the same. Now, take a look at the different
kinds of curvatures we defined in this chapter. Because the piece of paper
is flat, k1 , k2 , H and K are all zero. But for the cylinder, which is isometric
to the piece of paper, locally, k1 = 0, k2 = 1/R, H = 1/R and K = 0.
Therefore, the first three cannot be, in general, isometric. k1 might be the
same here, but you can roll in the other direction, finding another isometry,
where it is 1/R. So the first three cannot be intrinsic quantities. This is a
very important point. Three out of the four curvature quantities depend on
the ambient space. What about the Gaussian curvature? It is zero in both
surfaces, so we have not ruled out its intrinsic nature. Is it intrinsic?
102 CHAPTER 2. SURFACES

Figure 2.36: A surface and the two different distances we can define. The
first one in (a) comes from the surface itself, and can at least in principle
be measured by our ant. The second one in (b) comes from the structure
of R3 and is, in general, not equal to the one in (a), and the ant can never
feel this length.
2.10. INTRINSIC ISOMETRIES AND GAUSSIAN CURVATURE 103

Figure 2.37: Two spheres at different places in R3 are indistinguishable from


each other to the ant. This is the simplest example of an intrinsic isometry.

Proposition: The cone

The cone, similarly to the cylinder, can, locally, be unfurled into a flat piece
of paper. Exercise: Try both examples with paper.

In general, any surface that can be unrolled into a piece of paper is called
developable, and through any point, there will be a line through it that is straight
(in the ambient space sense), so that k1 = 0 and K = 0.
This hints at the following result:

Theorem 2.10.1: Theorema Egregium (Gauss)

The Gaussian curvature K is intrinsic. In other words, it does not change


under (local) isometries.

This is an extremely important result for surfaces. An ant cannot measure k1 , k2


or H, but what it can measure is K!
In fact, there is an extremely intuitive way for the ant to measure curvature. Let
the ant make the closest thing to a triangle it can in the space it lives in. By that
we mean it takes the lines, which are as straight as they can possibly be (geodesic)
and crosses them to make a triangle. Then let it measure the angles. It will find
that, if the space it lives in is not flat, in general, the sum of angles won’t be 180◦ !
104 CHAPTER 2. SURFACES

Figure 2.38: An isometry is a map from one surface M to another surface


N , so that distance is preserved. A local isometry is one that only works
locally.

Theorem 2.10.2: Gauss-Bonnet - Triangle Version

Let T be a triangle on a surface, that is, a triangle constructed out of three


geodesics (straightest possible lines on surface). Then:
Z
α+β+γ =π+ K (2.88)
T

where α, β, γ are the angles of the triangle, K is the Gaussian curvature,


and the integral is an area-integral (not over the boundary of the triangle)

So by measuring angles, the ant can figure out if the space she lives in is curved
or flat! Another way she can do this is by using circles, not triangles. A circle, in
her world, is the set of points equally distant from some point (”radius r”), which
she of course calls the centre of the circle. If she lives in flat space, she will find
that the area of the circle will be πr2 . But if she lives in a curved space, then this
fact also doesn’t have to be true, you can take a look at figure 2.10.
One can show that the area will be:
π
Ap (r) = πr2 − K(p)r4 + · · · (2.89)
12
which is not πr2 !
Note 7. There is a formula for K in terms of the metric, but it is rather long
and not very useful for our purposes right now.

∂ 2 gij
K = g ij g kl + ... (2.90)
∂xk xl
2.11. TWO IMPORTANT THEOREMS 105

Figure 2.39: A piece of paper can be rolled up into a cylinder, without


disturbing local lengths. The principal curvatures, and the mean curvature
all change, but the Gauss curvature does not.

2.11 Two Important theorems


. Before finishing with surfaces, we want to mention two very important theorems of
surfaces connecting differential geometry with topology. A simplified version of the
first one was already presented in the last section, called the Gauss-Bonnet-Theorem.
It describes a very specific way in which curvature and topology interact. The other
theorem is the Uniformization Theorem, which describes the way in which you can
bring topological surfaces into differential geometry. Both are global theorems.

2.11.1 The Gauss-Bonnet-Theorem


We want to connect the Gauss curvature to a result from topology. Topology tells
us that we can characterize every (orientable compact) surface by a number called
the genus. You can see a few examples in 2.11.1. The genus is the number of
tunnels the surface has. A sphere has no tunnels, so it’s genus is zero. A torus has
one, and so its genus has two. A double-torus has two, its genus is two.
Topology also has the idea of the Euler characteristic, which measured genus
through triangles. The idea of the Euler characteristic is to triangulate the surface
and compare the number of triangles with the number of edges and vertices. If we
call the Euler characteristic of a surface M , χ(M ) then:

χ(M ) = # Triangles − #Edges + #Vertices (2.91)

This number is, incidentally, independent of triangulation, which we won’t prove


here. It also turns out that:

χ(M ) = 2 − 2genus(M ) (2.92)


106 CHAPTER 2. SURFACES

Figure 2.40: A cone can also be unrolled to a flat piece of paper

Figure 2.41: An example of the the Gauss-Bonnet theorem. If you form a


geodesic triangle on the sphere like in the figure, you can create a triangle
with three angels of π, totaling 270◦ . If you calculate the (area) integral
over the triangle of the Gauss curvature, you get the angle excess!
2.12. THE UNIFICATION THEOREM 107

The Gauss-Bonnet theorem connects the Euler-characteristic to curvature in the


following way:

Theorem 2.11.1: Gauss-Bonnet

Let (M, g) be a compact Riemann 2-dimensional manifold (so a surface that


is either abstract surface or a normal surface in R3 . ) Then:
Z
K dA = 2πχ(M ) (2.93)
M

So you can get the Euler-characteristic of a surface through the Gauss-curvature!


This is an extremely important bridge between topology and differential geometry.

2.12 The unification theorem


The unification theorem deals with the way one can geometrize a topological surface
by adding a metric onto it. It is a bridge between topology and differential geometry
even deeper than the Gauss-Bonnet theorem.

Theorem 2.12.1: Unification Theorem


Let M be any compact topological surface. Then one can construct a metric
on g, that has constant Gauss-curvature, specifically:

+1 if χ > 0

KM = 0 if χ = 0 (2.94)
−1 if χ < 0

This is obvious for the case of the sphere (it has genus zero, so χ(S) = 2) and
you can just give it the usual metric of the sphere. Very surprising is the result that
on can do this with the torus. You cannot do it by embedding the torus in three
dimensions, but you can do it by embedding it in four.
The idea is presented in figure ?? and ??.
With this last theorem, we move away from curves and surfaces, and move
towards the theory of differential geometry on manifolds, making a short stop along
the way to catch up on important concepts from topology.
108 CHAPTER 2. SURFACES

Figure 2.42: On a sphere, the area of what the ant would call a circle is not
πr2 .

Figure 2.43: Examples for surfaces with genus 0, 1, 2. The genus is the
number of tunnels the surface has.
Part II

Manifolds

109
111

We have discussed the geometry of curves and surfaces extensively in the first
part of the lecture. We saw many ideas, like curvature and intrinsicness that became
very big themes and useful tools. The goal of this lecture is to extend these ideas
to general (geometric) spaces and see some of the fantastic results and tools that
this approach brings. Before we can do that, however, we need a bit of technical
knowledge. For curves and surfaces, to define our tools, we needed to use certain
concepts constantly. Continuity, open sets, neighbourhoods, vectors, tangent vec-
tors and coordinates are only some of these. Usually, they were rather trivial to do,
because we stuck to the ambient space Rn and we know all of these concepts in
Rn quite well from courses like calculus. A vector is, in the most simple form, just
an arrow in Rn and is easy to understand. But we want to go beyond this simple
idea of our geometric things sitting in Rn . As a particular example, our world,
according to general relativity is a four-dimensional space that is not R4 but also
does not live in any Rn . We want to extend ideas like surroundings and vectors
to abstract spaces which do not necessarily sit in some embedding space. This is
the topic of this part of the lecture. Our first step will be open sets since we need
them for basic things like continuity. The topic of open sets belongs to a broad
field in mathematics, called topology, which we will explore briefly14 . Afterwards,
we will handle coordinates and charts and define exactly what we mean by a smooth
manifold. Finally, we will talk about vectors in the settings of manifolds.

14 Any mathematician who has had a good lecture on topology can skip that part safely.
112
Chapter 3

Topology and topological


manifolds

We want to start our discussion of manifolds with a discussion of topology. A


topology gives a space some spatial structure, which will definitely be useful in our
discussion of geometry.

3.1 Humble beginnings


What is the least we can definitely say about a geometric space (be it a curve or
surface or something more complicated)?
Well, at the very very least, at the most basic level, it will be a set of all points
that belong to that space. The thing we start out with is a set. There is not a lot
of geometry you can do with just a set, so our description needs to go further. We
need some sort of sense of spatiality to describe where things are in relation to each
other. How do we do this in Rn ? Well, in Rn we have a metric, which is something
that tells us how far away points are. We don’t have that here yet and we don’t
want to introduce it just yet. But with that metric, we can define balls centred at
a point p of some radius and from balls, we can define sets that are open or closed
and work from there with where things are spatially.
Let us write down the definitions of open and closed sets in Rn .

Definition 3.1.1: open set

Let X ⊂ Rn be a set. We call it open, if for any point p ∈ X, there is a


ball centred around p of some radius, which is entirely contained in X.

Definition 3.1.2: closed set

A set Y ⊂ Rn is closed, if its complement Rn \ Y is open.

113
114 CHAPTER 3. TOPOLOGY AND TOPOLOGICAL MANIFOLDS

Where do these definitions come from? Well, the first idea is that an open set,
intuitively speaking is one that has no border, while a closed one does. You can
clearly see that for simple sets like the ones in the figures 3.1 - 3.1, this is true.

Figure 3.1: A set without a border. No matter where, no matter how close
to the border, you can always find a small enough radius so that the ball
around that point is still entirely contained in X.

We can therefore see, that at least in these examples, the definition matches
our intuition and we can therefore accept them and see what other sets are open
and closed in that case.

We can easily see that a set consisting of a single point is closed (it is its own
border) and that often (but not always) the question of open/closed comes down
to whether we include a border in our set or not.

We want to point out a few ideas that follow from our definitions below.
3.1. HUMBLE BEGINNINGS 115

Figure 3.2: If you have a piece of the border, then you cannot do the same
thing as we did in the previous example. A point on the border, will, by
definition, always contain a bit of X and a bit of the rest of Rn . So if
you have a piece of the border, the set cannot be open, which matches our
intuition.
116 CHAPTER 3. TOPOLOGY AND TOPOLOGICAL MANIFOLDS

Figure 3.3: If you have a set with a border, you can quite easily convince
yourself that its complement is open since no part of the border is in the
compliment.
3.1. HUMBLE BEGINNINGS 117

Proposition 3.1.1: Some consequences

Here are some of the consequences of our definitions.


• The empty set ∅ and the whole of Rn are open sets.

• Any composition of open sets is an open set. This is simple to see


(and prove). If you have a point p in the composed set, it must have
come from some open set that you were composing, and if you can
find a ball there that works, then it also works in the composed set.
Adding other material does not change this simple fact, whether we
are composing two, three, infinitely many, or uncountably infinitely
many open sets.
• However, only finite intersections have to still be open. This is simple
to see. If you take infinitely many spheres (centred at 0), of radius
converging to zero, then the intersection will be the set containing
only the point 0, which is not open anymore.

• Similarly only finite compositions of closed sets are guaranteed to be


closed. The reason for this is the last example, really. You can do the
same thing with the complements of the spheres, and in the end, if
you compose all of these, you get all of Rn , without the origin, which
is not closed.

• Any intersection of closed sets is closed. This follows from the second
item and the definition.

You can see the two examples given in the proposition in figure 3.1.
We can also express continuity of a function through open sets.

Proposition 3.1.2: Continuity through open sets

A function f : Rn → Rm is continuous if and only if the pre-image of any


open set in Rm is open. In formula:

∀V ∈ Rm open ⇒ f −1 (V ) is open (3.1)

This can be proven quite easily to be equivalent to the usual ϵ − δ definition of


continuity, and is left for the reader as an exercise.
Continuity is something very local and the usual definition (ϵ − δ) makes this
very clear and uses the spacial relations between points a lot.
This means, that in Rn , we can define continuity (and in the end spaciality)
only through the use of empty (and or closed) sets. This is what we will generalize.
118 CHAPTER 3. TOPOLOGY AND TOPOLOGICAL MANIFOLDS

Figure 3.4: On the left you have the composition of a few open sets. Any
point in the combined set is still surrounded by a ball (circle) containing only
points of the combined set because the ball (circle) from the original set the
point comes from works for this. On the right, you have the intersection of
many balls (circles) whose radius is getting smaller and smaller. (All open).
After intersecting all of them, you get a set containing only the origin, which
is not open anymore.
3.2. A TOPOLOGICAL SPACE 119

3.2 A topological space


Now we can take these concepts from Rn and use them to give us some more
structure for our space, more than the only piece of information we had until now.
”it’s a set”.
We can encode spaciality by having open sets on the space. Now for a general
set, we cannot construct the open sets they need to be given. In the case of Rn ,
we constructed them using balls, in the general case we do not have balls we can
use to construct open sets with, because they are geometrical objects!
We need to choose some axioms of how we want open sets to behave otherwise
there is no difference between open sets and just subsets of a space X. We mirror
the case of Rn completely here, by simply taking the first three items from the
proposition of the last section as the axioms. This we turn into a definition.

Definition 3.2.1: Topological space and topology

A set X is called a topological space if it is equipped with a set of ”open”


sets, called a topology T . The topology has to have the following properties:

• ∅, X ∈ T
• If Ui ∈ T for some indexation I, then
S
Ui ∈ T .
i∈I

• If Ui ∈ T for a finite indexation I, then


T
Ui ∈ T
i∈I

This definition might look slightly different, but is exactly the three ideas we
wanted to steal from Rn . Indexation here means simply that we have a set (for
example {1, 2, 3, . . . }) that we use as indexes, finite means that the has a finite
amount of elements.
We can immediately define continuity of a function between two topological
spaces.

Definition 3.2.2: Continuity for topological spaces

Let X with TX and Y with TY be two topological spaces and f : X → Y


a function between the two. Then we call f continuous, if for any open
set V ∈ TY , f −1 (V ) ∈ TX , that is if the preimage of any open set (in the
Y -sense) is open (in the X-sense).

This definition is of course, exactly as the axioms, stolen from Rn .


This is the first structure we add to our space. We want it to be a topological
space, that is, we want to be able to speak about spatial relations, like continuity.
There is a lot you can do with a topological space without adding any more
structure to it. We will talk about these in the interlude at the end of the chapter,
which is recommended for anyone not familiar with topology and includes concepts
ranging from convergence, paths and connectedness to separation axioms.
120 CHAPTER 3. TOPOLOGY AND TOPOLOGICAL MANIFOLDS

But before we go into all that detail, which, while interesting, is not the direct
subject of the lecture, we want to talk about a few more ideas that will be more
directly relevant to us.

3.3 Charts: Part I


Our ultimate goal will be to do differential geometry, or very simply said, geometry
with derivatives. But the concept of a derivative needs numbers somewhere in the
problem. We need some descriptions of the space we work with that uses numbers,
otherwise known as coordinates, parametrizations or charts. These three concepts
are very similar. Let’s start with charts.
What is a chart? Very simply put, it is a map (in the geography sense, not in
the math sense). Let us take a two-dimensional space as an example. Then a chart
is a map in the intuitive way. It is a piece of paper which represents the space we
describe. We have points on the piece of paper and some way of knowing which
points these correspond to in the space we are describing. Like with real-life maps,
the map does not need to describe all of the space at once, but can simply represent
a part of the space. Like with real-life maps, the map does not need to represent
the geometry of the space well, often it is even impossible to construct a map that
represents, for example, both the lengths and angles accurately.
How do we transform this into a functional definition?

Figure 3.5: The earth and a map of the earth. We can have a map of only
part of the earth and as we know all of the maps we have cannot represent
the geometry of the earth well. A common example is that the size of
Greenland in the Mercator map is similar in size to Africa, even though in
reality it is about one-fifteenth Africa’s size

Well, mathematically, what we need is (1) the piece of the space the chart
describes, (2) the piece of Rn you draw the map in and (3) a way to assign every
point in (1) a point in (2). The latter is of course just the description of a function.
3.3. CHARTS: PART I 121

Definition 3.3.1: A chart (topological definition)

Let X be a topological space as defined above and U ∈ X open. A function


Ch is called a chart, if it is a function Ch : U → Rn for some n, and as
a function Ch : U → Ch(U ) is continuous and bijective. We also require
Ch(U ) ⊂ Rn to be open.

We sometimes call Ch the chart and sometimes (U, Ch) the chart (as a tuple),
depending on which one is most convenient1 .
With this definition, we have fulfilled both (1) and (3), and (2) is just Ch(U ).
As you can see in the definition, we require bijectivity, that is, we don’t want our
map to have two points on the map corresponding to the same point in the space,
nor do we want two points on the space being shown as one on the map.
We can also see this more topologically, by introducing the notion of a homeo-
morphism.

Definition 3.3.2: Homeomorphism

Let X, Y be two topological spaces with topologies TX , TY . We define a


function f : X → Y to be a homeomorphism, if it is:
• bijective

• continuous (If V ∈ TY , then f −1 (V ) ∈ TX )


• It’s inverse, f −1 is also continuous. (If U ∈ TX , then f (U ) ∈ TY )
Continuity of course refers to the topological continuity.

In this sense, a chart is a homeomorphism between an open subset of X and an


open subset of Rn .
A chart gives us (local) coordinates for X. You can see this in figure 3.3.
These coordinates are simply the numbers that the chart maps points to.

Definition 3.3.3: Coordinates

We can write a chart as a function that takes points p ∈ X as Ch(p) =


(x1 (p), x2 (p), . . . , xn (p)). The functions x1 (p), x2 (p), . . . , xn (p) are called
the coordinates associated with the chart, and evaluated at a specific point
p they are called the coordinates of p.

We can construct the parametrization P : Ch(U ) → U simply by taking the


inverse of Ch, which has to exist since we require bijectivity. So we get P = Ch−1 :
Ch(U ) → U .
1 Formally, a chart is defined as the tuple, so if you need to show something formally, use

the tuple.
122 CHAPTER 3. TOPOLOGY AND TOPOLOGICAL MANIFOLDS

Figure 3.6: A topological space X with an open subset U and it’s map/chart
onto a portion Ch(U ) of Rn . The coordinate lines from the map can be
projected back onto the space, giving us coordinates for the space.

Definition 3.3.4: Parametrization

We can construct a parametrization of U from a chart (U, Ch) simply by


taking the inverse of Ch, restricted to Ch(U ).

P = Ch−1 : Ch(U ) → U (3.2)


Then P is of course a function P (x , x , . . . , x ) = p ∈ U and parameterizes
1 2 n

U in the usual sense.


3.3. CHARTS: PART I 123

Example 3.3.1: The Sphere, again

The sphere, as a subset of R3 and a two-dimensional surface, is a topo-


logical space, if you equip it with the usual open sets that come from R3 .
Specifically, you can write the topology as:

TS 2 = U ⊂ S 2 | ∃V ⊂ R3 : V ∈ TR3 and V ∩ S 2 = U (3.3)




Or, simply put, the open sets on S 2 have bigger open subsets of R3 , that
they match with on the sphere.
There exists a standard chart, which is simply the chart that describes the
sphere by longitude and latitude. It’s easier to write the parametrization
down first, so we will start with that.

P : (0, 2π) × (0, π) → S 2 \ S 2 ∩ {(x, 0, z) | x ≥ 0 z ∈ R} (3.4)




cos(ϕ) sin(θ)
 

(ϕ, θ) →  sin(ϕ) sin(θ)  (3.5)


cos(θ)

From this, we can construct a chart, by taking the inverse. You can work this
out quite easily. If we call the subset of the sphere, that the parametrization
points to, U , then we get:

Ch : U → (0, 2π) × (0, π) (3.6)


(x, y, z) → (arctan(y/x), arccos(z)) (3.7)

with the extension of the arctan to the case where x = 0. You can check
(or should know) that this is a bijective continuous function so (U, Ch) is
a chart. There are, however, a myriad of other charts you could use for the
sphere, some of which we will introduce later.

We have developed the tools to add another condition for what we want to work
with, calling the new type of space a topological manifold. The new requirement
will be quite simple. We will want to be able to use charts. We want to require
that the whole space can be covered by charts, i.e., that there is no point in the
space, where you cannot construct a chart for any of its surroundings2 . We do not
require, however, that there is one chart that covers the entire space. You can see
easily why in examples like example 3.3
Before we add this extra structure, however, we will want to talk about another
condition that we will want to have, which eliminates (geometrically) ”pathological”
examples that we will definitely not want to work with, at least in this lecture.

2A surrounding is an open set containing the point it surrounds.


124 CHAPTER 3. TOPOLOGY AND TOPOLOGICAL MANIFOLDS

3.4 The Hausdorff Condition

We will want our new space to have a condition called the Hausdorff condition to
eliminate some weird examples we don’t want to have. Consider the example below
for a pathological case of a topological space that we won’t want to work with.

Figure 3.7: We can create the real line with two origins, by glueing together
two real lines everywhere, except at the origins of the two lines. This is
shown in (a). Sets that would be open on a real line (and have the origin
in them) are open if they contain at least one of the origins.
3.5. THE ANT 125

Example 3.4.1: The real line with two origins

We can construct a pathological example we want to avoid by taking two real


lines and gluing them togethera everywhere except for the origin. What we
get is the real line with two origins as show in figure 3.4. It is a topological
space and you can even show that you can cover it with chartsb , but it is
not something we want to work with. That is why we build the Hausdorff
condition into our new definition.
a in jargon: We take the quotient space
b You can just project onto the real axis, always picking one or the other origin.

What is the problem with cases such as the real line with two origins? Well, in
some sense, there are two points where there should be one. You can’t separate
them. In a topological way of speaking, there aren’t two open sets that each contain
one of the points, but that don’t intersect. The way we will avoid this is simply by
requiring it and throwing away all other cases.

Definition 3.4.1: The Hausdorff condition


Let X be a topological space. X is called a Hausdorff space if for any two
points, they are separable by two open sets. More precisely X is a Hausdorff
space if for any two distinct point p, q ∈ X there exist open sets O1 , O2 ⊂ X
so that p ∈ O1 , q ∈ O2 and O1 ∩ O2 = ∅

We will require this of any space we work with and build this into our new
definition. We will call the new kind of space a topological manifold.

Definition 3.4.2: Topological Manifold

A topological manifold M is a topological space (M, TM ), which obeys the


Hausdorff condition and can be covered by charts.

There are many examples of topological spaces, any curve or surface will do.
The circle is a great example, so is the torus.

3.5 The ant


We have now added many conditions, some rather technical and we want to conclude
this chapter by coming back to our old friend, the ant. You can imagine our
progressive addition of requirements onto our space as additional abilities the ant
has. In the first stage, we had a simple set and our ant couldn’t do anything really,
it just knew the places that exist in the world it lives in. In a topological space,
the ant can go a bit further, it can ”see”. It can tell where things are spatially in
relationship to each other, but it cannot (yet) tell anything as intricate as distances.
In a topological manifold, it can draw charts and it has enough of these to cover its
126 CHAPTER 3. TOPOLOGY AND TOPOLOGICAL MANIFOLDS

entire space, but it cannot yet do much with these charts, in particular things like
derivatives are still out of its reach. Our next goal will be to enable the ant to tell
how things change when she changes the coordinates. This will be the topic of the
next chapter.

3.6 Interlude: Useful topology for the course


Will be updated throughout the course.
Chapter 4

Smooth Manifolds

We have taught the ant how to recognize where things are spatially. We now want
to prepare to teach it calculus. We won’t teach it calculus just yet, but we will
prepare it to do so, by making sure the charts it uses are compatible with each
other, which is the main topic of this chapter.

4.1 Charts II: Compatible charts

We have seen that a topological manifold can be covered by continuous (in the
topological sense) charts. To differentiate, we would need something a bit better
some differentiable, or preferably smooth structure. Right now, there does not seem
to be an obvious way of defining a sensible, geometric, way to define a derivative
on the space without making grand ad hoc assumptions that we are not prepared
to make, since we want to stay quite general. But let us, for the sake of the
argument, say that we have found a way to do it. There is an immediate way that
differentiating could go wrong if we do not pose further restrictions on our charts.

Imagine we have two charts, (U1 , Ch1 ) and (U2 , Ch2 ), either covering the same
region or covering regions that overlap somewhere (U1 ∩ U2 ̸= ∅) and we have some
function, let’s call it f : M → R. Whatever our derivative should be, the one
thing we will want is that if f is differentiable in the new sense, and if the charts
are sensible, then f should be differentiable as a function of the coordinates. But
this should hold for any chart we want, so in particular it should hold for Ch1 and
Ch2 . We can guarantee this if the transition map from one chart to another is
differentiable. This will be the topic of this chapter.

127
128 CHAPTER 4. SMOOTH MANIFOLDS

Definition 4.1.1: The transition map

Let (U1 , Ch1 ) and (U2 , Ch2 ) be two charts, with U1 ∩ U2 not empty, that
is, there is a region on M that both charts cover. Call this region U . Then
we define the transition map from the first to the second chart as:

T1→2 = Ch2 ◦ Ch−1


1 = Ch2 ◦ P1 (4.1)

More specifically, we define it as T1→2 : Ch1 (U ) → Ch2 (U ), since anything


else is senselessa . The proper definition is: T1→2 = Ch2 U ◦ P1 Ch1 (U ) .
a We cannot define it on any bigger set obviously.

You can see a picture with all the objects we are using right now in figure 4.1.

Figure 4.1: A picture showing all the main players of this chapter. We have
(two) charts Ch1 , Ch2 each covering a region U1 , U2 of the manifold. Each
chart has its own inverse, P1 , P2 associated with it, which are parametriza-
tions of the manifolds. On the region where U1 and U2 overlap (called U )
we can define transition maps T1→2 , T2→1 , which change charts.
4.1. CHARTS II: COMPATIBLE CHARTS 129

From now on, since we will be working with charts a lot and want to declutter
notation, we will not write on which sets each chart is defined, these are implied
to be the ”reasonable” ones. It is, however, a good exercise, to keep track of
these, especially for the exam. We will also call U1 and Ch1 (U1 ) the same thing,
even though they are definitely not. Our reasoning is that Ch1 (U1 ) is our chart
representation of U1 and in the same way you can point at a map and say ”Here
is America”, even though you are pointing at a chart of America, you can call
Ch1 (U1 ), U1 .
We can now formalize what our idea, by calling two charts compatible if their
transition maps are differentiable. The only thing we will change is require smooth-
ness, for convenience.

Definition 4.1.2: Compatible charts

Let (U1 , Ch1 ) and (U2 , Ch2 ) be two charts of M . We call the two charts
(smoothly) compatible if either U1 and U2 don’t overlap, or otherwise, if
their transition functions T1→2 and T2→1 are smooth.

With this, we can define an atlas and a preliminary definition of a smooth


manifold. An atlas, simply put, is a collection of charts that cover M and are all
compatible with each other. A smooth manifold (for now) is a topological manifold
with an atlas.

Definition 4.1.3: Atlas

An Atlas is a set A of charts (Ui , Chi ) which cover M and are all smoothly
compatible with each other (all the transition maps between the charts are
smooth)

Definition 4.1.4: Smooth Manifold: Preliminary Definition

A smooth manifold M is a topological manifold equipped with an atlas. We


summarize all the requirements in the following list:

• The set M has a topology TM , which is the set of all open subsets of
M.
• The space fulfils the Hausdorff condition, that is, every pair of two
different points can be separated by open sets.

• The space is equipped with an atlas A of charts, which are smoothly


compatible.
• The charts are all homomorphisms. (This comes from the definition
of a topological manifold.)
130 CHAPTER 4. SMOOTH MANIFOLDS

4.2 Examples of smooth manifolds


There are many examples of smooth manifolds, and now that we have a preliminary
definition we can give some examples in this chapter.

Example 4.2.1: Smooth manifolds

• Given that we started our journey with Rn as an inspiration and


generalized some of its properties, it is no wonder that Rn is an n-
dimensional smooth manifold.
• Any open subset of Rn is also a smooth manifold.

• The graph of a smooth function f : U ⊂ Rn → Rm with U open is a


smooth n-dimensional manifold.

Figure 4.2: The graph of a smooth function is a smooth manifold.


We can create a smooth chart Ch that covers all of the manifold
quite easily, by projecting (x, f (x)) down onto x.

• Any subset of RN that can be written, locally as the graph of a


smooth function (in some orthogonal coordinate system) is, of course,
a smooth manifold.

The last two examples aren’t too surprising. They are completely analogous to
their equivalents we saw while discussing surfaces and since spaces like surfaces are
what we want to generalize, it looks like we are on the right track here.
Now that we have the general examples done, we give a few more specific ones.
4.2. EXAMPLES OF SMOOTH MANIFOLDS 131

Example 4.2.2: The n-dimensional sphere S n

• The n-dimensional spheres S n are all manifolds. (S n is defined as


the set of all points in Rn+1 with euclidean distance one from the
origin). You might ask yourself how many charts we need to cover S n
completely. Your first idea might be to do something similar to how
we covered S 2 in chapter 2 and describe half the sphere as a graph.
If you do this, you get (2n + 2) charts (exercise: verify this). You can
see the two examples S 1 and S 2 in the following figure.

Figure 4.3: The six different charts (only the open sets are shown)
that you need to map S 2 with the first method.

It turns out, that you don’t actually need this many charts to cover
S n , you could do with just two. What you need for this is the
stereographic projection. You can see the stereographic projection
of S 2 onto the plane in figure 4.2. The way you project a point
p = (x1 , . . . , xn , xn+1 ) from the sphere onto the (hyper)-plane is by
drawing a straight line from the north pole N = (0, . . . , 0, 1) through
p. The point at which the line hits Rn is the point it gets mapped on.
Notice that in two dimensions it is easy to see that the southern hemi-
sphere gets mapped onto the disc inside the sphere, while the northern
one takes up all of the rest of the plane. This way you can create a
chart that covers the entire sphere except for the north pole. This is
the first chart. The second one is also the stereographic projection,
but this time you do it from the south pole, and that covers the entire
sphere except for the south pole. Together you have an atlas of size
two.
132 CHAPTER 4. SMOOTH MANIFOLDS

The formulas for the two projections are:

ChN (x1 , . . . , xn , xn+1 ) = (x1 , . . . , xn , 0)/(1 + xn+1 ) (4.2)


ChS (x , . . . , x , x
1 n n+1
) = (x , . . . , x , 0)/(1 − x
1 n n+1
) (4.3)
(4.4)

As an exercise, you should derive these formulas and show that they
are compatible.

Figure 4.4: The stereographic projection. To produce the stereo-


graphic projection of a point p on S n+1 , you draw a line between the
north pole N and the point p. You then follow the line to the point
where it crosses Rn × {0}. This point is then the projection of the
point p

You can also check that the charts from the first method (2n + 2
charts) are compatible with the charts obtained from the stereographic
projection.

Example 4.2.3: Mobius Strip

The Mobius strip is another example of a smooth manifold. You can see
the Mobius strip in a bit of a different form in the figure below.
4.3. THE MAXIMAL ATLAS 133

Figure 4.5: The Mobius strip.

4.3 The maximal atlas

We have given a preliminary definition of a smooth manifold, by saying that a


manifold is a space with a topology and an atlas, composed of charts which are
compatible with each other and cover the whole space. We now come to a practical
problem with our definition. In example ??, we saw that we can construct two
atlases for the sphere, both of which are good enough in the sense that the charts
of each individual atlas cover the sphere and are compatible among each other (in
each atlas, individually).

But which atlas should we choose to ”define” the sphere as a manifold? And
do we get different geometries from different atlases like that? In the case of the
sphere, it certainly would be weird if we got different geometric results if we used
the stereographic atlas or the graph atlas. To this also comes the fact that working
in one or the other is not really too comfortable. For example, if you choose the
graph atlas, there is no chart on which you can see both the north and south pole on
a single map, you need to use at least two charts, which is both more complicated
and somehow seems like an unnecessary problem. There is an easy solution to
this. Throw both of these atlases together. We already asked you to show that the
charts you get from the graphs and the ones from the stereographic projections are
compatible, so if you throw all these together, you still get a fully functional atlas.

In fact, while we are at it, why not just throw all possible compatible charts
together into one atlas, call it a maximal atlas and be done with it? This is exactly
what we choose to do and will be the modification to our definition.
134 CHAPTER 4. SMOOTH MANIFOLDS

Definition 4.3.1: The maximal atlas

Let A be an atlas of some smooth manifold (as per our preliminary defini-
tion). We can construct a maximal atlas by collecting together all possible
compatible charts with A. The maximal atlas is defined as:

Ā = {all charts (U, Ch) that are compatible with all (UA , ChA ) ∈ A }
(4.5)

Of course, to construct the maximal atlas we need some atlas to start with, and
the maximal atlas will depend on this. If you have two atlases that are compatible
with each other, however, they of course produce the same maximal atlas. In that
case we call the two atlases equivalent.

It should make sense, of course, that Ā is an atlas in its own right, all charts in
Ā are compatible with each other. We will give the proof, because it is a proof that
is similar to many other proofs in differential geometry of this kind and it is good
to have seen its kind once.

We will need two ideas for the proof, which seem quite trivial and are not too
hard to prove.
4.3. THE MAXIMAL ATLAS 135

Lemma 4.3.1: Smoothness is a local property

S manifold M and (Ui )i∈I open sets that


Let U be an open region of some
cover U . (which means that Ui = U ). Then a function f is smooth if
i∈I
and only if f |Ui is smooth for all i.

Figure 4.6: It should be clear that smoothness is a local property.


A function should be smooth on U exactly when it is smooth on a
covering of U (a parting of U into smaller parts).

This idea is quite easy to accept, since smoothness should be a local property,
after all, it is a generalization of the ϵ − δ kind of continuity and differentiation.
The second idea is even simpler.

Lemma 4.3.2: Composition of smooth functions are smooth.

et U, V, W be open sets of some manifold N, M, P and f : U → V and


g : V → W two smooth functions. Then g ◦ f is a smooth function.

Again this should feel obvious and the proof is not hard, it’s another one of the
typical chain rule proofs of differential geometry.
We will leave these two claims unproven, since their proofs are neither hard nor
illuminating and focus on the original idea we want to prove.

Proof. We want to show that Ā, which is a maximal atlas constructed from A,
is an atlas in its own right, that is, every two charts in Ā are compatible with
each other.
136 CHAPTER 4. SMOOTH MANIFOLDS

We start with choosing two charts in Ā, and we call them (V, ChV ) and
(W, ChW ). We want to show that they are compatible.
Let Z = V ∩ W and we can assume that it is not empty, otherwise, we are
done. We want to show that TV →W (on Z) is a smooth map. The basic idea of

Figure 4.7: The charts we use for the proof.

the proof is that we do not transition from the first chart to the other directly,
but go over the charts from A. (See figure 4.3)This is why we needed the second
lemma. The first we need because, in general, it is not necessary that A contains
a chart that works on Z, so we have cut Z in parts and prove it for each part,
which is where our lemma will come in handy. Let’s start.
Firstly, cut Z up into all the pieces, where a chart in A exits that covers
that portion of Z (and maybe more beyond Z). By construction of the maximal
atlas, they are compatible with each (Ui , Chi ) ∈ A. That is, S define the sets
Zi = Z ∩ Ui , which are open and cover Z. Then ChV (Z) = Ch(Zi ) and all
i∈I
of these are open as well since ChV has to be continuous.
4.4. THE FINAL DEFINITION OF A SMOOTH MANIFOLD 137

Take TV →W and split it up into smaller maps over the Ch(Zi ). Because
of the first lemma, we only need to show that TV →W |Ch(Zi ) is smooth for all
Ch(Zi ), the smoothness of TV →W as a whole map follows from the lemma.
But we can use that:

TV →W |Ch(Zi ) = ChW ◦ PV (4.6)


= ChW ◦ (Pi ◦ Chi ) ◦ PV (4.7)
= (ChW ◦ Pi ) ◦ (Chi ◦ PV ) (4.8)
= TUi →W ◦ TV →Ui (4.9)

where all the functions are, of course, restricted to either Zi (all the charts) or
Chi (Zi ) (all the parameterizations and transition maps), which has been left
out for readability.
But ChV and ChW have to be compatible with all the Chi , so the two tran-
sition maps in the above equation need to be smooth, but since the composition
of two smooth functions is smooth, TV →W |Ch(Zi ) has to be smooth for all charts
in the atlas, and since the Zi cover Z, the whole transition map TV →W has to
be smooth.

4.4 The final definition of a smooth manifold


We can now tweak our preliminary definition to get the full, proper, definition of a
smooth manifold. The only change we make is that the atlas has to be maximal.

Definition 4.4.1: Smooth Manifold: Final Definition


A smooth manifold M is a topological manifold equipped with a maximal
atlas. We summarize all the requirements in the following list:

• The set M has a topology TM , which is the set of all open subsets of
M.
• The space fulfils the Hausdorff condition, that is, every pair of two
different points can be separated by open sets.
• The space is equipped with a maximal atlas Ā of charts, which are
smoothly compatible.
• The charts are all homomorphisms. (This comes from the definition
of a topological manifold.)

4.5 Cartesian Products, Smoothness


In this section, we want to talk about a few more ideas that we need to understand
smooth function. In particular, we want to discuss what happens if you take the
138 CHAPTER 4. SMOOTH MANIFOLDS

cartesian product of smooth manifolds and also the relationship between smooth
functions between manifolds and atlases.

4.5.1 The Cartesian product of two manifolds


One thing we would like to do is take small manifolds and create larger ones out
of these. A very natural way to do this is by using the Cartesian product. Going
back to Rn , if you take a copy of Rn and a copy of Rm and combine them using a
cartesian product, you, of course, get Rn+m , which is, of course, a smooth manifold.
It turns out that the same happens when you take the cartesian product of two
smooth manifolds. The resulting space is a smooth manifold as well, in a very
natural way.
You can for example take two circles (two S 1 ) and combine them by the cartesian
product. The result is the torus.
The main question is how you can construct an atlas for a cartesian product of
two smooth manifolds. It turns out to be quite simple, you just take the cartesian
product of the charts of each atlas.

Definition 4.5.1: Atlas for a cartesian product

Let M and N be two manifolds, with dimensions m and n. Then you can
construct an atlas for M × N out of the two (maximal) atlases ĀM and ĀN
that belong to M and N . If (UM , ChM ) and (UN , ChN ) are two charts
from the individual atlases, then you can create a chart for UM × UN by
taking the cross product:

ChM ×N = ChM × ChN (4.10)

You can check that with this atlas (or the maximal version of it) you get a
smooth manifold, by checking all the conditions for a smooth manifold.
This definition also forces the dimension of M × N to be m + n as you would
expect, which you should convince yourself of.

4.5.2 Smooth functions between manifolds


We want to talk about smooth function between two manifolds, say M and N , with
dimensions m and n.
The first question is how we want to define smoothness on manifolds. We use
charts of course! But we need a bit more detail. Let’s say we have f : M → N ,
and a point p ∈ M so that f (p) = q ∈ N . Since smoothness is a local quantity,
we only need to check it locally. The general idea is to pick two charts, one of M
that charts a surrounding of p, and a second one for N that chats a surrounding of
q, and see if f is smooth as a function of the coordinates of the charts.
Firstly, however, we need a small condition on the pair of charts we use.
4.5. CARTESIAN PRODUCTS, SMOOTHNESS 139

Definition 4.5.2: Admissible pairs

Let (U, ChU ) ∈ ĀM and (V, ChV ) ∈ ĀN be two charts and f : M → N
the function whose smoothness we want to check. The two charts are called
an admissible pair if f (U ) ⊂ V , that is if the whole set U gets mapped into
V.

Definition 4.5.3: Smoothness of f : M → N

Let f : M → N be a function between two manifolds M and N . We say it


is smooth if for every point p ∈ M we can find at least one admissible pair
(U, ChU ) ∈ ĀM and (V, ChV ) ∈ ĀN where p ∈ U so that the function in
coordinates is smooth. In other words, fc = ChV ◦ f |U ◦ PU : Ch(U ) →
Ch(V ) is smooth, where the c in fc stands for coordinates.

Notice first that we did not define f to have to be smooth in coordinates for all
admissible pairs, only that one exists. You might ask yourself if this means that a
smooth f might not be smooth for an admissible pair that wasn’t used to check its
smoothness. It turns out, that the answer is no. You only need to check smoothness
in one atlas, not the maximal atlas, and the definition then forces f to be smooth
as a function of coordinates for all other admissible pairs. The proof of this claim
is very similar to the proof that all charts in the maximal atlas are compatible with
each other. You take two charts that are an admissible pair, break U up into small
pieces for which there is a chart where it is smooth (as per the definition), and use
the fact that smoothness of functions between two Rn is local. We therefore choose
not to give it here.

From this definition, you can immediately see that the following corollary has to
be true.

Corollary 4.5.1: Smoothness is a local property

The smoothness of a function f : M → N is a local property.


140 CHAPTER 4. SMOOTH MANIFOLDS

Example 4.5.1: Smooth functions between manifolds

We want to give a few examples of smooth functions between manifolds


which you already know, but might not have thought about in this way
before (except for the last two). The first three rely on the fact that Rn is
a manifold.

• Smooth curves on a manifold γ : (a, b) → M are in this sense smooth


functions, since (a, b) is a smooth manifold.

• Charts and parameterizations are smooth functions in the above sense,


which should convince you that our definition of a smooth function is
sensible.
• Functions u : M → R can be smooth.

• The definition of smooth maps for surfaces that we gave in Chapter


2 is equivalent to the new one.

The last two points should make it reasonable that our definition makes
sense.

Another thing we definitely would like to have is that the composition of two
smooth functions is a smooth function and that smooth functions in the above sense
are also continuous in the topological sense, both of which turn out to be true.

Proposition 4.5.1: Composition of smooth functions is smooth

Let f : M → N and g : N → P be two smooth functions and M, N, P


smooth manifolds. Then g ◦ f : M → P is also smooth.

Proposition 4.5.2: Smooth functions are continuous

et f : M → N be a smooth function in the manifold sense. Then it is


continuous in the topological sense.

Since both of these propositions seem very natural and their proofs aren’t too
interesting we leave these out. You can show the first one easily by using charts,
similar to other proofs in this chapter and the second by using that continuity is a
local property and charts are homeomorphic.

4.6 Diffeomorphisms
Now that we have done a lot of technical detail, we want to talk about when (at
this stage) you cannot tell the difference between two manifolds, similar to how
two topological spaces are pretty much the same thing (equivalent, homeomorphic)
4.6. DIFFEOMORPHISMS 141

if there is a homomorphism between them. The main point back then was that
we needed a bijective continuous map between the two spaces. The idea was that
we have a structure on the space, and if the structure is the same between two
spaces, we cannot tell the difference with any tool we have that comes from these
structures. The only new structure we have at this stage is a smooth atlas, so you
are probably not too surprised that the bijection will need to be smooth and its
inverse as well. When we have such a map, we call it a diffeomorphism and the two
spaces are diffeomorphic.

Definition 4.6.1: diffeomorphism

Let f : M → N be a function. It is called a diffeomorphism if it satisfies


the following conditions.

• f is a bijection.
• f is a homomorphism. If the two spaces are supposed to be ”the
same space”, then we shouldn’t be able to tell them apart by their
topologies, so this condition makes sense.
• f is smooth, and it’s inverse f −1 is also smooth. This is to make sure
that the maximal atlases are ”pretty much the same” and we cannot
tell M and N apart from their smooth structure.

If such an f exists, we call M and N diffeomorphic.

There are a few things to note about the definition. Firstly, it forces dim(M ) =
dim(N ), which should make sense. It should not be possible for the circle and the
sphere to be the same thing, and they are not. Secondly, the second requirement is
not actually necessary, because charts are homeomorphic and the second condition
follows from the other two. A diffeomorphism is automatically a homomorphism
without the second condition, we just added it so that you could see very clearly
how a diffeomorphism respects the entire structure, not just the atlas.

4.6.1 What you can get from diffeomorphisms


We want to just mention a few results that you can get from studying diffeomor-
phisms, without mentioning too much detail.
The set of self-diffeomorphisms Diff(M ) of a manifold M forms a group and
that this group is huge, usually infinitely dimensional.
Starting in dimension four, not all topological manifolds possess a smooth struc-
ture (a maximal atlas). Other topological manifolds have more than one maximal
atlases, which are not diffeomorphic to each other. The last one might seem like
something that happens only to some weird edge cases, but it actually even happens
to R4 . (Result of Freedmann/Donaldson in 1980’s)
142 CHAPTER 4. SMOOTH MANIFOLDS
Chapter 5

Tangent Vectors

Now that we have defined what a smooth manifold is, we want to be able to do
something inside of it. It is all well and good to talk about diffeomorphisms and
charts, but we need some objects in the manifold to have interesting results about.
The obvious first candidate for something interesting on a smooth manifold is a
vector. After all, Rn , curves and surfaces all have vectors associated with them and
a lot of the nice results from the first part of the lecture had something to do with
vectors.
There is a small problem of just ”lifting” the definition of vectors from Rn to
manifolds directly, without any thought. The problem is that a manifold does not, in
general, have a natural vectorspace structure. You can’t ”add” points on a general
manifold in a very sensible way. What is the north pole plus the south pole on a
sphere, which is not embedded in R3 ? This question does not even seem sensible
and any addition you would add at this stage would seem very arbitrary and definitely
not natural. Now, there are many ways to look at the idea of what a vector in Rn
is. You have the obvious one, a vector is ”an arrow” or the mathematical one of it
being an element of a set with an addition and scalar multiplication which obey the
following axioms...
But neither of these help us right now. Of course, we always want to draw
vectors as arrows, and mathematically, the object we want to work with should be
vectors in the vector-space-vector definition, but they are not helpful yet.
One idea is clear even at this stage. Whatever concept of a vector we will choose
to generalize, we will only generalize tangent vectors (for example of curves/sur-
faces). The reason is simply that all other vectors on curves and surfaces (normal
vectors) did not come from the curve/surface itself but the Rn that we embedded
the curve/surface in. So only tangent-vectors are appropriate if we don’t want to
have the influence of some ambient space that our manifold lives in.
There are four different ideas we can generalize out of Rn that end up being
equivalent.

• The first one is very simple. We treat vectors in charts. Vectors are vectors
in charts, where vectors make sense (since charts go to Rn ) and we can point

143
144 CHAPTER 5. TANGENT VECTORS

at a vector in a chart and look to another chart and through the transition
map check which vector it is there. This definition defines tangent-vectors
through equivalence classes of vectors on charts. We don’t know what they
are on M , but we can chart them.

• A bit more on the computational side, we can define vectors through direc-
tional derivatives of smooth functions. We know that you can differentiate a
smooth function f : Rm → R in the direction of X and get

∂f ∂f ∂f
DX f = X 1 1
+ X2 2 + . . . Xn n (5.1)
∂x ∂x ∂x
This derivative contains the same information as what we would usually call
the vector, that is the quantities (X 1 , X 2 , . . . , X n ).

• More on the geometric side, we can use curves to define tangent vectors. We
can use the fact that from a geometric standpoint, a vector tangent to a
curve (multiplied with a small ϵ) looks like a small piece of the curve itself, in
the usual Rn cases. This way we define a tangent vector as a ”small piece of
a curve”. Since many curves have the same tangent vector, we will also use
equivalence classes here.

• On the more abstract side, we can define tangent vectors as linear operators
on smooth functions to R from the manifold, which satisfy the product rule.

X op (f g) = X op (f )g + f X op (g) (5.2)

These four ways are all equivalent to each other and can all be used to define
tangent vectors. They all represent different ways to think about tangent vectors
(charts, computation, geometric (curves), abstract) and this variety gives you a lot
of ways to tackle a mathematical problem. Sometimes the geometric picture will
be more applicable, sometimes the computational etc.
Of particular note is the second definition, which, because of its relationship
with partial derivatives has fostered a notation in differential geometry which can
be confusing at first, but to which one gets used to quite fast.
For our purposes, the first two definitions will be most useful, and we will take
the most time discussing them.

5.1 Tangent vectors from charts


The first possibility of defining tangent vectors on a manifold is by using charts.
We know how vectors work in Rn and can therefore use this knowledge to define
something resembling a vector on a manifold. We will recapitulate how vectors
behave with maps in Rn , and then go on to define them properly on manifolds.
5.1. TANGENT VECTORS FROM CHARTS 145

5.1.1 Vectors and Maps in Rn


Imagine you have two copies of Rn (or open subsets) that are connected via a
(smooth) map f : Rn → Rn , which you can see as a coordinate change for the
underlying map. Then pick a point p and a vector X ∈ Tp Rn ≈ Rn . Where does
the vector X get mapped to? That is a simple idea from calculus. It gets mapped
to df (X) of course because the Jacobean (df ) tells us how the map f behaves
locally, and a vector is a local thing (it is at the point p).
You can see the situation in figure 5.1.1.

Figure 5.1: A vector X in Rn gets mapped to dfp (X) when f is applied to


Rn .

5.1.2 Back to manifolds


Now what can we say when working with manifolds? Well, we can use the situation
from the last section, by saying that the first Rn (or more precisely an open subset
U1 ⊂ Rn ) is the coordinate space for a chart Ch1 and the second one for Ch2 ,
146 CHAPTER 5. TANGENT VECTORS

which are connected by T1→2 . Then if you are working in the first chart, which
means you are working in the first Rn , you know what a vector is in the chart,
it’s just a tuple (X 1 , . . . , X n ) ∈ Rn situated at p and you can work with it in the
chart. What happens when you use the second chart? Well the coordinates of the
vector X, which previously lived on the first chart, will just be the jacobian of the
transformation matrix at p evaluated at X.

(X ′1 , . . . , X ′n ) = dT1→2 ((X 1 , . . . , X n )) (5.3)

or written out:

n
X ∂T i
X ′i = Xj (5.4)
j=1
∂xj

where we wrote T = (T 1 (x), . . . , T n (x)) instead of T1→2 , and when clear will
continue to do so.
By taking all other charts you can find which vector in the other charts X
corresponds to and what you get is a working definition of a vector, without having
really talked about the manifold, at all. Figure 5.1.2 might make this clearer.
Let’s say we have a plane flying over the point p with a physically real velocity
described by the vector X on the Mercator chart (square). We can describe the
path of the plane perfectly well in that chart without any reference to the actual
Earth. If we want to switch to a new chart, maybe because it is easier to see
something or represent our country as bigger than others, we can easily do that
with the transition map and bring the velocity vector to the new chart using the
Jacobean of the transition map. This is the idea of this definition, we take vectors
in charts and use them only in charts, really, with no mention of the manifold.

5.1.3 The definition

We want to pack the idea of using vectors in charts into a working definition. The
way is quite simple. We say a vector is simply the collection of all vectors in charts
that transform into each other, and we can take one representative (in one chart)
as the example of the vector we talk about.
5.1. TANGENT VECTORS FROM CHARTS 147

Figure 5.2: The Mercator and stereographic projection of the earth (/parts
of the earth). Imagine at a real point p on the earth, there is a plane flying
with a velocity described by the (blue) tangent vector X. We can talk of
its velocity vector as an arrow on both of these pictures (charts) without
making any reference to the actual manifold, that is the earth. We can
describe the path of the plane and all the information about it we would like
using only the charts and nothing else.
148 CHAPTER 5. TANGENT VECTORS

Definition 5.1.1: Tangent vector definition 1: Through charts

Our first definition is through charts. Let M be a manifold and p a point


on it as always. A tangent vector X at p is defined as an equivalence class
of vectors in charts identified with each other through transition maps. Two
vectors X ′ , X ′′ in different charts Ch1 , Ch2 are equivalent in this sense if:
• The point p is covered by both charts p ∈ U1 ∩ U2
• The second vector is just the Jacobean of the transition map acting
on the first. X ′′ = dT1→2,p (X ′ )

We always have the point p be a part of the vector, a tangent vector never
exists without a point p it sits at.

It should not surprise you that the equivalence relationship we presented is an


actual equivalence relationship in the mathematical sense (reflexivity, symmetry,
transitivity). If you want, you can prove it, the proof is another one of those ”chain
rule” proofs we see quite often.
You might wonder why we require the point p to always be included with the
vector. The reason is that we don’t really have a good way of saying two vectors,
sitting on two different points p, q are the same vector but shifted. In Rn , this is
simple. We can simply translate the arrows to get one to another, and by this,
we can have a very natural way of determining if two vectors are the same. But
have a look at the sphere (figure 5.1.3). This clearly doesn’t work here. Simply
translating, even in a way where locally the vectors look parallel1 you can get back
to the original point and get a different vector! Transporting vectors on manifolds
is clearly not as simple as on Rn and requires more thought. In fact, this is the first
hint at curvature.
This is why we always carry the point around with the vector because we cannot
yet do anything with vectors between two
We can also, along the lines of our discussion of surfaces, define the tangent
space, which we have already mentioned a few times in this chapter (specifically
the one of Rn ), but have not yet properly introduced. The definition is quite simple
and doesn’t require much talk.

Definition 5.1.2: Tangent-space

We define the tangent space Tp M at a point p of the manifold M to be the


set of all of its tangent vectors.

You really shouldn’t be too surprised that Tp M is a vector space of dimension n.


A heuristic argument would be that we have glued together all the tangent spaces
of Rn at the points Chi (p) for all the charts together, so we are really just left with
one glued-together copy of the tangent space of Rn , which is just {p} × Rn and all
1 after all, the sphere locally looks like R2 , as you should know from living on the earth.
5.1. TANGENT VECTORS FROM CHARTS 149

Figure 5.3: In Rn , you can transport vectors and compare them without
problems, we therefore don’t need to always say which point the vector is
situated at. But if we move onto the sphere, this is not the case anymore.
You can transport a vector on the sphere so that it locally stays parallel to
itself, and after performing a loop, come back and get a different vector!
150 CHAPTER 5. TANGENT VECTORS

the rules from there get copied over to Tp M

5.2 Tangent vectors as directional derivatives

In the last section, we introduced tangent-vectors as things that we know how to


describe in charts, but don’t really understand on the manifold. We know how to
map them, but we don’t know what they are. This definition gives a more compu-
tational view of things. We take inspiration from the fact that we can differentiate
functions along the direction of a vector in Rn .
We know that the derivative of a (smooth) function u : Rn → R in the direction
of X (vector) at p is:

n
X ∂u i
DX u(p) = Du(p)(X) = X (5.5)
i=1
∂xi
∂u ∂u ∂u
= X1 + X2 2 + · · · + Xn n (5.6)
∂x1 ∂x ∂x
(5.7)

We can use this. The components of X are very explicit in this equation. Notice also
that if you view this over all possible smooth functions, the directional derivatives
are operators on smooth functions. Even more so, every tangent vector produces
its own derivative operator, and two different tangent vectors produce two different
operators.
We can also rewrite the above equation using curves. If γ is a curve so that
γ(0) = p, γ ′ (0) = X, then:

du(γ(t)) ∂u ∂u ∂u
= γ ′ (0)1 + γ ′ (0)2 2 + · · · + γ ′ (0)n n (5.8)
dt t=0 ∂x1 ∂x ∂x
∂u ∂u ∂u
= X 1 1 + X 2 2 + · · · + X n n = DX u(p) (5.9)
∂x ∂x ∂x
(5.10)

Again, notice that if we take all possible smooth functions, different vectors produce
different operators. We can use the last equation very easily to define vectors on
manifolds, we don’t need any other structure. We just take the last equation as the
definition.
5.2. TANGENT VECTORS AS DIRECTIONAL DERIVATIVES 151

Definition 5.2.1: Tangent vector definition 2: Through derivatives

Let p be a point on M . We can define a tangent vector as an operator:

X = (p, X op ) (5.11)

where X op is a linear operator on the smooth functions on M , C ∞ (M ),


which arises as the derivative of functions along smooth curves.

X op : C ∞ (M ) → R (5.12)
du(γ(t))
X op (u) = (5.13)
dt t=0
where γ is some curve on M , but always the same curve for all the functions
u.

Notice how, as with the first definition, we use external objects (there charts,
here smooth functions) to define things on M . This is quite common in differential
geometry. We will drop the op from X op from now on and just call it X as well,
but always have it in the back of our head that we need a point p where the vector
sits, for the same reason as with the first definition. As a reminder we will also
sometimes write Xp instead of X.
We can define the tangent space again.

Definition 5.2.2: Tangent space 2

The tangent space Tp M at p according to the second definition is the set


of all vectors X = (p, X op ) in the sense of the above definition.

Notice that this time Tp M ⊂ {p} × Hom(C ∞ (M, R), which is a vector space.
This time, however, it is totally not obvious that Tp M is a vector space. No way is
it obvious that, for example, if X, Y are in Tp M , that X + Y is in it two, because
we do not know how to add curves on M . In Rn , it is obvious, but certainly not
on a general manifold, Imagine even just the earth and two curves, for example, the
ones of a plane flying from Zurich to London and from Zurich to New York. There
is no reasonable way of adding them. What would that even mean? Would you end
up in Greenland?
Where we can do this is in charts, however. The one important point here,
though, is that the curve we get is dependant on the chart, and depending on the
chart, if you add the two curves from before, you can get anywhere from Greenland
to Brazil, but locally the two curves will be the same, in the sense that they will
have the same tangent vector.
We will not prove that this version of Tp M satisfies all the axioms of a vec-
torspace, rather and more interestingly, we will show that it is a whole vectorspace
in the sense that it has a basis and that basis spans the entire vectorspace, leaving
152 CHAPTER 5. TANGENT VECTORS

Figure 5.4: We can define tangent vectors by differentiating smooth func-


tions along curves.
5.2. TANGENT VECTORS AS DIRECTIONAL DERIVATIVES 153

none out and spanning no other non-vector identity2 .

Figure 5.5: We don’t know how to really add curves sensibly on a manifold
directly. We can do it in curves, but depending on which chart we use, the
curves will look different. Here, on this example of the earth, the addition of
the two paths of our plane lands you in Cuba, if you use the Mercator pro-
jection, or somewhere in Antarctica if you use the stereographic projection.
That is not an ambiguity we want if we don’t want our vacation ruined.
Notice, however, that both curves result in the same tangent vector.

5.2.1 The basis of Tp M with the second definition.


It should be obvious that there is no natural (canonical) basis of Tp M , we don’t
have preferred directions in any way, at least not with the structures we have now.
2 No linear operator which does not come from a curve
154 CHAPTER 5. TANGENT VECTORS

So we are forced to use charts. Let us fix a point p and a chart Ch, which also
charts p.
We can construct a basis of Tp M by using the basis of Rn . For simplicity, let
us say that p gets mapped to (0, 0, . . . , 0). Then we have the basis vectors of Rn
at the origin (=p̃ = Ch(p)), which we can call e1 , e2 , . . . , en . Which curves could
we use to construct the basis of Tp M ? Well, the coordinate lines of course! The
coordinate lines are defined as follows:
β̃i (t) = tei (5.14)
where β̃i is the i-th coordinate line. Then we can project them back onto the
manifold βi (t) = PCh (βi˜(t)), as you can see in figure 5.2.2.
We can then define the i-th basis vector of Tp M as the one that one gets from
the i-th coordinate axis, projected back onto the manifold. In the coordinate space,
this vector would simply belong to the operator:
∂ ∂ ∂ ∂
X1 + X2 2 + · · · + Xn n = (5.15)
∂x 1 ∂x ∂x ∂xi
This motivates a new notation for this basis. We can now write:
 op

= The vector gotten from the i-th coordinate line (5.16)
∂xi p,Ch
where we remind ourselves that it sits at p and that it is definitely something that
comes from Ch and that it is an operator. (In future, we will leave out all these
little reminders.)
We can turn this into a definition.

Definition 5.2.3: Standard basis of Tp M for a chart Ch

We define the standard basis of Tp M with respect to a chart Ch as:


 op
∂ d
(u) = u(βi (t)) (5.17)
∂xi p,Ch dt t=0

where βi is the i-th coordinate line of the coordinate space Ch(U ), projected
back onto the manifold.

What is left to do now is to prove that this is a basis. For this, we would like
the coefficients of our vectors, to work a bit easier.

5.2.2 The coefficients of a vector


If we ever want to go computational with vectors, we need their coefficients. We
need some tuples to work with. Getting to the coefficients is not hard, however,
we just need a chart Ch and use the standard basis of Tp M induced by that chart.
Then we can write X as:
du(γ(t))
X ·u= (5.18)
dt t=0
5.2. TANGENT VECTORS AS DIRECTIONAL DERIVATIVES 155

for some curve γ which is appropriate for X. Now, this is a map from R to R, going
over the manifold. We can eliminate the manifold, by going to the coordinate space
and back with a chart and parametrization.

d
X ·u= (u ◦ Ch−1 ) ◦ (Ch ◦ γ)(t) (5.19)
dt t=0

The first (from the right) is simply the curve γ, drawn into the coordinate space,
not onto the manifold, which we will also call γ̃. The second one is simply the
function u, but as a function of the coordinates, not of the points on M , which we
will call ũ. We can now use the chain rule:

d
X op · u = ũ ◦ γ̃(t) (5.20)
dt t=0
n n
X ∂ ũ dγ̃ i X dγ̃ i ∂ ũ
= (0) = (5.21)
i=1
∂xi dt i=1
dt ∂xi
n op
dγ̃ i

X ∂
= (u) (5.22)
i=1
dt ∂xi

∂ ũ
where we have used the fact that ∂x i is simply the derivative of u in the i-th

direction in coordinate space, which allowed us to introduce the operator in the


next line. We have clearly found the coefficients of X, and they simply turn out to
be the tangent-vector of γ, but in coordinate space, which should show that what
we are doing is not nonsense. What else could they even have been?
We can summarize this in a box.

Definition 5.2.4: Coefficients of vector in standard basis

Let X be a vector (in the second definition sense) at p on M and γ a curve


that X belong to, so that X · u = du(γ(t))
dt . Then the coefficients of X are:

dγ̃ 1 dγ̃ n
(X 1 , . . . , X n ) = ( ,..., ) (5.23)
dt dt
where γ̃ = Ch ◦ γ is the curve γ, but in coordinate space, not the manifold.

 

5.2.3 Proving that ∂xi i=1,...,n
is a basis


We now want to now show that ∂x i
i=1,...,n
is a basis. This means we need to
show three things. Firstly, that all vectors spanned by the basis are in Tp M , and
secondly that all vectors in Tp M are spanned by the basis and thirdly that the basis
is linearly independent.
156 CHAPTER 5. TANGENT VECTORS

Figure 5.6: We can use the coordinate lines from a chart Ch to define our
basis-vectors of Tp M , by projecting them onto the manifold.
5.2. TANGENT VECTORS AS DIRECTIONAL DERIVATIVES 157



Proposition 5.2.1: The vectors ∂xi i=1,...,n form a basis of Tp M



The vectors ∂xi i=1,...,n form a basis of Tp M , that is:

• Tp M ⊂ span( ∂x ∂
, . . . , ∂x∂n )
 
1

• span( ∂x

, . . . , ∂x∂n ) ⊂ Tp M
 
1

• The basis is linearly independent.


We note that dim(Tp M ) = n = dim(M ), which is not too surprising since
Tp M are all the directions in which you can vary (differentiate), locally, on
M.

But we have already shown the first claim in the last section, when we found
the coefficients of a general vector in Tp M , because we wrote X out as a linear
combination of this basis. So we only have the second and third claims to prove.
The proof of the second Pn claim iseasy to understand. We need to show that
any (X 1 , . . . , X n ) = i=1 X i ∂
∂xi is in Tp M , which means we need to find
a curve that generates these coefficients. But the curve (in coordinate space)
γ̃(t) = t(X 1 , . . . , X n ) will do3 . Figure 5.2.3 should make this almost obvious.

Figure 5.7: We can use the curve that just ”extends” X in coordinate space
for our proof

The only thing left to show is that the basis is linearly independent, which we
leave to you as an exercise, it’s not too hard.

3 Here, we are still using the mentioned simplification, which is that Ch(p) = 0. If you

don’t want this, this would just be p̃ + t(X ! , . . . , X n )


158 CHAPTER 5. TANGENT VECTORS

5.3 Tangent vectors from curves


The third definition we introduce is, intuitively, that tangent vectors are ”small”
pieces of curves. Now, you can understand this geometrically quite easily, you can
have a look a figure 5.3, for example.

Figure 5.8: We can see vectors as small pieces of curves, locally, if we think
of them geometrically.

Mathematically, however, it is not as simple as that, specifically, which part of


a curve we call small or large is not something clear, so we need to attack this with
a different approach. Instead of finding some measure of small/local and having
a headache with that, we can do the exact opposite. Let’s take all the curves, no
matter the ”length”, which we have not yet even defined. That definitely leads to
the problem of us having too many curves, there will be two curves that locally
look the same, i.e., that locally look the same. How can two curves locally look the
same? Well, we can recycle definition two, and say that they locally look the same if
they induce the same differential operator X op . With that, we can get the ”small”
part of the curve, by using an equivalence relation, which tears away anything that
is non-local.
5.3. TANGENT VECTORS FROM CURVES 159

Definition 5.3.1: Tangent vector definition 3: Through curves

Let γ and β be two curves on M , so that γ(0) = β(0) = p. Then we call


these equilient, if they induce the same operator, as per the second definition
of tangent vectors.
γ ∼ β ⇐⇒ Xγop = Xβop (5.24)
Then we can define vectors as the equivalence classes of this equivalence
relationship.

This definition is a lot more geometric and is useful to think about in very
pictorial settings. You can see the equivalence relationship in figure 5.3

Figure 5.9: A lot of curves, even very wild ones are equivalent in our defini-
tions. These wild behaviours are cut off by the equivalence relationship, so
we have a way of talking about ”small” parts of the curve
160 CHAPTER 5. TANGENT VECTORS

5.4 Tangent vectors as operators that satisfy the


product rule.
There is one last definition of tangent vectors, which we want to mention, but not
go into very much detail about. This one is very much more on the abstract side,
compared to the last few, and we only really want to show that it exists. The idea
we generalize is the Leibniz or product rule, which should be very familiar to you.
For first derivatives, in general, an equation of this form is correct:

D(uv) = uD(v) + D(u)v (5.25)

for some functions u, v. It turns out that, like the directional derivative, we can also
generalize this. We can take the space of all linear transformations that take smooth
functions on M as the input and output a real number, Hom(C ∞ (M ), R), which
contains things like tangent vectors (as derivatives), but also things like multiplying
functions with the number two, which is clearly not a derivative-operator.
We can then notice, that derivative operators should satisfy the product rule,
while non-derivative operators do not. For example, with multiplying by two, in
general, we have:
2 · (uv) ̸= u(2v) + v(2u) = 4uv (5.26)
ith this we get to our definition:

Definition 5.4.1: Tangent vector definition 4: The Leibniz-rule

Let X op be an element of Hom(C ∞ (M ), R). We call X = (p, X op ) a


tangent vector at p, if X op satisfies the Leibniz rule at p, that is if:

X op (uv) = X op (u)v(p) + u(p)X op (v) (5.27)

for all u, v ∈ C ∞ (M )

It is quite tricky to work with this definition, because we don’t have charts or
derivatives in it, so one needs to be very algebraic and it all turns into a tricky,
non-digestible mess quickly, so we won’t pursue this further.
What we want to mention is that the vectors from the previous definitions we
have do satisfy the Leibniz rule, you can use it comfortably. The proof is similar
to what we did before, using lots of chain rules, working in charts and applying the
Rn Leibnitz rule.

5.5 Change of coordinates


We want to see what happens to tangent vectors when you change from one chart
to another. This is quite useful since often one thing might be easy to calculate in
one chart, but then another thing might be easier to calculate in another chart but
uses the first. You are probably familiar with the situation from calculus or physics,
5.5. CHANGE OF COORDINATES 161

where often for example, something is easier in polar coordinates, and sometimes
in cartesian. We want to generalize this to any coordinates.
More specifically, we want to look at how tangent vectors transform. For this,
fix M, p and two charts Ch1 , Ch2 and a tangent vector X.
n  
X ∂
X= Xi (5.28)
i=1
∂xi Ch1

where X 1 , . . . , X n are the components of X in the basis of the first chart. What
happens if we change the charts? That is, what are the components in the other
chart, Ch2 ?
We can start by answering first how we can express the basis vectors of the old
chart as the basis vectors of the new chart.
We
 do this, by unravelling the definition of the differential operator that is

∂xi Ch1 .
We know that:
 
∂ ∂
i
·u= (u ◦ Ch−1
1 ) (5.29)
∂x Ch1 ∂xi
in the first coordinate space. But we know how to transform coordinates. We call
the coordinates in the second chart y 1 , . . . , y n . We can change to the second chart
by inserting an identity, that is by going to the second coordinate space and back
and then using the Rn chain rule:
 
∂ ∂
i
·u= (u ◦ Ch−1 1 ) (5.30)
∂x Ch1 ∂xi

= (u ◦ (Ch−1
2 ◦ Ch2 ) ◦ Ch1 )
−1
(5.31)
∂xi

= ((u ◦ Ch−1 −1
2 ) ◦ (Ch2 ◦ Ch1 )) (5.32)
∂xi
n 
∂(u ◦ Ch−12 ) ∂(Ch2 ◦ Ch−1
1 )
j
X 
= j i
(5.33)
j=1
∂y ∂x
n
∂(Ch2 ◦ Ch−1 1 )
j
 
X ∂
= i
(5.34)
j=1
∂x ∂y j Ch2

We have found the transformation of the basis vectors of one chart into the vectors
of another. We can thus see how a vector X, which is of course nothing less than
a linear combination of the basis vectors in the first chart, transforms.
n n X n −1 j 
i ∂(Ch2 ◦ Ch1 )
  X 
X ∂ ∂
X= Xi = X (5.35)
i=1
∂xi i=1 j=1
∂xi ∂y j Ch2
n n
!
−1 j
i ∂(Ch2 ◦ Ch1 )
  X 
∂ X ∂
= = X (5.36)
∂xi j=1 i=1
∂xi ∂y j Ch2
(5.37)
162 CHAPTER 5. TANGENT VECTORS

So we have found how the coefficients change. If we call the coefficients in the
second basis (in the second chart) X̂ 1 , . . . , X̂ n , then we can write them as:
n
X ∂(Ch2 ◦ Ch−1
1 )
j
X̂ j = Xi (5.38)
i=1
∂xi

Note that this is just a matrix multiplication, and we can write it simpler using our
transformation notation, since Ch2 ◦ Ch−1 1 = T1→2

XCh2 = dT1→2 XCh1 (5.39)

Where XCh2 is the vector X as a column in the second basis, while XCh1 equiva-
lently in the first basis.
At this point, we want to introduce a new notation, which is very common in
much of differential geometry literature, and understandably so. We can think of
the coordinates of the first charts as functions:

x1 (p), x2 (p), . . . xn (p) (5.40)

which gives every point on the covered subset of the manifold a real number. That
is xi : U1 ⊂ M → R.
We do the same thing with the other charts and call them

y 1 (p), y 2 (p), . . . , y n (p) (5.41)

We can then do the above calculations with these as actual partials. But what is
the transition map, then? Well, as you can imagine, it takes the coordinates of a
point in the first chart and spits out the corresponding coordinates in the second
chart:
T (x , . . . , xn ) y (x , . . . , xn )
 1 1   1 1 
 T 2 (x1 , . . . , xn )   y 2 (x1 , . . . , xn ) 
T = = (5.42)
   
.. .. 
 .   . 
T n (x1 , . . . , xn ) y n (x1 , . . . , xn )

so we can write:
∂y j ∂T1→2 ∂(Ch2 ◦ Ch−1
1 )
j

i
for i
= i
(5.43)
∂x ∂x ∂x
and the formula, as:
n
j
X ∂y j
XCh = i
XCh (5.44)
2
i=1
∂xi 1

which looks very similar to the real number chain rule. Note, however, that there
is a bit of an abuse of notation here, and that it is certainly not mathematically
enough to use the chain rule without thought with these, there is more behind them.
5.6. DIFFERENTIATION OF A FUNCTION BETWEEN MANIFOLDS 163

5.6 Differentiation of a function between mani-


folds

We finally want to address the question of a proper real derivative of a function


between two manifolds. We have used many derivatives up till now, but they were
all either inside a chart or maybe a parametrization of a curve.

Let us now go to the case where we have two manifolds, M, N and a smooth
function f : M → N between them. What is the derivative of f , df (p). We expect
it to be similar to a jacobian in Rn , taking tangent vectors to tangent vectors.

Figure 5.10: You can see two manifolds, M and N and a function between
them. We want a derivative df , that is a function that takes tangent vectors
of M (X) to tangent vectors of N (Y ).

How can we do this? The geometric idea is quite simple. We know f and
want to know what happens with a vector X ∈ Tp M . That vector is created by
some curve on M , call it α through the second definition. We know what happens
with the curve if we use f on the manifold, it gets mapped to some other curve
β = f ◦ α. But then the vector associated with β at p̃ = f (p) should be the vector
the derivative maps X to!
164 CHAPTER 5. TANGENT VECTORS

Definition 5.6.1: The derivative df (p) of a function at p

Let f : M → N be a smooth function and p ∈ M a fixed point. Then we


define the derivative of f at the point p, df (p) as a function of the tangent
space at p, Tp M , to the corresponding tangent space of N at p̃ = f (p),
Tp̃ N , by the following relationship. If X ∈ Tp M is created by the curve α,
that is, if for all u ∈ C ∞ (M ):

d
X ·u= u(α(t)) (5.45)
dt 0

then
Y = (df (p)(X) ∈ Tp̃ N (5.46)
is defined by:
d
Y ·v = v(βt) (5.47)
dt 0
for all v ∈ C ∞ (N ) and where β is the curve β = f ◦ α.

Notice that nowhere in the definition did we use any charts whatsoever! This is
a purely geometric, chart-independent object we have.
There is, however, a few things we need to make sure work if this definition is
to make sense.

• Firstly and most obviously, df (p)(X) needs to be independent of α, it needs


to be well defined.
• Secondly, it is a derivative, we want it to be linear.

The second claim is easy to prove if you do it in charts, so we leave it to you as an


exercise.
The first one is a bit more interesting and introduces a new idea.
Let us see what happens, if we evaluate Y · v for some v ∈ C ∞ (N ) and a Y
gotten from the definition for a particular choice of α.

d
Y ·v = v(β(t)) (5.48)
dt 0
d
= v(f (α(t))) (5.49)
dt 0
= X · (v ◦ f ) (5.50)

So Y · v = X · (v ◦ f ), the right side of which is definitely independent of α, which


means that Y is too.
The new idea we get is that:

Y · v = X · (v · f ) (5.51)
5.7. THE CHAIN RULE 165

which we will call proposition X from now on, whenever we use it, which will be
in the next section when we use it to prove the chain rule for functions between
manifolds.

5.7 The chain rule


To remind you, if you have Rm , Rn , Rp and f : Rm → Rn and g : Rn → Rp both
smooth, then the chain rule says that:
d(g ◦ f )p = dgf (p) ◦ dfp (5.52)
where d is the jacobian.
We will show that this result is true in the case of manifolds as well.

Theorem 5.7.1: The chain rule


Let M, N, P be manifolds and f : M → N and g : N → P be smooth
functions. Then for p ∈ M the chain rule holds:

d(g ◦ f )p = dgf (p) ◦ dfp (5.53)

You can show this in two ways. You can either express this whole thing in
coordinates and ”inherit” the chain rule from Rn into the whole thing or you can
do it directly and abstractly. Neither is better, but since many of our proofs until
now have been leaning more toward the first type, we will do it with the second
method instead.
Proof. We need a bit of setup since we have a lot of players in this proof. We
have the manifolds, the maps, the vectors and the general smooth functions we
need for the vectors to act on. We show all the players in figure ??
Our strategy is to write each of the parts of the chain rule equation using
proposition X and then collect them together.
df (p)(X) · v = X · (v ◦ f ) (5.54)
dg(q)(Y ) · w = Y (w ◦ g) (5.55)
d(g ◦ f )(p)(X) · w = X · (w ◦ (g ◦ f )) (5.56)
We can now set Y = df (p)(X) and q = f (p) and insert into the right side of the
chain rule.
dg(f (p))(df (p)(X)) · w = df (p)(X) · (w ◦ f ) (5.57)
= X · ((w ◦ g) ◦ f ) (5.58)
= d(g ◦ f )(p)(X) · w (5.59)
So we get the desired equation:
d(g ◦ f )p = dgf (p) ◦ dfp (5.60)
since w, X and p were all general.
166 CHAPTER 5. TANGENT VECTORS

Figure 5.11: You can see all the actors we need in the proof in this figure.

5.8 The coordinate expression for df (p)


We have until now been on a completely abstract level, not using coordinates what-
soever. We want to now complete our discussion of the derivative of a function by
deriving a coordinate expression.
Let us say we have two manifolds again, M and N , the general vector X in
Tp M , two charts ChM and ChN which map at least the relevant parts of M and
N . What is the coordinate representation of df (p) : Tp M → Tq N where q = f (p).
Let us write X = (X 1 , . . . , X n ) and Y = df (p)(X) = (Y 1 , . . . , Y n ) for the
vectors X and Y in the standard basis at p of the charts ChM and ChN , respectively.
We need to compute Y = df (p)(X) in the coordinates. Let’s start by writing
out the definition and using proposition X. (v ∈ C ∞ (M ))

Y · v = x · (v ◦ f ) (5.61)
m  
X ∂
= Xi (v ◦ f ) (5.62)
i=1
∂xi p,ChM
m
X ∂
= Xi (v ◦ f ◦ Ch−1
m ) (5.63)
i=1
∂xi p̃,ChM
m

(ṽ ◦ f˜)
X
= Xi (5.64)
i=1
∂xi p̃,ChM

where in the last expression, the tilde means that the functions are their represen-
tations in charts, and the partial derivative becomes the simple Rn partial we all
know and love. We can then use the Rn chain rule.
5.8. THE COORDINATE EXPRESSION FOR df (p) 167

Figure 5.12: We present the standard picture with functions between man-
ifolds again, with the charts ChM and ChN . Our goal is to find the
coordinate representation of df (p)
168 CHAPTER 5. TANGENT VECTORS

∂ṽ ∂ f˜j
m X
X n
= Xi (5.65)
i=1 j=1
∂y j q̃ ∂xi p̃
(5.66)

where the partials in y are in the chart of N and the ones in x are the ones belonging
∂ ṽ
to the chart of M . If we rearrange a bit, and realize that we can turn ∂y j back

into the operator, and that these are simply the standard basis vectors at q̃ in ChN ,
we get:

∂ f˜j
n m
!
X X ∂ṽ
= i
X i
(5.67)
j=1 i=1
∂x p̃ ∂y j q̃

So the coefficients of Y are then:

∂ f˜ i
Xm
Yj = X (5.68)
i=1
∂xi

Again, we find a result that is parallel to the chase of Rn , since if f were a map
from some Rm to some Rn , we would get the exact same result. The (column)
vector Y we get is simply the Jacobean (in the chart) used on X (in the chart)!
We can introduce a new matrix notation, for the chart Jacobean of df (p). We
can write:
∂ f˜j
df (p)ji = (df (p)ChM ,ChN )ji = (5.69)
∂xi
Then we can write the above result as:

Y j = df (p)ji X i (5.70)

We can also rewrite the chain rule in this notation. It becomes:


n
X
d(g ◦ f )ki = dg(f (p))kj df (p)ji (5.71)
i=1
Chapter 6

Tangent Spaces and


Tangent Bundels

The clear continuation of the discussion of what a vector is on a manifold is the


discussion of vector fields. Vector fields are ubiquitous throughout many calculus
results in more than one dimension, and it shouldn’t be too surprising that they are
also fascinating on manifolds. Unlike Rn , however, we will not just quickly define
a vector field to be a function which takes a point on the manifold and spits out a
vector in Tp M . There is a whole lot of geometry we can do and get an interesting
view of vector fields as things that live in a bigger space than the manifold, called
the tangent bundle.

6.1 First example


We want to start this chapter not with a definition, but an example, which will
make the definitions a lot clearer. Let us choose the first example of a manifold as
the circle of radius one, S 1 . Let us still think of it as embedded in R2 . We want
to play around with tangent spaces and vector fields. Let us start with a circle and
some (smooth) vector field as you can see in figure 6.1.
We have drawn the tangent spaces (a few examples) into the figure as well.
This is the space tangent-vector fields live in, it is the collection of all the tangent
spaces.
But geometrically, we can do a trick. If we flip all of the tangent spaces at once
(all in a consistent direction), we get a cylinder! Have a look at figure 6.1 Now, if
we connect the dots between the arrows, we can see that a vector field on the circle
is nothing else than a special kind of curve on the cylinder. What is special about
this curve? Well, it cannot be any curve on the cylinder, it needs to be a ring, since
otherwise, we would have places on the circle where the tangent vector would have
two or more vectors.
We call this cylinder, in the general case, the tangent bundle. In general, vector
fields become substructures (like curves, surfaces, etc) of this tangent bundle, and

169
170 CHAPTER 6. TANGENT SPACES AND TANGENT BUNDELS

Figure 6.1: A vector field on the circle. We can draw the tangent spaces,
the Tp M ’s into the picture (only a few shown). This whole space (with all
of the tangent spaces) is where vector fields live

in some sense, we have found that vector fields aren’t all that different from (sub-)
manifolds themselves.

6.2 Definitions and Conditions


We will now provide the definitions of the ideas from the last section. The tangent
bundle is the first one.

Definition 6.2.1: The tangent bundle T M

Let M be a smooth manifold. The tangent bundle T M of M is the collection


of all of its tangent spaces.
[
TM = Tp M (6.1)
p∈M

This union is disjoint since all vectors always have the point p as a part of them.
Mathematically, the tangent bundle is a subset of M × Hom(C ∞ , R).
This definition is clear by the example with the circle from the last section, we
simply take all possible tangent spaces together at once and, later, consider objects
on it. Clearly, as you can see in 6.1, the manifold M itself is a subset of T M ,
trivially in the geometric sense, and mathematically as M × {0}, where 0 referee to
the linear map that takes a smooth function on M and spits out the number 0.
6.2. DEFINITIONS AND CONDITIONS 171

Figure 6.2: If you turn all of the tangent spaces, you end up with a cylinder!
Tangent vector fields simply become curves on a cylinder (all be it ones with
a few special properties). Even though we draw the vectors as pointing up,
they are still tangent vectors living in tangent spaces, this trick is only one
that we do in our head, there is no mathematical transformation here.
172 CHAPTER 6. TANGENT SPACES AND TANGENT BUNDELS

Proposition 6.2.1: The tangent bundle is a manifold

The tangent bundle T M of a manifold M is in a natural way a smooth man-


ifold, you can construct charts for T M out of charts for M . Its dimension
is 2n, where n is the dimension of M .

This shouldn’t be too surprising. After all, we found that for the circle at least,
the tangent bundle was the cylinder, which is a 2-dimensional manifold.

Proof. We want to show that T M is a 2n-dimensional manifold. For this, we


need to construct charts and we want to do this using charts from the manifold.
How can we do this? Well, the construction isn’t too complicated, as you might
imagine. We just use the coordinates of the points on the manifold and the
coordinates of the vectors and put them in a tuple. Let’s say we have a chart
(U, Ch) on the manifold M . We first need to find the region of T M that we can
describe. This should be rather clear if you look at figure 6.2.
We know how to describe points on U and know how to find the coordinates
of vectors (of any length), which are situated at any point of U .
So the region we can chart is:
[
TU = Tp M (6.2)
p∈U

Now the coordinates of a point of T U are given very simply. If the coordinates
of a point p ∈ U are x1 , . . . , xn and the components of a vector X situated at p
are X 1 , . . . , X n , then the new chart, which we will call ChT U is:

ChT U ((p, X)) = (x1 , . . . , xn , X 1 , . . . X n ) (6.3)

We can also write this as:

ChT U ((p, X)) = (Ch(p), dCh(X)) (6.4)

The only part that is left to prove, is that this is truly a chart and that you get
an atlas from these charts. Then you can take the maximal atlas and you have
yourself a manifold. None of these parts are too enlightening, so we leave them
unproven, hoping that the above charts are reasonable enough to convince you
that it will work.

6.2.1 The projection map


A tool which often comes in handy while working with tangent bundles is something
called the projection map. It is the function that projects a vector in T M onto the
point it sits.
6.2. DEFINITIONS AND CONDITIONS 173

Figure 6.3: We can construct a chart for T S 1 from a chart of S 1 . If U


is the part of M that Ch charts, then we can chart all of the part of
T M which contains all of the tangent spaces of U , because we know the
coordinates of p and X for these, and only these, points
174 CHAPTER 6. TANGENT SPACES AND TANGENT BUNDELS

Definition 6.2.2: The projection map π

The projection map π, takes vectors in T M and spits out the point on M
they live at. More precisely:

π : TM → M (6.5)
(p, X) → p (6.6)

Proposition 6.2.2: The projection map is smooth

The projection map π(p, X) = p is smooth.

Proof. Per our definition of smoothness, we only need to find, for any point
(p, X) ∈ T M , an admissible pair of charts for which the function is smooth in
coordinates. As you might remember, we only need to show that one admissible
pair exists per point. This is enough to guarantee that the map is smooth in
all admissible charts. Let (p, X) ∈ T M be a vector in the tangent bundle.
Let(U, Ch) be a chart so that p ∈ U . We can take the admissible pair of charts
(U, Ch) and the from it constructed (T U, ChT U ), since π(T U ) = U and U is
trivially a subset of U . In these coordinates, the map π is:
π̃ : (x1 , . . . , xn , X 1 , . . . , X n ) → (x1 , . . . , xn ) (6.7)
This map (as a map from R2 n → Rn ) is obviously smooth, and since we found
a map like this for any pair (p, X), we are finished.

6.3 Vector fields


We have done half the work we need to properly explain the example from the first
section (S 1 and T S 1 ). We have found a working definition of the tangent bundle,
which in the example was the cylinder. Now we need to find a definition of a vector
field. The definition is simple. A vector field is a function of M , which goes to the
tangent bundle, with the two conditions that it’s smooth and that the vector p gets
mapped to is actually at p, i.e. that it is in Tp M .

Definition 6.3.1: Vectorfield


Let M be a smooth manifold and T M its tangent bundle. Then a smooth
vector field is a smooth function:

X : M → TM (6.8)

so that X(p) ∈ Tp M .

There are three different ways to think of vector fields in this context. Firstly,
you can think of it as arrows on the space, where each point gets its own vector.
6.4. A FEW INTRIGUING EXAMPLES 175

This is the classical, calculus, approach to vectors. You can also think of them in
terms of objects on the tangent-bundles. For the circle, for example, they are curves
on a cylinder (=T S 1 ), which have a few restrictions on them. Both of these were
already explored. The last way is in coordinates.
Let’s say you have a chart (U, Ch) and a vector field X. What does that mean
in coordinates? You shouldn’t be too surprised that this simply means that the
coefficients of the vectors (at each point) are functions of the points on M .
n  
X ∂
X(p) = i
XCh (p) (6.9)
i=1
∂xi p,Ch

We will not make the difference between the maps X i : M → R which give the
coordinates as functions of the point on the manifold and the functions (called the
same) X i : Rn → R, which give them as functions of the coordinates of the points.
We want to end this section by saying a thing about smoothness.

Proposition 6.3.1: Smoothness of vector fields

vector field X is smooth if and only if the coordinate functions are smooth
in some chart near every point p. This is also equivalent to them being
smooth in every chart.

This proof is a very straightforward consequence of our definition of what a


smooth function is. You might remember that a smooth function is smooth if
around every point you find an admissible pair of charts for which it is smooth in
these charts and this also forces the function to be smooth for all admissible pairs
of charts. The proof of the proposition is basically the last sentence, except you
need to do a bit more unravelling of the definition, which we leave to you.

6.4 A few intriguing examples


If you are a non-mathematician, or even if you are, you might wonder why we have
introduced all of this notion of a tangent bundle, when, at first sight, it does not
look like much. After all, for the circle, the tangent bundle in some sense is simply
S 1 × R. Generally, it seems like every tangent bundle is just something like M × Rn ,
since every tangent space is (coordinate definition) really just Rn . This is wrong
though! Oh, you might counter, this sounds like something that is wrong only for
some weird constructed examples, which never happen in real life. Well, let us
counter you with the sphere.
It turns out that T S 2 ̸= S 2 × R2 as a manifold!

6.4.1 The sphere


You are probably wondering why it is wrong that the tangent bundle of the sphere
is not just S 2 × R2 . It has to do with topology, a hedgehog and hairstyles (not
kidding).
176 CHAPTER 6. TANGENT SPACES AND TANGENT BUNDELS

There is a well-known theorem from topology, that states that you cannot comb
a hedgehog without giving it a bald spot. If you are confused, maybe you should
look at the theorem below.

Theorem 6.4.1: You can’t comb a hedgehog, without giving it a


baldspot

There does not exist a smooth non-zero (tangent) vector field on the sphere.

If you are still confused, let us explain with an example. Imagine that vectors,
which are usually drawn as arrows, are hairs/spikes, and that the sphere is, in reality,
a curled-up hedgehog. If the hedgehog is really curled up perfectly and you comb it
(make the hairs/vectors tangent to the hedgehog/sphere), it will have a bald spot
(a place where X = 0). This was the analogy that some mathematicians came
up with for this theorem, and it has stuck, especially because of how visual the
theorem is. We won’t prove it here, but we will give an example which should be a
bit convincing.

Example 6.4.1: Hair on a hedgehog

Let X(x, y, z) = (−y, x, 0) be a vector field on R3 , which we put on the


sphere.

Figure 6.4: The vector field from the example

You can hopefully imagine that this vectorfield is smooth, and where the
hair/hedgehog analogy comes from. Now notice that the closer you get to
the north pole you get, the smaller the hairs become and at the north pole,
you get a length of zero or a bald spot. (The same thing happens at the
north pole).
6.4. A FEW INTRIGUING EXAMPLES 177

What this means for the tangent bundle

Now what does this hedgehog theorem have to do with the tangent bundel of the
sphere? Well, let us for a minute imagine that, in fact, T S 2 = S 2 × R2 , as a
manifold. Then take the function:

X(p) = (p, (1, 0)) (6.10)

This is a smooth vector field if we assume that T S 2 = S 2 × R2 . But it is also


nonzero everywhere, which is in violation of the Hedgehog theorem! But the only
assumption we made is that T S 2 = S 2 × R2 . So this can just not be true. In truth,
there is a twist in the tangent bundle of the sphere, making it something different
from S 2 × R2 . You can see the idea in figure 6.5. For the upper and lower half
of the sphere, it is true that, for this part, the tangent bundle is simply that part
of the space cross R2 , but to connect them into the whole tangent bundle of the
sphere, you need to build in a twist. That twist is the reason why we can’t have a
smooth nonzero vector field on S 2 .

Figure 6.5: To get the tangent-bundle of the sphere, you need to take the
sphere apart into a lower and upper half. create the tangent-bundles for
these two, which are just U × R2 , and then glue them together, but with
a twist, which ensures that the hedgehog theorem is true.

What can we see from this example? Firstly, the main lesson is that the tangent
bundle is not a trivial object, it is not always the same as just the space times R2
and that this has big implications on the sort of vector fields that can even exist
on the manifold. Another thing you might have already guessed at is that locally,
however, the tangent bundle is just the open set ×Rn .
178 CHAPTER 6. TANGENT SPACES AND TANGENT BUNDELS

Proposition 6.4.1: Locally, T U = U × Rn

Let M be a smooth manifold. Then for every point p ∈ M there exists a


small enough open subset U ∈ M , with p ∈ U , so that:

T U = U × Rn (6.11)

The proposition in the end is simple to prove in charts and since there always
exists a chart that covers a small region around p, it is general.

6.4.2 The circle, again


Now, we want to quickly mention an assumption we made at the beginning of the
chapter. As our first example, we picked a circle and constructed its tangent bundle
out of the tangent spaces, which were rays and said what came out was a cylinder.
Was there ever a different possibility? Yes, in some sense. The only other reasonable
possibility would have been a Mobius strip. (One that extends until infinity) The
wall of the cylinder could have had a flip in it, similar to the tangent bundle of the
sphere, which had its twist. But on the circle, we can construct a nonzero vector

field, namely ∂ϕ , so it could never have been the Mobius strip, since there, like in
2
T S , there are no nonzero vector fields (Exercise: why?)

6.5 The sphere, torus and Klein bottle


The question we want to ask and answer, is, how many smooth, linearly independent
vector fields do the three manifolds, the sphere, torus and Klein bottle have?
First, what is the Klein Bottle? We already showed how you can construct a
torus from a piece of paper you glue in a special way. The Klein bottle is constructed
similarly, except for how you glue it together. One of the directions is reversed, as
you can see in figure 6.6
The figure 6.6 should make it reasonable that the torus has two linearly inde-
pendent vector fields (the most it could have, as a two-dimensional surface, since
Tp M ≈ R2 , while the Klein bottle has only one, since for the other second direction
you get a discontinuity, so you cannot build a non-zero smooth vector field that is
linearly independent of the first, and the sphere has non, per the hedgehog theorem,
since 0 is not linearly independent.
This seems astounding if you think of it. We cannot construct a smooth global
basis for some manifolds, even those as simple as a sphere.

6.6 A few last things about spheres


We have explored S 1 and S 2 and have found very different results. The first has
a simple tangent bundle, the second doesn’t. The first has smooth nonzero vector
fields, the second doesn’t. Can we somehow figure things like this out for S n ?
6.6. A FEW LAST THINGS ABOUT SPHERES 179

Figure 6.6: The construction of the torus and klein bottle. The only differ-
ence between them is that one of the arrows on the Klein bottle is flipped.
But from this, you can see that while the first linearly independent vector
field works on both (you get no discontinuities, the second type does not.
If you go over the seam on the Klein bottle you switch directions and when
you return (to the middle) you find you have a discontinuity.
180 CHAPTER 6. TANGENT SPACES AND TANGENT BUNDELS

Yes, we can. Your first guess might be that it has something to do with n being
even or odd since 2 is even and 1 is not. Extrapolating from two data points is
dangerous, but pays off sometimes. You would be partially right. The result about
whether or not S n has at least one smooth nonzero vector field caries over like this.
The result about the structure of T S n does not, however.

Proposition 6.6.1: T S n = S n × Rn only for n = 1, 3, 7

The structure of T S n is trivial (T S n = S n × Rn ) only for n = 1, 3, 7.

Why? Well, from the depths of linear algebra, you might vaguely remember
that these are the dimensions in which you can have a vector product and also that
this has something to do with extending complex numbers. The reason is that S 1
are the unit complex numbers, S 3 the unit quaternions and S 7 the unit octonions,
but these are also the only extensions of the complex numbers.

Proposition 6.6.2: When does S n have nonzero vectorfields

The manifold S n has at least one non-zero smooth vector field if and only
if n is odd If n is even, it has none.

You might ask how many non-zero vector fields there are for each odd n. That is
a complicated question, which has been answered though. It has, for some reason,
something to do with the binary expansion of n, which is almost unbelievable. Why
would vectorfields on spheres count the number of 1’s in the binary expansion of
n? You can google this result if your curiosity is peaked.
Part III

Creating and Embedding


Manifolds

181
183

We have developed manifolds in great detail in the second part of this course,
but we didn’t really talk about where you get the manifolds you work with. We also
didn’t talk about things like manifolds living in manifolds or orientability, which are
the topics of this part of the lecture. We want to get some manifolds to be able to
work with them in later parts of the lecture.
184
Chapter 7

Submanifolds and how to


get them

We want to have a quick discussion on submanifolds. Submanifolds are, of course,


manifolds that live inside of a bigger manifold. A few examples include curves on a
manifold, surfaces embedded in R3 , but also submanifolds of the same dimension,
like the full sphere living in a bigger full sphere. We want to put forward a reasonable
definition, talk about a few examples and then move on to constructing submanifolds
through some tools, constraints or functions.

7.1 Definition of a submanifold, Compatibility


with manifolds
We begin with a definition, which in some sense just says that a submanifold is one
that locally, in charts, looks like an Rm .

Definition 7.1.1: Submanifold

Let N n be a (smooth) manifold of dimension n, and M m a subset of N .


M m is a smooth manifold of dimension m, if for all points p ∈ M , there
exists a chart (U, Ch), such that:

• p ∈ U , of course
• The mapped piece of M gets mapped onto the Rm living in Rn .
Ch(M ∩ U ) = (Rm × {0}) ∩ Ch(U )

You can see a picture in figure 7.1.


As a consistency check, one could check that a submanifold one gets from the
above definition is actually really also a manifold. That’s one of those questions
that if you tell someone from the non-math department, they laugh at you, but we

185
186 CHAPTER 7. SUBMANIFOLDS AND HOW TO GET THEM

Figure 7.1: A curve (= submanifold M ) locally gets mapped onto the real
line, if we find a chart like the one that we need in the definition.

do want consistency.

Proposition 7.1.1: Consistency check

A submanifold is a manifold in a natural way.

We won’t prove it, but we will point out a few steps. The main idea is to build
(M )
an atlas from all the charts of the above definition. Let AN be the collection of
charts (not an atlas, even though named A) that satisfy the second condition of
the definition. Then we can reduce each of these charts to ones of M :

(U, Ch) → (V, ChM ) (7.1)

where:
V = U ∩ M and ChM = Ch M ∩U
: M ∩ U → Rm (7.2)
and create and atlas AM for M . These charts cover M , since M is a submanifold
and therefore we find a chart for every p ∈ M . The only thing left to prove is that
these are charts and a few technicalities.
A noteworthy thing is that the maximality of the atlas gets inherited. AM is
already a maximal atlas.

7.2 Examples of submanifolds.


We want to now give a few examples of submanifolds, some of which should be
rather familiar.
7.2. EXAMPLES OF SUBMANIFOLDS. 187

Figure 7.2: The types of charts we take for the atlas of M . They are all the
charts which satisfy the second condition of the definition of a submanifold,
mainly that they take a piece of M to a Rm ⊂ Rn .
188 CHAPTER 7. SUBMANIFOLDS AND HOW TO GET THEM

Example 7.2.1: Examples of smooth manifolds

• Any open set M ⊂ N is a submanifold. This makes sense since the


property of being a manifold is local, and therefore open subsets of N
should be submanifolds.

• Curves on M are one dimensional submanifolds on M . Actually, this


is in some sense a definition of a curve.
• Surfaces in R3 are submanifolds of R3 . Any set M ⊂ Rn which can
be locally written as a graph of course is a manifold, embedded in Rn .

• S n in S n+1 is a submanifold. You can find the circle in the two-


dimensional sphere, and you can find the n-dimensional sphere in the
n + 1-dimensional sphere.

Figure 7.3: You can find S n in S n+1

• Classical groups of linear algebra O(n), U (n), SL2 (R) etc. are sub-
2
manifolds of Rn×n = Rn

7.3 Construction and Verification of submani-


folds
While the definition of a submanifold technically gives you a way to check whether
something is a submanifold, it is clearly not very practical to try to basically construct
an atlas for the submanifold every time you want to check whether some curve is a
submanifold and not somehow problematic. The definition also requires the set to
be given in some accessible way, which is not always the case. Where do you even
get submanifolds from?
We will restructure these questions in the next list, already hinting at some of
the answers.
• When does a parametrization give a submanifold? That is, when is the image
of a map a submanifold?
7.4. IMMERSIONS, SUBMERSIONS AND EMBEDDINGS 189

• When can we get a submanifold from constraints, for example, that f (x, y, z) =
x2 + y 2 + z 2 − 1 = 0. That is, when is the preimage of a map a submanifold?

We can also write this a bit more precisely.

• Let f : L → N be a smooth map. When is M = f (L) a submanifold of N ?

• Let g : N → P be a smooth map. If L ⊂ P is a submanifold, is g −1 (L) a


submanifold of N ?

The tools we will develop for this will be immersions and submersions which will
be the partial answer to the first and second questions respectively.

7.4 Immersions, Submersions and Embeddings


The first question we had was basically about when what we normally think of as a
parametrization is actually a good parametrization. The idea is similar to the one
we presented for curves in the very first chapter. There, one of the main criteria of a
definition of a curve is that the derivative of the parameterization does not vanish.
This was important since otherwise we would get spots which weren’t smooth.
Here the same idea is true, although the conditions will be slightly different. We
require the derivative of the parametrization to be injective. Such a map is called
an immersion.

Definition 7.4.1: Immersion


Let f : M m → N n be a smooth map. It is called an immersion if for every
p ∈ M , dfp : Tp M → Tf (p) N is injective.

This is of course equivalent to the kernel of dfp being empty for all p. It also
forces m ≤ n, since you can’t have an empty kernel otherwise. This is, of course,
obviously sensible, since you can’t parameterise into a smaller space than the amount
of variables you have.
The helping tool for the second question will be in some sense the ”dual” to an
immersion, that is a submersion, which is a map where the derivative is subjective.

Definition 7.4.2: Submersion


A smooth map f : M m → N n is called a submersion if dfp is surjective.

If you did a lot of linear algebra and recognize the similarity of the two questions
we are asking, you will not be surprised that if the injective linear map is involved
in one, then the subjective will be involved in the second one.
An idea similar to the above is that of an embedding. We have already seen
many embeddings, very specifically, we saw curves and surfaces embedded in R3
a lot in the first part of the lecture. An embedding is basically a very good way
190 CHAPTER 7. SUBMANIFOLDS AND HOW TO GET THEM

of putting the manifold into a bigger manifold while preserving all of its major
properties and not losing anything.

Definition 7.4.3: Embeddings

A function f : M → N is an embedding of M to N if f : M → f (M ) is a
diffeomorphism.

The definition basically states that we cannot differentiate between M and


f (M ) ⊂ N , since the restricted version of f is a diffeomorphism. There is no
intrinsic attribute of the smooth manifold variety that you could find in one but not
the other. This is however a very strong condition. A diffeomorphism is something
that truly can’t see the difference, it is not just something that is ”good enough”.

7.4.1 Examples of Immersions, Submersions and Embed-


dings

We want to give a few examples of Immersions, Submersions and Embeddings to


get you to visualize what they are.

Example 7.4.1: Immersions

• A parametrization of a smooth curve is an immersion. Let γ : t → Rn


be a smooth curve in the sense of the first chapter. Then, since one of
the requirements was that |γt | =
̸ 0, the map dγ is certainly injective,
so γ is an immersion.
• The above is very strict, however. We can also take a curve that
intersects itself, for example:

γ(t) = (t2 , t3 − t) (7.3)

parametrized with the whole of R. The curve intersects itself at (1, 0),
since γ(1) = γ(−1) = (0, 1). However, the characteristic of being an
immersion is clearly local with respect to the first space (here time),
and the differential γt is injective everywhere. But this means that
we cannot guarantee that smooth manifolds are always the results of
immersions, additional conditions will be needed.
7.4. IMMERSIONS, SUBMERSIONS AND EMBEDDINGS 191

Figure 7.4: The curve of this example.

• The usual map that maps S n to S n+1 is an immersion.

Example 7.4.2: Submersions

• Projections (with conditions) are good examples of submersions. For


example, you can project Rn onto Rm in the trivial way, if n ≥ m. The
differential of that map is certainly subjective, so it is an immersion.
• We can submerse Rn \ {0} onto S n−1 . by projecting the rays in Rn
onto the sphere.

Rn → S n−1 (7.4)
x
x→ (7.5)
|x|

Figure 7.5: The projection of Rn \ {0} onto S n−1 .


192 CHAPTER 7. SUBMANIFOLDS AND HOW TO GET THEM

• The projection π : T M → M is also a submersion, submersing the


tangent bundle back into the manifold.

Figure 7.6: The canonical projection of T M onto M .

Example 7.4.3: Embeddings

• You have seen many embeddings in this class already, particularly in


the first part, where we embedded many curves into Rn or onto a sur-
face, and many surfaces in R3 . (Exercise: Why are they embeddings?
Find a counterexample)
• The Veronese map RP 2 → R3 is not an embedding. It immerses RP 2
into R3 as Boy’s surface but does not embed RP 2 .

7.4.2 Embeddings are injective immersions

The last examples may have led you to think that an embedding is just an injective
immersion and vice versa. The only problem with the curve γ(t) = (t2 , t3 − t) was
the fact that it was not injective, after all. This is not quite true. While the first
claim is generally true, an embedding is an injective immersion, the opposite is not
true in general. It requires another condition, which we will talk about later. For
now, we can prove the first claim.
7.5. THE INVERSE FUNCTION THEOREM. 193

Theorem 7.4.1: Embeddings are injective immersions

Let f : M → N be an embedding. Then f is an injective immersion.

Proof. There are two claims. Firstly, f has to be injective. But this follows from
the definition, since f : M → f (M ) has to be a diffeomorphism and therefore
also bijective, which means that f : M → N is injective. We only need to still
prove that f is an immersion, i.e. that df (p) is injective for all p ∈ M . We
can prove this using the chain rule. Firstly, note that since f : M → f (M ) is a
diffeomorphism, it is clear that:

f −1 ◦ f = idM (7.6)

We can use the chain rule on this equation.

df −1 (f (p)) · df (p) = idTp M (7.7)

Similarly, we can get:

df (p) · df −1 (f (p)) = idTf (p) f (M ) (7.8)

but then df (p) has to be an isomorphism, i.e f an immersion.

With this said, we can go on to a discussion of the inverse function theorem,


which is a key theorem for this part of the lecture.

7.5 The Inverse function theorem.

The Inverse function theorem is a key theorem in this part of the lecture. It will
give rise to two other key theorems, the local immersion theorem and the local
submersion theorem, which will be the answer to the question of how we can build
submanifolds.
The Inverse function theorem tells us something about local diffeomorphisms.
Local diffeomorphisms are functions between manifolds, which are, locally all dif-
feomorphic.
194 CHAPTER 7. SUBMANIFOLDS AND HOW TO GET THEM

Definition 7.5.1: Local Diffeomoprhism

Let f : M → N be a smooth function. We call f a local diffeomorphism if


for all p ∈ M the following conditions hold.

• We can find an open set U ⊂ M with p ∈ U , such that f (U ) is open


(in N ).
• and f restricted to these sets (f U
: U → f (U ) is a diffeomorphism.

A local diffeomorphism is different from a normal diffeomorphism in the fact


that it doesn’t necessarily need to be one globally. It can very well be that
f is locally diffeomorphic, but not injective. You can imagine taking a disc
and overlapping itself with it, like in figure 7.8, in a way that the smooth
structure is locally preserved, but the map is not injective and therefore not
a global diffeomorphism.

Figure 7.7: The idea of a local diffeomorphism is that you have a function
relating two manifolds, in such a way that at every point p on M , you can
find a (small) surrounding which is diffeomorphic to its image under f .

The inverse function theorem tells us that, for a function f , if dfp is an isomor-
phism (= determinant non-zero in charts), then f is a local diffeomorphism at that
point.
7.6. CONSEQUENCES OF THE INVERSE FUNCTION THEOREM 195

Figure 7.8: A local diffeomorphism does not have to be injective and there-
fore does not have to be a global diffeomorphism.

Theorem 7.5.1: Inverse function theorem


Let f : M → N be a smooth function and p ∈ M a point such that:
dfp : Tp M → Tf (p)N is an isomorphism. Then there exit open sets U, V
of M and N , such that p ∈ U, f (p) ∈ V and:

f U
:U →V (7.9)

is a diffeomorphism.

Proof. The theorem is true for M = N = Rn , as you probably have seen in a


calculus class. It transfers directly to M and N via charts.

Many things follow from the inverse function theorem, some of which we want
to address here.

7.6 Consequences of the inverse function theo-


rem
An immediate consequence of the inverse function theorem is a condition for a
function to be a local diffeomorphism.
196 CHAPTER 7. SUBMANIFOLDS AND HOW TO GET THEM

Corollary 7.6.1: Condition for local diffeomorphism

A function f : M → N is a local diffeomorphism if and only if df (p) is


isomorphic for all p and if and only if f is a submersion and immersion at
the same time.

This corollary is a direct application of the inverse function theorem, one could
even see it as just a reformulation of it.

Corollary 7.6.2: Diffeomorphisms

f : M → N is a diffeomorphism if and only if:


• f is bijective and f is a local diffeomorphism

• f is bijective and dfp is bijective for all p.

The proof of this is very similar to many proofs we already say, using a covering
of the relevant spaces and using the interplay between global and local properties.

7.7 A few more examples


We want to give a few more examples, which might give a bit more clarity to the
interplay between immersions and embeddings.

Example 7.7.1: Immersions and Embeddings:1

We already saw that the injectivity of an immersion will play a big part in
it being a manifold. But there are other conditions, the need for in this
example. Regard the curve in figure 7.9

Figure 7.9: This curve stops infinitesimal close to itself and in turn
is not a submanifold of R2 , even though the map is an injective
immersion.
7.7. A FEW MORE EXAMPLES 197

The curve meets itself infinitesimally close at a point. Thus the curve is not
a submanifold, since the curve doesn’t look (locally) like R there. But the
map is a smooth injective immersion. This shows that injective immersions
are not necessarily embeddings, but that some other conditions need to be
satisfied.

Example 7.7.2: Immersions and Embeddings: 2

To see that immersions are definitely necessary we can look at the curve in
figure 7.10

Figure 7.10: The curve γ(t) = (t2 , t3 ) is injective, but not a sub-
manifold. It fails at the point t = 0, where dγ is not injective, hence
γ is not an immersion.

The curve is described by γ(t) = (t2 , t3 ) and we already saw it in chapter


one as something we did not like. It shows up here for essentially the same
reason. The point at which the curve is not smooth is t = 0, and it is not
smooth there precisely because dγ is not injective there (i.e. γ is not an
immersion). This shows that immersion is necessary and not just for some
complicated cases.

Example 7.7.3: Immersions and Embeddigs: Klein Bottle

We already meet the Klein bottle in a previous chapter, as a square in R2


which is glued together in a specific way. That was not an embedding,
however, since the sides that were supposed to be connected were in fact
not. We can put the Klein-Bottle into R3 by the construction in figure 7.11.
198 CHAPTER 7. SUBMANIFOLDS AND HOW TO GET THEM

Figure 7.11: This is how you can construct the immersion of the Klein
bottle into R3 . Note that because the constructed bottle intersects
itself, this cannot be an embedding. In fact, it is impossible to embed
the Klein bottle into R3 .
By the way the bottle looks, you should be convinced that this is an immer-
sion, but it is not injective, so it is also not an embedding. For the points
of intersection, the surface does not locally look like R2 .

7.8 What comes next


We have introduced the topic of submanifolds and how to find them in this chapter,
but the question we set out to answer at the beginning has not yet been truly
answered. It will be the local immersion theorem and local submersion theorem
that will become this answer. Before we can talk about them in detail, however, we
will need to do a bit of preparation. We need to talk about coverings, orientations,
countability, bumps and cutoffs and partitions of unity before we can have a full
discussion of these two theorems. These will be the topics of the rest of this part
of the lecture until we get to these theorems.
Chapter 8

Coverings

File corrupted.

199
200 CHAPTER 8. COVERINGS
Chapter 9

Orientations

We now come to the topic of orientation. An orientation, intuitively speaking is


a consistent global notion of left and right-handedness. This is clearly not always
possible. The Mobius strip is the canonical example of a surface which is non-
orientable, see figure ??. The problem with the Mobius strip is that if you start
with an orientation anywhere on it, and consistently copy it from one spot to another
and go like this around the strip, you will end up exactly where you started, but
with the ”other” orientation, thus globally, you cannot orient the Moebius strip.
The orientation in the figure is a two-dimensional one. A three-dimensional
orientation is the concept that separates your left hand by your right hand, also
called chiralty. You cannot, by a simple rigid motion get your left hand to look like
your right hand. Try this yourself.
They are the same object, but one is the reflection across a plane of the other,
which means that they have different orientations and thus you cannot get the one
to the other by a simple rotation or translation.
We will develop this concept of orientation rigorously, by first exploring orienta-
tions in a vectorspace.

9.1 Orientations on Vectorspaces


Let’s start with a finite-dimensional (for simplicity) vectorspace V . (You can best
imagine this as Rn , but this works for any real finite-dimensional vectorspace. This
vectorspace will have a basis, and if its dimension is n, then that basis will have n
elements. Take two choices of basis, E and E ′ .

E = {e1 , e2 , . . . , en } (9.1)
E =

{e′1 , e′2 , . . . , e′n } (9.2)

where e1 , . . . , en are the basis vectors of the first basis, E, and e′1 , . . . , e′n are the
basis vectors of the second basis E ′ . How can we express if they are orientated the
same way or not?

201
202 CHAPTER 9. ORIENTATIONS

Figure 9.1: The Moebius strip is a good example of a non-orientable surface.


This is because if you start at some point, pick an orientation, and smoothly
go forward in the loop, you will come back to the same point with a different
orientation.

Figure 9.2: Your hands are two objects, which have different orientations.
You cannot simply turn one into another by a ridging motion.
9.1. ORIENTATIONS ON VECTORSPACES 203

Well, we can write E ′ in terms of E by a linear transformation:


n
X
e′i = aij ej (9.3)
j=1

and then say that E ′ and E have the same orientation if the matrix aij has a
positive determinant, and opposite orientations if it has a negative determinant.
(The case det(aij ) = 0 is not possible since E ′ and E are bases.)

9.1.1 Why the determinant?

Figure 9.3: There are two different kinds of orientations. If you switch
bases, then two things can happen. Either you switch orientation, or you
don’t. To switch, you will need to reflect in some way.

Why, you might ask, has the determinant something to do with this? Well,
let’s start with figure ??, which is a two-dimensional example. As you can see,
204 CHAPTER 9. ORIENTATIONS

if the map is a reflection, then we definitely have a reversal of orientation. But


what about slightly more complicated basis changes? We can use our knowledge of
matrix decompositions to help us. Specficially, we will use the QR-decompositon
of A = aij . From linear algebra, we know that there exists such a decomposition,
and since A is invertible, we can choose R to have positive diagonals.

A = QR (9.4)

What does this mean geometrically? Well, R is an upper triangular matrix with
a positive diagonal, which means that it is a shear. A shear is a transformation
that tilts one basis vector after the other, towards the previous basis vectors, so it
cannot be orientation reversing. The Q is an orthogonal map, i.e. either a rotation
or reflection. If it is the former, it also preserves orientation. If it is the latter, it
reverses it.

Figure 9.4: The QR-decomposition in two dimension. You have a sheer,


and then either a reflection or a rotation. The sheer preserves orientation,
so it is up to the Q to decide if you have an orientation reversal or not.

What about the determinant? You probably remember that the determinant of
rotations is +1 and of reflections −1. What about R? Well, it is an upper triangular
matrix with a positive diagonal, so it is definitely positive (Why?).
By:
det(A) = det(QR) = det(Q)det(R) (9.5)
9.1. ORIENTATIONS ON VECTORSPACES 205

Figure 9.5: The QR-decomposition in three dimension. You have a twp


sheers, and then either a reflection or a rotation. The sheers preserve
orientation, so it is up to the Q to decide if you have an orientation reversal
or not.
206 CHAPTER 9. ORIENTATIONS

we get the fact that the sign of det(A) is positive if and only if there are no reflections
in the transformation, and negative if and only if there is one.
Therefore the definition makes sense.

Definition 9.1.1: Same and different orientations

Let E, E ′ be two choices of bases for V . We say E and E ′ have the same
orientation, if the basis-change matrix A has a positive determinant and
different orientations if the determinant is negative.

We can now define an orientation for V . We can group all the bases of V into
two groups, having either one orientation or the other. (Which one is which is a
matter of choice). We do this through equivalence classes.

Definition 9.1.2: Orientation on V

Let V be a finitely dimensional real vectorspace and E, E ′ two choices of


basis. The equivalence relationship:

E =≃ E ′ ⇐⇒ E and E ′ have same orientation (9.6)

tells us that two bases are equivalent if they have the same orientation. We
define an orientation of V to be an equivalence class under ≃.

o = [E] = [(e1 , . . . , en ] (9.7)


An orientation on a vectorspace is just a choice equivalence class. We also
write O(V ) for the set of orientations of V .

There are a few small points about orientations we want to mention.

• O(V ) contains 2 elements, that is, there are two ways to orient V .

• If we call these two ways o1 and o2 , then we can write o1 = −o2 /

• Any odd premutation of (e1 , . . . , en ) reverses the orientation, any even per-
mutation preserves it.

• Replacing e1 by −e1 (reflection) reverses orientation.

We can also pack the vectorspace and the orientation into a tuple.

Definition 9.1.3: A

oriented vectorspace is a pair (V, o) where o ∈ O(V ).

We basically commit ourselves to one choice of orientation.


9.2. ORIENTATIONS AND LINEAR MAPS 207

9.2 Orientations and linear maps


When motivating the definition, we already spoke of the change of basis from E to
E ′ being orientation preserving or reversing. We want to formalize this discussion.

Definition 9.2.1: Orientation of linear maps

Let L : (V, oV ) → (W, oW ) be a linear map between the two (possibly


different) vectorspaces. L is called orientation preserving (OP ) if it takes
oV → oW and orientation reversing (OR) if it takes ov → −ow .

Notice that this definition requires us to choose the orientations beforehand.


Orientation of maps is not canonical (if V ̸= W ).

9.3 Orientations on Manifolds


We want to extend the notion of orientation from a vectorspace onto a manifold.
How can we do this? The first thing to notice is that orientation (i.e. small circular
arrows in 2d, hands in 3d, etc.) is something seemingly local. I don’t need the
whole manifold I live in to decide whether I am talking about my right or left hand.
This, along with the fact that we already know orientations on vector spaces should
not make it too surprising that we will use Tp M .
We write:

Op M = O(Tp M ) = { the 2 orientations of Tp M } (9.8)


[
OM = Op M (9.9)
p∈M

similarly to how we defined tangent spaces. It is supposed to be understood that


orientations carry the point p with them, like vectors, and the union in the above
equation is disjoint.
With this, we already can piece together bits of the definition. An orientation
for the whole of M is simply a consistent choice of orientation for every Tp M . But
what do we mean by consistent? Well, somehow the orientations need to be chosen
in a ”smooth” way. We can make this idea of ”smooth choice of orientations” a bit
more clear if we introduce bases.

Definition 9.3.1: Smooth (/continous) local frames

Let U ∈ M be an open set and let V1 , . . . , Vn ∈ C ∞ (U, T M ) be smooth


vectorfields. If V1 (p), . . . , Vn (p) are a basis of Tp M for every p ∈ U , then
we call this choice of vector fields a smooth local frame for M over U .

You can see this in figure ??. Basically, a frame is a (local) system of basis
vectors for all Tp M ’s, such that these basis vectors vary smoothly from one place
to another.
208 CHAPTER 9. ORIENTATIONS

Figure 9.6: The vector fields (blue, red, orange) are already at each point
a basis of Tp M and vary smoothly, thus they are a frame for M . Notice
that they don’t need to be orthogonal.

With this, we can complete our definition of orientations on M .

Definition 9.3.2: Orientation on Manifold


We call o : M → OM which chooses an orientation for every Tp M an
orientation, if for all p ∈ M you can find an U with p ∈ U and a smooth
local frame V1 , . . . , Vn on U that induces the orientation by:

o(p) = [(V1 (p), . . . , Vn (p)] (9.10)

Definition 9.3.3: Orientability of M

We call M orientable if there exists an orientation of M , and conversely,


non-orientable if no such orientation exists. We call the pair (M, o) an
oriented manifold if o is a (global) orientation of M .

We now want to discuss orientations in connection with smooth maps.


9.4. ORIENTATION DOUBLE COVER 209

Definition 9.3.4: Orientation preserving and reversing maps

Let f : (M, oM ) → (N, oN ) be a local diffeomorphism between manifolds.

• We call f orientation preseriving (OP ) if for all p ∈ M : df (p) :


Tp M → Tf (p) N is OP .
• We call f orientation reversing (OR) if for all p ∈ M : df (p) : Tp M →
Tf (p) N is OR.

What this definition means intuitively, is that f is orientation preserving, if it is


orientation preserving locally (as dfp in Tp M ) in the vectorspace sense, everywhere.
There are a few things to note.

Lemma 9.3.1
• If M is connected, then f is either OR of OP and cannon be anything
else.
• The obvious relations for preserving/reversing a property hold:

OP · OP = OP (9.11)
OR · OP = OR (9.12)
OP · OR = OR (9.13)
Or · OR = OP (9.14)
(9.15)

If f : U ∈ Rn → Rn and f is a local diffeomorphism, then:


• f is OP ⇐⇒ det(df (p)) > 0 for all p ∈ M

• f is OR ⇐⇒ det(df (p) < 0 for all p ∈ M


and the case det(df (p)) = 0 is not possible (exercise: why?)

You can also specify the orientation of a manifold by specifying a subatlas of M


so that all of the overlap maps are orientable. This is another way of saying that
you can specify orientations in charts.

9.4 Orientation double cover


You might have noticed that we have introduced similar notations for orientations
OM, Op M in this chapter as for the tangent bundle and tangent spaces T M, Tp M .
This was not by accident. T M and OM are, in some sense, very similar as they
are both bigger manifolds we can build from M , without other structures. In this
section, we explore this, first by looking at S 1 .
210 CHAPTER 9. ORIENTATIONS

Example 9.4.1: The orientations of S 1 , OS 1

Let S 1 be the circle as we know and love it. What does OS 1 look like?
Well, it is quite simple, really. It looks like two copies of a circle, one with
one orientation, and one with the other. No twisting or anything weird like
with T S 2 . You can see it in the figure below.

Figure 9.7: OS 1 is just two copies of the circle, with different orien-
tations, as you would expect, and as you should!

As you can see in the example, OS 1 is simply S 1 × {±1}, really, which is just a
way of saying that OS 1 consists of two copies of the circle, one oriented in the ”one
way”, one in the ”other”. This is clearly not true for a general M , just like it is not
true that T M = M × Rn for all M . One obvious counter-example is the Mobius
strip, and this should make sense. If it was otherwise, that is, if OM = M × {±1},
then we would have two copies of the Mobius strip, one with ”one” orientation,
nor with the other, but that cannot be, since the Mobius strip, as we saw, is not
orientable.
This idea is very general, as is the idea that OM is a smooth manifold.

Theorem 9.4.1: Orientationspace OM and orientation.

Let M be a smooth manifold. Then OM is naturally a smooth manifold of


the same dimension. The map π : OM → M is a smooth covering map of
degree two. OM is called the orientation double cover of M .
Additionally, M is orientable, if and only if OM = M × {±1}.

The first claim should be clear, since locally, OM does look like M × {±1},
9.4. ORIENTATION DOUBLE COVER 211

since, M being manifold, it locally looks like Rn and we can therefore for a small
local patch always find two possible orientations, as you can see in figure ??. The
second claim can be explained by the following. If it is the case that M is orientable,
then OM has to have this structure, since you can find two global orientations +M
and −M and they are disconnected in OM . The other direction is also simple since
you can just call the one component of OM the +1 orientation and the other the
−1 and you have found the two global orientations of M .

Figure 9.8: Locally OM looks like M × {±1} since a manifold looks locally
like Rn , and thus you can find two orientations for a small patch around
any point p. By choosing one to be the +1 orientation and the other to be
the −1, we can see clearly that locally, over the patch around p, OM does
look like P atch × {±1}

This theorem shows that, as with T M , the global manifold structure of OM


can tell us a lot about the underlying geometry (here orientability) of M .

Proposition 9.4.1: Orientations of M

Choosing an orientation of M is equivalent to choosing a smooth map o :


M → OM with o(p) ∈ Op M for all p ∈ M

This proposition is somehow, intuitively, obvious, especially if you think of the


212 CHAPTER 9. ORIENTATIONS

Mobius strip.
The double cover of M , OM also has an interesting property, orientation-wise.

Proposition 9.4.2: OM is orientable

et M be a manifold, which might or might not be orientable. The OM is


automatically orientable.

This is quite astounding if you think about it. It means that even for non-
orientable manifolds, like the Mobius strip, OM , is orientable.
You can see this with for the Mobius strip in figure ??

Figure 9.9: On the one hand, you have the Mobius strip, which is non-
orientable, due to the twist it has. If you take the orientation double cover,
OM , you can see that it has two twists, and therefore (as you can trace
with your finger), it is orientable. If you are unsure why OM looks like it
does, think about why it does and what it represents.
Chapter 10

Techanicalities:
Countability

Another topic we want to discuss before going into the real stuff is countability.
The definition of a smooth manifold still allows for some extremely wild examples
that sometimes even feel absurd. This results from our definition of a smooth
manifold simply being a topological space (M, T ) with a smooth atlas, with the
only restriction on the topological space being that it is Haussdorf. This is not yet
enough for us, since it includes some weird examples, that we want to exclude.
The main idea is that these ”pathological” examples happen if your topology
(= the set of all open sets) cannot somehow be reduced to something countable,
i.e that it is built non-trivially out of an uncountable number of ”things” and thus
can show very wild behaviour.
There are three main ways to say that something is countable we will say. They
are called second accountability, σ-compactness and paracompactness. There are
many more, but all of these will be enough for our purposes here.

10.1 Second countability, σ-compactness and para-


compactness
The three different ways of speaking about this vague idea we have introduced can
be seen as approaching the problem from three different sides. The first, second
countability, has to do with ”bases” of the topology, which are conceptually similar to
bases in linear algebra, that is, in a very particular sense, you can construct the entire
space from them. The second, σ-compactness, talks about which compact sets the
topological space is built of, not by an abstract notion like second-compactness, but
simply by set unions. Paracompactness is a bit more difficult, and talks about how
you can cover the space, and how you can refine these covers.

213
214 CHAPTER 10. TECHANICALITIES: COUNTABILITY

10.1.1 Second Countabilty


We said that second countability has something to do with ”bases” of T . We need
to define this concept first. The idea, simply, is that a basis is a subset of the
topology so that for any open set U ∈ T on the space, and any point p ∈ U , you
can find a set in the basis, which also has p in it, and fits inside U .

Definition 10.1.1: (Countable) Basis of a Topology

Let (M, T ) be a topological space (not necessarily a manifold). A (count-


able) Basis B is a subset of T of open sets, such that for any open set
U ∈ T , so that for all p ∈ U , there exists a V ∈ B with p ∈ V and V ⊂ U .

Equivilently, you could define it so that every U ∈ T can be written as a union


of sets of B
In this sense, the basis, B, generates the topology, because you can construct
every open set in the topology from the sets in the basis.
A second countable topological space is simply one that has a countable basis.

Definition 10.1.2: Second accountability

We call (M, T ) second countable space if it has a countable basis, B.

Figure 10.1: You can see the defining property of a basis of a topology in
this figure.

10.1.2 σ-compactness
The concept of σ-compactness is even simpler than that of paracompactness.
10.1. SECOND COUNTABILITY, σ-COMPACTNESS AND PARA-COMPACTNESS215

Definition 10.1.3: σ-compactness

A topological space (M, T ) is σ-compact if it is the union of countably many


compact sets.

For example, R is σ-compact, since it is the countable union of the compact


sets [n, n + 1] for n ∈ Z

Figure 10.2: The real line is sigma countable, because it is the union of all
the sets . . . , [−1, 0], [0, 1], [1, 2], . . . .

10.1.3 Paracompactness
To define paracompactness, we need something called an open cover. An open
cover, as you might imagine, is just a lot of open sets, that, together, cover the
space in question.

Definition 10.1.4: Open cover

Let A ⊂ M be a set. An open cover of A is a collection of open sets O,


which cover A, i.e so that: [
A⊂ U (10.1)
U ∈O

You can see the idea in figure 10.3.


Clearly, there is a sort of quality to open covers. Have a look at figure 10.4
In the figure, the open cover P, looks more efficient, because it wastes less
space, intuitively. There is a lot that O covers that P doesn’t. The way we make
this precise is by saying that every set V ∈ P is contained in a (bigger) set U ∈ O.

Definition 10.1.5: Refiments


Let O, P be two open covers. We call P a regiment of O and write P << O,
if for all V ∈ P, there exists a U inO, such that V is contained in U , i.e
V ⊂ U.
216 CHAPTER 10. TECHANICALITIES: COUNTABILITY

Figure 10.3: An open cover of A is simply a collection of open sets, that,


if you take them together, cover A

Figure 10.4: Clearly, the cover P is better (more refined) that O, because
it ”wastes less space”, i.e every V ∈ P is contained in a bigger set U ∈ O.
10.1. SECOND COUNTABILITY, σ-COMPACTNESS AND PARA-COMPACTNESS217

There is another way to ”waste” your effort. You could also simply be using
more open sets to cover your set that you need, as you can see in figure 10.5. We

Figure 10.5: You can be very wasteful, by covering with much too many
sets. In the worst case, you could be using infinitely many sets you don’t
need, like here with the point p, which does not even lie on A.

can avoid infinite wastage with the following conditions.

Definition 10.1.6: local finitemess


We say that an open cover O is locally finite, if, for all p ∈ M , there exists
an open set U , with p ∈ U , so that the number of sets in O it meets finite.

You can see a picture in figure 10.6 to get a feeling for the definition.
With this said, we can finally define paracompactness.

Definition 10.1.7: Paracompact space

A topological space (M, T ) is called paracompact if every open cover of


(M, T ) has a locally finite refinement.

Basically, paracompactness guarantees us that we will not come into the situa-
tion, like in figure 10.5, where we have an open cover and infinitely much ”wastage”
and cannot refine it away without losing the covering of our space.
This criterion will be especially useful when we will work with partitions of unity.
218 CHAPTER 10. TECHANICALITIES: COUNTABILITY

Figure 10.6: An example of the criterion of local finiteness at p. You can


pick the set U , and it only meets finitely many (3) sets in the open cover.

10.2 How the three definitions interplay

We want to discuss how these three definitions interplay, firstly when you have just
a topological space, and then if you have a manifold. We shall see that the atlas
makes the interplays easier.

The particularly interesting fact is that all second-countable spaces are para-
compact, but not all paracompact spaces are second-countable.

2nd-countable ⇒ paracompact (10.2)


⇍ (10.3)

We can get a feeling for this with a few examples.


10.2. HOW THE THREE DEFINITIONS INTERPLAY 219

Example 10.2.1: Countability

We will explain the difference in ”size” between just paracompact and also
second countable, by Rn .
• As you can imagine, R2 is both. It is second countable because its
topology can be shown to be spanned by all balls sitting at points with
coordinates in the rational numbers and a rational radius. It is also
not too hard to convince yourself that it is paracompact.
• Rn is both paracompact and second countable.

• A countable union of copies of Rn is also both paracompact and second


countable.
• On the other hand, an uncountable union of copies of Rn is not second
countable, but it is still paracompact. It is clearly not second count-
able, since you need a bunch of basis-open sets for each copy, but
there is an uncountable amount of these copies and you can therefore
not find a countable basis, i.e. it can’t be second countable.

So as you can see, paracompact spaces don’t necessarily have to be ”small”, like
second countable do.
The difference between a general paracompact set and a second countable is
not too big though, you just need one condition on the paracompact set for it to
be second countable. It has to be seperable.

Definition 10.2.1: Seperable Space

A topological space is called separable if it has a countable dense subset.

This is quite a dense (pun intended) definition. But you can see this be with
R. it is separable since it has a countable dense subset, namely Q. It is a subset,
obviously. It is one of the first examples of a countable set that you usually learn,
by Cantor’s counting trick, and it is dense in R.

Proposition 10.2.1: 2nd countable and seperable

Every second countable space is separable.

Proposition 10.2.2: Second countable and paracompact

A set is second countable if and only if it is paracompact and separate.

Neither paracompact nor separable is enough to guarantee second countable


alone. You need both. If you are wondering, there is a counterexample for a
separable but non second countable set, but we won’t give it here.
220 CHAPTER 10. TECHANICALITIES: COUNTABILITY

If we now move to manifolds, the distinctions become less complicated.

Proposition 10.2.3: Countablility for manifolds

Let M now be a smooth manifold. Then the conditions that M is second


countable, σ-compact or that there is an atlas of M , which is countable,
are equivalent.

We won’t prove this here, as there was not enough time.

Proposition 10.2.4: 2nd-countable and paracompact for manifolds

For any locally compact space, second countable implies paracompact. For
any connected manifold, paracompact implies second countable.

The proofs of these two claims are both clever, and certainly not easy, but since
we only really need the result, we will not give it here.
We also have to explain what we mean by locally compact space.

Definition 10.2.2: Locally compact space

A space X is called locally compact, if for all P ∈ X, there exists V open


and K compact, such that p ∈ V and V ⊂ K.

The upshot of the previous propositions is that for manifolds, second countable
means simply a countable number of second countable connected components, while
paracompact means any number of second countable connected components. This
is a lot simpler than with just topological spaces.
With all the definitions and propositions out of the way, we come to a truly
weird example.

10.3 The Long line


Will come soon
Chapter 11

Tools: Bumps, Cutoffs and


partitions of unity

We now come to a completely different topic, again, by talking about very useful
tools to actually prove results, especially ones where we have concrete objects, like
the metric. In particular, we will be talking about ways to smoothly ”localize” stuff
(whatever the stuff might be), particularly in charts.
We will talk about bump functions and cut-off functions, which are functions
that look vaguely like bumps (see Gaussians), that you can multiply with an object
(f.ex function) so that it is zero everywhere except where the bump is. We then go
on to talk of partitions of unity, which are extremely useful when introducing, for
example, integration on manifolds.

11.1 Bumps and Cut-offs

A bump is something that looks vaguely like a, well, bump. It is a function which
is zero everywhere except for a (small) open set, where it is strictly positive.

Definition 11.1.1: Bump function

A bump function for the open set U ⊂ M is a function b ∈ C ∞ (M ) such


that:
• b(x) = 0 if x ∈
/ U.
• b(x) > 0 on U .

221
222CHAPTER 11. TOOLS: BUMPS, CUTOFFS AND PARTITIONS OF UNITY

Example 11.1.1: Examples of bump functions

On R, we can create a bump function through the following idea. Let’s start
with the function: (
e−1/t t ≥ 0
g1 (t) = (11.1)
0 t<0
You probably remember that this function, astounding as it looks, being
exactly zero and then suddenly growing, is smooth from a calculus class.
We can construct a bump function for (−1, 1) from it by the following:

g2 (t) = g1 (1 + t)g1 (−1 − t) (11.2)

Since we are only multiplying, g2 is still smooth and zero only on (−1, 1),
as you should be able to convince yourself of. We can even use it to make
a bump function for B1 ⊂ Rn , by taking:
2
g3 = g2 (|x| ) (11.3)

This is zero exactly on the ball of radius one. You might wonder whether
it is smooth at 0, since it has the absolute value. You can check that it is,
but would not be if there wasn’t a square in the function.

Figure 11.2: The functions from this example. At first you have the
function g1 , from which we construct bump functions for (−1.1) and
for B1 ⊂ Rn .

We can also define something similar called a cut-off function. The idea is that
the function is 1 on a set A, and zero outside of a slightly bigger set U , falling off
smoothly in between.
11.1. BUMPS AND CUT-OFFS 223

We need a few topological ideas to write down the definition.

Definition 11.1.2: compactly contained

Let U ⊂ M be an open set. We say a set A is compactly contained in U ,


if Ā ⊂ U and Ā is compact. In this case, we write A << U and also say
that A avoids the edges of U .
Generally, we say A is precompact if Ā is compact.

Figure 11.3: An example for the definition

Definition 11.1.3: Support of a function

Let u : M → R. We call the set of all the points at which u(x) ̸= 0 with
the set’s bound the support of u.
¯ 0}
supp(u) = spt(u) = {u ̸= (11.4)

Figure 11.4: An example for the definition

These definitions are basically a way of defining a (small) edge around a set, in
a topologically sensible way.
224CHAPTER 11. TOOLS: BUMPS, CUTOFFS AND PARTITIONS OF UNITY

A cutoff function is a function for a set A ⊂ U which is one on A and zero


outside of U , and falls off smoothly in between.

Definition 11.1.4: Cutoff function


Let A ⊂ U be a set and U open. We say that c is a cutoff function if:
• c ∈ C ∞ (M )
• 0≥c≥1
• Ā ⊂ {c = 1}◦ , i.e. the set A with boundary is in the interior of the
set where c = 1.
• The support of c is a subset of U . supp(c) ⊂ U

Figure 11.5: An example for the definition

Example 11.1.2: Cutoff functions

We can construct cutoff functions for various sets from a bump function.
So let g be a bump function for the interval (1/4, 3/4). We can define a
function h1 by the following:
Rt
g(s)ds
h1 (t) = R−∞
∞ (11.5)
−∞
g(s)ds

which is basically a normalized integration. Notice that the function is 0


below 1/4, and 1 over 3/4. So it is a cutoff function for the set (1, ∞). We
can turn it into a cutoff for (−1, 1) in (−2, 2) by taking the function:

h2 (t) = h1 (t + 2)h1 (−t − 2) (11.6)

and into a cutoff for B1 in B2 by:


2
h3 (x) = h2 (|x| ) (11.7)

exactly as we did with the bumps.


11.2. PARTITION OF UNITY 225

Figure 11.6: All the examples from this example.

11.2 Partition of Unity


We now want to use these bumps and cutoffs, to introduce an extremely useful tool
called the partition of unity. A partition of unity is a tool, which lets us localize
”something”, whatever the something might be, f.ex a function, vector field or
tensor field (we have not introduced those yet). The basic idea is quite simple. We
take a bunch of smooth functions, which are all bumps for some open sets (that
were chosen in advance) and sum up to one. If this sentence makes little sense to
you, I invite you to think of it like this. Take your manifold and divide it into many
tiny, non-overlapping, parts. For each part, define a function, which is 1 on the part
and 0 outside of it, i.e. it is the characteristic function of that set. Then, these
functions are weights and if we multiply a function f with one of these, we get a
localized version f ′ , which f (x) only on the tiny bit, but zero otherwise. This is
quite a simple procedure, except for the problem that these characteristic functions
are not smooth and we cannot work with them easily (think: derivatives?). So we
want to find a smooth version of this idea. A smooth version of this idea is called a
partition of unity. One application of partitions of unity we will see is when we will
define the metric.

11.2.1 Local finiteness


We need a few technicalities before we go on, in particular, we need to clear up a
few rules for calculating with things like partitions of unity.
226CHAPTER 11. TOOLS: BUMPS, CUTOFFS AND PARTITIONS OF UNITY

Figure 11.7: The basic idea of a partition of unity. We will be working


with a smooth version since we want to differentiate, but the concept is
slightly simpler with the non-smooth version. What you want is a lot of
functions that are characteristic functions for a defined set of parts of your
manifolds (here, intervals in R), or smooth functions that are similar to
the characteristic functions, so that if you multiply these functions with an
object, for example another function, of a vector field, you get a localized
version (for each of the characteristic functions) and if you sum all of these
localized versions of f up, you get f again.
11.2. PARTITION OF UNITY 227

Definition 11.2.1: locally finite functions

Let {ξα }α∈A be a set of functions M → R, index by A, where A can be any


set, finite, countable, noncountable etc. The sum of all of these functions,
X
ξα (11.8)
α∈A

is called locally finite, if for any point p ∈ M , there exists an open subset U
so that to calculate the sum on U , you have only a finite amount of non-zero
functions to sum up. In other words if:

#{α|ξα |U ̸= 0} < ∞ (11.9)

These functions don’t necessarily need to be smooth.

The consequence of this is that everywhere, the sum looks just like a locally
finite sum, hence the name. If these functions are all differentiable, then the sum
can also be differentiated, term by term.

Proposition 11.2.1: Differentablilty


P
If ξα is smooth for all α ∈ A, then you can differentiate α∈A ξα term by
term. !′
X X
ξα = ξα′ (11.10)
α∈A α∈A

If ξα ∈ C ∞ (M ) for all α, then


P
α∈A ξα is smooth too.

Sketch. At every point restrict the sum to the small set U , on which only a
finite amount of functions is non-zero. Ignore all the rest, since they are zero on
U and thus cannot do anything to the derivative. Then differentiate the finite
amount that isn’t and you get your derivative. The second part follows from
similar ideas.

With this, we can define a partition of unity.

11.2.2 Parition of unity, definition


We can now define a partition of unity. We will follow the ideas from the beginning
of this section.
228CHAPTER 11. TOOLS: BUMPS, CUTOFFS AND PARTITIONS OF UNITY

Definition 11.2.2: Partition of Unity

Let O be an open cover of M . A portion of unity, subordinate to O, is a


collection of functions {ξα }α∈A , such that:
• Every function is smooth and between zero and one. ξα ∈
C ∞ (M ), 0 ≥ ξα ≥ 1.
• (Subordinate to O) Every function has a set in O, such that it is
zero outside this set. I.e for all α, there exists a Wα ∈ O, such that
supp(ξα ) ⊂ Wα .

• The sum α∈A ξα is locally finite.


P

• α∈A ξα = 1
P

The usage is, as we have already said, to localize things. Let T : M → X be


some function of M to X, may it be R (function), T M (vector-field), or tensorfield.
Then we can have:

!
X X X
T = 1T = ξα T = (ξα T ) = Tα (11.11)
α∈A α∈A α∈A

where we have defined Tα = ξα T , which are now all smooth, localized versions of
T . This way, we can transfer to smaller sets, for example, ones that we can chart
with a single chart, i.e. we can transfer our global calculations into local ones in
charts.
Before we can use them as we like, we would need to prove that partitions of
unity even exist. Luckily, the conditions for this to be true are very lax.

Theorem 11.2.1: Existence of Partion of Unity

Let M be a paracompact manifold, and O an open cover of M , Then there


exists a partition of unity subordinate to O.

The condition of paracompactness was, historically, actually exactly constructed


for portions of unity, since we need these to do calculus really and this is why we
even introduced paracompactness.
We will need a lemma to prove the theorem.
11.2. PARTITION OF UNITY 229

Lemma 11.2.1
Assume the following:
• M is paracompact
• O is an open cover of M

• B is a base for TM
Then there exists an open cover P of M , such that the following are true:
• P refines O, P << O.

• P is locally finite
• P⊂B

The first two follow from the definition of paracompactness. We will skip the
proof as it is a bit tricky.
We can now prove the theorem.

Proof. This proof is actually a bit routine.

Step 1 We start by defining a basis B of TM Call an open set U ⊂ M admissible,


if

– Ū is compact
– There exists a V ⊂ M such that (U, V ) is diffeomorphic to (B1 , B2 ) ⊂
Rn with the same diffeomporhism. (See figure 11.8)

Then let B = {U |U is admissible }

Step 2 Let O be any open cover of M . Then by the lemma, there exists a refine-
ment P , that is locally finite and which is a subset of the basis B.

Step 3 We can write this open cover, P as {Uα }{ α ∈ A} for some indexation A.
Step 4 Let g be a cut-off for B1 in B2 . For each Uα select a Vα such that (Uα , Vα )
are diffeomorphic to (B1 , B2 ) by the diffeomorphism φα and define:
(
g ◦ φα on Vα
µα = (11.12)
0 on M \ Ūα

which is smooth (exercise: why?). See figure 11.9 for a visualization.


Step 5 Notice that the set of all points at which µα is positive is Uα and µα ≥ 0.
The µα ’s ”cover” M . Since P is locally finite, the sum is also locally finite
(exercise: why?)
Step 6 Notice that α∈A µα > 0
P
230CHAPTER 11. TOOLS: BUMPS, CUTOFFS AND PARTITIONS OF UNITY

Step 7 This all already looks like a partition of unity, the only part we have not
yet got is the normalization. We can get this easily by defining ξα to be:
µα
ξα = P (11.13)
α∈A µα

Now, all the ξα ’s taken together form a partition of unity, subordinate to O.

Figure 11.8: The choices of U and V needed for the construction in the
proof.

Figure 11.9: The construction of the functions µα .

With this, we have proven that a partition of unity exists, if M is paracompact.


We will be using them a lot.
Chapter 12

Immersions: Local
Immersion Theorem

After having done a lot of (technical) preparations, and gone out and done a few
topics that fit, but were not necessarily needed for it, we come to answer one of the
questions of how to get submanifolds.
Let f : L → N be a smooth map. When is M = f (L) a submanifold of N ?
We already saw that this has something to do with Immersion, the definition of
which we include here as a reminder.

Definition 12.0.1: Immersion


Let f : M m → N n be a smooth map. It is called an immersion if for every
p ∈ M , dfp : Tp M → Tf (p) N is injective.

We have also seen that embeddings are always injective immersions, but the
opposite is not necessarily true.
These questions will be answered, in parts, by the main theorem of this chapter,
the local immersion theorem.
To state it, we need a certain concept, which sounds a bit trivial/weird to even
state.

Definition 12.0.2: The Canonical Linear Immersion


The canonical linear immersion i = im , n : Rm → Rn where m ≤ N is the
map that trivially inserts Rm into Rn .

i(x1 , . . . , xm ) = (x1 , . . . , xm , 0, . . . , 0) (12.1)

231
232 CHAPTER 12. IMMERSIONS: LOCAL IMMERSION THEOREM

Figure 12.1: The canonical linear immersion. It is the map that


simply puts Rm into Rn .

You might protest at this definition. Why in the world would we give, what is
basically ”the identity” between Rm and Rn , a name? What is so special about it?
Well, the thing that is special about it is that it is an extremely simple, practically
trivial map. The reason why we even give it a name is because of this:

Theorem 12.0.1: Local Immersion Theorem


Let f : M → N be smooth and p ∈ M . Supose the differential of f at
p is injective, i.e. df (p) : Tp M → Tf (p) N is injective. Then there exist
local coordinates for M and N around p and f (p), in which f looks like the
canonical linear immersion i.

This is an extremely important result. Let us restate it. Any (smooth) map
whose differential at a point is injective, locally looks just like (x1 , . . . , xm ) →
(x1 , . . . , xm , 0, . . . , 0).
The proof of this theorem will rely on the inverse function theorem, which states
that if g : P → Q is a map and dg(p) is bijective, then g is a diffeomorphism near
p for some open set U and g(U ).

Proof. We want to use the inverse function theorem to prove this result, but
that requires manifolds of the same dimension, while our condition is m ≥ n.
The basic idea is to augment M until it has dimension n (artificially), use
the inverse function theorem, and then throw away the extra dimensions we
added. Let’s start by fixing a p ∈ M . The theorem is entirely local, so we can
without loss of generality work in a chart. In this chart, let f : U → V where
U ⊂ Rm , V ⊂ Rn and f (U ) ⊂ V , of course. We know, from the condition, that
df (p) : Tp Rm = Rm → Tf (p) Rn = Rn is injective, so without loss of generality,
we can transform Rm and Rn , so that p = 0, f (p) = 0, df (p) = i : Rm → Rn ,
simply by using some linear algebra (Rigid motion, then Gauss-decomposition).
233

So we get the result already, but only at p, not locally on U . That’s the part
we will need the inverse function theorem for.
To use the inverse function theorem, we will need to add n − m dimensions
to M somehow. We do this, by ”stacking” M enough times. What this means it
that we take, instead of the manifold M , the manifold M × Rn−m , or in charts:

x = (x′ , x′′ ) = (x1 , . . . , xm , xm+1 , xn ) (12.2)

where the first coordinates refer to those on M , the rest on Rn−m . We can also
write N (in chart Rn ) like this:

y = (y ′ , y ′′ ) = (y 1 , . . . , y m , y m+1 , . . . , y n )

(12.3)

You can see a visualization of what we are doing in figure 12.2. We can also
replace the function f by a function F : Rn → Rn going from the chart of
M × Rn−m to the chart of N . We cannot just choose F (x′ , x′′ ) = f (x′ ), since
it’s differential would not be bijective. (All partials after x′′ would be zero. So
we need to choose differently. We need to add something, which will make F
not too different from f , but the differential bijective. We can simply choose:

F (x′ , x′′ ) = f (x′ ) + (0, x′′ ) (12.4)

which is the simplest choice we can get. The first term, f (x′ ) is independent of
x′′ , and similarly, the second term, (0, x′′ ) ≡ j(x′′ ) is independent of x′ . The
second term shifts things up.
To use the inverse function theorem, we need to show that dF is bijective at
p = 0. Let us compute:

dFp (X ′ , X ′′ ) = dfp (X ′ ) + djp (X ′′ ) (12.5)


= i(X ) + j(X )
′ ′′
(12.6)
= (X , 0) + (0, X )
′ ′′
(12.7)
= (X , X )
′ ′′
(12.8)
=X (12.9)

Remember that X = (X ′ , X ′′ ) is a vector. So what we get is the simple result


that:
dFp = In (12.10)
I.e. The differential of F is the identity on Rn . It is clearly bijective, so we can
use the inverse function theorem.
The inverse function theorem tells us that F is invertible near p, i.e. that
there exist a small rectangle W = U ′ × U ′′ ⊂ U ′ × Rn−m open such that
F |W : W → F (W ) is a diffeomorphism, which means that (per definition of
234 CHAPTER 12. IMMERSIONS: LOCAL IMMERSION THEOREM

Figure 12.2: The construction we use in the proof of the local immersion
theorem.

Figure 12.3: Another view of what we are doing to M .


235

diffemorphism) F(W) is open, F |W is invertible and it’s inverse G = (F |W )−1


is smooth. Since F is locally invertible, G constitutes a valid chart for F (W ).
We can use it to construct new coordinates which will fit our purpose. The
coordinates of M × Rn−m we get at p, by the inverse G can be seen in figure
12.4.

Figure 12.4: The coordinates of M × Rn−m we get by the inverse of G.


Note that we see the left side as the manifold and the right (which is a
chart of N ) as the Rn with which we chart M

In these special coordinates, f is nothing but the canonical linear injection.

This theorem is extremely important and many things follow from it. As an
example, it implies that f (U ′ ) can locally be written as a graph near f (p) in the
coordintaes (y 1 , . . . , y n of N .

Theorem 12.0.2: Graphable image theorem

If everything is like in the local immersion theorem, then f (U ′ ) can be locally


written as a graph around f (p), in the coordinates of N .

y ′′′ = g(y ′ ) = f (G(y ′ , 0)′ )′′ = (f ◦ πRn →Rm ◦ G ◦ iRm →Rn (y ′ ) (12.11)

Other consequences are:


236 CHAPTER 12. IMMERSIONS: LOCAL IMMERSION THEOREM

Corollary 12.0.1

• If you fix p ∈ M and f : M → N smooth and dfp is injective at p,


then dfq is injective for q near p. (I.e in a (small) open subset of M ,
containing p).

• The set of all points where the differential of f is injective is thus


open. I = {q ∈ M |dfq injective }
• F restricted to the set I is an immersion.

We can also use it to finally find an answer to the question that started this
chapter.

12.1 A local answer


Corollary 12.1.1: The Answer to the first question

The image under an immersion of any sufficiently small open set of M is a


sumbmanifold of N . Indeed, f : U → f (U ) is an embedding of U into N .

The proof is left as an exercise. You can see the grand idea of this in figure ??.

The end result can be summarized as following. An immersion, by itself, is not


enough to make f (U ) a submanifold of N , but it is enough to guarantee that a local
version of it, that is, if we pick U small enough, then f (U ) will be a submanifold
of N . Another way of saying it, is that an immersion is an embedding, locally.

12.2 A global answer


We have a local answer. We can also get a global answer. It turns out, you only
need to check everything on the stage of continuity.

Theorem 12.2.1: Global Answer


Let f : M → N be a smooth immersion. Then if f is a homomorphism
onto f (M ), f (M ) is a smooth submanifold of N and f : M → f (M ) is a
diffeomorphism. In other words, f is an embedding.

The problem with this global answer is that it is hard to check for some f . We
can also give a different answer.
12.3. PROPER MAPS 237

Figure 12.5: An immersion (shown here) does not necessarily make f (M ),


for example, a submanifold of N , as you can see here, since we have an
overlap at f (p), which means that at this point, f (M ), does not look like a
manifold and thus cannot be a submanifold. But, locally every immersion
looks like an embedding. You can see this, by taking a small interval U ′
around the problem point p, and it if is small enough, you get a submanifold.
The global problem here does not pose a local problem, because we picked
U ′ to be small enough that the loop does not return to f (p) and cause a
problem.

Theorem 12.2.2: Embeddings of compact manifolds

If f : M → N is a smooth injective immersion and M is compact, then


f (M ) is a submanifold of N and f : M → N is an embedding.

For the first global answer, you need to check wether f is a homomorphism,
which means checking wether it is bijective, which means checking infectivity and
surrjectivity, also wether it is continuous, and wether f −1 is continuous. Having M
to be a compact manifold is way simpler.

12.3 Proper Maps


We want to give a slightly different version of the answer, by putting a different
kind or restriction on f . The restriction is that we require f to be proper.

Definition 12.3.1: Proper function

Let f : M → N be a smooth function. We say f is proper, if for all K ⊂ N


which are compact, f −1 (K) ⊂ M is also compact.
238 CHAPTER 12. IMMERSIONS: LOCAL IMMERSION THEOREM

Example 12.3.1: Proper and nonproper functions

Let f : R → R be the function f (t) = tanh(t), which you can see in the
figure. f is not proper, as you can simply see. f −1 ([0, 2]) = [0, ∞) and
so, since [0, 2] is compact and [0, ∞) is not, f cannot be propper. Let
F : R → R2 now be the function f (t) = (t2 , t(t2 − 1)). This function is
propper, for the simple reason that if |t| → ∞, then |f (t)| → ∞, so you
cannot find a compact set in R2 , which the curve stays in for a non-compact
interval.

Figure 12.6: The two functions in the example. The first is not-
proper, the second is.

If we require f to be proper, then the only other thing we need to check is


wether it is injective (otherwise, f (M ) can obviously not be a submanifold for
”usual” cases).

Theorem 12.3.1: Embeddings/Immersions and properness

Let f : M → N be a proper injective immersion. Then f is an embedding.

We now give a particularly nice example to see a lot of the concepts interplaying.
12.3. PROPER MAPS 239

Example 12.3.2

Let us take a function of the real line to the taurus, T 2 . Specifically, let’s
take the function:
f (t) = [(t, αt)] (12.12)
where [, ] is the equivalence class for the taurus T 2 ≃ R2 /Z2 (Basically, it
puts the point (a, b) to a point on the square by adding and subtracting
integers to the coordinates until it gets to something in the [0, 1] × [0, 1]
square.
If α is an irrational number, then the line you get on T 2 never meets itself,
that is, it never closes up. Notice though, that f is an injective immersion.
There are a few things that happen because of this.
• f (R) is dense in T 2 .

• f (R) is not a submanifold and f : R → f (R) is not a homomorphism


• f is not an embedding.
All of this happens because f is not proper.

Figure 12.7: The map from this example. If α is irrational, then a lot
of weird things happen. In particular, f (R) never closes, becomes
dense in T 2 and is not a submanifold of T 2 . All of this happens,
because f is not proper.

Theorem 12.3.2

Let f : M → N be an embedding. Then f is proper if and only if f (M ) ⊂ N


is closed.

To prove this theorem one needs a few lemmas, which we will just state here.
240 CHAPTER 12. IMMERSIONS: LOCAL IMMERSION THEOREM

Lemma 12.3.1
• In a Hausdorff space, compact sets are closed
• In a compact set, closed sets are compact
• In any topological space, if K is compact and C closed, then K ∩ C
is compact.

Lemma 12.3.2
If M is a manifold, then for all U open, and all p ∈ U , there exists a V
open, with p ∈ V such that V̄ is compact and V ⊂ U . and V is a chart/

Definition 12.3.2
If U ⊂ M is open then we call A ⊂ U compactly contained in U if:

• Ā ⊂ U
• Ā is compact
and we write A << U .

The idea of A being compactly contained in U is that it keeps away from the
boundary.

Lemma 12.3.3
A manifold is locally compact, that is, for all p ∈ M there exists an open
V , and compact K, such that p ∈ U ⊂ K.

Lemma 12.3.4
Let Y be a locally compact Hausdorff space and let D ⊂ Y . Then D is
closed iff ∀K ⊂ Y : K compact =⇒ K ⊂ D compact.

Lemma 12.3.5
Assume the following:
• X is a topological space

• Y is a locally compact Hausdorff space


• f : X → Y is proper.
Then f (X) is closed in Y .
12.3. PROPER MAPS 241

The theorem above follows immediately from these lemmas.


242 CHAPTER 12. IMMERSIONS: LOCAL IMMERSION THEOREM
Chapter 13

Embeddings: Whitney’s
Theorem

In the last chapter, we have talked a lot about immersions, embeddings and their
interplay. We saw that embeddings are at the very least injective immersions, and
we saw that immersions are locally embeddings. We also saw that if M is compact,
then an injective immersion is an embedding and that if f is a proper injective
immersion, it is an embedding. We want to leave this interplay for now (except
as a tool) and talk a little bit about embeddings themselves, in particular, a very
important theorem, that lifts the mystique around manifolds and realizes them as
something more tangible. This theorem is called Whitney’s embedding theorem and
this chapter is all about it.

13.1 Whitney’s embedding theorem


The goal of this chapter is to shed some light on the idea of a manifold. As it stands,
we know only that a general smooth manifold is ”some abstract space, which we
can build charts for”. This is very abstract. Whitney’s embedding theorem takes
manifolds and connects them to something we know really well, that is the Euclidean
space Rn .(even more so than the definition of smooth manifolds itself.)
Whitney’s theorem tells us that we can embed manifolds in Rn .

Theorem 13.1.1: Whitney’s embedding theorem: Version 1

Every second-countable smooth manifold can be embedded in some Rk .

The k in Rk is there so that you don’t accidentally think that this theorem
says you can embed any n-dimensional manifold in Rn , because that is simply not
possible.
Before we throw the proof at you, there are a few refinements and variations of
the theorem we want to discuss.

243
244 CHAPTER 13. EMBEDDINGS: WHITNEY’S THEOREM

13.2 Refinments of Whitney’s Embedding The-


orem
The theorem, as it stands, tells us only that we can find some Rk into which we
embedd our manifold. It does not put an upper bound on k, so situations like a
100
one-dimensional manifold (a curve) needing k = 10, 1000, 100000000, 10100 are
still imaginable (in principle), even if these scenarios feel absurd.
It makes sense that there are bounds, and the refined version makes this clear.

Theorem 13.2.1: Whitney’s embedding theorem: Refined Version

Every n-dimensional second-countable smooth manifold can be embedded


in R2n .

This tells us that we only need, maximally, twice the amount of dimensions to
embed a manifold. In particular, it tells us we never need more that R2 to embed
any curve and R4 for any surface. For both of these, it makes sense that these are
upper bounds and also that they can’t be made smaller. For curves, a circle needs
at least two dimensions to be embedded in, for surfaces, well, we already met the
Klein Bottle, which cannot be embedded in three dimensions. 2n is however not
the best we can do for all n.
What is the idea behind this refinement? Well, from the first version of the
theorem, we know that for any manifold, we can embed it in some Rk . We can
take this embedding and project the whole thing linearly down onto linear spaces. It
turns out, any projection down to R2n+1 will work if they are immersive injections.
To get down to R2n is another matter and requires a bit more trickery, as it is not
clear how you would not almost always get double points. This requires a trick,
which is aptly named Whitney’s trick, but which we will not cover here.

13.2.1 Additional refinments


You might wonder how good we can get. 2n is quite big for big dimensions, for
example, it seems a like a lot. Think of 3-manifolds. Six dimensions sound like
quite a lot.
This is one of the weirdest results from math:

Proposition 13.2.1

You can immerse all compact second countable manifolds of dimension n in


R2n−b(n) , where b(n) is the number of digits in the binary expansion of n.
This is also the best you can get.

If you are wondering why manifolds somehow count the number of digits in the
binary expansion of n (i.e. 8 in binary is 1000, so b(8) = 4), then you are not alone,
this connection is quite confusing.
13.3. EXAMPLES 245

The optimal embedding dimensions are not known! There are a few n’s for
which they are known, but not for all.

13.3 Examples
We can give a few examples. For n = 2, we know we can embed every surface
in 2n = 4. You can even classify all that you can embed in R3 . Since smooth
manifolds don’t have a rigid structure (no metric or similar), these categories are
really the same as the ones in topology.

Figure 13.1: The categories of all surfaces (categorised by diffeomorphicity)


that you can embedd in R3 .

The case n = 3 is quite interesting. It is known that any 3-manifold can


be embedded in R5 . This forms two groups. Those that can be embedded in
R4 , and those that cannot, which need R5 . They actually show quite different
characteristics. A final quite interesting result is that if you take any 3-manifold,
and allow the removal of an open ball, then the resulting thing you get can be
immersed in R3 , which feels truly weird.

13.4 Proof of Whitney’s Embedding Theorem


We want to prove Whitney’s Embedding Theorem for the case of a compact mani-
fold. The general case is not harder, just has more notation, really.

Proof. Let M be compact. Then we can cover M with a finite number of


coordinate charts, as you can see in the figure.
246 CHAPTER 13. EMBEDDINGS: WHITNEY’S THEOREM

Figure 13.2: We can cover the compact manifold with k-Coordinate


patches (charts), since it is compact. Each of these goes to Rn

Let’s say you have these charts and there are k of them. We can construct
a map into Rn·k = Rn × Rn × · · · Rn (k-times), by taking these charts and
combining them with cutoff functions and that is the basic idea of the proof.
Let’s fill out the detail.
• First, cover M by open sets (Up , Vp ), where p ∈ M , and Up << Vp .
• Then pick a diffeomorphism of these, φp , which takes (Up , Vp ) → (B1 , B2 ) ⊂
Rn .
• Since M is compact, there exist a finite subcover of (Up )p∈M . We can
reliable these as Ui , Vi , φi , where i ∈ {1, 2, . . . , k}. This is where we are
using compactness.
• For each i, we can use φi to define a cutoff function (to get the embedding
later on). We need a cutoff-function (any), for (B1 , B2 ) and can transfer
this through the diffeomorphism φi onto Ui , Vi . We call this function (of
M ), ξi . It is also smooth (since φi is a diffeomorphism and the cutoff for
(B1 , B2 ) we choose is smooth as well.)
• In addition, we can find a bump function for Ui , which is nonzero exactly
on Ui , and zero otherwise, in a similar fashion. Let’s call it hi .
• We can now construct an embedding f from M → R(n+1)k = Rn+1 ×
Rn+1 × · · · × Rn+1 (k times). We write φi = (x1i , x2i , . . . , xni ) : Vi → B2 ⊂
Rn . Then we can define the functions:
(
ξi φi on Vi
gi := (13.1)
0 on M \ supp(ξi )
13.4. PROOF OF WHITNEY’S EMBEDDING THEOREM 247

Which is basically just a cuoff version of the diffeomorphism, extended


(by zero) to all of M . gi is is also still smooth (because ξi is a cutoff for
Ui in Vi ).
• Note that since the cutoff ξi = 1 on Ui , gi is still a coordinate system on
Ui .
• Note also, that hi , the bump function is nonzero exactly on Ui .
• Now we can define the embedding. Let

f = (g1 , h1 , g2 , h2 . . . , gk , hk ) : (13.2)
M → R × R × R × R × ··· × R × R = R
n 1 n n n 1 (n+1)k
(13.3)

• Claim: f is an injective immersion, hence, since M is compact, it is an


embedding.
We now need to show two things. Firstly, that f is injective. Secondly, that dfp
is injective at every point (immersivity).
Let’s start with the injectivity. We want to show that f is injective, so
let p, q ∈ M . and assume that f (p) = f (q). Now, we know that p is in
one of the coordinate patches, let’s say in Ui . Then hi (p) = 1, and since
f (p) = f (q), hi (q) = 1. So both p and q lie in the same patch, p, q ∈ Ui . Now,
since ξi = 1 on Ui and f (p) = f (q), we know that φi (p) = φi (q) and so both
points have the same coordinates in the chart. But on the coordinate patch,
φi : Ui → φi (Ui ) ⊂ Rn is a diffeomorphism (because it is a chart), and therefore
injective. So p has to be q.
Now,we need to show immersivity. The idea is quite similar. Fix p ∈ M We
need to show that dfp is injective. We know that φi is an immersion because
it is a chart. Now find p in the chart covering, i.e. fix, i such that p ∈ Ui .
Now concider that f = (ξ1 φ1 , h1 , . . . , ξk φk , hk ) and in particular take the part
ξi φi . The claims that the differential of this part is injective, and hence, dfp is
injective, since dfp only needs to differ for two tangent vectors on one coordinate
anyways, and it will be different for at least the coordinates i(n + 1)’th to the
(i + 1)(n + 1)’th, if d(ξi φi ) is injective. Let’s show this. First let’s use the
product rule on d(ξi φi ).

d(ξi φi )|p = ξi dφi + dξi φi (13.4)


= 1dφi + 0φi (13.5)
= dφi |p (13.6)

The first line follows from that fact that ξi is a cutoff function for Ui in Vi , so
it is 1 on Ui , and since p ∈ Ui , dξi |p = 0, since on Ui , ξi = 1. What we have
show is that d(ξi φi )|p = dφi |p . But we know that φi : Ui → φI (Ui ) = B1 ⊂ Rn
is a diffeomorphism. so:

dφi : Tp Ui (= Tp M ) → Tf (p) Rn (= Rn ) (13.7)


248 CHAPTER 13. EMBEDDINGS: WHITNEY’S THEOREM

is a bijection. So dfp is an injection. Since this holds for all p, f is an immersion.


So we have shown that f is an injective immersion of a compact manifold,
and thus an embedding of M in Rn
Chapter 14

Submersions: Local
Submersion Theorem

Now we come back to the second question we wanted to answer about submanifolds.
Let f : M → N be a smooth function. If L ⊂ N is a submanifold of N , is f −1 (L)
a submanifold of M ? The answer will have, as the title hints at, something to
do with submersions. There will be a theorem very similar to the local immersion
theorem, called the local submersion theorem.

14.1 Submersions
We want to recall the definition of a submersion, and give a few examples to make
it a bit clearer as to what we are working with.

Definition 14.1.1: Submersion


A smooth map f : M m → N n is called a submersion if dfp is subjective.
(for all p)

Example 14.1.1: Some Submersions

• A simple example is the following projection. Take, as your maniold


M × N . Then the (obvious) projections πM : M × N → M and
πN : M × N → N which just project onto the corresponding points
in M and N , respectively, are submersions. (Every tangent vector on
M gets hit by the projection πM , similarly for N ).
• We can create a Mobius strip to S 1 -submersion, by doing the proce-
dure in the figure below. You simply project the Mobius strip down
onto the circle. The map dπp is surjective since you can get any vector

249
250 CHAPTER 14. SUBMERSIONS: LOCAL SUBMERSION THEOREM

in Tp S 1 by moving horizontally on the Mobius strip.


• The canonical projection π : T M → M is also a submersion by a
similar thought.

Figure 14.1: The construction of the submersion of the Mobius strip


into S 1 from this example. We simply project down. The differential
is surjective since moving horizontally on the strip (i.e. horizontal
tangent-vectors) gets you all the tangent vectors in Tp S 1 .

14.2 Back to the question


When is the preimage of a submanifold a submanifold? Another way to pose this
question is, when is the zero set of a system of equations a smooth submanifold?.
The answer will be similar to the first question, except it will be a lot simpler.
The answer will be when the function is a (smooth) submersion. Nothing else. No
properness or anything.
We will proceed similarly to immersions, and first will define the counterpart to
the canonical linear immersion, which is aptly named the canonical linear submer-
sion.

Definition 14.2.1: The canonical linear submersion


Take Rm , Rn , but this times with m > n. The canonical linear submersion
14.2. BACK TO THE QUESTION 251

π = πm,n : Rm → Rn is the map:

π( x1 , . . . , xn , xn+1 , dots, xm ) = (x1 , . . . , xn ) (14.1)

Figure 14.2: The canonical linear submersion.

As before, we only name this function because it will show up in the theorem
that will follow. Before we state it, however, we want to give a an example of the
theorem. (Submersions are enough).

Example 14.2.1

Take the curve γ(t) = (t2 , t3 ). We have seen it often before. It is clearly
not smooth at t = 0. We described this behaviour already, by examining γ
as a curve, as one of the first counterexamples in chapter 1. We now give a
different view of it, seeing it as broken submersion. We can write γ as the
graph of y = ±x3/2 , which is the zero set of the function f (x, y) = x3 − y 2 .
We know f −1 (0) is not a submanifold, since that is per construction just
γ, which is not smooth at the origin. We can see that f is also not a
submersion, because at (0, 0), we get df |(0,0) = (0, 0), which cannot be
surjective to T0 R = R.
252 CHAPTER 14. SUBMERSIONS: LOCAL SUBMERSION THEOREM

Figure 14.3: The curve γ.

14.3 The local submersion theorem


We now go on to the local submersion theorem.

Theorem 14.3.1: Local submersion Theorem


Let us assume that the function f : M → N is smooth and that at a certain
p ∈ M , dfp : Tp M → Tf (p) N is surjective. (which means m ≥ n). Then
there exist coordinates of M near p and N near f (p) in which f looks like
the canonical local submersion.

We will not do the proof, as it is extremely similar to the local immersion


theorem. The basic idea is, again, to use the inverse function theorem, by adding
m − n variables to N and finding the good coordinates from this.

14.3.1 Consequences of the Local Submersion Theorem


Similarly to the local immersion theorem, the local submersion theorem has a lot of
consequences, some of which we will write out here.

Corollary 14.3.1

If dfp is surjective, then dfq is surjective for q near enough p. Alternatively,


the set U = p ∈ M, dfp is surjective is open.

The proof of this is immediate from the local submersion theorem, since for
a small open set around p, on the whole set, dfp looks like the canonical linear
submersion.
14.3. THE LOCAL SUBMERSION THEOREM 253

Corollary 14.3.2

Fix p ∈ M and call q = f (p) If dfp is surjective, then there exists a small
open set U ⊂ M , so that f −1 (q) ∩ U is a submanifold of M of dimension
m − n.

Figure 14.4: You can see the idea of this corollary. If at p, f is a


submersion, then you can find a small U , where, at least in that
section, f −1 (q) is a submanifold

Proof: Exercise. What about a more global version? That also exists.

Theorem 14.3.2: Submersions and submanifolds

Let f : M → N be a submersion. Then for all q ∈ N , f −1 (q) is a


submanifold of M of dimension m − n. It is even a proper manifold.
254 CHAPTER 14. SUBMERSIONS: LOCAL SUBMERSION THEOREM

Corollary 14.3.3: Implicit function theorem/”Graphic preimage theo-


rem”

Let f : Rm → Rn be function with m > n, f (p) = 0 anddf (p)|Rn : Rn →


Rn nonsigular (det(df (p)|Rn > 0). Then f −1 (0) can locally be written as a
graph.

Figure 14.5: The idea of this corollary.

This corollary easily follows from the local submersion theorem. This one is
especially important to be able to define (sub)-manifolds through equations. We
have now, finally, learned a very concrete way of defining manifolds. We can do so
through equations. Note how far we have gotten. We had the very abstract,non
concrete way of doing it by defining the atlas, which is unpractical, but conceptu-
ally enlightening. We have now gotten the tools to do so through equations and
functions, tools that are a world away from the atlas.

14.4 Critical Points and Values

We want to end this part of the lecture by talking about critical points and values,
a favourite tool of analysis, used a lot. We can use much of what we have learned
in this part here.
14.4. CRITICAL POINTS AND VALUES 255

Definition 14.4.1: Critical points and values

• Let p ∈ M be a point and f : M → N a function as per usual smooth.


We call p a critical point if dfp is not surjective and we call it regular
otherwise.

• q ∈ N is a critical value, if there exists a critical point p ∈ M , such


that f (p) = q and dfp is not surjective.
• In other words, the set of critical values, is the the image of f of all
the critical points.

• q ∈ N is called a regular value, if for every p ∈ f −1 (q), dfp is surjec-


tive, i.e. p proper. (A regular values is one where non of it’s preimage
points is critical).

It is easy to see (from the local immersion theorem) that:

Corollary 14.4.1

The set of all regular points is open, while the set of all critical points is
closed.
256 CHAPTER 14. SUBMERSIONS: LOCAL SUBMERSION THEOREM

Example 14.4.1

Let f : R2 → R be the function f (x, y) = x2 − y 2 . We can see that:

df |(x,y) = (2x, −2y) (14.2)

so df |(0,0) = (0, 0). So the origin is a critical point. It is also, clearly, the
only critical point. The regular points are all of R2 \ {0}.

Figure 14.6: The function from this example. (specifically, its con-
tour lines. As you can see, only the origin is not a part of a manifold,
locally, which is a result of the function’s differential not being sur-
jective at that point.

There is a theorem, which is more related to analysis, but which also works well
in differential geometry. It tells us that ”almost all” values are regular values. For
this, we need to define ”almost all”.

Definition 14.4.2: Almost All/Every and Almost None

A set X ∈ N has ”measure zero”m if it has measure ”zero” in every chart


of some atlas. The measure, for those who didn’t take measure theory, is for
our cases just the n-dimensional volume. A property holds for almost every
or almost none points, if the set with or without the property has ”measure
zero”.
14.4. CRITICAL POINTS AND VALUES 257

Figure 14.7: The idea of this definition.

This is quite weird if you think of it. We have not introduced volume yet,
since we don’t have a metric yet, but we can define something as having ”zero
volume”. You might wonder if this makes sense and is independent of charts. It is!
As a remark, we define a set X as having ”full measure” (i.e. the volume of the
manifold), if the complement M \ X has measure zero. With this, we can state the
theorem.

Theorem 14.4.1: Sards Theorem


et f : M → N be smooth and M second countable and N at least para-
compact. Then almost every z ∈ N is a regular value.

The proof is rather technical and we wont do it here. The basic idea is that f
squishes the volume down a lot near every critical point, and you can sum over this
volume and take a limit to show that it is zero. This also means that almost every
f −1 (q) is a smooth (sub)-manifold! This is quite a result. Almost every. Well, if
only life was this easy. Most often, you find that exactly at the q you are interested
in, it’s not.
258 CHAPTER 14. SUBMERSIONS: LOCAL SUBMERSION THEOREM
Part IV

Vectorfields, Lie Bracket,


Flows and Lie Derivative

259
261

Now that we have essentially created the tools to be able to possess/define


a lot of manifolds, we can go on to talk about what you can do with them. In
particular, we are going to return to vector fields. The goal of this part is to develop
a derivative of one vector field in the direction of another. It will turn out that this
is nontrivial. On a smooth manifold, a derivative like this does not exist without
further structure.
262
Chapter 15

Lie Brackets and


Vectorfields

We start with a bit of a discussion of vector fields, a reminder of how they work
and a few things that will be useful throughout this part. We then go on to discuss
the problematics of a derivative of one vector field in the direction of another. We
then stray away from this and talk about the Lie Bracket, which, as we will see,
will play an important role in the development of a kind of derivative, called the Lie
Derivative, at the end of this part.

15.1 Vectorfields

We have talked about vectors, vectorfields and T M extensively, so we will only


remind you of a few of the most important concepts and add a few new ones.

Firstly, we want to get a bit of notation.

263
264 CHAPTER 15. LIE BRACKETS AND VECTORFIELDS

Definition 15.1.1: Vectorfields, Γ(T M )

We call the set of all vector fields on a manifold M , Γ(T M ), which is not
C ∞ (T M ), because Γ(T M ) has truly all vector fields, even discontinuous
ones. A vector field, in this sense, is simply a map:

X : M → T M, X(p) ∈ Tp M (15.1)

In coordinates, we can write this as:


n  
X ∂
X(p) = X i (p) (15.2)
i=1
∂xi p
n  
X ∂
X(p) = X i (x1 , . . . , xn ) (15.3)
i=1
∂xi p

where the functions X i are the components of X in that specific chart.

We can also define k-th differentiable vector fields. This is akin to our definition
of smooth vector fields in the second part of the lecture.

Definition 15.1.2: C k -Vectorfields

We define the set: C k (T M ) to be the set of all C k -vectorfields (k-times


continuously differentiable) on M . C 0 (T M ) is the set of all continuous
vectorfields on M . A C k -vectorfield is one, whose components X i are C k
in every chart.

This point is well-defined because the overlap maps are all smooth, per the
definition of a smooth manifold, and so if the X i ’s are all C k in one chart, then
they will be C k in all other charts.

Definition 15.1.3: Appling a vectorfield to a function

We can apply a vectorfield X ∈ C ∞ (T M ) to a function u ∈ C ∞ (M ) to


another smooth function (X · u) ∈ C ∞ (M ) by the following:

(X · u)|p = X(p) · u (15.4)

where X(p) · u is the action of the tangent vector X(p) as a differentiable


operator on the function u.

We would need to show that this new function (X · u) is actually C ∞ (M ), but


the proof is immediate, just go to a chart. There the function is:
n
X ∂u
(X · u)(p) = X i (x1 , . . . , xn ) (15.5)
i=1
∂xi
15.2. WHY DX Y IS NOT SO TRIVIAL. 265

all of which is smooth.

15.2 Why DX Y is not so trivial.


Let us see what goes wrong if we want to define a derivative DX Y of a vector field
Y in the direction X. The idea, of course, somehow needs to be that you look how
Y changes in the direction of X. We know how to differentiate the components of
Y in Rn . In Rn , we can do the following:
n
R
DX Y (x) = (DX Y 1 , . . . , dX Y n ) (15.6)
n
X ∂Y j ∂
= dY (x)(X) = X i (x) i (15.7)
i,j=1
∂x ∂xj

This is just the normal derivative we know and love from calculus in Rn . The
resulting object, DX Y , is another smooth vector field. and the map:

D· · : C ∞ (T Rn ) × C ∞ (T Rn ) → C ∞ (T Rn ) (15.8)

is bilinear. How can we try to get something like this on manifolds? The simplest
Rn
idea is just to use what we already know. We know how to calculate DX Y . Why
not, you might ask, just calculate DX Y in the chart, and take the components you
get and use these to define the vector DX Y ? The Problem, is that you don’t get
something intrinsic to the manifold. I.e. the DX Y you calculate in one chart differs
from chart to chart! (That is, the abstract vector DX Y ∈ Tp M is a different one for
different charts, the components don’t transform like a vector should). (Exercise:
Verify this).

15.2.1 What we can do


There are two solutions to this problem, i.e. two kinds of derivatives you can
define, which rely on different ideas that you generalize. The names of these are
the Lie Derivative LX Y and the covariant derivative DX Y . The Lie Derivative
is strange. Unlike what you would expect of the derivative, it depends not just
on X(p), but on the first derivatives of X too. Similarly for Y . So it is a bit
unintuitive at first. It is not really the ”right” generalization, but it is still extremely
useful. In contrast, DX Y , the covariant derivatives, is the ”right” generalization
and depends only on X(p) and the partials of Y . It is very similar to the derivative
we have constructed from charts, but it has a second term that makes it intrinsic. It,
however, needs additional structure on the manifold, something called a connection.
This makes DX Y something unique to M , but something which is defined by the
further structure. For the curious amongst you, the metric is enough to specify a
DX Y , because you can get connections from it.
There is, however, a deep connection between LX Y and DX Y .
There are a couple of ways to define LX Y . One uses Lie Brackets, and one uses
Flows. We start with Lie Brackets.
266 CHAPTER 15. LIE BRACKETS AND VECTORFIELDS

15.3 Lie Bracket


We start out with one way to define the Lie Derivative, called the Lie Bracket,
[X, Y ]. It turns out LX Y = [X, Y ], so we are not really taking a detour here. We
will show this only in two chapters, though.
Let us remember that if X, Y ∈ C ∞ (T M ), then X · u and Y · u are functions
(u ∈ C ∞ (M )). We write X · (Y · u) as XY u. The operator XY is clearly a
second-order derivative.

Proposition 15.3.1

Let X, Y ∈ C ∞ (T M ). Then there exists an unique vectorfield Z ∈


C ∞ (T M ), such that:

Z · u = [XY − Y X] · u (15.9)

for all u in C ∞ (M ). We write this vectorfield Z = [X, Y ] and call it the


Lie Bracket.

This seems strange. Firstly, if anything Z should be a second-order derivative


since XY and Y X are second-order derivatives. So somehow, the second derivatives
have to cancel.The second weird thing is that, unless Z = 0, we lost Schwarz’s
theorem. Partials are not interchangeable!

Proof. Let us do this in charts. Let X = ∂


X i ∂x i , and Y =

Y j ∂x j . Then
P P
we can calculate XY · u.

X ∂ X j ∂u
XY u = X i[ ( Y )] (15.10)
i
∂xi j ∂xj
X ∂2u ∂Y j ∂u
= X i [Y j + ] (15.11)
i,j
∂xi ∂xj ∂xi ∂xj

So:

[XY − Y X] · u = (15.12)
2 j i 2
X ∂ u ∂Y ∂u ∂X ∂u X ∂ u
= X iY j i j + X i i j
−Yj j i
− X i Y j i j (15.13)
i,j
∂x ∂x ∂x ∂x ∂x ∂x i,j
∂x ∂x
X ∂Y j ∂u i
j ∂X ∂u
= − Y (15.14)
i,j
∂xi ∂xj ∂xj ∂xi
15.3. LIE BRACKET 267

We can calculate the components by playing around with the indices.


X ∂Y j ∂u ∂X i ∂u
[XY − Y X] = i j
−Yj j (15.15)
i,j
∂x ∂x ∂x ∂xi
X X ∂Y j ∂X j ∂u
= ( Xı̂ i − Y j ) (15.16)
j i
∂x ∂xi ∂xj
(15.17)
We can take these as the components of Z. This shows a few things. Firstly, it
shows that Z is smooth, obviously, because of its coordinate representation. It
also is well defined and independent of parameterization and unique.

Proposition 15.3.2: Lie Bracket properties

Let X, Y, Z ∈ C ∞ (T M ) be vectorfields. Then the following are true.

• The map [, ] : C ∞ (T M ) × C ∞ (T M ) → C ∞ (T M ) : X, Y → [X, Y ]


is bilnear.
• The Lie Bracket is anti-commutative. [X, Y ] = −[Y, X].
• The Jacobi Identity holds, that is the sum of cyclic permutations of
X, Y, Z in the brackets is zero:

[[X, Y ], Z] + [[Y, Z], X] + [[Z, X], Y ] = 0 (15.18)

The proof of the first two is immediate, the Jacobi identity is quite straightfor-
ward, you just spell everything out and see that everything cancels. It is long though.
Before we do it, we will talk about what the Jacobi Identity means. The Jacobi
Identity, really, is just a way to say that the bracket operation is not associative. To
see this, note that if [, ] was commutative, then:
[X, [Y, Z]] − [[X, Y ], Z] = 0 (15.19)
since the brackets wouldn’t matter. Now write [X, [Y, Z]] = −[[Y, Z], X]. Then
the Jacobi identity tells us that:
[X, [Y, Z]] − [[X, Y ], Z] = [[X, Y ], Z] + [[Y, Z], X] = −[[Z, X], Y ] (15.20)
and not zero! So the brackets are not associative.
Proof. The proof is not too hard, it just requires writing everything out. We
know:
[[X, Y ], Z] = [(XY −Y X), Z] = (XY −Y X)Z−Z(XY −Y X) = XY Z−Y XZ−ZXY +ZY X
(15.21)
Now we can permute X → Y , Y → Z, Z → X to get the other two terms.
[[Y, Z], X] = Y ZX − ZY X − XY Z + XZY (15.22)
268 CHAPTER 15. LIE BRACKETS AND VECTORFIELDS

and again:
[[Z, X], Y ] = ZXY − XZY − Y ZX + Y XZ (15.23)

If we now add all of them, we get:

XY Z−Y XZ−ZXY +ZY X+Y ZX−ZY X−XY Z+XZY +ZXY −XZY −Y ZX+Y XZ = 0
(15.24)
as you should verify.

We know that [X, Y ] is bilinear. In particular, for a, b ∈ R, [aX, bY ] = ab[X, Y ].


You might wonder then if elongating the vectors pointwise, i.e. multiplying X and
Y with functions u, v ∈ C ∞ (M ), has the same effect. Is it true that [uX, vY ] =
uv[X, Y ]? No, it is not true. It is almost true, in the sense that there are just two
more ”error”-terms.

Proposition 15.3.3: Bilinearlty of [X, Y ] fails with functions

Let X, Y be vectorfields and u, v functions. Then:

[uX, vY ] = uv[X, Y ] + u(X · v)Y − v(Y · u)X (15.25)

So you can pull out the functions, but you get error terms.

15.4 Local and Differential Operators


We now come to a very important idea in differential geometry. It is the interplay
between local and differential operators. In some sense, this is very intuitive. Of
course, every differential operator is something local. But the deep idea in differen-
tial geometry is that (conditions apply) every local operator is also some differential
operator.
But before we continue, what is a local operator?

Definition 15.4.1: Local Operator

Let L : C ∞ (M ) → C ∞ (M ) be an operator. (Not necessarily linear or


anything). L is a local operator if for any open set U ⊂ M , if u|U is known,
then L(u)|U is known.

Alterantivley you can define it so that if u1 |U = u2 |U then L(u1 )|U = L(u2 )|U .
You can repeat this definition not for functions, but generally for everything, from
vectors to operations etc.
Both (X, u) → X · u and (X, Y ) → [X, Y ] are local operators (exercise: prove
this).
15.5. LIE ALGEBRAS 269

Definition 15.4.2: Differential Operator

A differential operator is a function E : C ∞ (M ) → C ( M ) if there exists a


k > 0 and a smooth function F , such that in local coordinates:
∂u
E(u)(p) = F (u, , ·, ∂ k uik (15.26)
∂xi
i.e. if the operator is just a function of the first k partials.

2
For example, E(u) = ∂x ∂u
+ sin( ∂x2 ) is a differential operator.
∂ u
Directional
derivatives u → X · u are of course differential operators. It is also true that every
differential operator is a local operator, simply because of how partials are local.

Theorem 15.4.1: Deep Theorem

On a compact manifold, every local operator is a differential operator.

15.5 Lie Algebras


The algebra that Lie Brackets create has a special name and is called a Lie Algebra.
It can be generalized in the following way.

Definition 15.5.1: Lie Algebra

Let V be a vectorspace over the Field K equipped with a bilinear operation


[, ] : V × V → V which, for all X, Y, Z ∈ V satisfies:
• Anticomutativity: [X, Y ] = −[Y, X].

• Jacobi Identity
is called a Lie Algebra.

A Lie algebra is obviously nonassociative. The Main examples (in differential


geometry) are the Lie Bracket, and for a vectorspace V , the space Hom(V, V ) with
the bracket [A, B] = AB − BA. The Lie Algebra is a way to go forward and study
Lie Groups.
270 CHAPTER 15. LIE BRACKETS AND VECTORFIELDS
Chapter 16

Flows

Will come soon.

271
272 CHAPTER 16. FLOWS
Chapter 17

Lie Derivative

Will come soon.

273

You might also like