Differential Geometry I Script 15-11-24 NEW
Differential Geometry I Script 15-11-24 NEW
Maciej Swiatek
15.01.2024
2
Contents
Preface 9
2 Surfaces 49
2.1 Some definitions and basic quantities . . . . . . . . . . . . . . . . . 49
2.2 The curvature of surfaces in R3 . . . . . . . . . . . . . . . . . . . . 54
2.3 The Geometric Definition of Curvature on Surfaces . . . . . . . . . 55
2.3.1 A bit about Qp . . . . . . . . . . . . . . . . . . . . . . . . 57
2.3.2 The independence of Qp (v) from the curve we choose . . . 59
2.3.3 Proof of the theorem . . . . . . . . . . . . . . . . . . . . . 61
2.4 The second fundamental form . . . . . . . . . . . . . . . . . . . . 64
2.4.1 Simplifying the second fundamental form . . . . . . . . . . 64
2.4.2 The mean and Gauss curvatures . . . . . . . . . . . . . . . 66
2.5 Symmetry and Curvature . . . . . . . . . . . . . . . . . . . . . . . 70
3
4 CONTENTS
II Manifolds 109
3 Topology and topological manifolds 113
3.1 Humble beginnings . . . . . . . . . . . . . . . . . . . . . . . . . . 113
3.2 A topological space . . . . . . . . . . . . . . . . . . . . . . . . . . 119
3.3 Charts: Part I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
3.4 The Hausdorff Condition . . . . . . . . . . . . . . . . . . . . . . . 124
3.5 The ant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
3.6 Interlude: Useful topology for the course . . . . . . . . . . . . . . . 126
8 Coverings 199
9 Orientations 201
9.1 Orientations on Vectorspaces . . . . . . . . . . . . . . . . . . . . . 201
9.1.1 Why the determinant? . . . . . . . . . . . . . . . . . . . . 203
9.2 Orientations and linear maps . . . . . . . . . . . . . . . . . . . . . 207
9.3 Orientations on Manifolds . . . . . . . . . . . . . . . . . . . . . . . 207
9.4 Orientation double cover . . . . . . . . . . . . . . . . . . . . . . . 209
6 CONTENTS
16 Flows 271
Currently, the script is not finished. In particular, the part about Flows and Lie
Derivatives is missing (Currently still in progress). Additionally, a few examples are
missing, for example, the Long line, the Helicoid and a few examples of smooth
manifolds, so if you are learning for the exam, don’t forget that these exist.
Additionally, I cannot guarantee that the script is error-free, in fact, it has many.
I tried to make sure that the big ones are corrected, but there will still be some
errors. As always, remember to verify everything yourself, and anything that differs
from your notes could be wrong. The lecture takes precedence when it comes to
the exam and grading.
7
8 CONTENTS
Preface
Sections with an asterisk in front of them are not mandatory to read, you can
skip them as you like (”You should know these ideas exist, but don’t need to learn
them”). Big thanks go out to Anastasia Sandamirskaya and Ji Zhexian for helping
with writing the first lecture about surfaces.
Disclaimer: We generally stay true to the notation of the lecture. The only real
exception to this (so far) is the symbols we use for charts, for which ψ, χ were used
in the lecture as general charts, whereas we use Ch1 and Ch2 (literally: chart 1 and
chart 2), to make it the equations a bit more direct. Other than that, the symbol
T1→2 is also a product of this script, for the transition/overlap map, which in the
lecture was always written out with the charts (χ ◦ ψ −1 ). We did this to make a
few equations a bit more readable.
Note: There are currently a few pages, particularly where there are many figures
in the text, that have very wide gaps. This changes every time one adds any text
because latex re-chooses where to put things like figures and definitions and then
tries to fit the text around this. I would need to fix this manually at every such
point and will do it after the script is done. For now, I hope you can get past this.
9
10 CONTENTS
Part I
11
13
We start with some of the most intuitive examples of the type of manifolds we
will be working with, that is, with curves and surfaces embedded in some form of
Rn .
14
Chapter 1
Curves
In this chapter, we will deal with curves. We first define what we mean by a curve,
and impose some restrictions on the kind of curve we want to deal with. We won’t
prove all the things we claim in this chapter, as some of these things you should
have already seen in a Calculus class and this is only a quick overview.
15
16 CHAPTER 1. CURVES
Figure 1.1: An example of a smooth curve. The Interval I gets mapped onto a
curve in R3 with the function γ
coefficients are polynomials in t, all the derivatives exist and are continuous.)
But look at Figure 1.2. The image of the curve is obviously not smooth in R2
at t = 0 or equivalently x = (0, 0). What is happening over there? Well, it
resembles the absolute value function a bit. It also had a sort of sharp bend
at a point. The problem back then was with the derivative. It simply did not
exist, which made the curve have a weird behavior (the sharp bend).Similarly,
here the problem is also with the derivative. It exists, obviously, since this is a
smooth curve. But it becomes 0 at the problem point (t = 0 or the origin). A
curve with such a bend is not something we want to really work with, therefore
we put another restriction on the curves we work with. We eliminate curves like
the one from this example simply by saying we don’t work with curves whose
derivative becomes 0 anywhere.
As we saw in the previous example, we get into problem situations if the deriva-
tive of the curve with respect to the parametrization parameter is zero. We therefore
define regular curves as those for which this doesn’t happen, or in other words, where
the velocity never vanishes.
Definition 1.1.2 (regular curves). A smooth curve is called a regular curve, if:
dγ
̸= 0 for all t ∈ I (1.1)
dt
where γ is the smooth curve and I is the interval it is defined on.
Note 1. We will use various notations for the derivative of a curve. These
include:
dγ
= γt = γ̇ (1.2)
dt
1.2 Arc-length
Now that we have said what we mean by a curve and restricted it so as to not run
into problems like the one in the example above, we can start with the geometry.
Undeniably, one of the most important quantities in geometry is the length. If you
know the lengths of a problem, you already know quite a bit of the geometry. What
is the length of a (piece of a) curve?
Well, we already restricted ourselves to work with regular curves, so our moti-
vation will be more on the intuitive side.
Imagine you have any curve, like the one in the figure 1.3. The idea is that we
divide the curve into very small almost-straight parts, calculate the length of these
parts by approximating that part as a straight line and then summing up all of those
back together. We can do it for reasonable (i.e regular) curves. Of course, in reality
what we do is go infinitesimal, at which point this becomes an integral.
dt ∆t, where by ’s’ we
For the small piece as seen in the figure, we have ∆s = dγ
mean the length and by ∆s the very small length of that very small part. Afterwards
18 CHAPTER 1. CURVES
we add all of these up, and in the continuum limit we get an integral:
Z Z t
dγ
s(t) = ds = dt (1.3)
γ t0 dt
Figure 1.3: A curve and our intuitive way to understand the definition of the
arc-length. We zoom in on a very small part of the curve, between t′ and t′ +∆t.
There, if ∆t is small enough, the line will be approximately straight and we can
use the velocity vector to calculate the length of that piece approximately. Note
that the velocity vector is drawn in way smaller than it would actually be for
any reasonable ∆t, just so that the picture is clearer.
Note that in the definition we did not assume that t > t0 , a negative arc-length
is possible, simply by going into the opposite direction of the parametrization of the
curve.
We already mentioned that the geometrically interesting object is the image of
γ, not the function γ (i.e the parametrization) itself. We will care mostly for things
we can define on the image of γ that are not dependant on that parametrization.
The arc-length is something independent of the parametrization1
1 Of course, we can always choose to parameterize the curve in the other direction, which
changes the arc-length by a minus. We can also choose a different reference point other than
γ(t0 ). But these are choices that are rather trivial and we won’t really mention them from
now on.
1.3. GEOMETRIC QUANTITIES 19
In line with this philosophy, we can define a very convenient, but also more
geometrically ”real” parametrization. The idea is that the arc-length is a geometric
object independent of parametrization, and that, for regular curves, we can use the
arc-length to parameterize the curve.
Proof. We will only sketch the proof, as this is something rather simple and you
very likely already saw the proof in a calculus class2
The first step is to take the arc-length and see it as a function of t:
Z t
dγ
s = f (t) = dt (1.6)
t0 dt
We can then take the inverse of this function, call it g(s) = f −1 and express
t as a function of s. If we take β = γ ◦ g = γ(g(s)), we found the right
parametrization. The only thing left that you need to convince yourself is that
the velocity is really of unit length. (You can do this using the chain rule.)
Of course, the curve is still regular, and all the properties like smoothness are
still obeyed by the curve. The image of β and γ is of course exactly the same, i.e,
you can’t change the curve simply by re-parameterizing. You might find Figure 1.4
helpful in visualizing this.
? dγ
τ (t) = (1.7)
dt
2 If not, try it yourself as an exercise.
20 CHAPTER 1. CURVES
Figure 1.4: Different Parametrizations of the same curve. The curve is drawn in
green, the ticks are the points on the curve with the parameter-values written
next to them. In (a) you see a typical non-special parametrization (i.e the ”t”
, in (b) you find the curve parameterized by the arc-length (s). It is intuitively
clear, why the parametrization is not something geometrically interesting. The
real curve (green line) exists, independent of the ticks. In (c) you find another
parametrization by the arc-length, except with a different choice of reference
point on the curve.
1.3. GEOMETRIC QUANTITIES 21
A bit of thought however, reveals that this cannot be true. Why? Well, it is not
independent of parametrization. Imagine, for example, you were to go twice as fast
along the curve. Then your velocity vector (=tangent vector in this example) would
be, twice as big at every point. But if we want the tangent vector to be something
fundamentally independent of the parametrization, then equation 1.7 cannot be the
correct definition of the tangent vector.
How can we fix this? Well, look at figure 1.5. It shows the same curve, param-
eterized in three different ways, with the ”fake” tangent vectors (from equation 1.7
drawn in. The thing that should jump at you is that, while the length of the vectors
does change, the direction does not 3 . The tangent vector how we defined it is not
the geometrically real thing, rather the unit tangent vector is, which is exactly how
we choose to define it below.
Figure 1.5: The ”fake” tangent vectors from equation 1.7 for different
parametrizations of the same curve (green). In (a) you see a random
parametrization (black ticks) and its ”fake” tangent vector (blue) from equation
1.7. In (b) you have the same situation, only that this time you go twice as fast
along the curve. Notice that the vectors (the physically drawn arrows) change.
In (c) you have the same thing, but this time parameterized with arc-length.
dγ/dt
τ (t) = (1.8)
|dγ/dt|
If we parameterize by the arc-length, then the formula for the tangent vector
becomes:
dγ
τ (s) = (1.9)
ds
since dγ
ds = 1.
avoid introducing another symbol,g, since it is just the function that expresses
the parameter t in terms of s.
Proof. The proof is strikingly simple. We know that the length of τ is set to
one. Therefore ⟨τ, τ ⟩ = 1 and dsd
⟨τ, τ ⟩ = 0 since the length (and therefore the
scalar product) doesn’t change along the trajectory. We can use the product
rule:
d dτ
0= ⟨τ, τ ⟩ = 2⟨ , τ ⟩ = 2⟨κ, τ ⟩ (1.11)
ds ds
Therefore, the the scalar product of the two vectors is 0, i.e they are orthogonal,
as claimed.
Figure 1.6: A curve with τ and κ drawn in. Notice that κ is orthogonal to τ .
This is something physics students are very familiar with. The situation is very
analogous to the trajectory of a particle. The speed of the particle doesn’t change,
so the only direction the acceleration (= curvature vector) can have is perpendicular
to the curve.
Note 3. We want to make a quick check on the units of all the quantities that
we described so far. Let’s assume that our RN holds some sort of length unit,
like the cm, which we will write as [L]. Let’s also assume the parameter of
our parametrization has the units of time, like sec, which we denote [T ]. Then
both γ and s have units [L], so the tangent vector dγ ds has units of [L]/[L] = 1
and is unit-less. This is something we want explicitly, as the geometric object
should not be dependant on the parametrization, which means it should also be
independent of the unit of the parametrization [T ]. The ”fake” tangent vector
we defined before has, on the other hand. units of [L]/[T ].
2
The curvature vector ddsγ2 has units of [L]/[L]2 = 1/[L].
24 CHAPTER 1. CURVES
Until now, we have only given a formula for the curvature vector in the arc-
length-parametrization. We will now write down the formula for the curvature
vector with any parametrization.
Lemma 1.3.2 (Curvature Vector in arbitrary Parametrization). Let γ : t ∈
dγ/dt
I → RN be any curve and τ (t) = γt (t) = |dγ/dt| the tangent vector. Then the
curvature vector κ(t) can be written as : 6
1
γt γt
κ= 2 γ tt − ⟨γ tt , ⟩ (1.12)
|γt | |γt | |γt |
d2 γ
where γtt is dt2 .
Before we go on to prove this, we first want to talk about what each part of the
equation means.
2
We know that κ = ddsγ2 and therefore expect it to have something to do with
d2 γ
dt2 . This turns out to be the case, the first term is indeed γtt . But there is a
correction term of −⟨γt , |γγtt | ⟩ |γγtt | , which has a nice geometric explanation.
It projects γtt onto the normal plane of the tangent vector. See Figure 1.7 for
a visual example. After we have projected γtt onto the normal plane, we still divide
2
it by |γt | . You can see it as just a factor that makes sure that the units work out.
We can see this simply by comparing units. The part that projects γtt onto the
Normal plane has the same unit as γtt , so we can just look at γtt . (Because we add
2
them. That doesn’t change the units.) The unit of γtt = ddt2γ are clearly [L]/[T ]2 ,
while the unit of κ is 1/[L], as we saw above. Therefore, to get a consistent formula,
2
we need something that has units of [T ]2 /[L]2 . 1/ |γt | is exactly a factor like that.
Note 4. We call the normal plane a plane, even though that is technically only
correct if we have a curve in R3 . In R2 it is a line, in R4 a hyperplane and in
general an (N − 1)-dimensional vector-space.
Proof. We now prove equation 1.12. The proof consist in its most basic form just
of taking the definition of κ in the arc-length-parametrization and switching to
the t-parametrization, using the normal rules of derivatives (chain rule / product
rule). We start with the chain rule.
dτ dt dτ
κ= = (1.13)
ds ds dt
Rt
where by dsdt
we of course mean dg
ds where g = f
−1
and f (t) = t0
dγ
dt dt.
Therefore:
−1 −1
dt dg df dγ 1
= = = = = 1/ |γt | (1.14)
ds ds dt dt |dγ/dt|
6 If you already have some experience of Differential Geometry or you are rereading this after
already learning further chapters, you might notice how this is the the covariant derivative of
the tangent vector
1.3. GEOMETRIC QUANTITIES 25
Figure 1.7: A curve with τ and κ drawn in, as well as γtt . Because we move
along the curve faster and faster (the ticks are more spread out), γtt has a
component in the ”forward” direction, which we cancel out in equation 1.12.
2
The vector is still too long though, which is why we need to divide by |γt | .
26 CHAPTER 1. CURVES
1.4 Curves in R2
We have, by now, defined exactly what we mean by a curve, seen the concept of
what sort of object is geometric, and defined a few of these, like the arc-length,
tangent and curvature vectors. We will now use all of these concepts to describe
curves in the two dimensional plane.
The main idea that makes this a lot simpler, is that the curvature vector κ
reduces to a number. This is because the direction of the curvature is always
predetermined in two dimensions by the direction of the tangent vector.
To see this, we note that, as we showed before, the curvature vector κ lies in
the normal ”plane” of the tangent vector, which in two dimensions means that it
lies on a straight line perpendicular to τ . Therefore, we only need to specify one
number8 to determine the curvature vector.
Let’s say we are at a point on a curve, like the one drawn in figure 1.8. We
can construct a right handed basis of R2 at that point by taking τ as our first
basis vector, and the vector that one gets if one rotates τ by 90 deg (in the positive
sense.), which we will call N . Since τ is of unit length and we get N by rotating τ
by 90 deg, this is an orthonormal basis. (One that is right hand sided.) Notice that
it immediately follows that:
κ = kN (1.20)
for some k ∈ R, because we know that κ and τ are orthogonal. We call this k the
curvature scalar. It is an important quantity in differential geometry, and we will
find its equivalents for different geometric objects throughout the subject.
7 You should recognize this from Calculus II, given maybe in a different notation: dr/dt =
x/dt = ⃗xr d⃗
(∇r) ∗ d⃗ x
dt
8 On each point of the curve
1.4. CURVES IN R2 27
Figure 1.8: A curve with its tangent, curvature and normal vectors drawn in
at a point on the curve. As you see, the curvature vector is just some number
times the normal vector. Note however, that k is not just the absolute value of
κ, since it can also be the negative of its length, if it points in the other direction
Example 1.4.1. Our first example is the simplest curve that is not a straight
line (because a straight line, of course, has no curvature9 ),which is a circle of
radius R.
The curvature of that circle is
k = 1/R (1.21)
Figure 1.9: A few circles with different radii, with their respective τ, κ, N drawn
at the point (0, R) of each curve. The bigger the circle, the less curved it is, as
reflected by the formula k = 1/R.
1.4. CURVES IN R2 29
The curvature scalar has a lot of interpretations. Let’s first state them, then discuss
their consequences.
1. The curvature scalar is the rate of change of the angle the tangent vector
makes with the x axis. Mathematically, let θ = arctan ττ21 (s)
(s) be exactly
that angle Then:
dθ
k= (1.22)
ds
2. The absolute value of the curvature scalar k tells us the radius of the
osculating circle, which is the distinct circle, that agrees with the curve
up to order two.
1
|k(s)| = (1.23)
R(s)
where R(s) is the radius of that circle at the point of the curve whose
parameter-value is s.
The first interpretation of the curvature scalar should make a lot of sense in-
tuitively. We know that the tangent vector cannot change its length, since per
construction it is of unit length. Therefore the only thing that can really change is
the direction, i.e the angle it makes with the x-axis. This, along with the fact that
the curvature vector describes how the tangent vector changes, makes the first part
of the proposition rather intuitive. See figure 1.10 for a visualisation.
The proof is not to complicated, you just need to derive θ(s) and remember
that (1) the derivative of arctan(x) is 1+x
1
2 and (2) the normal vector in terms of
Proof of the first interpretation. As we said, you only need to derive θ. Let’s
30 CHAPTER 1. CURVES
Figure 1.10: A curve, with its tangent vectors drawn in, and a table that shows
how the tangent vector rotates.
start:
dθ d τ2
= arctan (1.24)
ds ds τ1
d(arctan(x)) d τ2
= (1.25)
dx ds τ1
1 τ˙2 τ1 − τ2 τ˙1
= (1.26)
1 + x2 τ12
1 τ˙2 τ1 − τ2 τ˙1
= (1.27)
τ2
1 + 22 τ12
τ1
1
= ⟨(τ˙1 , τ˙2 ), (−τ2 , τ1 )⟩ (1.28)
τ12 + τ22
1
= ⟨κ, N ⟩ (1.29)
1
=k (1.30)
The factor in the fraction is one because it’s the square of the length of τ , which
is one.
Now, what about the second interpretation? Well, you can imagine a circle,
going along the curve, that locally looks like the curve. (The curve tries to be
1.4. CURVES IN R2 31
as similar to the circle as possible, but because the radius of the osculating circle
changes with s, it doesn’t become a circle.)
Figure 1.11 gives a picture of a curve and it’s osculating circles at different
points of the curve. As you hopefully agree with, the bigger the radius of the circle,
the more straight the curve will be at that point (as both of them agree to order
two so they locally behave quite similarly.) Therefore we expect that the second
interpretation is correct, that is, the curvature scalar is inverse to the radius of the
osculating circle.
Figure 1.11: A curve with its osculating circle drawn in at a few places along the
curve. (The biggest one only partially drawn in) It is clear that the bigger the
osculating circle is, the straighter the curve will be, which gives the connection
to the curvature scalar.
Proof of the second interpretation. We will not prove this, as it is quite simple,
but we will sketch a proof. The osculating circle agrees with γ up to order
two. Therefore, we can expect that the second derivatives (i.e the k’s) agree
for the curve and the osculating circle (which we can see as a second curve.) at
that point. We know that at that point, the circle has kcircle = 1/Rcircle , and
therefore this should also be true for the first curve. The only missing parts of
the proof are (1) the proof that an osculating circle exists, which it does10 , and
a more rigorous way of presenting the above argument.
There is actually also a third interpretation of the curvature scalar, for a special
kind of curve. Let’s say, that the curve is the graph of a function y = u(x) that
assigns a y-value to every x-value, like the one in figure 1.12. Consider the second
10 A straight line is a circle of infinite radius in many aspects of geometry, this is also
true here, if the curve is locally straight at a point, the radius of it’s osculating circle will
blow up and the circle will become as straight line, but the theorem will still hold. For the
mathematicians: 1/∞ = 0 in this case.
32 CHAPTER 1. CURVES
Proof. Let’s start, by collecting different terms that might be useful. Firstly,
γ(x) = (x, u(x)) and therefore:
γx = (1, ux ) (1.32)
1/2
|γx | = 1 + u2x (1.33)
γx (1, ux )
τ= = 1/2
(1.34)
|γx | (1 + u2x )
(−ux , 1)
N= 1/2
(1.35)
(1 + u2x )
The first three should be rather clear, coming straight from the definition. The
last one comes from the fact that N is just τ , but rotated by 90 deg, which means
we switch the two entries of the vector and put a minus in-front of the first one12
11 Conditions apply, as always.
12 Ifthis is not clear to you, try it out with the rotation matrix of positive 90 deg. You’ll
see that this is correct.
1.4. CURVES IN R2 33
We can now just use the definition of κ and the chain rule and calculate until
we get there.
dx dτ
κ= (1.36)
ds dx
−1
dx ds −1
= = |γx | (1.37)
ds dx
1 dτ
→κ= (1.38)
|γx | dx
1 d γx
= (1.39)
|γx | dx |γx |
Before we continue, there is something to note about what we already found.
Inside the derivative, we already normalize once, and then again outside of the
derivative. In this sense, κ is a normalized version of a second derivative.
1 d (1, ux )
... = (1.40)
|γx | dx |γx |
1 (0, uxx ) d 1
= − (1, ux ) (1.41)
|γx | |γx | dx |γx |
(We used the product rule.) Now, we know that k = ⟨κ, N ⟩. The last term in
the equation above for k is proportional to (1, ux ) which is proportional to τ ,
which means that when we form the scalar product to get k, it drops out, since
τ is orthogonal (per construction) to N . We get:
k = ⟨κ, N ⟩ (1.42)
1 (0, uxx ) (−ux , 1)
=⟨ , ⟩ (1.43)
|γx | |γx | |γx |
uxx uxx
= 3 = 3/2
(1.44)
|γx | (1 + u2x )
Here, we used that that aforementioned second term is orthogonal to N and left
it out. At the end we just collected terms.
Now, after seeing how much manual computation this took, you might be a
bit astounded as to why. The reason is the same reason why anytime you actually
want to compute something in differential geometry it usually turns into a mess
of derivatives. We are turning something fundamentally coordinate-based13 (uxx )
into something geometric (k). Coordinate-based objects usually have, as you might
imagine, a lot of information in them that is only related to the choice of our
coordinates and we have to filter that information out when we do the conversion.
This is the reason why there is so much to compute, even if the steps aren’t too
complicated.
13 To make this discussion more general, we write coordinate-based, even though right now
it’s just a parametrization. You can see a parametrization as coordinates on the curve.
34 CHAPTER 1. CURVES
Figure 1.13: You can preform a rigid motion and not change anything about
the curvature of the curve.
where θ(0) is a constant we can choose freely. The next step is to integrate the
equation:
dγ
(s) = τ (s) = eiθ(s) (1.47)
ds
and get: Z s
′
γ(s) = γ0 + eiθ(s ) ds′ (1.48)
s0
giving us yet another constant γ0 , which we can choose freely. To get a more
rigorous proof, you would need to show that these are the solutions (just dif-
ferentiate them) and that these are the only solutions (use a theorem from
calculus.)
Figure 1.15: A problem case of a closed curve, for which the tangent vector at
the beginning is not the same as the tangent vector at the end. We want to avoid
this, so we just take these kinds of curves (and ones where higher derivatives
don’t match up) out of the set of curves we consider
1.5. CURVES IN R3 37
3. If we are working with closed curves, we will want them to have the (nice)
property that, if we extend them periodically to a curve from R → RN ,
they are smooth. This is to avoid annoying situations like the one in figure
1.15
Theorem 1.4.2. Let γ be a two dimensional regular closed curve that obeys
the above restriction. Then: Z
kds = 2πn (1.49)
γ
for some n ∈ Z
The integral quantity is the global quantity we mentioned before, while k is the
Rb
curvature scalar, which is a local quantity. Since γ k ds = a dθ
R
ds ds, the global
quantity is just the total angle (with signs) by which the tangent vector rotated, a
profoundly global thing.
It makes sense that this would be so. If the curved is closed (and is smooth
on the edge, if made periodic), then the angle τ has to rotate by must be some
multiple of a whole rotation, since it ends up where it started.
This proposition tells us that a non-intersecting curve’s τ can only turn once in
total. See the examples in figure 1.16
With this we conclude the topic of two-dimensional curves, and move on to
three-dimensional curves.
1.5 Curves in R3
1.5.1 First remarks
Very early in our discussion of two-dimensional curves we figured out that in two
dimension, the curvature vector is not really necessary for the description of how the
curve curves. Better said, the curvature scalar (the signed length of the curvature
vector) held all the information about curvature, the direction of the curvature vector
38 CHAPTER 1. CURVES
Figure 1.16: Two simple curves and the graph that shows by how much the
tangent vector rotated.
was always predetermined. This is very different for curves in three dimensions.
Here, the vector character of the curvature vector really stands out.
We saw at the beginning of this chapter that the curvature vector lies in the
normal plane of the tangent vector. In two dimensions this helped us, by letting us
forget about the direction of the curvature vector and only consider the curvature
scalar. This time, we cannot do this, as we have an entire plane that the curvature
vector could lie in.
In the two dimensional case, we defined a moving frame composed of τ and
N , which was a right handed orthogonal basis. This is the idea we will use to get
further in three dimensions.
Definition 1.5.1 (The moving frame). Let, as always, γ be a curve, this time in
R3 with all the properties we already mentioned before (smoothness, regularity
etc.) We define three vectors at each point, that will compose the moving frame
we will usually use.
The second vector will be called the normal vector15 , while the third vector is
called the bi-normal vector.
Together, N and β span the normal plane of τ . For a visualisation of the moving
frame, look to figure 1.17.
Figure 1.17: A three dimensional curve, with the moving frame drawn in. N is
κ, but normalized, β is the cross product of N and τ .
Definition 1.5.2 (Curvature scalar for curves in three dimensions). The cur-
vature scalar is simply the absolute value of the curvature vector, defined as
k = |κ|
Now, you might have noted that to define N , we divided by the curvature scalar
and this becomes a problem, if k is zero. This is an actual problem and happens
for any curve that is (to second order) straight at some point. We will exclude this,
simply by adding another restriction on our definition of a curve.
15 Even though it is not the only normal vector to τ , but it is a very special normal vector,
With this aside, let’s go back to the curvature vector. We saw that the curvature
scalar will simply not provide enough information about our curve that we can paint
a complete picture. We will need another agent, which will be called the torsion.
1.5.3 Torsion
Here we will define what we mean by the new agent we said we needed in the
previous section, called torsion. Torsion will also be another geometric object16 . It
will tell us, in a sense, how much the curvature vector changes.
dN d dτ d
⟨ , τ⟩ = ⟨N, τ ⟩ − ⟨N, ⟩ = 0 − ⟨N, κ⟩ = −k (1.52)
ds ds ds ds
which is just (minus) the curvature scalar, which we already know, and for the other
component we get (Product rule again):
dN d⟨N, N ⟩ d
2⟨ , N⟩ = = 1=0 (1.53)
ds ds ds
since a unit vector can’t change in it’s own direction (otherwise it’s length would
change.).
That is why we take the projection.It provides us with the only new information.
The information we want is how the normal plane changes, but only in the direction
of the bi-normal-vector.
We can form a table with all the objects we have so-far introduced and a few
things to note on them. See table 1.1.
One thing that you might find surprising at first is that the units of λ are not
1/[L]2 . The reason is because we normalize κ before differentiating, which means
we multiplied the units by [L].
16 As a reminder, something is a geometric object or geometrically invariant when it does
Figure 1.18: A curve, and it’s normal vector and plane changing along the curve.
Table 1.1: The main geometric objects we have defined up-til now.
Figure 1.19: The parallel between k and l. k measures how lose a curve is to
a straight line (Part (a)). In part (b) you see a curve that lives entirely in a
plane, while in (c) you see a curve that deviates from the plane it is almost in,
at least in the direct vicinity of the thick point with the moving frame drawn
in. l is the thing that measures this.
1.5. CURVES IN R3 43
where l is the torsion scalar, and c is some universal constant. This is why the
torsion scalar measures how much the curve deviates from living in a plane. As an
exercise, you should prove all the claims we did not prove in this discussion and find
c. You can take the curve γ = (s, as2 , bs3 ), as a first example and see where in the
Taylor-series around 0 you find k and l)
You can also prove that if k = 0 for all s, then the curve is a straight line, and
if l = 0 for all s, the curve lies in a plane. (You can do this as an exercise, or decide
that the above discussion convinced you of this.)
In summary, torsion measures how much a curve twists away from a plane and
into the third dimension.
Example 1.5.1. What is the curve with constant curvature and torsion? We
won’t show it here, but it can be shown that it is a helix. A helix, by the way,
can be parameterized by (R cos(t), R sin(t), mt) where m is some constant.
Why does it make sense that it’s a helix? Well, we want constant curvature,
which means a circle is involved, but we also want it to twist out of the plane
it lives in, at a constant rate, which is why we get a helix.
This is stated in the following theorem, which can be proven similarly to its
equivalent in two dimensions.
Theorem 1.5.1 (k and l determine curve up to a rigid motion). Let k(s) ≥ 0
and l(s) be any smooth functions. If we set the curvature scalar and torsion
scalar to these functions, respectively, we determine the curve uniquely in R3
up to a rigid motion in R3 .
As we said before, we won’t prove this. But we hope you see how interesting the
result it. We need only two real functions to describe a curve in three dimensions,
and only one real function in two dimensions. In both of these cases, we reduced
our description by an entire function.
e1 (s)
e2 (s) (1.55)
e3 (s)
Then, as we will show in a second, for a general moving frame we can write:
e1 (s) e1 (s)
d
e2 (s) = A(s) e2 (s) (1.56)
ds
e3 (s) e3 (s)
where A(s) is a 3 × 3 matrix.
This matrix has a special property, generally, for any moving frame.
Proposition 1.5.1. Let γ be any curve and (e1 (s), e2 (s), e3 (s)) an orthonormal
moving frame. Then the matrix A(s) from equation 1.56 is anti-symmetric.
Where does the anti-symmetry come from? Well, there are two parts of the
anti-symmetry. Firstly, the diagonal is zero, which comes simply from the fact that
a vector of unit length cannot change in its own direction, otherwise the length
would change. Then there is the anti-symmetry of the other components, which
1.5. CURVES IN R3 45
Figure 1.21: A curve with its Fréchet-frame and a second picture of the same
curve with a random moving frame.
46 CHAPTER 1. CURVES
stems from the orthogonality. If one vector changes, the others have to change in
a way that they all stay orthogonal to each other.
0 0
τ k τ
d
N = −k 0 l N (1.58)
ds
β 0 −l 0 β
where k = k(s), l = l(s) are the curvature and torsion scalars, respectively,
This is a particularly simple matrix. The special thing about a Fréchet-frame is
that it reduces the three independent matrix-components of A to two, specifically
two components we already know.
Proof. Many of these are definitions (for example, the first equation is just the
definition of N ), the rest aren’t too hard to check and we leave them to you as
an exercise.
The next theorem says something about the same quantity as above, for knotted
curves. Knotted curves are curves that cannot smoothly be transformed into a circle,
without crossing themselves. You can see a few examples in figure 1.22
Figure 1.22: The trefoil, figure-8-knot and unknot. The first two are knotted,
the last one is not, even though it might look like it at first.
The bound in Millner’s theorem is sharp: for example the trefoil knot has total
absolute curvature of exactly 4π.
With these two examples, we conclude our discussion of curves and move on to
the second type of object we want to discuss before talking of differential geometry
in a general matter, those objects being surfaces in R3 .
48 CHAPTER 1. CURVES
Chapter 2
Surfaces
In this chapter, we deal with surfaces, which are the obvious next step after curves in
our discussion. We will not treat surfaces in the more general RN , but just surfaces
in R3 , as they are more intuitive and are enough for the purposes of this lecture.
We start with the the definition and then define a few basic properties, before
moving further to a discussion of the geometry and curvature.
In figure 2.1(b), you can see the graph of the function z = f (x, y) =
(x2 + y 2 ). It is, of course, a surface (in every intuitive way, but also in the
more general definition we will give below.)
It is clear that the surface in the example should be a surface. The idea that a
surface should always be the graph of a function is however, not a good one. There
are two simple examples that should definitely be surfaces, but wouldn’t be, if that
was our definition. The first is the xz-plane. Obviously, it should be a surface. If
anything should be a surface, the xz-plane should be. And yet, it is quite easy to
see that you can’t write it as a function of x and y. Well, you might say, that there
is nothing special about x and y in R3 and that we should be allowed to choose
any plane to describe our surface. For example, we could take y = f (x, z) = 0 to
describe the xz-plane. Yes, that is a possibility, but still we find a problem. Take
the sphere. It should definitely be included in our definition of a surface. But I dare
you to find any plane for which you can write the whole sphere as a graph of a
function. It should be very clear, that this is not possible.
49
50 CHAPTER 2. SURFACES
temp
Figure 2.1: A picture of many surfaces, that explain our definition of a surface. In
(a) you can a graph of some random function of x and y. The picture in (b) is
simply the graph of f (x, y) = x2 + y 2 . In (c) you see the first problem with the
naive definition, because we cannot write the xz-plane as the graph of a function
of x and y. In (d) you see the further problem, that even if we allow for the
function to be defined on an arbitrary plane, the sphere can simply never be of
sure form globally, but it can be made such locally (e).
2.1. SOME DEFINITIONS AND BASIC QUANTITIES 51
Can we repair this situation? Yes, if we recognize that, while we cannot write
surfaces like the sphere as a graph of a function, at every point on the surface, in
some (possibly very small) region of the surface, we can write that region as the
graph of a function (on some plane in R3 ). That is we can write the surfaces locally
as the graph of a function1 . This leads us to our full definition.
The tangent plane to M at p is the set of all vectors based at p and tangent
to M at point p. We can formally define it as:
(2.1)
Apart from tangent vectors, we also have normal vectors. These are vectors that
are normal to M , which, of course, means that they are also normal to Tp M
Figure 2.2: A surface M , with a point p and the tangent plane Tp M drawn
in, at different levels of ”zoom”. Globally, M and Tp M are very different,
but the more we zoom in, the more do the coincide, until they become
”almost” the same.
Figure 2.5: The Möbius strip. You can try to draw a normal field by starting
at some point and drawing the next normal vector and then the next until
you get back to where you started, but you will find that when you come
back, your normal will point into the other direction than the one you started
with (= not smooth).
Now that we have developed the most basic tools we could use to do differential
geometry on surfaces (The analogues of τ and N for curves) we can proceed to
talk about curvature.
We shall start with the geometric definition of curvature, since it is the most intuitive
one.
The first way to define curvature on a surface is using curves that lie on the surface.
If the surface is curved, then any curve passing through the point of interest p
has to be curved as well. We cannot just take the curvature of some curve on
M going through p, simply because this will give us different values for different
curves. (There is an infinite amount of curves going through p that live on M ,
all curved differently, see figure 2.3) There is however a specific amount of curving
that a curve has to curve at point p to still stay on M , otherwise it would leave.
This amount of curvature, will be in the direction of N . Why? Well, in the tangent
direction, we can have pretty much any curvature we want, the curve can be as
curved as it want on M . That pretty much leaves on the direction of N . From this
simple thought follows the next definition.
Figure 2.6: A surface M and a point p. There is a lot of curves going through
p, some less, some more curved. (They all have to live on M though.)
56 CHAPTER 2. SURFACES
Figure 2.7: The idea of the previous definition. The function Qp takes a
(normalized) v from the tangent plane, takes a curve through p with a
matching tangent vector, and puts out the normal component of κγ .
There are two questions you might have already asked yourself. Firstly, why do
we only define this when |v| = 1? That is an easy question to answer. Mostly
convenience. It will simplify the proof and calculations, while not leaving out any
real information. Let’s say you want to know Qp (v) for some vector with length
2. Then you can do this entire procedure for the same vector, but normalized, and
simply parameterize the curve γ in such a way that you move twice as fast along it.
The second question is whether this definition makes any sense, whether it
is well-defined, because we have, after all, many curves going through p with their
tangent-vector equaling v and it is not obvious that we always get the same number
for Qp (v) for any choice of appropriate curve. This will turn out to be true however,
as we will prove in a short time. For now, we will quickly assume that this is true
and talk about the object Qp a bit.
2.3. THE GEOMETRIC DEFINITION OF CURVATURE ON SURFACES57
We want to take a few notes and intuitive ideas about what Qp is and describes at
this point.
• Firstly, Qp (v) is not standard notation, because it will turn out that Qp will
simply be the second fundamental form in a slightly different manner, which
will be denoted as Ap (X, Y ), and which we will introduce shortly.
• Qp (v) will turn out to be a quadratic form in the components of v, i.e it will
P2
be of the form Qp (v) = i,j=1 aij v i v j , where aij are some numbers that
come from the surface’s geometry and v i is the i-the component6 of v.
• Qp (v) = Qp (−v). This follows from the fact that for an appropriate curve
which we use to calculate Qp (v), we can simply take the same curve, but
reverse it’s parametrization. Then the tangent-vector changes direction (or
sign), but the curvature vector stays the same, and therefore by definition
Qp (v) does too7 .
We will prove many of these claims soon, for now we just wanted to familiarize you
with these properties.
6 Notice the index is upstairs. There is a reason for this sort of notation, where upstairs and
downstairs indexes have different meanings. For now, however, there really is no difference,
but we will use this notation to get you accustomed.
7 Prove that the tangent-vector changes sign, but the curvature doesn’t.
58 CHAPTER 2. SURFACES
Figure 2.8: A sphere, the north pole, a random unit tangent vector and a
great circle.
1 1
Qp (v) = ⟨κγ , N ⟩ = ⟨ N, N ⟩ = (2.2)
R R
Also, this is independent of v, which really just says something about the
(local) symmetry of the sphere.
a You can alternatively see this as us using special (Cartesian) coordinates, for
which the point on the sphere that is of interest p has coordinates (0, 0, R). Then,
since Qp is defined by a scalar product of two fundamentally geometric vectors, it
should transform geometrically.
2.3. THE GEOMETRIC DEFINITION OF CURVATURE ON SURFACES59
Figure 2.9: Here we have the same situation as in the previous figure, but
instead of a greater circle, we take a smaller circle to compute Qp (v). We
have however rotated the sphere (v is pointing towards you), so that it would
be simpler to see what is going on, namely that the curvature vector of the
smaller curve is far larger, but that the projection onto N is still the same.
Lemma 2.3.1
Let everything be the same as in the theorem above. Then
What the lemma says, is basically that instead of the geometric curvature κ of
the curve, we can instead use the second derivative of γ.
The proof is quite simple.
Proof. Let γ be any appropriate
curve. We simply compute ⟨κγ (0), N (p)⟩, using
the fact that κγ = |γ |2 γtt − ⟨γtt , γt ⟩ |γ |2 . We can use the fact that |γt | =
1 γt
t t
|v| = 1, which simplifies the formula for κ
⟨κγ (0), N (p)⟩ = ⟨γtt − ⟨γtt , γt ⟩γt , N (p)⟩ = ⟨γtt (0), N (p)⟩ (2.4)
where in the last step we have used the geometric fact that γt = v is in the
tangent plane, and therefore normal to N .
k = uxx (2.7)
It is clear that we can also (locally) write the curve as a graph over it’s tangent, as
seen in figure 2.3.2, where we also find that the curvature scalar becomes a second
derivative.
We will see that the curvature of a surface will also become a second derivative,
when you write the surface as the graph of a function over its tangent plane (locally).
⟨κγ (0), N (p)⟩ = ⟨γtt (0), N ⟩ = ⟨(x′′ (0), y ′′ (0), z ′′ (0)), (0, 0, 1)⟩ = z ′′ (0)
(2.8)
where we used the lemma from above. We found the theme of the last
subsection. Curvature is a second derivative. What we have basically
done is write (a local part of) the surface as a graph over its tangent plane
at p, and found that, similarly to curves, curvature became the second
derivative in those coordinates.
Step 4 Somehow, we need to use the fact that γ lies on the surface, which we have
not yet used. Of course, because γ does lie on the surface, its coordinates
have to be: γ(t) = (x(t), y(t), f (x(t), y(t))), because the z component (for
a local part, again) is not independent of x and y. If we now evaluate
z ′′ (0) we get:
Figure 2.10: A curve that is just the graph of the function u(x). When
written as the graph of a function over its tangent, k becomes the second
derivative.
2.3. THE GEOMETRIC DEFINITION OF CURVATURE ON SURFACES63
Figure 2.11: The coordinates we are using for the proof of the theorem.
The point p is at the origin and the tangent plane is the xy-plane
Firstly, we have just proven that Qp (v) is a quadratic form, but we have
also proven that Qp (v) is independent of the curve we choose. The quan-
tities fxx , fxy and fyy have nothing to do with our choice of curve, only
with the underlying surface.
We can rewrite our result quite a bit and see another way of looking at it.
which makes the quadratic-form-ness of Qp a bit more visible. (The partials are of
course evaluated at p)
64 CHAPTER 2. SURFACES
Definition 2.4.1
Let M and p be, as always, a surface and a point on it. Let us also use
coordinates where locally we write the surface as the graph of a function f
over its tangent Tp M at p. The the second fundamental is defined as:
2
X ∂f
Ap (X, Y ) = i ∂xj
X iY j (2.15)
i,j=1
∂x
We can very quickly note that the second fundamental form is symmetric, in
the sense that Ap (X, Y ) = Ap (Y, X) simply because the hessian is a symmetric
matrix. You can also see the second fundamental form as the first Taylor-coefficient
of f , which actually tells us anything geometric. The zero-th coefficient just tells
us where the surface is, the first only how the tangent plane is oriented, both of
which are not geometric things. Only the second one tells us anything geometric,
that is, the curvature at the point p.
Note 5. The formula above can only work if we use coordinates where p =
(0, 0, 0) and the xy plane is the tangent-space, because there fx and fy vanish.
We need to carefully pick this coordinate system to extract the geometric infor-
mation. There is a general formula in any coordinates, but it long and tedious,
which is the reason why we won’t do it here.
To connect the second fundamental form to curvature, we note that:
Let’s say you have a scalar product (not necessarily the standard euclidean
one) and a bilinear form. That is you have the maps
⟨,⟩ : V × V → R (2.17)
B :V ×V →R (2.18)
(2.19)
where the first has the usual properties of a scalar product and the second one
is a linear in both entries and symmetrica . Then there exists an orthonormal
basis e1 , e2 , . . . , en of the vectorspace V , so that Bij = B(ei , ej ) = λi δij
for some numbers λ1 , λ2 , . . . , λn . In other words, the matrix Bij is diagonal
with the λ’s as the diagonal. The λ’s are called the singular or principal
values.
a B(X, Y ) = B(Y, X)
If you have not seen the singular value decomposition before, it simply states
that you can rotate and flip your coordinate system so as to turn a particular bilinear
form into the form:
B(v, w) = λ1 v 1 w1 + λ2 v 2 w2 + · · · + λn v n wn (2.20)
k1 0
Ap = (2.21)
0 k2
where k1 and k2 are some numbers that only depend on the surface. They are the
principal curvatures we mentioned when discussing Qp (v), which you can see quite
simply. If we work in that coordinate system, the second fundamental form used on
the same vector twice becomes:
You can verify (exercise) that the maxima and minima happen when v = ±e1 , ±e2
in those special coordinates, which verifies that (1) k1 , k2 are the extreme values of
Qp (v) and that (2) the directions of the principal axes of curvature are orthogonal
to each other (e1 ,e2 are orthogonal per construction).
66 CHAPTER 2. SURFACES
H = k1 + k2 (2.24)
K = k1 k2 (2.25)
The first H, is called the mean curvature, while the second, K, is called the
Gauss curvature. You might wonder why H is called the mean curvature when it
should be k1 +k
2 . That used to be the old definition, but over time, people got sick
2
What is the meaning of H? To understand it, we hope, that by now you are
convinced that the matrix form of A in our special coordinates is just the Hessian
of the function that describes our surface, and that when we rotate Tp M , this
does not change. What does change is that the Hessian becomes diagonal in these
coordinates (which we will denote x′ , y ′ ). So what we get for that matrix is:
∂2f
!
0 k1 0
H = Ap = = (2.26)
∂x ′2
0 ∂2f 0 k2
∂y ′2
So that:
∂2f ∂2f
H = k1 + k2 = ′2
+ ′2 (2.27)
∂x ∂y
which is the two dimensional Laplacian of f . Now, the Laplacian tells us quite a bit
about average quantities, which also makes it more clear why H is called the mean
curvature. It tells us how much the function f deviates from it’s value at (0, 0) in
our coordinates, which is, of course, 0. It makes sense that a this would be a good
value to describe a part of the curvature. Because the Laplacian is inevitably tied
up in (physical) problems like drums and soap films, it first came up way before
curvature in this sense was discussed. Surfaces with H = 0 are called minimal
surfaces, not only because of the history OF THE Laplacian, but also because they
are usually surfaces of minimal area for some boundary problem.
Figure 2.13: The signs of k1 and k2 change with the other choice of N .
If we choose N1 as in the picture, k1 is positive, while if we use N2 , it is
negative. The absolute value stays the same however.
Now what about Gauss’s curvature? Well, it has a very nice property. It’s sign
is independent of our choice of N , unlike k1 , k2 , and H. First regard k1 and k2 .
68 CHAPTER 2. SURFACES
Figure 2.14: The Laplacian locally is just the average difference of f (x0 )
and f (x0 + ϵe) where e are all possible directions, ϵ is a parameter that goes
to 0
Look at figure 2.4.2. There you should see how k1 changes depending on the choice
of our normal. If we choose N1 , then the normal and the curvature vector of the
curve belonging to k1 point in the same direction, if we choose the other direction,
they point in opposite directions.
The fact that the sign of H tells us nothing geometric either can be seen alge-
braically quite easily.
H = k1 + k2 (2.28)
Therefore if we change the signs of k1 and k2 it is clear that the sign of H changes.
You can also look at this through the meaning of H. H is the two dimensional
Laplacian of the function that the surface is a graph of (in good coordinates it is.)
If we reverse the direction of N , we practically just take −f as the function M is a
graph of (locally). But the Laplacian is linear, which means it changes signs too and
since H is just the Laplacian it does too. The Gauss curvature, on the other hand,
does not change signs. The important thing to note is that if you change the signs
of k1 and k2 . their product, which is the Gauss curvature, does not. The sign of the
Gauss curvature captures the relationship of the signs of the principal curvatures. If
they have the same sign (++ or −−), then K will be positive. Otherwise it will be
negative. The sign relationship of the principal curvatures are very geometric things.
In particular, if both principal curvatures have the same sign, then the curves that
they belong to curve in the same direction, and if they have opposite signs, into
other directions. In the first case you have something that looks like a bulge, in the
other something that looks like a saddle point, as you can see in figure 2.4.2.
2.4. THE SECOND FUNDAMENTAL FORM 69
Figure 2.15: Typical example for each of the possible signs of the Gauss
curvature. If it is positive, both curves curve in the same direction, creating
a bulge. Otherwise, they curve in opposite directions and create a saddle.
70 CHAPTER 2. SURFACES
There is another way to express H and K, that might be more enlightening. If,
as always, we call the matrix of the second fundamental form (so the hessian of f )
A, then we get the result:
H = tr(A) (2.29)
K = det(A) (2.30)
as you should check yourself (exercise). These formulas are important, as they
illustrate the geometric character of H and K.
1/R 0
D2 (f ) = (2.31)
0 1/R
from which our results from above follow. If you actually did the calculation,
you are probably happy about the power of geometry.
• The point p for which we are calculating the second fundamental form
is a fixpoint. S(p) = p
• The tangent plane Tp M is preserved by S, which means S(Tp M ) =
Tp M . In other words, the plane P that we reflect our space on is
perpendicular to Tp M .
Then, as I hope you can agree, S acts on the tangent-plane like a reflection
through a line L, which passes through the origin. Then the direction of
this line as well as the direction orthogonal to it are the principal axes of
curvature.
You can see the setup of the situation in figure 2.5 and the way the symmetry
acts on the tangent plane in figure 2.5. I hope you can appreciate how useful this
Figure 2.16: A surface that is symmetric under a reflection across the plane
P . The plane P is perpendicular to the tangent plane Tp M , and the line of
intersection of the two planes (L) is also drawn in.
theorem is. It is very easy to recognize a reflective symmetry like the one we need,
and if we do find it, we have pretty much immediately found the axes of curvature.
We then only need to calculate Ap for two combinations of e∥ and e⊥ to specify
Ap completely, and often these are quite easy to find (like with the sphere).
Proof. Let e⊥ and e∥ be an orthonormal system so that the first vector is orthog-
72 CHAPTER 2. SURFACES
Figure 2.17: Here we drew the tangent plane and the principal axes of cur-
vature. The symmetry acts as a reflection along the line L and the two
directions e∥ , e⊥ turn out to be the axes of curvature (per the theorem we
will prove in this section.)
onal to L and the second is parallel. We need to prove that in these two axes,
A is diagonal. We will do this by a contradiction. First, not that if k1 = k2 ,
then A = k1 I, which already is diagonal, so we can assume k1 ̸= k2 . We will
show that any other orthonormal basis cannot diagonalize A8 .
Step 1 Our first claim is that Ap cannot change under the symmetry as it is a
geometric object. As a formula, we claim
This is easy to see geometrically, because the curves flip under the sym-
metry.
Step 2 Let’s assume there is another orthonormal basis of Tp M , which diagonal-
izes A and let’s call these basis vectors e1 , e2 . Then S(e1 ), S(e2 ) also diago-
nalize A. This is easy to to see because of the previous step.A(S(e1 ), S(e2 )) =
A(e1 , e2 ) = 0. and A(S(e1 ), S(e1 )) = A(e1 , e1 ) = k1 and similarly for the
other combinations. A in that basis has to have the form
k1 0
A= (2.33)
0 k2
Step 3 We now want to show that S(e1 ) cannot lie in the direction of e2 and vice
versa. Well, we know that the maximal and minimal values of curvature
8 This is why we excluded the case k = k , because any orthonormal basis diagonalizes A
1 2
in that case.
2.5. SYMMETRY AND CURVATURE 73
lie in directions that are perpendicular to each other, see figure 2.5 for
a picture of an Ap with k1 ̸= k2 . It should be clear that the maximum
(k1 ) cannot occur for a vector between e1 and e2 and the same goes for
the minimum. Therefore S(e1 ) can also not have a component in the e2
direction.
Step 4 But then S(e1 ) has to lie along e1 and be of unit length (since S is an
isometry) and the same goes for e2 . So we get the possibilities:
Step 5 But which vectors have this property? Well, S is a reflection along L, so
the only vectors with this properties are the ones that lie along L, which
get mapped onto themselves, and the ones lying on the line perpendicular
to L (L⊥ ), as you can see in figure 2.5. Therefore:
(2.37)
e1 , e2 ∈ ±e∥ , ±e⊥
This theorem is very useful when computing curvatures and we will now provide
a few examples.
9 The fact that we can also have a ± before the vectors in the set is not important, as we
Figure 2.19: The tangent plane (drawn twice) and the way vectors get
mapped onto other vectors when you reflect along L. The vectors drawn
in purple are the only ones that get mapped onto the same line that they
already laid on.
76 CHAPTER 2. SURFACES
The catenoid is the surface of revolution of the function cosh(z). You take
the graph of cosh(z) as a function of z and rotate it around the z-axis to get
a surface, that looks a bit like a sci-fi picture of a wormhole, as you can see
in figure 2.5.1. We have drawn in a point p lying on the ”original” graph of
cosh(z) that points in the x direction. Its position on that line is arbitrary
though. For that point, we can use the xz-plane exactly like with cylinder
(although the orientation is different) and get that the line tangent to the
graph is the one the reflection preserves. So we know one of the directions
s tangent to to the graph. The other one is the y − axis, since it needs
to be orthogonal and lie in the tangent plane. We leave it as an exercise
to you to show that the principal curvatures have opposite signs and that
their values are: k1 = cosh1 2 z , k2 = cosh
−1
2 z and that H = 0 (the surface is
bulges locally.
Before continuing our discussion of curvature, we want to stop for a minute and
discuss how one defines derivatives on surfaces and fix the notation. We start by
introducing standard calculus notation and then defining derivatives on surfaces.
Proposition
This equation is very useful while calculating derivatives but also quite intuitive.
Simply put, to differentiate a function in the direction of X, take Any curve that
has (at the point of interest) its tangent-vector equal to X and differentiate on that
curve. Locally, they will look the same, after all. You can look at figure 2.6 for an
example.
Proof. The proof of this Proposition is very simple. You basically use the chain
rule once, and get exactly what you need.
df (γ(t))
= Df (γ(0))γt (0) = Df (x)X = DX f (x) (2.43)
dt t=0
The last equation is just the definition of DX f (x). Notice that the multiplication
is a matrix multiplication.
We can also define the derivative of one vector field in the direction of another
vectorfield, which we do point wise.
What we get is another vectorfield. Note that the derivative at p depends on the
value of X only at p, but that the same is not true for Y . This is simply because
we are interested in the change of Y along X(p). From the first we need a few
values so we can calculate the change (that is an surrounding of p), while we only
need X(p) for the direction.
The first kind of vectorfields of course contains the second, but the second is
a bit special. It is special in-so-far as that geometrically small vectors (|v| < ϵ for
2.6. INTERLUDE: A BIT ABOUT DIFFERENTIATION 81
Figure 2.22: The catenoid and the lines we need to calculate the
principal axes of curvature using symmetry, for some point p.
some ϵ) lie in the surface, because locally the surface looks like the tangent plane.
You can see the difference between the two types in figure 2.6.2 and
How can we differentiate a vectorfield along a surface M ? Let’s say we have
vectorfields X, Y defined on M and want to differentiate Y with respect to X,
where X of course has to be tangent to M . How can we proceed? There are a
few possibilities. we could for example take a curve that at t = 0 is at p and has
the tangent vector X(p) and use the proposition from two sections ago and repeat
for all the points. We will however use a different (but equivalent) way. We will
extend X, Y to be smooth vectorfields X̃, Ỹ on an open set U ∈ R3 , so that X̃, Ỹ
restricted to M match X, Y , and then define the derivative to be DX Y = DX̃ Ỹ
Figure 2.23: An example for the intuition behind the Proposition, for an
f : R2 → R. If you move along X and the curve γ with a very tiny step (h
is very small) the values you get will be very similar, which is why you get
the same result.
Proof. We know that DX̃ Ỹ depends only on the value X̃(p) = X(p) and on the
values of Ỹ in a small surrounding of p. (Surrounding in R3 ). Therefore the
independence of the extension of X is given. We also know that:
dỸ (γ(t))
DX̃ Ỹ = (2.46)
dt t=0
for some γ(t) with γ(0) = p and γt (0) = X(p). We can use any smooth curve
that satisfies the two conditions, in particular we can take a curve in M , so that
the expression becomes:
with this, we are done with the interlude and return to curvature.
following theorem.
84 CHAPTER 2. SURFACES
The first equality say that we can characterize curvature by how vectorfields
change along M , specifically by the component of that change normal to the surface.
The second one tells us that we can characterize curvature by how the normal
vectorfield changes on M .
We will prove this theorem, but we want to first discuss a few ways to see
this intuitively. We already saw that a tangent vectorfield has to change to stay
tangent to M . We can also see this, not as Y changing, but the tangent-plane
rotating as you move on M , which you can see in figure 2.7. This should be clear,
since the tangent-plane is in one sense the first linear approximation of M , and if
M curves then the tangent-plane also has to rotate. You can also see this as all
the possible tangent-vectors changing to accommodate M ’s curving, and since the
tangent-plane is build out of all of these vectors, it has to change as well. You can
also describe the change of the tangent-plane by describing how the normal vector
changes, since it has to change with the tangent-plane. This is how you get the
second equality. It is clear that DX N has all the information about how N , and
therefore Tp M changes, and this is why this map gets a special name, it’s called
the Weingarten map.
There are a few things to note about the Weingarten map, that one can see
quite easily. Firstly, Wp is self-adjoint, because Ap is symmetric.
Figure 2.27: Vectors tangent to M live in the tangent plane at the point they
come out of, which locally, you can imagine, as them living in the surface.
2.7. ANOTHER CHARACTERISATION OF CURVATURE: THE WEINGARTEN MAP87
Secondly, we can once again use that a unit normal vector cannot change in it’s one
direction, and conclude that the Wp really goes to Tp M , because ⟨Wp (X), N ⟩ = 0.
You can see this, by applying the same calculation we already did quite often. We
know that, ⟨N, N ⟩ = 1 since the unit normal is of unit length. We can therefore
calculate:
0 = DX ⟨N, N ⟩ = 2⟨DX N, N ⟩ (2.52)
which is exactly the claim we just described.
The example showed the way in which you can sometimes use the Weingarten
map to calculate something really fast. Imagine doing the calculation of Ap by
using the Hessian Matrix. There is so many derivatives that you’d probably make
a small mistake, but at the very least it would take way longer.
We will actually prove the second part first, as it is both easier and we need it to
prove the first part.
Proof. Let us first prove the second claim. It is not too hard to prove. It’s
the same calculation we already did very often. We know that ⟨Y, N ⟩ = 0
everywhere, because Y is a tangent vector field, and N is a unit normal field
and they are, per construction, orthogonal to each other. Then we get:
as promised.
Proof. The setup of our proof is drawn in (a), which is the same as in the proof
that Qp is the same as Ap . This time we want to extend Y to a vector that is
in the tangent-plane of q. In (b) we find that our first step will be extending Y
to be constant on the tangent-plane.
Figure 2.28: Even the most boring tangent vectorfield like Y has to change
a minimal specific amount to stay on M .
Step 3 We now think about which extension might be good to work with. We
want to extend X, Y to a surrounding of p on M , that contains the typical
point q ∈ M . The choice we will make is to ensure that the extensions
X̃(q), Ỹ (q) are both tangent to M at q, or in other words, that they are
an element of Tq M .
We can construct our vectors10 , first, by extending them to constant vec-
tors on Tp M . So we take Ỹ (q) = (Y 1 , Y 2 , ?), where we don’t know the
third component yet. We know that q = (x1 , x2 , f (x1 , x2 )) if it has coor-
dinates x1 , x2 on Tp M . When is a vector in q’s tangent space? Well, if we
have a graph, like f , we can see what we have to do by quickly looking at
a one-dimensional-example. which you can see in figure 2.7.2. Very simi-
larly to the figure, the a vector (Y 1 , Y 2 , Y 3 ) will be in the tangent-space
Tq M of q, if Y 3 = df (Y 1 , Y 2 ). So our extension becomes:
Step 6 Calculate DX Y p
. We get:
3
X ∂ Ỹ j
DX Y = DX̃ Ỹ = X̃ i e
p j
(2.61)
p p
i,j=1
∂xi
3
X ∂
= Xi (Ỹ 1 , Ỹ 2 , Ỹ 3 ) (2.62)
i=1
∂xi p
3 2
X ∂ X ∂f 1 2 j
= Xi (Y 1
, Y 2
, (x , x )Y ) (2.63)
i=1
∂xi p j=1
∂xj
3 X
2
X ∂ ∂f 1 2
= (0, 0, Xi (x , x ) p Y j ) (2.64)
i=1 j=1
∂xi ∂xj
= (0, 0, XHY ) (2.65)
uxx
k= 3/2
(2.68)
(1 + u2x )
The first, in principle, just describes the curvature vector of a parameterized curve.
The second one describes the curvature of a curve as a graph. We will start with
the first.
We can get a basis of Tp M at each point p, by taking the basis vectors of the
coordinate space e1 , e2 and applying dF |p onto them. That way we get two new
vectors (at each p):
Xi = dF |p(ei ) (2.70)
which (you should check) turn out to be basis-vectors of Tp M . We can also write
the Xi ’s out:
∂F α 3
Xi = ( ) (2.71)
∂xi α=1
We change notation a bit (to make the formula we will get readable in the end) so
α
that for ∂F α α
∂xi we write Fi . Similarly we use Fij for second derivatives etc.
We can then define the metric, which is a tool that tells us how to transform
from changes of coordinates to changes in lengths.
In actuality, gij is a collection of four maps, but it will turn out to be a tensor
(in later chapters), and a mighty object in differential geometry, so we call it one
map. Let us also define g ij to be the inverse of gij , if you see gij as a matrix. Then
for a parameterized surface we get the following formulas for some curvatures:
2.8. *OTHER FORMULAS FOR CURVATURE 93
Figure 2.31: The setup of our proof is drawn in (a), which is the same
as in the proof that Qp is the same as Ap . This time we want to extend
Y to a vector that is in the tangent-plane of q. In (b) we find that our
first step will be extending Y to be constant on the tangent-plane.
94 CHAPTER 2. SURFACES
W = g −1 D2 F · n (2.73)
H = tr(W ) (2.75)
or written out: XX
H= g ij Fijα nα (2.76)
i,j α
or written out:
!
F2β F2β F11
α
− 2F1β F2β F12
α
+ F1β F1β F22
α
H = nα γ γ δ δ γ γ 2 (2.77)
F1 F1 F2 F2 − (F1 F2 )
where all indexes repeated twice are summed over (Einstein convention).
For the Gauss-curvature K we get a similar result:
α α β β
F11 n F22 n − (F12 n )
α α 2
K = det(W ) = (2.78)
F1 F2 F1 δF2δ − (F1γ F2γ )2
γ γ
The formulas might look horrible, which in a way they are. But they are certainly
useful, because often you know F and can simply plug the derivatives in and get H
and K. It’s a formula which is very easy to implement on a computer.
By the way, if you know H and K, you can figure out the principal curvatures
quite easily. Let’s say you know k1 and k2 .
Then you can realize that:
Let’s say you have a function z = f (x, y), so that the graph of f is the
surface M for which you want to calculate H, K for. Let’s denote the
partial derivative after x, fx as usual. Then you can write H and K as:
As in the previous case, if you know H and K, you can solve for the principal
curvatures by the same calculation,
Notice the similarly of both cases to the formulas for a curve. Specifically, notice
how both of the formulas for the case of a graph have a correction term (in the
denominator) similar to the case of curves.
Figure 2.32: The idea of the third Step of the proof, but in the one-
dimensional case. A vector (x1 , x2 ) is in the tangent-space (lies in the
tangent of f at p), if x2 = df (x1 ) = dx
df 2
x
Imagine that you are an ant on some surface, like the one in figure 2.9 and you
96 CHAPTER 2. SURFACES
cannot see ”outside”, into the third dimension. You stay entirely confined to the
surface, which is your world. Imagine also, that you are a curious, very smart, ant.
You also have a measuring tape, with which you can measure lengths (infinitely
precisely) and have access to infinite computational power, as we said, you are very
very smart. In this world, light can obviously not go in straight lines (unless the
surface is flat), so we say it goes as straight as it can, and what the ant sees is a
result of this. You can see all of this drawn in figure 2.9. From this idea, that is,
what the ant can understand about the world it lives in from measuring lengths and
looking around, we can define what we mean as an intrinsic quantity on the surface.
Intrinsic quantities, simply said, are those that you can infer from measuring lengths
only.
where |γt | uses the norm in three dimensional space. We know from our treatment
of curves, that this is independent of parametrization, and only depends on the
image of γ. It is something intrinsic, that the ant can measure.
From this we can define the distance between two points on the surface M .
We simply pick the curve connecting two points that has the smallest length (if it
exists), otherwise the infimum.
for curves γ that connect p and q. Usually, a curve connecting the two with the
property that its length is the distance between the points exists, but sometimes it
does not. That is why we define it as an infimum13 . We call a curve with the above
property a geodesic.
The surface, equipped with these lengths as distances, is a metric space, as is
quite easy to show (exercise).
external structure of the R3 that the surface might live in, which is not something
the ant can ever experience. You can see the difference in the following figure. We
Figure 2.33: The way we extend Y , is by using the the same (but two-
dimensional) construction from figure 2.7.2.
This g in turn defines both dM and Lγ on the surface, i.e it is the only thing
you need to calculate L(γ) on the surface for any curve γ. It is also determined
from L(γ) or dM . This can be proven, but we will abstain from doing so until
later chapters. Our ant can therefore find it, it exists without any reference to the
ambient space R3 . We call anything that can be deducted from g in a sensible way,
intrinsic.
Note 6. Notice how both g(p) and A(p) are bilinear forms on Tp M . This is why
the first is called the first fundamental form, and the second is called the second
fundamental form. If you are wondering, there is also a third fundamental form,
which was introduced by Gauss, but it is not used much nowadays anymore,
because it doesn’t really add any new information, and can be calculated from
the first two.
Figure 2.34: The parameterization of the sphere. The lines we usually draw
on the sphere, are, as you know, the lines of constant longitude and latitude,
which are lines inherited from the coordinate space. We can also get basis-
vectors of Tp M for all p from the coordinate space, using the vectors Xi =
dF |p (ei ) as basis vectors of Tp M at p.
100 CHAPTER 2. SURFACES
Figure 2.35: A surface, on which our very smart ant lives. It has a measuring
tape, and (somehow) access to infinitely much computational power, so that
it can figure out as much about the world it lives in, as possible.
2.10. INTRINSIC ISOMETRIES AND GAUSSIAN CURVATURE 101
Let (M, gM ) and (N, gN ) be two surfaces, with their own respective metrics.
We call a function ϕ : M → N an (intrinsic) isometry, if it is bijective,
smooth, and preserves distances. That is, for any two p, q ∈ M , if p̃ = ϕ(p)
and q̃ = ϕ(q), then
dM (p, q) = dN (p̃, q̃) (2.85)
or equivalently, if for any curve γ in M :
Figure 2.36: A surface and the two different distances we can define. The
first one in (a) comes from the surface itself, and can at least in principle
be measured by our ant. The second one in (b) comes from the structure
of R3 and is, in general, not equal to the one in (a), and the ant can never
feel this length.
2.10. INTRINSIC ISOMETRIES AND GAUSSIAN CURVATURE 103
The cone, similarly to the cylinder, can, locally, be unfurled into a flat piece
of paper. Exercise: Try both examples with paper.
In general, any surface that can be unrolled into a piece of paper is called
developable, and through any point, there will be a line through it that is straight
(in the ambient space sense), so that k1 = 0 and K = 0.
This hints at the following result:
So by measuring angles, the ant can figure out if the space she lives in is curved
or flat! Another way she can do this is by using circles, not triangles. A circle, in
her world, is the set of points equally distant from some point (”radius r”), which
she of course calls the centre of the circle. If she lives in flat space, she will find
that the area of the circle will be πr2 . But if she lives in a curved space, then this
fact also doesn’t have to be true, you can take a look at figure 2.10.
One can show that the area will be:
π
Ap (r) = πr2 − K(p)r4 + · · · (2.89)
12
which is not πr2 !
Note 7. There is a formula for K in terms of the metric, but it is rather long
and not very useful for our purposes right now.
∂ 2 gij
K = g ij g kl + ... (2.90)
∂xk xl
2.11. TWO IMPORTANT THEOREMS 105
This is obvious for the case of the sphere (it has genus zero, so χ(S) = 2) and
you can just give it the usual metric of the sphere. Very surprising is the result that
on can do this with the torus. You cannot do it by embedding the torus in three
dimensions, but you can do it by embedding it in four.
The idea is presented in figure ?? and ??.
With this last theorem, we move away from curves and surfaces, and move
towards the theory of differential geometry on manifolds, making a short stop along
the way to catch up on important concepts from topology.
108 CHAPTER 2. SURFACES
Figure 2.42: On a sphere, the area of what the ant would call a circle is not
πr2 .
Figure 2.43: Examples for surfaces with genus 0, 1, 2. The genus is the
number of tunnels the surface has.
Part II
Manifolds
109
111
We have discussed the geometry of curves and surfaces extensively in the first
part of the lecture. We saw many ideas, like curvature and intrinsicness that became
very big themes and useful tools. The goal of this lecture is to extend these ideas
to general (geometric) spaces and see some of the fantastic results and tools that
this approach brings. Before we can do that, however, we need a bit of technical
knowledge. For curves and surfaces, to define our tools, we needed to use certain
concepts constantly. Continuity, open sets, neighbourhoods, vectors, tangent vec-
tors and coordinates are only some of these. Usually, they were rather trivial to do,
because we stuck to the ambient space Rn and we know all of these concepts in
Rn quite well from courses like calculus. A vector is, in the most simple form, just
an arrow in Rn and is easy to understand. But we want to go beyond this simple
idea of our geometric things sitting in Rn . As a particular example, our world,
according to general relativity is a four-dimensional space that is not R4 but also
does not live in any Rn . We want to extend ideas like surroundings and vectors
to abstract spaces which do not necessarily sit in some embedding space. This is
the topic of this part of the lecture. Our first step will be open sets since we need
them for basic things like continuity. The topic of open sets belongs to a broad
field in mathematics, called topology, which we will explore briefly14 . Afterwards,
we will handle coordinates and charts and define exactly what we mean by a smooth
manifold. Finally, we will talk about vectors in the settings of manifolds.
14 Any mathematician who has had a good lecture on topology can skip that part safely.
112
Chapter 3
113
114 CHAPTER 3. TOPOLOGY AND TOPOLOGICAL MANIFOLDS
Where do these definitions come from? Well, the first idea is that an open set,
intuitively speaking is one that has no border, while a closed one does. You can
clearly see that for simple sets like the ones in the figures 3.1 - 3.1, this is true.
Figure 3.1: A set without a border. No matter where, no matter how close
to the border, you can always find a small enough radius so that the ball
around that point is still entirely contained in X.
We can therefore see, that at least in these examples, the definition matches
our intuition and we can therefore accept them and see what other sets are open
and closed in that case.
We can easily see that a set consisting of a single point is closed (it is its own
border) and that often (but not always) the question of open/closed comes down
to whether we include a border in our set or not.
We want to point out a few ideas that follow from our definitions below.
3.1. HUMBLE BEGINNINGS 115
Figure 3.2: If you have a piece of the border, then you cannot do the same
thing as we did in the previous example. A point on the border, will, by
definition, always contain a bit of X and a bit of the rest of Rn . So if
you have a piece of the border, the set cannot be open, which matches our
intuition.
116 CHAPTER 3. TOPOLOGY AND TOPOLOGICAL MANIFOLDS
Figure 3.3: If you have a set with a border, you can quite easily convince
yourself that its complement is open since no part of the border is in the
compliment.
3.1. HUMBLE BEGINNINGS 117
• Any intersection of closed sets is closed. This follows from the second
item and the definition.
You can see the two examples given in the proposition in figure 3.1.
We can also express continuity of a function through open sets.
Figure 3.4: On the left you have the composition of a few open sets. Any
point in the combined set is still surrounded by a ball (circle) containing only
points of the combined set because the ball (circle) from the original set the
point comes from works for this. On the right, you have the intersection of
many balls (circles) whose radius is getting smaller and smaller. (All open).
After intersecting all of them, you get a set containing only the origin, which
is not open anymore.
3.2. A TOPOLOGICAL SPACE 119
• ∅, X ∈ T
• If Ui ∈ T for some indexation I, then
S
Ui ∈ T .
i∈I
This definition might look slightly different, but is exactly the three ideas we
wanted to steal from Rn . Indexation here means simply that we have a set (for
example {1, 2, 3, . . . }) that we use as indexes, finite means that the has a finite
amount of elements.
We can immediately define continuity of a function between two topological
spaces.
But before we go into all that detail, which, while interesting, is not the direct
subject of the lecture, we want to talk about a few more ideas that will be more
directly relevant to us.
Figure 3.5: The earth and a map of the earth. We can have a map of only
part of the earth and as we know all of the maps we have cannot represent
the geometry of the earth well. A common example is that the size of
Greenland in the Mercator map is similar in size to Africa, even though in
reality it is about one-fifteenth Africa’s size
Well, mathematically, what we need is (1) the piece of the space the chart
describes, (2) the piece of Rn you draw the map in and (3) a way to assign every
point in (1) a point in (2). The latter is of course just the description of a function.
3.3. CHARTS: PART I 121
We sometimes call Ch the chart and sometimes (U, Ch) the chart (as a tuple),
depending on which one is most convenient1 .
With this definition, we have fulfilled both (1) and (3), and (2) is just Ch(U ).
As you can see in the definition, we require bijectivity, that is, we don’t want our
map to have two points on the map corresponding to the same point in the space,
nor do we want two points on the space being shown as one on the map.
We can also see this more topologically, by introducing the notion of a homeo-
morphism.
the tuple.
122 CHAPTER 3. TOPOLOGY AND TOPOLOGICAL MANIFOLDS
Figure 3.6: A topological space X with an open subset U and it’s map/chart
onto a portion Ch(U ) of Rn . The coordinate lines from the map can be
projected back onto the space, giving us coordinates for the space.
Or, simply put, the open sets on S 2 have bigger open subsets of R3 , that
they match with on the sphere.
There exists a standard chart, which is simply the chart that describes the
sphere by longitude and latitude. It’s easier to write the parametrization
down first, so we will start with that.
cos(ϕ) sin(θ)
From this, we can construct a chart, by taking the inverse. You can work this
out quite easily. If we call the subset of the sphere, that the parametrization
points to, U , then we get:
with the extension of the arctan to the case where x = 0. You can check
(or should know) that this is a bijective continuous function so (U, Ch) is
a chart. There are, however, a myriad of other charts you could use for the
sphere, some of which we will introduce later.
We have developed the tools to add another condition for what we want to work
with, calling the new type of space a topological manifold. The new requirement
will be quite simple. We will want to be able to use charts. We want to require
that the whole space can be covered by charts, i.e., that there is no point in the
space, where you cannot construct a chart for any of its surroundings2 . We do not
require, however, that there is one chart that covers the entire space. You can see
easily why in examples like example 3.3
Before we add this extra structure, however, we will want to talk about another
condition that we will want to have, which eliminates (geometrically) ”pathological”
examples that we will definitely not want to work with, at least in this lecture.
We will want our new space to have a condition called the Hausdorff condition to
eliminate some weird examples we don’t want to have. Consider the example below
for a pathological case of a topological space that we won’t want to work with.
Figure 3.7: We can create the real line with two origins, by glueing together
two real lines everywhere, except at the origins of the two lines. This is
shown in (a). Sets that would be open on a real line (and have the origin
in them) are open if they contain at least one of the origins.
3.5. THE ANT 125
What is the problem with cases such as the real line with two origins? Well, in
some sense, there are two points where there should be one. You can’t separate
them. In a topological way of speaking, there aren’t two open sets that each contain
one of the points, but that don’t intersect. The way we will avoid this is simply by
requiring it and throwing away all other cases.
We will require this of any space we work with and build this into our new
definition. We will call the new kind of space a topological manifold.
There are many examples of topological spaces, any curve or surface will do.
The circle is a great example, so is the torus.
entire space, but it cannot yet do much with these charts, in particular things like
derivatives are still out of its reach. Our next goal will be to enable the ant to tell
how things change when she changes the coordinates. This will be the topic of the
next chapter.
Smooth Manifolds
We have taught the ant how to recognize where things are spatially. We now want
to prepare to teach it calculus. We won’t teach it calculus just yet, but we will
prepare it to do so, by making sure the charts it uses are compatible with each
other, which is the main topic of this chapter.
We have seen that a topological manifold can be covered by continuous (in the
topological sense) charts. To differentiate, we would need something a bit better
some differentiable, or preferably smooth structure. Right now, there does not seem
to be an obvious way of defining a sensible, geometric, way to define a derivative
on the space without making grand ad hoc assumptions that we are not prepared
to make, since we want to stay quite general. But let us, for the sake of the
argument, say that we have found a way to do it. There is an immediate way that
differentiating could go wrong if we do not pose further restrictions on our charts.
Imagine we have two charts, (U1 , Ch1 ) and (U2 , Ch2 ), either covering the same
region or covering regions that overlap somewhere (U1 ∩ U2 ̸= ∅) and we have some
function, let’s call it f : M → R. Whatever our derivative should be, the one
thing we will want is that if f is differentiable in the new sense, and if the charts
are sensible, then f should be differentiable as a function of the coordinates. But
this should hold for any chart we want, so in particular it should hold for Ch1 and
Ch2 . We can guarantee this if the transition map from one chart to another is
differentiable. This will be the topic of this chapter.
127
128 CHAPTER 4. SMOOTH MANIFOLDS
Let (U1 , Ch1 ) and (U2 , Ch2 ) be two charts, with U1 ∩ U2 not empty, that
is, there is a region on M that both charts cover. Call this region U . Then
we define the transition map from the first to the second chart as:
You can see a picture with all the objects we are using right now in figure 4.1.
Figure 4.1: A picture showing all the main players of this chapter. We have
(two) charts Ch1 , Ch2 each covering a region U1 , U2 of the manifold. Each
chart has its own inverse, P1 , P2 associated with it, which are parametriza-
tions of the manifolds. On the region where U1 and U2 overlap (called U )
we can define transition maps T1→2 , T2→1 , which change charts.
4.1. CHARTS II: COMPATIBLE CHARTS 129
From now on, since we will be working with charts a lot and want to declutter
notation, we will not write on which sets each chart is defined, these are implied
to be the ”reasonable” ones. It is, however, a good exercise, to keep track of
these, especially for the exam. We will also call U1 and Ch1 (U1 ) the same thing,
even though they are definitely not. Our reasoning is that Ch1 (U1 ) is our chart
representation of U1 and in the same way you can point at a map and say ”Here
is America”, even though you are pointing at a chart of America, you can call
Ch1 (U1 ), U1 .
We can now formalize what our idea, by calling two charts compatible if their
transition maps are differentiable. The only thing we will change is require smooth-
ness, for convenience.
Let (U1 , Ch1 ) and (U2 , Ch2 ) be two charts of M . We call the two charts
(smoothly) compatible if either U1 and U2 don’t overlap, or otherwise, if
their transition functions T1→2 and T2→1 are smooth.
An Atlas is a set A of charts (Ui , Chi ) which cover M and are all smoothly
compatible with each other (all the transition maps between the charts are
smooth)
• The set M has a topology TM , which is the set of all open subsets of
M.
• The space fulfils the Hausdorff condition, that is, every pair of two
different points can be separated by open sets.
The last two examples aren’t too surprising. They are completely analogous to
their equivalents we saw while discussing surfaces and since spaces like surfaces are
what we want to generalize, it looks like we are on the right track here.
Now that we have the general examples done, we give a few more specific ones.
4.2. EXAMPLES OF SMOOTH MANIFOLDS 131
Figure 4.3: The six different charts (only the open sets are shown)
that you need to map S 2 with the first method.
It turns out, that you don’t actually need this many charts to cover
S n , you could do with just two. What you need for this is the
stereographic projection. You can see the stereographic projection
of S 2 onto the plane in figure 4.2. The way you project a point
p = (x1 , . . . , xn , xn+1 ) from the sphere onto the (hyper)-plane is by
drawing a straight line from the north pole N = (0, . . . , 0, 1) through
p. The point at which the line hits Rn is the point it gets mapped on.
Notice that in two dimensions it is easy to see that the southern hemi-
sphere gets mapped onto the disc inside the sphere, while the northern
one takes up all of the rest of the plane. This way you can create a
chart that covers the entire sphere except for the north pole. This is
the first chart. The second one is also the stereographic projection,
but this time you do it from the south pole, and that covers the entire
sphere except for the south pole. Together you have an atlas of size
two.
132 CHAPTER 4. SMOOTH MANIFOLDS
As an exercise, you should derive these formulas and show that they
are compatible.
You can also check that the charts from the first method (2n + 2
charts) are compatible with the charts obtained from the stereographic
projection.
The Mobius strip is another example of a smooth manifold. You can see
the Mobius strip in a bit of a different form in the figure below.
4.3. THE MAXIMAL ATLAS 133
But which atlas should we choose to ”define” the sphere as a manifold? And
do we get different geometries from different atlases like that? In the case of the
sphere, it certainly would be weird if we got different geometric results if we used
the stereographic atlas or the graph atlas. To this also comes the fact that working
in one or the other is not really too comfortable. For example, if you choose the
graph atlas, there is no chart on which you can see both the north and south pole on
a single map, you need to use at least two charts, which is both more complicated
and somehow seems like an unnecessary problem. There is an easy solution to
this. Throw both of these atlases together. We already asked you to show that the
charts you get from the graphs and the ones from the stereographic projections are
compatible, so if you throw all these together, you still get a fully functional atlas.
In fact, while we are at it, why not just throw all possible compatible charts
together into one atlas, call it a maximal atlas and be done with it? This is exactly
what we choose to do and will be the modification to our definition.
134 CHAPTER 4. SMOOTH MANIFOLDS
Let A be an atlas of some smooth manifold (as per our preliminary defini-
tion). We can construct a maximal atlas by collecting together all possible
compatible charts with A. The maximal atlas is defined as:
Ā = {all charts (U, Ch) that are compatible with all (UA , ChA ) ∈ A }
(4.5)
Of course, to construct the maximal atlas we need some atlas to start with, and
the maximal atlas will depend on this. If you have two atlases that are compatible
with each other, however, they of course produce the same maximal atlas. In that
case we call the two atlases equivalent.
It should make sense, of course, that Ā is an atlas in its own right, all charts in
Ā are compatible with each other. We will give the proof, because it is a proof that
is similar to many other proofs in differential geometry of this kind and it is good
to have seen its kind once.
We will need two ideas for the proof, which seem quite trivial and are not too
hard to prove.
4.3. THE MAXIMAL ATLAS 135
This idea is quite easy to accept, since smoothness should be a local property,
after all, it is a generalization of the ϵ − δ kind of continuity and differentiation.
The second idea is even simpler.
Again this should feel obvious and the proof is not hard, it’s another one of the
typical chain rule proofs of differential geometry.
We will leave these two claims unproven, since their proofs are neither hard nor
illuminating and focus on the original idea we want to prove.
Proof. We want to show that Ā, which is a maximal atlas constructed from A,
is an atlas in its own right, that is, every two charts in Ā are compatible with
each other.
136 CHAPTER 4. SMOOTH MANIFOLDS
We start with choosing two charts in Ā, and we call them (V, ChV ) and
(W, ChW ). We want to show that they are compatible.
Let Z = V ∩ W and we can assume that it is not empty, otherwise, we are
done. We want to show that TV →W (on Z) is a smooth map. The basic idea of
the proof is that we do not transition from the first chart to the other directly,
but go over the charts from A. (See figure 4.3)This is why we needed the second
lemma. The first we need because, in general, it is not necessary that A contains
a chart that works on Z, so we have cut Z in parts and prove it for each part,
which is where our lemma will come in handy. Let’s start.
Firstly, cut Z up into all the pieces, where a chart in A exits that covers
that portion of Z (and maybe more beyond Z). By construction of the maximal
atlas, they are compatible with each (Ui , Chi ) ∈ A. That is, S define the sets
Zi = Z ∩ Ui , which are open and cover Z. Then ChV (Z) = Ch(Zi ) and all
i∈I
of these are open as well since ChV has to be continuous.
4.4. THE FINAL DEFINITION OF A SMOOTH MANIFOLD 137
Take TV →W and split it up into smaller maps over the Ch(Zi ). Because
of the first lemma, we only need to show that TV →W |Ch(Zi ) is smooth for all
Ch(Zi ), the smoothness of TV →W as a whole map follows from the lemma.
But we can use that:
where all the functions are, of course, restricted to either Zi (all the charts) or
Chi (Zi ) (all the parameterizations and transition maps), which has been left
out for readability.
But ChV and ChW have to be compatible with all the Chi , so the two tran-
sition maps in the above equation need to be smooth, but since the composition
of two smooth functions is smooth, TV →W |Ch(Zi ) has to be smooth for all charts
in the atlas, and since the Zi cover Z, the whole transition map TV →W has to
be smooth.
• The set M has a topology TM , which is the set of all open subsets of
M.
• The space fulfils the Hausdorff condition, that is, every pair of two
different points can be separated by open sets.
• The space is equipped with a maximal atlas Ā of charts, which are
smoothly compatible.
• The charts are all homomorphisms. (This comes from the definition
of a topological manifold.)
cartesian product of smooth manifolds and also the relationship between smooth
functions between manifolds and atlases.
Let M and N be two manifolds, with dimensions m and n. Then you can
construct an atlas for M × N out of the two (maximal) atlases ĀM and ĀN
that belong to M and N . If (UM , ChM ) and (UN , ChN ) are two charts
from the individual atlases, then you can create a chart for UM × UN by
taking the cross product:
You can check that with this atlas (or the maximal version of it) you get a
smooth manifold, by checking all the conditions for a smooth manifold.
This definition also forces the dimension of M × N to be m + n as you would
expect, which you should convince yourself of.
Let (U, ChU ) ∈ ĀM and (V, ChV ) ∈ ĀN be two charts and f : M → N
the function whose smoothness we want to check. The two charts are called
an admissible pair if f (U ) ⊂ V , that is if the whole set U gets mapped into
V.
Notice first that we did not define f to have to be smooth in coordinates for all
admissible pairs, only that one exists. You might ask yourself if this means that a
smooth f might not be smooth for an admissible pair that wasn’t used to check its
smoothness. It turns out, that the answer is no. You only need to check smoothness
in one atlas, not the maximal atlas, and the definition then forces f to be smooth
as a function of coordinates for all other admissible pairs. The proof of this claim
is very similar to the proof that all charts in the maximal atlas are compatible with
each other. You take two charts that are an admissible pair, break U up into small
pieces for which there is a chart where it is smooth (as per the definition), and use
the fact that smoothness of functions between two Rn is local. We therefore choose
not to give it here.
From this definition, you can immediately see that the following corollary has to
be true.
The last two points should make it reasonable that our definition makes
sense.
Another thing we definitely would like to have is that the composition of two
smooth functions is a smooth function and that smooth functions in the above sense
are also continuous in the topological sense, both of which turn out to be true.
Since both of these propositions seem very natural and their proofs aren’t too
interesting we leave these out. You can show the first one easily by using charts,
similar to other proofs in this chapter and the second by using that continuity is a
local property and charts are homeomorphic.
4.6 Diffeomorphisms
Now that we have done a lot of technical detail, we want to talk about when (at
this stage) you cannot tell the difference between two manifolds, similar to how
two topological spaces are pretty much the same thing (equivalent, homeomorphic)
4.6. DIFFEOMORPHISMS 141
if there is a homomorphism between them. The main point back then was that
we needed a bijective continuous map between the two spaces. The idea was that
we have a structure on the space, and if the structure is the same between two
spaces, we cannot tell the difference with any tool we have that comes from these
structures. The only new structure we have at this stage is a smooth atlas, so you
are probably not too surprised that the bijection will need to be smooth and its
inverse as well. When we have such a map, we call it a diffeomorphism and the two
spaces are diffeomorphic.
• f is a bijection.
• f is a homomorphism. If the two spaces are supposed to be ”the
same space”, then we shouldn’t be able to tell them apart by their
topologies, so this condition makes sense.
• f is smooth, and it’s inverse f −1 is also smooth. This is to make sure
that the maximal atlases are ”pretty much the same” and we cannot
tell M and N apart from their smooth structure.
There are a few things to note about the definition. Firstly, it forces dim(M ) =
dim(N ), which should make sense. It should not be possible for the circle and the
sphere to be the same thing, and they are not. Secondly, the second requirement is
not actually necessary, because charts are homeomorphic and the second condition
follows from the other two. A diffeomorphism is automatically a homomorphism
without the second condition, we just added it so that you could see very clearly
how a diffeomorphism respects the entire structure, not just the atlas.
Tangent Vectors
Now that we have defined what a smooth manifold is, we want to be able to do
something inside of it. It is all well and good to talk about diffeomorphisms and
charts, but we need some objects in the manifold to have interesting results about.
The obvious first candidate for something interesting on a smooth manifold is a
vector. After all, Rn , curves and surfaces all have vectors associated with them and
a lot of the nice results from the first part of the lecture had something to do with
vectors.
There is a small problem of just ”lifting” the definition of vectors from Rn to
manifolds directly, without any thought. The problem is that a manifold does not, in
general, have a natural vectorspace structure. You can’t ”add” points on a general
manifold in a very sensible way. What is the north pole plus the south pole on a
sphere, which is not embedded in R3 ? This question does not even seem sensible
and any addition you would add at this stage would seem very arbitrary and definitely
not natural. Now, there are many ways to look at the idea of what a vector in Rn
is. You have the obvious one, a vector is ”an arrow” or the mathematical one of it
being an element of a set with an addition and scalar multiplication which obey the
following axioms...
But neither of these help us right now. Of course, we always want to draw
vectors as arrows, and mathematically, the object we want to work with should be
vectors in the vector-space-vector definition, but they are not helpful yet.
One idea is clear even at this stage. Whatever concept of a vector we will choose
to generalize, we will only generalize tangent vectors (for example of curves/sur-
faces). The reason is simply that all other vectors on curves and surfaces (normal
vectors) did not come from the curve/surface itself but the Rn that we embedded
the curve/surface in. So only tangent-vectors are appropriate if we don’t want to
have the influence of some ambient space that our manifold lives in.
There are four different ideas we can generalize out of Rn that end up being
equivalent.
• The first one is very simple. We treat vectors in charts. Vectors are vectors
in charts, where vectors make sense (since charts go to Rn ) and we can point
143
144 CHAPTER 5. TANGENT VECTORS
at a vector in a chart and look to another chart and through the transition
map check which vector it is there. This definition defines tangent-vectors
through equivalence classes of vectors on charts. We don’t know what they
are on M , but we can chart them.
• A bit more on the computational side, we can define vectors through direc-
tional derivatives of smooth functions. We know that you can differentiate a
smooth function f : Rm → R in the direction of X and get
∂f ∂f ∂f
DX f = X 1 1
+ X2 2 + . . . Xn n (5.1)
∂x ∂x ∂x
This derivative contains the same information as what we would usually call
the vector, that is the quantities (X 1 , X 2 , . . . , X n ).
• More on the geometric side, we can use curves to define tangent vectors. We
can use the fact that from a geometric standpoint, a vector tangent to a
curve (multiplied with a small ϵ) looks like a small piece of the curve itself, in
the usual Rn cases. This way we define a tangent vector as a ”small piece of
a curve”. Since many curves have the same tangent vector, we will also use
equivalence classes here.
• On the more abstract side, we can define tangent vectors as linear operators
on smooth functions to R from the manifold, which satisfy the product rule.
X op (f g) = X op (f )g + f X op (g) (5.2)
These four ways are all equivalent to each other and can all be used to define
tangent vectors. They all represent different ways to think about tangent vectors
(charts, computation, geometric (curves), abstract) and this variety gives you a lot
of ways to tackle a mathematical problem. Sometimes the geometric picture will
be more applicable, sometimes the computational etc.
Of particular note is the second definition, which, because of its relationship
with partial derivatives has fostered a notation in differential geometry which can
be confusing at first, but to which one gets used to quite fast.
For our purposes, the first two definitions will be most useful, and we will take
the most time discussing them.
which are connected by T1→2 . Then if you are working in the first chart, which
means you are working in the first Rn , you know what a vector is in the chart,
it’s just a tuple (X 1 , . . . , X n ) ∈ Rn situated at p and you can work with it in the
chart. What happens when you use the second chart? Well the coordinates of the
vector X, which previously lived on the first chart, will just be the jacobian of the
transformation matrix at p evaluated at X.
or written out:
n
X ∂T i
X ′i = Xj (5.4)
j=1
∂xj
where we wrote T = (T 1 (x), . . . , T n (x)) instead of T1→2 , and when clear will
continue to do so.
By taking all other charts you can find which vector in the other charts X
corresponds to and what you get is a working definition of a vector, without having
really talked about the manifold, at all. Figure 5.1.2 might make this clearer.
Let’s say we have a plane flying over the point p with a physically real velocity
described by the vector X on the Mercator chart (square). We can describe the
path of the plane perfectly well in that chart without any reference to the actual
Earth. If we want to switch to a new chart, maybe because it is easier to see
something or represent our country as bigger than others, we can easily do that
with the transition map and bring the velocity vector to the new chart using the
Jacobean of the transition map. This is the idea of this definition, we take vectors
in charts and use them only in charts, really, with no mention of the manifold.
We want to pack the idea of using vectors in charts into a working definition. The
way is quite simple. We say a vector is simply the collection of all vectors in charts
that transform into each other, and we can take one representative (in one chart)
as the example of the vector we talk about.
5.1. TANGENT VECTORS FROM CHARTS 147
Figure 5.2: The Mercator and stereographic projection of the earth (/parts
of the earth). Imagine at a real point p on the earth, there is a plane flying
with a velocity described by the (blue) tangent vector X. We can talk of
its velocity vector as an arrow on both of these pictures (charts) without
making any reference to the actual manifold, that is the earth. We can
describe the path of the plane and all the information about it we would like
using only the charts and nothing else.
148 CHAPTER 5. TANGENT VECTORS
We always have the point p be a part of the vector, a tangent vector never
exists without a point p it sits at.
Figure 5.3: In Rn , you can transport vectors and compare them without
problems, we therefore don’t need to always say which point the vector is
situated at. But if we move onto the sphere, this is not the case anymore.
You can transport a vector on the sphere so that it locally stays parallel to
itself, and after performing a loop, come back and get a different vector!
150 CHAPTER 5. TANGENT VECTORS
n
X ∂u i
DX u(p) = Du(p)(X) = X (5.5)
i=1
∂xi
∂u ∂u ∂u
= X1 + X2 2 + · · · + Xn n (5.6)
∂x1 ∂x ∂x
(5.7)
We can use this. The components of X are very explicit in this equation. Notice also
that if you view this over all possible smooth functions, the directional derivatives
are operators on smooth functions. Even more so, every tangent vector produces
its own derivative operator, and two different tangent vectors produce two different
operators.
We can also rewrite the above equation using curves. If γ is a curve so that
γ(0) = p, γ ′ (0) = X, then:
du(γ(t)) ∂u ∂u ∂u
= γ ′ (0)1 + γ ′ (0)2 2 + · · · + γ ′ (0)n n (5.8)
dt t=0 ∂x1 ∂x ∂x
∂u ∂u ∂u
= X 1 1 + X 2 2 + · · · + X n n = DX u(p) (5.9)
∂x ∂x ∂x
(5.10)
Again, notice that if we take all possible smooth functions, different vectors produce
different operators. We can use the last equation very easily to define vectors on
manifolds, we don’t need any other structure. We just take the last equation as the
definition.
5.2. TANGENT VECTORS AS DIRECTIONAL DERIVATIVES 151
X = (p, X op ) (5.11)
X op : C ∞ (M ) → R (5.12)
du(γ(t))
X op (u) = (5.13)
dt t=0
where γ is some curve on M , but always the same curve for all the functions
u.
Notice how, as with the first definition, we use external objects (there charts,
here smooth functions) to define things on M . This is quite common in differential
geometry. We will drop the op from X op from now on and just call it X as well,
but always have it in the back of our head that we need a point p where the vector
sits, for the same reason as with the first definition. As a reminder we will also
sometimes write Xp instead of X.
We can define the tangent space again.
Notice that this time Tp M ⊂ {p} × Hom(C ∞ (M, R), which is a vector space.
This time, however, it is totally not obvious that Tp M is a vector space. No way is
it obvious that, for example, if X, Y are in Tp M , that X + Y is in it two, because
we do not know how to add curves on M . In Rn , it is obvious, but certainly not
on a general manifold, Imagine even just the earth and two curves, for example, the
ones of a plane flying from Zurich to London and from Zurich to New York. There
is no reasonable way of adding them. What would that even mean? Would you end
up in Greenland?
Where we can do this is in charts, however. The one important point here,
though, is that the curve we get is dependant on the chart, and depending on the
chart, if you add the two curves from before, you can get anywhere from Greenland
to Brazil, but locally the two curves will be the same, in the sense that they will
have the same tangent vector.
We will not prove that this version of Tp M satisfies all the axioms of a vec-
torspace, rather and more interestingly, we will show that it is a whole vectorspace
in the sense that it has a basis and that basis spans the entire vectorspace, leaving
152 CHAPTER 5. TANGENT VECTORS
Figure 5.5: We don’t know how to really add curves sensibly on a manifold
directly. We can do it in curves, but depending on which chart we use, the
curves will look different. Here, on this example of the earth, the addition of
the two paths of our plane lands you in Cuba, if you use the Mercator pro-
jection, or somewhere in Antarctica if you use the stereographic projection.
That is not an ambiguity we want if we don’t want our vacation ruined.
Notice, however, that both curves result in the same tangent vector.
So we are forced to use charts. Let us fix a point p and a chart Ch, which also
charts p.
We can construct a basis of Tp M by using the basis of Rn . For simplicity, let
us say that p gets mapped to (0, 0, . . . , 0). Then we have the basis vectors of Rn
at the origin (=p̃ = Ch(p)), which we can call e1 , e2 , . . . , en . Which curves could
we use to construct the basis of Tp M ? Well, the coordinate lines of course! The
coordinate lines are defined as follows:
β̃i (t) = tei (5.14)
where β̃i is the i-th coordinate line. Then we can project them back onto the
manifold βi (t) = PCh (βi˜(t)), as you can see in figure 5.2.2.
We can then define the i-th basis vector of Tp M as the one that one gets from
the i-th coordinate axis, projected back onto the manifold. In the coordinate space,
this vector would simply belong to the operator:
∂ ∂ ∂ ∂
X1 + X2 2 + · · · + Xn n = (5.15)
∂x 1 ∂x ∂x ∂xi
This motivates a new notation for this basis. We can now write:
op
∂
= The vector gotten from the i-th coordinate line (5.16)
∂xi p,Ch
where we remind ourselves that it sits at p and that it is definitely something that
comes from Ch and that it is an operator. (In future, we will leave out all these
little reminders.)
We can turn this into a definition.
where βi is the i-th coordinate line of the coordinate space Ch(U ), projected
back onto the manifold.
What is left to do now is to prove that this is a basis. For this, we would like
the coefficients of our vectors, to work a bit easier.
for some curve γ which is appropriate for X. Now, this is a map from R to R, going
over the manifold. We can eliminate the manifold, by going to the coordinate space
and back with a chart and parametrization.
d
X ·u= (u ◦ Ch−1 ) ◦ (Ch ◦ γ)(t) (5.19)
dt t=0
The first (from the right) is simply the curve γ, drawn into the coordinate space,
not onto the manifold, which we will also call γ̃. The second one is simply the
function u, but as a function of the coordinates, not of the points on M , which we
will call ũ. We can now use the chain rule:
d
X op · u = ũ ◦ γ̃(t) (5.20)
dt t=0
n n
X ∂ ũ dγ̃ i X dγ̃ i ∂ ũ
= (0) = (5.21)
i=1
∂xi dt i=1
dt ∂xi
n op
dγ̃ i
X ∂
= (u) (5.22)
i=1
dt ∂xi
∂ ũ
where we have used the fact that ∂x i is simply the derivative of u in the i-th
dγ̃ 1 dγ̃ n
(X 1 , . . . , X n ) = ( ,..., ) (5.23)
dt dt
where γ̃ = Ch ◦ γ is the curve γ, but in coordinate space, not the manifold.
∂
5.2.3 Proving that ∂xi i=1,...,n
is a basis
∂
We now want to now show that ∂x i
i=1,...,n
is a basis. This means we need to
show three things. Firstly, that all vectors spanned by the basis are in Tp M , and
secondly that all vectors in Tp M are spanned by the basis and thirdly that the basis
is linearly independent.
156 CHAPTER 5. TANGENT VECTORS
Figure 5.6: We can use the coordinate lines from a chart Ch to define our
basis-vectors of Tp M , by projecting them onto the manifold.
5.2. TANGENT VECTORS AS DIRECTIONAL DERIVATIVES 157
∂
Proposition 5.2.1: The vectors ∂xi i=1,...,n form a basis of Tp M
∂
The vectors ∂xi i=1,...,n form a basis of Tp M , that is:
• Tp M ⊂ span( ∂x ∂
, . . . , ∂x∂n )
1
• span( ∂x
∂
, . . . , ∂x∂n ) ⊂ Tp M
1
But we have already shown the first claim in the last section, when we found
the coefficients of a general vector in Tp M , because we wrote X out as a linear
combination of this basis. So we only have the second and third claims to prove.
The proof of the second Pn claim iseasy to understand. We need to show that
any (X 1 , . . . , X n ) = i=1 X i ∂
∂xi is in Tp M , which means we need to find
a curve that generates these coefficients. But the curve (in coordinate space)
γ̃(t) = t(X 1 , . . . , X n ) will do3 . Figure 5.2.3 should make this almost obvious.
Figure 5.7: We can use the curve that just ”extends” X in coordinate space
for our proof
The only thing left to show is that the basis is linearly independent, which we
leave to you as an exercise, it’s not too hard.
3 Here, we are still using the mentioned simplification, which is that Ch(p) = 0. If you
Figure 5.8: We can see vectors as small pieces of curves, locally, if we think
of them geometrically.
This definition is a lot more geometric and is useful to think about in very
pictorial settings. You can see the equivalence relationship in figure 5.3
Figure 5.9: A lot of curves, even very wild ones are equivalent in our defini-
tions. These wild behaviours are cut off by the equivalence relationship, so
we have a way of talking about ”small” parts of the curve
160 CHAPTER 5. TANGENT VECTORS
for some functions u, v. It turns out that, like the directional derivative, we can also
generalize this. We can take the space of all linear transformations that take smooth
functions on M as the input and output a real number, Hom(C ∞ (M ), R), which
contains things like tangent vectors (as derivatives), but also things like multiplying
functions with the number two, which is clearly not a derivative-operator.
We can then notice, that derivative operators should satisfy the product rule,
while non-derivative operators do not. For example, with multiplying by two, in
general, we have:
2 · (uv) ̸= u(2v) + v(2u) = 4uv (5.26)
ith this we get to our definition:
for all u, v ∈ C ∞ (M )
It is quite tricky to work with this definition, because we don’t have charts or
derivatives in it, so one needs to be very algebraic and it all turns into a tricky,
non-digestible mess quickly, so we won’t pursue this further.
What we want to mention is that the vectors from the previous definitions we
have do satisfy the Leibniz rule, you can use it comfortably. The proof is similar
to what we did before, using lots of chain rules, working in charts and applying the
Rn Leibnitz rule.
where often for example, something is easier in polar coordinates, and sometimes
in cartesian. We want to generalize this to any coordinates.
More specifically, we want to look at how tangent vectors transform. For this,
fix M, p and two charts Ch1 , Ch2 and a tangent vector X.
n
X ∂
X= Xi (5.28)
i=1
∂xi Ch1
where X 1 , . . . , X n are the components of X in the basis of the first chart. What
happens if we change the charts? That is, what are the components in the other
chart, Ch2 ?
We can start by answering first how we can express the basis vectors of the old
chart as the basis vectors of the new chart.
We
do this, by unravelling the definition of the differential operator that is
∂
∂xi Ch1 .
We know that:
∂ ∂
i
·u= (u ◦ Ch−1
1 ) (5.29)
∂x Ch1 ∂xi
in the first coordinate space. But we know how to transform coordinates. We call
the coordinates in the second chart y 1 , . . . , y n . We can change to the second chart
by inserting an identity, that is by going to the second coordinate space and back
and then using the Rn chain rule:
∂ ∂
i
·u= (u ◦ Ch−1 1 ) (5.30)
∂x Ch1 ∂xi
∂
= (u ◦ (Ch−1
2 ◦ Ch2 ) ◦ Ch1 )
−1
(5.31)
∂xi
∂
= ((u ◦ Ch−1 −1
2 ) ◦ (Ch2 ◦ Ch1 )) (5.32)
∂xi
n
∂(u ◦ Ch−12 ) ∂(Ch2 ◦ Ch−1
1 )
j
X
= j i
(5.33)
j=1
∂y ∂x
n
∂(Ch2 ◦ Ch−1 1 )
j
X ∂
= i
(5.34)
j=1
∂x ∂y j Ch2
We have found the transformation of the basis vectors of one chart into the vectors
of another. We can thus see how a vector X, which is of course nothing less than
a linear combination of the basis vectors in the first chart, transforms.
n n X n −1 j
i ∂(Ch2 ◦ Ch1 )
X
X ∂ ∂
X= Xi = X (5.35)
i=1
∂xi i=1 j=1
∂xi ∂y j Ch2
n n
!
−1 j
i ∂(Ch2 ◦ Ch1 )
X
∂ X ∂
= = X (5.36)
∂xi j=1 i=1
∂xi ∂y j Ch2
(5.37)
162 CHAPTER 5. TANGENT VECTORS
So we have found how the coefficients change. If we call the coefficients in the
second basis (in the second chart) X̂ 1 , . . . , X̂ n , then we can write them as:
n
X ∂(Ch2 ◦ Ch−1
1 )
j
X̂ j = Xi (5.38)
i=1
∂xi
Note that this is just a matrix multiplication, and we can write it simpler using our
transformation notation, since Ch2 ◦ Ch−1 1 = T1→2
Where XCh2 is the vector X as a column in the second basis, while XCh1 equiva-
lently in the first basis.
At this point, we want to introduce a new notation, which is very common in
much of differential geometry literature, and understandably so. We can think of
the coordinates of the first charts as functions:
which gives every point on the covered subset of the manifold a real number. That
is xi : U1 ⊂ M → R.
We do the same thing with the other charts and call them
We can then do the above calculations with these as actual partials. But what is
the transition map, then? Well, as you can imagine, it takes the coordinates of a
point in the first chart and spits out the corresponding coordinates in the second
chart:
T (x , . . . , xn ) y (x , . . . , xn )
1 1 1 1
T 2 (x1 , . . . , xn ) y 2 (x1 , . . . , xn )
T = = (5.42)
.. ..
. .
T n (x1 , . . . , xn ) y n (x1 , . . . , xn )
so we can write:
∂y j ∂T1→2 ∂(Ch2 ◦ Ch−1
1 )
j
i
for i
= i
(5.43)
∂x ∂x ∂x
and the formula, as:
n
j
X ∂y j
XCh = i
XCh (5.44)
2
i=1
∂xi 1
which looks very similar to the real number chain rule. Note, however, that there
is a bit of an abuse of notation here, and that it is certainly not mathematically
enough to use the chain rule without thought with these, there is more behind them.
5.6. DIFFERENTIATION OF A FUNCTION BETWEEN MANIFOLDS 163
Let us now go to the case where we have two manifolds, M, N and a smooth
function f : M → N between them. What is the derivative of f , df (p). We expect
it to be similar to a jacobian in Rn , taking tangent vectors to tangent vectors.
Figure 5.10: You can see two manifolds, M and N and a function between
them. We want a derivative df , that is a function that takes tangent vectors
of M (X) to tangent vectors of N (Y ).
How can we do this? The geometric idea is quite simple. We know f and
want to know what happens with a vector X ∈ Tp M . That vector is created by
some curve on M , call it α through the second definition. We know what happens
with the curve if we use f on the manifold, it gets mapped to some other curve
β = f ◦ α. But then the vector associated with β at p̃ = f (p) should be the vector
the derivative maps X to!
164 CHAPTER 5. TANGENT VECTORS
d
X ·u= u(α(t)) (5.45)
dt 0
then
Y = (df (p)(X) ∈ Tp̃ N (5.46)
is defined by:
d
Y ·v = v(βt) (5.47)
dt 0
for all v ∈ C ∞ (N ) and where β is the curve β = f ◦ α.
Notice that nowhere in the definition did we use any charts whatsoever! This is
a purely geometric, chart-independent object we have.
There is, however, a few things we need to make sure work if this definition is
to make sense.
d
Y ·v = v(β(t)) (5.48)
dt 0
d
= v(f (α(t))) (5.49)
dt 0
= X · (v ◦ f ) (5.50)
Y · v = X · (v · f ) (5.51)
5.7. THE CHAIN RULE 165
which we will call proposition X from now on, whenever we use it, which will be
in the next section when we use it to prove the chain rule for functions between
manifolds.
You can show this in two ways. You can either express this whole thing in
coordinates and ”inherit” the chain rule from Rn into the whole thing or you can
do it directly and abstractly. Neither is better, but since many of our proofs until
now have been leaning more toward the first type, we will do it with the second
method instead.
Proof. We need a bit of setup since we have a lot of players in this proof. We
have the manifolds, the maps, the vectors and the general smooth functions we
need for the vectors to act on. We show all the players in figure ??
Our strategy is to write each of the parts of the chain rule equation using
proposition X and then collect them together.
df (p)(X) · v = X · (v ◦ f ) (5.54)
dg(q)(Y ) · w = Y (w ◦ g) (5.55)
d(g ◦ f )(p)(X) · w = X · (w ◦ (g ◦ f )) (5.56)
We can now set Y = df (p)(X) and q = f (p) and insert into the right side of the
chain rule.
dg(f (p))(df (p)(X)) · w = df (p)(X) · (w ◦ f ) (5.57)
= X · ((w ◦ g) ◦ f ) (5.58)
= d(g ◦ f )(p)(X) · w (5.59)
So we get the desired equation:
d(g ◦ f )p = dgf (p) ◦ dfp (5.60)
since w, X and p were all general.
166 CHAPTER 5. TANGENT VECTORS
Figure 5.11: You can see all the actors we need in the proof in this figure.
Y · v = x · (v ◦ f ) (5.61)
m
X ∂
= Xi (v ◦ f ) (5.62)
i=1
∂xi p,ChM
m
X ∂
= Xi (v ◦ f ◦ Ch−1
m ) (5.63)
i=1
∂xi p̃,ChM
m
∂
(ṽ ◦ f˜)
X
= Xi (5.64)
i=1
∂xi p̃,ChM
where in the last expression, the tilde means that the functions are their represen-
tations in charts, and the partial derivative becomes the simple Rn partial we all
know and love. We can then use the Rn chain rule.
5.8. THE COORDINATE EXPRESSION FOR df (p) 167
Figure 5.12: We present the standard picture with functions between man-
ifolds again, with the charts ChM and ChN . Our goal is to find the
coordinate representation of df (p)
168 CHAPTER 5. TANGENT VECTORS
∂ṽ ∂ f˜j
m X
X n
= Xi (5.65)
i=1 j=1
∂y j q̃ ∂xi p̃
(5.66)
where the partials in y are in the chart of N and the ones in x are the ones belonging
∂ ṽ
to the chart of M . If we rearrange a bit, and realize that we can turn ∂y j back
q̃
into the operator, and that these are simply the standard basis vectors at q̃ in ChN ,
we get:
∂ f˜j
n m
!
X X ∂ṽ
= i
X i
(5.67)
j=1 i=1
∂x p̃ ∂y j q̃
∂ f˜ i
Xm
Yj = X (5.68)
i=1
∂xi
Again, we find a result that is parallel to the chase of Rn , since if f were a map
from some Rm to some Rn , we would get the exact same result. The (column)
vector Y we get is simply the Jacobean (in the chart) used on X (in the chart)!
We can introduce a new matrix notation, for the chart Jacobean of df (p). We
can write:
∂ f˜j
df (p)ji = (df (p)ChM ,ChN )ji = (5.69)
∂xi
Then we can write the above result as:
Y j = df (p)ji X i (5.70)
169
170 CHAPTER 6. TANGENT SPACES AND TANGENT BUNDELS
Figure 6.1: A vector field on the circle. We can draw the tangent spaces,
the Tp M ’s into the picture (only a few shown). This whole space (with all
of the tangent spaces) is where vector fields live
in some sense, we have found that vector fields aren’t all that different from (sub-)
manifolds themselves.
This union is disjoint since all vectors always have the point p as a part of them.
Mathematically, the tangent bundle is a subset of M × Hom(C ∞ , R).
This definition is clear by the example with the circle from the last section, we
simply take all possible tangent spaces together at once and, later, consider objects
on it. Clearly, as you can see in 6.1, the manifold M itself is a subset of T M ,
trivially in the geometric sense, and mathematically as M × {0}, where 0 referee to
the linear map that takes a smooth function on M and spits out the number 0.
6.2. DEFINITIONS AND CONDITIONS 171
Figure 6.2: If you turn all of the tangent spaces, you end up with a cylinder!
Tangent vector fields simply become curves on a cylinder (all be it ones with
a few special properties). Even though we draw the vectors as pointing up,
they are still tangent vectors living in tangent spaces, this trick is only one
that we do in our head, there is no mathematical transformation here.
172 CHAPTER 6. TANGENT SPACES AND TANGENT BUNDELS
This shouldn’t be too surprising. After all, we found that for the circle at least,
the tangent bundle was the cylinder, which is a 2-dimensional manifold.
Now the coordinates of a point of T U are given very simply. If the coordinates
of a point p ∈ U are x1 , . . . , xn and the components of a vector X situated at p
are X 1 , . . . , X n , then the new chart, which we will call ChT U is:
The only part that is left to prove, is that this is truly a chart and that you get
an atlas from these charts. Then you can take the maximal atlas and you have
yourself a manifold. None of these parts are too enlightening, so we leave them
unproven, hoping that the above charts are reasonable enough to convince you
that it will work.
The projection map π, takes vectors in T M and spits out the point on M
they live at. More precisely:
π : TM → M (6.5)
(p, X) → p (6.6)
Proof. Per our definition of smoothness, we only need to find, for any point
(p, X) ∈ T M , an admissible pair of charts for which the function is smooth in
coordinates. As you might remember, we only need to show that one admissible
pair exists per point. This is enough to guarantee that the map is smooth in
all admissible charts. Let (p, X) ∈ T M be a vector in the tangent bundle.
Let(U, Ch) be a chart so that p ∈ U . We can take the admissible pair of charts
(U, Ch) and the from it constructed (T U, ChT U ), since π(T U ) = U and U is
trivially a subset of U . In these coordinates, the map π is:
π̃ : (x1 , . . . , xn , X 1 , . . . , X n ) → (x1 , . . . , xn ) (6.7)
This map (as a map from R2 n → Rn ) is obviously smooth, and since we found
a map like this for any pair (p, X), we are finished.
X : M → TM (6.8)
so that X(p) ∈ Tp M .
There are three different ways to think of vector fields in this context. Firstly,
you can think of it as arrows on the space, where each point gets its own vector.
6.4. A FEW INTRIGUING EXAMPLES 175
This is the classical, calculus, approach to vectors. You can also think of them in
terms of objects on the tangent-bundles. For the circle, for example, they are curves
on a cylinder (=T S 1 ), which have a few restrictions on them. Both of these were
already explored. The last way is in coordinates.
Let’s say you have a chart (U, Ch) and a vector field X. What does that mean
in coordinates? You shouldn’t be too surprised that this simply means that the
coefficients of the vectors (at each point) are functions of the points on M .
n
X ∂
X(p) = i
XCh (p) (6.9)
i=1
∂xi p,Ch
We will not make the difference between the maps X i : M → R which give the
coordinates as functions of the point on the manifold and the functions (called the
same) X i : Rn → R, which give them as functions of the coordinates of the points.
We want to end this section by saying a thing about smoothness.
vector field X is smooth if and only if the coordinate functions are smooth
in some chart near every point p. This is also equivalent to them being
smooth in every chart.
There is a well-known theorem from topology, that states that you cannot comb
a hedgehog without giving it a bald spot. If you are confused, maybe you should
look at the theorem below.
There does not exist a smooth non-zero (tangent) vector field on the sphere.
If you are still confused, let us explain with an example. Imagine that vectors,
which are usually drawn as arrows, are hairs/spikes, and that the sphere is, in reality,
a curled-up hedgehog. If the hedgehog is really curled up perfectly and you comb it
(make the hairs/vectors tangent to the hedgehog/sphere), it will have a bald spot
(a place where X = 0). This was the analogy that some mathematicians came
up with for this theorem, and it has stuck, especially because of how visual the
theorem is. We won’t prove it here, but we will give an example which should be a
bit convincing.
You can hopefully imagine that this vectorfield is smooth, and where the
hair/hedgehog analogy comes from. Now notice that the closer you get to
the north pole you get, the smaller the hairs become and at the north pole,
you get a length of zero or a bald spot. (The same thing happens at the
north pole).
6.4. A FEW INTRIGUING EXAMPLES 177
Now what does this hedgehog theorem have to do with the tangent bundel of the
sphere? Well, let us for a minute imagine that, in fact, T S 2 = S 2 × R2 , as a
manifold. Then take the function:
Figure 6.5: To get the tangent-bundle of the sphere, you need to take the
sphere apart into a lower and upper half. create the tangent-bundles for
these two, which are just U × R2 , and then glue them together, but with
a twist, which ensures that the hedgehog theorem is true.
What can we see from this example? Firstly, the main lesson is that the tangent
bundle is not a trivial object, it is not always the same as just the space times R2
and that this has big implications on the sort of vector fields that can even exist
on the manifold. Another thing you might have already guessed at is that locally,
however, the tangent bundle is just the open set ×Rn .
178 CHAPTER 6. TANGENT SPACES AND TANGENT BUNDELS
T U = U × Rn (6.11)
The proposition in the end is simple to prove in charts and since there always
exists a chart that covers a small region around p, it is general.
Figure 6.6: The construction of the torus and klein bottle. The only differ-
ence between them is that one of the arrows on the Klein bottle is flipped.
But from this, you can see that while the first linearly independent vector
field works on both (you get no discontinuities, the second type does not.
If you go over the seam on the Klein bottle you switch directions and when
you return (to the middle) you find you have a discontinuity.
180 CHAPTER 6. TANGENT SPACES AND TANGENT BUNDELS
Yes, we can. Your first guess might be that it has something to do with n being
even or odd since 2 is even and 1 is not. Extrapolating from two data points is
dangerous, but pays off sometimes. You would be partially right. The result about
whether or not S n has at least one smooth nonzero vector field caries over like this.
The result about the structure of T S n does not, however.
Why? Well, from the depths of linear algebra, you might vaguely remember
that these are the dimensions in which you can have a vector product and also that
this has something to do with extending complex numbers. The reason is that S 1
are the unit complex numbers, S 3 the unit quaternions and S 7 the unit octonions,
but these are also the only extensions of the complex numbers.
The manifold S n has at least one non-zero smooth vector field if and only
if n is odd If n is even, it has none.
You might ask how many non-zero vector fields there are for each odd n. That is
a complicated question, which has been answered though. It has, for some reason,
something to do with the binary expansion of n, which is almost unbelievable. Why
would vectorfields on spheres count the number of 1’s in the binary expansion of
n? You can google this result if your curiosity is peaked.
Part III
181
183
We have developed manifolds in great detail in the second part of this course,
but we didn’t really talk about where you get the manifolds you work with. We also
didn’t talk about things like manifolds living in manifolds or orientability, which are
the topics of this part of the lecture. We want to get some manifolds to be able to
work with them in later parts of the lecture.
184
Chapter 7
• p ∈ U , of course
• The mapped piece of M gets mapped onto the Rm living in Rn .
Ch(M ∩ U ) = (Rm × {0}) ∩ Ch(U )
185
186 CHAPTER 7. SUBMANIFOLDS AND HOW TO GET THEM
Figure 7.1: A curve (= submanifold M ) locally gets mapped onto the real
line, if we find a chart like the one that we need in the definition.
do want consistency.
We won’t prove it, but we will point out a few steps. The main idea is to build
(M )
an atlas from all the charts of the above definition. Let AN be the collection of
charts (not an atlas, even though named A) that satisfy the second condition of
the definition. Then we can reduce each of these charts to ones of M :
where:
V = U ∩ M and ChM = Ch M ∩U
: M ∩ U → Rm (7.2)
and create and atlas AM for M . These charts cover M , since M is a submanifold
and therefore we find a chart for every p ∈ M . The only thing left to prove is that
these are charts and a few technicalities.
A noteworthy thing is that the maximality of the atlas gets inherited. AM is
already a maximal atlas.
Figure 7.2: The types of charts we take for the atlas of M . They are all the
charts which satisfy the second condition of the definition of a submanifold,
mainly that they take a piece of M to a Rm ⊂ Rn .
188 CHAPTER 7. SUBMANIFOLDS AND HOW TO GET THEM
• Classical groups of linear algebra O(n), U (n), SL2 (R) etc. are sub-
2
manifolds of Rn×n = Rn
• When can we get a submanifold from constraints, for example, that f (x, y, z) =
x2 + y 2 + z 2 − 1 = 0. That is, when is the preimage of a map a submanifold?
The tools we will develop for this will be immersions and submersions which will
be the partial answer to the first and second questions respectively.
This is of course equivalent to the kernel of dfp being empty for all p. It also
forces m ≤ n, since you can’t have an empty kernel otherwise. This is, of course,
obviously sensible, since you can’t parameterise into a smaller space than the amount
of variables you have.
The helping tool for the second question will be in some sense the ”dual” to an
immersion, that is a submersion, which is a map where the derivative is subjective.
If you did a lot of linear algebra and recognize the similarity of the two questions
we are asking, you will not be surprised that if the injective linear map is involved
in one, then the subjective will be involved in the second one.
An idea similar to the above is that of an embedding. We have already seen
many embeddings, very specifically, we saw curves and surfaces embedded in R3
a lot in the first part of the lecture. An embedding is basically a very good way
190 CHAPTER 7. SUBMANIFOLDS AND HOW TO GET THEM
of putting the manifold into a bigger manifold while preserving all of its major
properties and not losing anything.
A function f : M → N is an embedding of M to N if f : M → f (M ) is a
diffeomorphism.
parametrized with the whole of R. The curve intersects itself at (1, 0),
since γ(1) = γ(−1) = (0, 1). However, the characteristic of being an
immersion is clearly local with respect to the first space (here time),
and the differential γt is injective everywhere. But this means that
we cannot guarantee that smooth manifolds are always the results of
immersions, additional conditions will be needed.
7.4. IMMERSIONS, SUBMERSIONS AND EMBEDDINGS 191
Rn → S n−1 (7.4)
x
x→ (7.5)
|x|
The last examples may have led you to think that an embedding is just an injective
immersion and vice versa. The only problem with the curve γ(t) = (t2 , t3 − t) was
the fact that it was not injective, after all. This is not quite true. While the first
claim is generally true, an embedding is an injective immersion, the opposite is not
true in general. It requires another condition, which we will talk about later. For
now, we can prove the first claim.
7.5. THE INVERSE FUNCTION THEOREM. 193
Proof. There are two claims. Firstly, f has to be injective. But this follows from
the definition, since f : M → f (M ) has to be a diffeomorphism and therefore
also bijective, which means that f : M → N is injective. We only need to still
prove that f is an immersion, i.e. that df (p) is injective for all p ∈ M . We
can prove this using the chain rule. Firstly, note that since f : M → f (M ) is a
diffeomorphism, it is clear that:
f −1 ◦ f = idM (7.6)
The Inverse function theorem is a key theorem in this part of the lecture. It will
give rise to two other key theorems, the local immersion theorem and the local
submersion theorem, which will be the answer to the question of how we can build
submanifolds.
The Inverse function theorem tells us something about local diffeomorphisms.
Local diffeomorphisms are functions between manifolds, which are, locally all dif-
feomorphic.
194 CHAPTER 7. SUBMANIFOLDS AND HOW TO GET THEM
Figure 7.7: The idea of a local diffeomorphism is that you have a function
relating two manifolds, in such a way that at every point p on M , you can
find a (small) surrounding which is diffeomorphic to its image under f .
The inverse function theorem tells us that, for a function f , if dfp is an isomor-
phism (= determinant non-zero in charts), then f is a local diffeomorphism at that
point.
7.6. CONSEQUENCES OF THE INVERSE FUNCTION THEOREM 195
Figure 7.8: A local diffeomorphism does not have to be injective and there-
fore does not have to be a global diffeomorphism.
f U
:U →V (7.9)
is a diffeomorphism.
Many things follow from the inverse function theorem, some of which we want
to address here.
This corollary is a direct application of the inverse function theorem, one could
even see it as just a reformulation of it.
The proof of this is very similar to many proofs we already say, using a covering
of the relevant spaces and using the interplay between global and local properties.
We already saw that the injectivity of an immersion will play a big part in
it being a manifold. But there are other conditions, the need for in this
example. Regard the curve in figure 7.9
Figure 7.9: This curve stops infinitesimal close to itself and in turn
is not a submanifold of R2 , even though the map is an injective
immersion.
7.7. A FEW MORE EXAMPLES 197
The curve meets itself infinitesimally close at a point. Thus the curve is not
a submanifold, since the curve doesn’t look (locally) like R there. But the
map is a smooth injective immersion. This shows that injective immersions
are not necessarily embeddings, but that some other conditions need to be
satisfied.
To see that immersions are definitely necessary we can look at the curve in
figure 7.10
Figure 7.10: The curve γ(t) = (t2 , t3 ) is injective, but not a sub-
manifold. It fails at the point t = 0, where dγ is not injective, hence
γ is not an immersion.
Figure 7.11: This is how you can construct the immersion of the Klein
bottle into R3 . Note that because the constructed bottle intersects
itself, this cannot be an embedding. In fact, it is impossible to embed
the Klein bottle into R3 .
By the way the bottle looks, you should be convinced that this is an immer-
sion, but it is not injective, so it is also not an embedding. For the points
of intersection, the surface does not locally look like R2 .
Coverings
File corrupted.
199
200 CHAPTER 8. COVERINGS
Chapter 9
Orientations
E = {e1 , e2 , . . . , en } (9.1)
E =
′
{e′1 , e′2 , . . . , e′n } (9.2)
where e1 , . . . , en are the basis vectors of the first basis, E, and e′1 , . . . , e′n are the
basis vectors of the second basis E ′ . How can we express if they are orientated the
same way or not?
201
202 CHAPTER 9. ORIENTATIONS
Figure 9.2: Your hands are two objects, which have different orientations.
You cannot simply turn one into another by a ridging motion.
9.1. ORIENTATIONS ON VECTORSPACES 203
and then say that E ′ and E have the same orientation if the matrix aij has a
positive determinant, and opposite orientations if it has a negative determinant.
(The case det(aij ) = 0 is not possible since E ′ and E are bases.)
Figure 9.3: There are two different kinds of orientations. If you switch
bases, then two things can happen. Either you switch orientation, or you
don’t. To switch, you will need to reflect in some way.
Why, you might ask, has the determinant something to do with this? Well,
let’s start with figure ??, which is a two-dimensional example. As you can see,
204 CHAPTER 9. ORIENTATIONS
A = QR (9.4)
What does this mean geometrically? Well, R is an upper triangular matrix with
a positive diagonal, which means that it is a shear. A shear is a transformation
that tilts one basis vector after the other, towards the previous basis vectors, so it
cannot be orientation reversing. The Q is an orthogonal map, i.e. either a rotation
or reflection. If it is the former, it also preserves orientation. If it is the latter, it
reverses it.
What about the determinant? You probably remember that the determinant of
rotations is +1 and of reflections −1. What about R? Well, it is an upper triangular
matrix with a positive diagonal, so it is definitely positive (Why?).
By:
det(A) = det(QR) = det(Q)det(R) (9.5)
9.1. ORIENTATIONS ON VECTORSPACES 205
we get the fact that the sign of det(A) is positive if and only if there are no reflections
in the transformation, and negative if and only if there is one.
Therefore the definition makes sense.
Let E, E ′ be two choices of bases for V . We say E and E ′ have the same
orientation, if the basis-change matrix A has a positive determinant and
different orientations if the determinant is negative.
We can now define an orientation for V . We can group all the bases of V into
two groups, having either one orientation or the other. (Which one is which is a
matter of choice). We do this through equivalence classes.
tells us that two bases are equivalent if they have the same orientation. We
define an orientation of V to be an equivalence class under ≃.
• O(V ) contains 2 elements, that is, there are two ways to orient V .
• Any odd premutation of (e1 , . . . , en ) reverses the orientation, any even per-
mutation preserves it.
We can also pack the vectorspace and the orientation into a tuple.
Definition 9.1.3: A
You can see this in figure ??. Basically, a frame is a (local) system of basis
vectors for all Tp M ’s, such that these basis vectors vary smoothly from one place
to another.
208 CHAPTER 9. ORIENTATIONS
Figure 9.6: The vector fields (blue, red, orange) are already at each point
a basis of Tp M and vary smoothly, thus they are a frame for M . Notice
that they don’t need to be orthogonal.
Lemma 9.3.1
• If M is connected, then f is either OR of OP and cannon be anything
else.
• The obvious relations for preserving/reversing a property hold:
OP · OP = OP (9.11)
OR · OP = OR (9.12)
OP · OR = OR (9.13)
Or · OR = OP (9.14)
(9.15)
Let S 1 be the circle as we know and love it. What does OS 1 look like?
Well, it is quite simple, really. It looks like two copies of a circle, one with
one orientation, and one with the other. No twisting or anything weird like
with T S 2 . You can see it in the figure below.
Figure 9.7: OS 1 is just two copies of the circle, with different orien-
tations, as you would expect, and as you should!
As you can see in the example, OS 1 is simply S 1 × {±1}, really, which is just a
way of saying that OS 1 consists of two copies of the circle, one oriented in the ”one
way”, one in the ”other”. This is clearly not true for a general M , just like it is not
true that T M = M × Rn for all M . One obvious counter-example is the Mobius
strip, and this should make sense. If it was otherwise, that is, if OM = M × {±1},
then we would have two copies of the Mobius strip, one with ”one” orientation,
nor with the other, but that cannot be, since the Mobius strip, as we saw, is not
orientable.
This idea is very general, as is the idea that OM is a smooth manifold.
The first claim should be clear, since locally, OM does look like M × {±1},
9.4. ORIENTATION DOUBLE COVER 211
since, M being manifold, it locally looks like Rn and we can therefore for a small
local patch always find two possible orientations, as you can see in figure ??. The
second claim can be explained by the following. If it is the case that M is orientable,
then OM has to have this structure, since you can find two global orientations +M
and −M and they are disconnected in OM . The other direction is also simple since
you can just call the one component of OM the +1 orientation and the other the
−1 and you have found the two global orientations of M .
Figure 9.8: Locally OM looks like M × {±1} since a manifold looks locally
like Rn , and thus you can find two orientations for a small patch around
any point p. By choosing one to be the +1 orientation and the other to be
the −1, we can see clearly that locally, over the patch around p, OM does
look like P atch × {±1}
Mobius strip.
The double cover of M , OM also has an interesting property, orientation-wise.
This is quite astounding if you think about it. It means that even for non-
orientable manifolds, like the Mobius strip, OM , is orientable.
You can see this with for the Mobius strip in figure ??
Figure 9.9: On the one hand, you have the Mobius strip, which is non-
orientable, due to the twist it has. If you take the orientation double cover,
OM , you can see that it has two twists, and therefore (as you can trace
with your finger), it is orientable. If you are unsure why OM looks like it
does, think about why it does and what it represents.
Chapter 10
Techanicalities:
Countability
Another topic we want to discuss before going into the real stuff is countability.
The definition of a smooth manifold still allows for some extremely wild examples
that sometimes even feel absurd. This results from our definition of a smooth
manifold simply being a topological space (M, T ) with a smooth atlas, with the
only restriction on the topological space being that it is Haussdorf. This is not yet
enough for us, since it includes some weird examples, that we want to exclude.
The main idea is that these ”pathological” examples happen if your topology
(= the set of all open sets) cannot somehow be reduced to something countable,
i.e that it is built non-trivially out of an uncountable number of ”things” and thus
can show very wild behaviour.
There are three main ways to say that something is countable we will say. They
are called second accountability, σ-compactness and paracompactness. There are
many more, but all of these will be enough for our purposes here.
213
214 CHAPTER 10. TECHANICALITIES: COUNTABILITY
Figure 10.1: You can see the defining property of a basis of a topology in
this figure.
10.1.2 σ-compactness
The concept of σ-compactness is even simpler than that of paracompactness.
10.1. SECOND COUNTABILITY, σ-COMPACTNESS AND PARA-COMPACTNESS215
Figure 10.2: The real line is sigma countable, because it is the union of all
the sets . . . , [−1, 0], [0, 1], [1, 2], . . . .
10.1.3 Paracompactness
To define paracompactness, we need something called an open cover. An open
cover, as you might imagine, is just a lot of open sets, that, together, cover the
space in question.
Figure 10.4: Clearly, the cover P is better (more refined) that O, because
it ”wastes less space”, i.e every V ∈ P is contained in a bigger set U ∈ O.
10.1. SECOND COUNTABILITY, σ-COMPACTNESS AND PARA-COMPACTNESS217
There is another way to ”waste” your effort. You could also simply be using
more open sets to cover your set that you need, as you can see in figure 10.5. We
Figure 10.5: You can be very wasteful, by covering with much too many
sets. In the worst case, you could be using infinitely many sets you don’t
need, like here with the point p, which does not even lie on A.
You can see a picture in figure 10.6 to get a feeling for the definition.
With this said, we can finally define paracompactness.
Basically, paracompactness guarantees us that we will not come into the situa-
tion, like in figure 10.5, where we have an open cover and infinitely much ”wastage”
and cannot refine it away without losing the covering of our space.
This criterion will be especially useful when we will work with partitions of unity.
218 CHAPTER 10. TECHANICALITIES: COUNTABILITY
We want to discuss how these three definitions interplay, firstly when you have just
a topological space, and then if you have a manifold. We shall see that the atlas
makes the interplays easier.
The particularly interesting fact is that all second-countable spaces are para-
compact, but not all paracompact spaces are second-countable.
We will explain the difference in ”size” between just paracompact and also
second countable, by Rn .
• As you can imagine, R2 is both. It is second countable because its
topology can be shown to be spanned by all balls sitting at points with
coordinates in the rational numbers and a rational radius. It is also
not too hard to convince yourself that it is paracompact.
• Rn is both paracompact and second countable.
So as you can see, paracompact spaces don’t necessarily have to be ”small”, like
second countable do.
The difference between a general paracompact set and a second countable is
not too big though, you just need one condition on the paracompact set for it to
be second countable. It has to be seperable.
This is quite a dense (pun intended) definition. But you can see this be with
R. it is separable since it has a countable dense subset, namely Q. It is a subset,
obviously. It is one of the first examples of a countable set that you usually learn,
by Cantor’s counting trick, and it is dense in R.
For any locally compact space, second countable implies paracompact. For
any connected manifold, paracompact implies second countable.
The proofs of these two claims are both clever, and certainly not easy, but since
we only really need the result, we will not give it here.
We also have to explain what we mean by locally compact space.
The upshot of the previous propositions is that for manifolds, second countable
means simply a countable number of second countable connected components, while
paracompact means any number of second countable connected components. This
is a lot simpler than with just topological spaces.
With all the definitions and propositions out of the way, we come to a truly
weird example.
We now come to a completely different topic, again, by talking about very useful
tools to actually prove results, especially ones where we have concrete objects, like
the metric. In particular, we will be talking about ways to smoothly ”localize” stuff
(whatever the stuff might be), particularly in charts.
We will talk about bump functions and cut-off functions, which are functions
that look vaguely like bumps (see Gaussians), that you can multiply with an object
(f.ex function) so that it is zero everywhere except where the bump is. We then go
on to talk of partitions of unity, which are extremely useful when introducing, for
example, integration on manifolds.
A bump is something that looks vaguely like a, well, bump. It is a function which
is zero everywhere except for a (small) open set, where it is strictly positive.
221
222CHAPTER 11. TOOLS: BUMPS, CUTOFFS AND PARTITIONS OF UNITY
On R, we can create a bump function through the following idea. Let’s start
with the function: (
e−1/t t ≥ 0
g1 (t) = (11.1)
0 t<0
You probably remember that this function, astounding as it looks, being
exactly zero and then suddenly growing, is smooth from a calculus class.
We can construct a bump function for (−1, 1) from it by the following:
Since we are only multiplying, g2 is still smooth and zero only on (−1, 1),
as you should be able to convince yourself of. We can even use it to make
a bump function for B1 ⊂ Rn , by taking:
2
g3 = g2 (|x| ) (11.3)
This is zero exactly on the ball of radius one. You might wonder whether
it is smooth at 0, since it has the absolute value. You can check that it is,
but would not be if there wasn’t a square in the function.
Figure 11.2: The functions from this example. At first you have the
function g1 , from which we construct bump functions for (−1.1) and
for B1 ⊂ Rn .
We can also define something similar called a cut-off function. The idea is that
the function is 1 on a set A, and zero outside of a slightly bigger set U , falling off
smoothly in between.
11.1. BUMPS AND CUT-OFFS 223
Let u : M → R. We call the set of all the points at which u(x) ̸= 0 with
the set’s bound the support of u.
¯ 0}
supp(u) = spt(u) = {u ̸= (11.4)
These definitions are basically a way of defining a (small) edge around a set, in
a topologically sensible way.
224CHAPTER 11. TOOLS: BUMPS, CUTOFFS AND PARTITIONS OF UNITY
We can construct cutoff functions for various sets from a bump function.
So let g be a bump function for the interval (1/4, 3/4). We can define a
function h1 by the following:
Rt
g(s)ds
h1 (t) = R−∞
∞ (11.5)
−∞
g(s)ds
is called locally finite, if for any point p ∈ M , there exists an open subset U
so that to calculate the sum on U , you have only a finite amount of non-zero
functions to sum up. In other words if:
The consequence of this is that everywhere, the sum looks just like a locally
finite sum, hence the name. If these functions are all differentiable, then the sum
can also be differentiated, term by term.
Sketch. At every point restrict the sum to the small set U , on which only a
finite amount of functions is non-zero. Ignore all the rest, since they are zero on
U and thus cannot do anything to the derivative. Then differentiate the finite
amount that isn’t and you get your derivative. The second part follows from
similar ideas.
• α∈A ξα = 1
P
!
X X X
T = 1T = ξα T = (ξα T ) = Tα (11.11)
α∈A α∈A α∈A
where we have defined Tα = ξα T , which are now all smooth, localized versions of
T . This way, we can transfer to smaller sets, for example, ones that we can chart
with a single chart, i.e. we can transfer our global calculations into local ones in
charts.
Before we can use them as we like, we would need to prove that partitions of
unity even exist. Luckily, the conditions for this to be true are very lax.
Lemma 11.2.1
Assume the following:
• M is paracompact
• O is an open cover of M
• B is a base for TM
Then there exists an open cover P of M , such that the following are true:
• P refines O, P << O.
• P is locally finite
• P⊂B
The first two follow from the definition of paracompactness. We will skip the
proof as it is a bit tricky.
We can now prove the theorem.
– Ū is compact
– There exists a V ⊂ M such that (U, V ) is diffeomorphic to (B1 , B2 ) ⊂
Rn with the same diffeomporhism. (See figure 11.8)
Step 2 Let O be any open cover of M . Then by the lemma, there exists a refine-
ment P , that is locally finite and which is a subset of the basis B.
Step 3 We can write this open cover, P as {Uα }{ α ∈ A} for some indexation A.
Step 4 Let g be a cut-off for B1 in B2 . For each Uα select a Vα such that (Uα , Vα )
are diffeomorphic to (B1 , B2 ) by the diffeomorphism φα and define:
(
g ◦ φα on Vα
µα = (11.12)
0 on M \ Ūα
Step 7 This all already looks like a partition of unity, the only part we have not
yet got is the normalization. We can get this easily by defining ξα to be:
µα
ξα = P (11.13)
α∈A µα
Figure 11.8: The choices of U and V needed for the construction in the
proof.
Immersions: Local
Immersion Theorem
After having done a lot of (technical) preparations, and gone out and done a few
topics that fit, but were not necessarily needed for it, we come to answer one of the
questions of how to get submanifolds.
Let f : L → N be a smooth map. When is M = f (L) a submanifold of N ?
We already saw that this has something to do with Immersion, the definition of
which we include here as a reminder.
We have also seen that embeddings are always injective immersions, but the
opposite is not necessarily true.
These questions will be answered, in parts, by the main theorem of this chapter,
the local immersion theorem.
To state it, we need a certain concept, which sounds a bit trivial/weird to even
state.
231
232 CHAPTER 12. IMMERSIONS: LOCAL IMMERSION THEOREM
You might protest at this definition. Why in the world would we give, what is
basically ”the identity” between Rm and Rn , a name? What is so special about it?
Well, the thing that is special about it is that it is an extremely simple, practically
trivial map. The reason why we even give it a name is because of this:
This is an extremely important result. Let us restate it. Any (smooth) map
whose differential at a point is injective, locally looks just like (x1 , . . . , xm ) →
(x1 , . . . , xm , 0, . . . , 0).
The proof of this theorem will rely on the inverse function theorem, which states
that if g : P → Q is a map and dg(p) is bijective, then g is a diffeomorphism near
p for some open set U and g(U ).
Proof. We want to use the inverse function theorem to prove this result, but
that requires manifolds of the same dimension, while our condition is m ≥ n.
The basic idea is to augment M until it has dimension n (artificially), use
the inverse function theorem, and then throw away the extra dimensions we
added. Let’s start by fixing a p ∈ M . The theorem is entirely local, so we can
without loss of generality work in a chart. In this chart, let f : U → V where
U ⊂ Rm , V ⊂ Rn and f (U ) ⊂ V , of course. We know, from the condition, that
df (p) : Tp Rm = Rm → Tf (p) Rn = Rn is injective, so without loss of generality,
we can transform Rm and Rn , so that p = 0, f (p) = 0, df (p) = i : Rm → Rn ,
simply by using some linear algebra (Rigid motion, then Gauss-decomposition).
233
So we get the result already, but only at p, not locally on U . That’s the part
we will need the inverse function theorem for.
To use the inverse function theorem, we will need to add n − m dimensions
to M somehow. We do this, by ”stacking” M enough times. What this means it
that we take, instead of the manifold M , the manifold M × Rn−m , or in charts:
where the first coordinates refer to those on M , the rest on Rn−m . We can also
write N (in chart Rn ) like this:
y = (y ′ , y ′′ ) = (y 1 , . . . , y m , y m+1 , . . . , y n )
(12.3)
You can see a visualization of what we are doing in figure 12.2. We can also
replace the function f by a function F : Rn → Rn going from the chart of
M × Rn−m to the chart of N . We cannot just choose F (x′ , x′′ ) = f (x′ ), since
it’s differential would not be bijective. (All partials after x′′ would be zero. So
we need to choose differently. We need to add something, which will make F
not too different from f , but the differential bijective. We can simply choose:
which is the simplest choice we can get. The first term, f (x′ ) is independent of
x′′ , and similarly, the second term, (0, x′′ ) ≡ j(x′′ ) is independent of x′ . The
second term shifts things up.
To use the inverse function theorem, we need to show that dF is bijective at
p = 0. Let us compute:
Figure 12.2: The construction we use in the proof of the local immersion
theorem.
This theorem is extremely important and many things follow from it. As an
example, it implies that f (U ′ ) can locally be written as a graph near f (p) in the
coordintaes (y 1 , . . . , y n of N .
y ′′′ = g(y ′ ) = f (G(y ′ , 0)′ )′′ = (f ◦ πRn →Rm ◦ G ◦ iRm →Rn (y ′ ) (12.11)
Corollary 12.0.1
We can also use it to finally find an answer to the question that started this
chapter.
The proof is left as an exercise. You can see the grand idea of this in figure ??.
The problem with this global answer is that it is hard to check for some f . We
can also give a different answer.
12.3. PROPER MAPS 237
For the first global answer, you need to check wether f is a homomorphism,
which means checking wether it is bijective, which means checking infectivity and
surrjectivity, also wether it is continuous, and wether f −1 is continuous. Having M
to be a compact manifold is way simpler.
Let f : R → R be the function f (t) = tanh(t), which you can see in the
figure. f is not proper, as you can simply see. f −1 ([0, 2]) = [0, ∞) and
so, since [0, 2] is compact and [0, ∞) is not, f cannot be propper. Let
F : R → R2 now be the function f (t) = (t2 , t(t2 − 1)). This function is
propper, for the simple reason that if |t| → ∞, then |f (t)| → ∞, so you
cannot find a compact set in R2 , which the curve stays in for a non-compact
interval.
Figure 12.6: The two functions in the example. The first is not-
proper, the second is.
We now give a particularly nice example to see a lot of the concepts interplaying.
12.3. PROPER MAPS 239
Example 12.3.2
Let us take a function of the real line to the taurus, T 2 . Specifically, let’s
take the function:
f (t) = [(t, αt)] (12.12)
where [, ] is the equivalence class for the taurus T 2 ≃ R2 /Z2 (Basically, it
puts the point (a, b) to a point on the square by adding and subtracting
integers to the coordinates until it gets to something in the [0, 1] × [0, 1]
square.
If α is an irrational number, then the line you get on T 2 never meets itself,
that is, it never closes up. Notice though, that f is an injective immersion.
There are a few things that happen because of this.
• f (R) is dense in T 2 .
Figure 12.7: The map from this example. If α is irrational, then a lot
of weird things happen. In particular, f (R) never closes, becomes
dense in T 2 and is not a submanifold of T 2 . All of this happens,
because f is not proper.
Theorem 12.3.2
To prove this theorem one needs a few lemmas, which we will just state here.
240 CHAPTER 12. IMMERSIONS: LOCAL IMMERSION THEOREM
Lemma 12.3.1
• In a Hausdorff space, compact sets are closed
• In a compact set, closed sets are compact
• In any topological space, if K is compact and C closed, then K ∩ C
is compact.
Lemma 12.3.2
If M is a manifold, then for all U open, and all p ∈ U , there exists a V
open, with p ∈ V such that V̄ is compact and V ⊂ U . and V is a chart/
Definition 12.3.2
If U ⊂ M is open then we call A ⊂ U compactly contained in U if:
• Ā ⊂ U
• Ā is compact
and we write A << U .
The idea of A being compactly contained in U is that it keeps away from the
boundary.
Lemma 12.3.3
A manifold is locally compact, that is, for all p ∈ M there exists an open
V , and compact K, such that p ∈ U ⊂ K.
Lemma 12.3.4
Let Y be a locally compact Hausdorff space and let D ⊂ Y . Then D is
closed iff ∀K ⊂ Y : K compact =⇒ K ⊂ D compact.
Lemma 12.3.5
Assume the following:
• X is a topological space
Embeddings: Whitney’s
Theorem
In the last chapter, we have talked a lot about immersions, embeddings and their
interplay. We saw that embeddings are at the very least injective immersions, and
we saw that immersions are locally embeddings. We also saw that if M is compact,
then an injective immersion is an embedding and that if f is a proper injective
immersion, it is an embedding. We want to leave this interplay for now (except
as a tool) and talk a little bit about embeddings themselves, in particular, a very
important theorem, that lifts the mystique around manifolds and realizes them as
something more tangible. This theorem is called Whitney’s embedding theorem and
this chapter is all about it.
The k in Rk is there so that you don’t accidentally think that this theorem
says you can embed any n-dimensional manifold in Rn , because that is simply not
possible.
Before we throw the proof at you, there are a few refinements and variations of
the theorem we want to discuss.
243
244 CHAPTER 13. EMBEDDINGS: WHITNEY’S THEOREM
This tells us that we only need, maximally, twice the amount of dimensions to
embed a manifold. In particular, it tells us we never need more that R2 to embed
any curve and R4 for any surface. For both of these, it makes sense that these are
upper bounds and also that they can’t be made smaller. For curves, a circle needs
at least two dimensions to be embedded in, for surfaces, well, we already met the
Klein Bottle, which cannot be embedded in three dimensions. 2n is however not
the best we can do for all n.
What is the idea behind this refinement? Well, from the first version of the
theorem, we know that for any manifold, we can embed it in some Rk . We can
take this embedding and project the whole thing linearly down onto linear spaces. It
turns out, any projection down to R2n+1 will work if they are immersive injections.
To get down to R2n is another matter and requires a bit more trickery, as it is not
clear how you would not almost always get double points. This requires a trick,
which is aptly named Whitney’s trick, but which we will not cover here.
Proposition 13.2.1
If you are wondering why manifolds somehow count the number of digits in the
binary expansion of n (i.e. 8 in binary is 1000, so b(8) = 4), then you are not alone,
this connection is quite confusing.
13.3. EXAMPLES 245
The optimal embedding dimensions are not known! There are a few n’s for
which they are known, but not for all.
13.3 Examples
We can give a few examples. For n = 2, we know we can embed every surface
in 2n = 4. You can even classify all that you can embed in R3 . Since smooth
manifolds don’t have a rigid structure (no metric or similar), these categories are
really the same as the ones in topology.
Let’s say you have these charts and there are k of them. We can construct
a map into Rn·k = Rn × Rn × · · · Rn (k-times), by taking these charts and
combining them with cutoff functions and that is the basic idea of the proof.
Let’s fill out the detail.
• First, cover M by open sets (Up , Vp ), where p ∈ M , and Up << Vp .
• Then pick a diffeomorphism of these, φp , which takes (Up , Vp ) → (B1 , B2 ) ⊂
Rn .
• Since M is compact, there exist a finite subcover of (Up )p∈M . We can
reliable these as Ui , Vi , φi , where i ∈ {1, 2, . . . , k}. This is where we are
using compactness.
• For each i, we can use φi to define a cutoff function (to get the embedding
later on). We need a cutoff-function (any), for (B1 , B2 ) and can transfer
this through the diffeomorphism φi onto Ui , Vi . We call this function (of
M ), ξi . It is also smooth (since φi is a diffeomorphism and the cutoff for
(B1 , B2 ) we choose is smooth as well.)
• In addition, we can find a bump function for Ui , which is nonzero exactly
on Ui , and zero otherwise, in a similar fashion. Let’s call it hi .
• We can now construct an embedding f from M → R(n+1)k = Rn+1 ×
Rn+1 × · · · × Rn+1 (k times). We write φi = (x1i , x2i , . . . , xni ) : Vi → B2 ⊂
Rn . Then we can define the functions:
(
ξi φi on Vi
gi := (13.1)
0 on M \ supp(ξi )
13.4. PROOF OF WHITNEY’S EMBEDDING THEOREM 247
f = (g1 , h1 , g2 , h2 . . . , gk , hk ) : (13.2)
M → R × R × R × R × ··· × R × R = R
n 1 n n n 1 (n+1)k
(13.3)
The first line follows from that fact that ξi is a cutoff function for Ui in Vi , so
it is 1 on Ui , and since p ∈ Ui , dξi |p = 0, since on Ui , ξi = 1. What we have
show is that d(ξi φi )|p = dφi |p . But we know that φi : Ui → φI (Ui ) = B1 ⊂ Rn
is a diffeomorphism. so:
Submersions: Local
Submersion Theorem
Now we come back to the second question we wanted to answer about submanifolds.
Let f : M → N be a smooth function. If L ⊂ N is a submanifold of N , is f −1 (L)
a submanifold of M ? The answer will have, as the title hints at, something to
do with submersions. There will be a theorem very similar to the local immersion
theorem, called the local submersion theorem.
14.1 Submersions
We want to recall the definition of a submersion, and give a few examples to make
it a bit clearer as to what we are working with.
249
250 CHAPTER 14. SUBMERSIONS: LOCAL SUBMERSION THEOREM
As before, we only name this function because it will show up in the theorem
that will follow. Before we state it, however, we want to give a an example of the
theorem. (Submersions are enough).
Example 14.2.1
Take the curve γ(t) = (t2 , t3 ). We have seen it often before. It is clearly
not smooth at t = 0. We described this behaviour already, by examining γ
as a curve, as one of the first counterexamples in chapter 1. We now give a
different view of it, seeing it as broken submersion. We can write γ as the
graph of y = ±x3/2 , which is the zero set of the function f (x, y) = x3 − y 2 .
We know f −1 (0) is not a submanifold, since that is per construction just
γ, which is not smooth at the origin. We can see that f is also not a
submersion, because at (0, 0), we get df |(0,0) = (0, 0), which cannot be
surjective to T0 R = R.
252 CHAPTER 14. SUBMERSIONS: LOCAL SUBMERSION THEOREM
Corollary 14.3.1
The proof of this is immediate from the local submersion theorem, since for
a small open set around p, on the whole set, dfp looks like the canonical linear
submersion.
14.3. THE LOCAL SUBMERSION THEOREM 253
Corollary 14.3.2
Fix p ∈ M and call q = f (p) If dfp is surjective, then there exists a small
open set U ⊂ M , so that f −1 (q) ∩ U is a submanifold of M of dimension
m − n.
Proof: Exercise. What about a more global version? That also exists.
This corollary easily follows from the local submersion theorem. This one is
especially important to be able to define (sub)-manifolds through equations. We
have now, finally, learned a very concrete way of defining manifolds. We can do so
through equations. Note how far we have gotten. We had the very abstract,non
concrete way of doing it by defining the atlas, which is unpractical, but conceptu-
ally enlightening. We have now gotten the tools to do so through equations and
functions, tools that are a world away from the atlas.
We want to end this part of the lecture by talking about critical points and values,
a favourite tool of analysis, used a lot. We can use much of what we have learned
in this part here.
14.4. CRITICAL POINTS AND VALUES 255
Corollary 14.4.1
The set of all regular points is open, while the set of all critical points is
closed.
256 CHAPTER 14. SUBMERSIONS: LOCAL SUBMERSION THEOREM
Example 14.4.1
so df |(0,0) = (0, 0). So the origin is a critical point. It is also, clearly, the
only critical point. The regular points are all of R2 \ {0}.
Figure 14.6: The function from this example. (specifically, its con-
tour lines. As you can see, only the origin is not a part of a manifold,
locally, which is a result of the function’s differential not being sur-
jective at that point.
There is a theorem, which is more related to analysis, but which also works well
in differential geometry. It tells us that ”almost all” values are regular values. For
this, we need to define ”almost all”.
This is quite weird if you think of it. We have not introduced volume yet,
since we don’t have a metric yet, but we can define something as having ”zero
volume”. You might wonder if this makes sense and is independent of charts. It is!
As a remark, we define a set X as having ”full measure” (i.e. the volume of the
manifold), if the complement M \ X has measure zero. With this, we can state the
theorem.
The proof is rather technical and we wont do it here. The basic idea is that f
squishes the volume down a lot near every critical point, and you can sum over this
volume and take a limit to show that it is zero. This also means that almost every
f −1 (q) is a smooth (sub)-manifold! This is quite a result. Almost every. Well, if
only life was this easy. Most often, you find that exactly at the q you are interested
in, it’s not.
258 CHAPTER 14. SUBMERSIONS: LOCAL SUBMERSION THEOREM
Part IV
259
261
We start with a bit of a discussion of vector fields, a reminder of how they work
and a few things that will be useful throughout this part. We then go on to discuss
the problematics of a derivative of one vector field in the direction of another. We
then stray away from this and talk about the Lie Bracket, which, as we will see,
will play an important role in the development of a kind of derivative, called the Lie
Derivative, at the end of this part.
15.1 Vectorfields
263
264 CHAPTER 15. LIE BRACKETS AND VECTORFIELDS
We call the set of all vector fields on a manifold M , Γ(T M ), which is not
C ∞ (T M ), because Γ(T M ) has truly all vector fields, even discontinuous
ones. A vector field, in this sense, is simply a map:
X : M → T M, X(p) ∈ Tp M (15.1)
We can also define k-th differentiable vector fields. This is akin to our definition
of smooth vector fields in the second part of the lecture.
This point is well-defined because the overlap maps are all smooth, per the
definition of a smooth manifold, and so if the X i ’s are all C k in one chart, then
they will be C k in all other charts.
This is just the normal derivative we know and love from calculus in Rn . The
resulting object, DX Y , is another smooth vector field. and the map:
D· · : C ∞ (T Rn ) × C ∞ (T Rn ) → C ∞ (T Rn ) (15.8)
is bilinear. How can we try to get something like this on manifolds? The simplest
Rn
idea is just to use what we already know. We know how to calculate DX Y . Why
not, you might ask, just calculate DX Y in the chart, and take the components you
get and use these to define the vector DX Y ? The Problem, is that you don’t get
something intrinsic to the manifold. I.e. the DX Y you calculate in one chart differs
from chart to chart! (That is, the abstract vector DX Y ∈ Tp M is a different one for
different charts, the components don’t transform like a vector should). (Exercise:
Verify this).
Proposition 15.3.1
Z · u = [XY − Y X] · u (15.9)
X ∂ X j ∂u
XY u = X i[ ( Y )] (15.10)
i
∂xi j ∂xj
X ∂2u ∂Y j ∂u
= X i [Y j + ] (15.11)
i,j
∂xi ∂xj ∂xi ∂xj
So:
[XY − Y X] · u = (15.12)
2 j i 2
X ∂ u ∂Y ∂u ∂X ∂u X ∂ u
= X iY j i j + X i i j
−Yj j i
− X i Y j i j (15.13)
i,j
∂x ∂x ∂x ∂x ∂x ∂x i,j
∂x ∂x
X ∂Y j ∂u i
j ∂X ∂u
= − Y (15.14)
i,j
∂xi ∂xj ∂xj ∂xi
15.3. LIE BRACKET 267
The proof of the first two is immediate, the Jacobi identity is quite straightfor-
ward, you just spell everything out and see that everything cancels. It is long though.
Before we do it, we will talk about what the Jacobi Identity means. The Jacobi
Identity, really, is just a way to say that the bracket operation is not associative. To
see this, note that if [, ] was commutative, then:
[X, [Y, Z]] − [[X, Y ], Z] = 0 (15.19)
since the brackets wouldn’t matter. Now write [X, [Y, Z]] = −[[Y, Z], X]. Then
the Jacobi identity tells us that:
[X, [Y, Z]] − [[X, Y ], Z] = [[X, Y ], Z] + [[Y, Z], X] = −[[Z, X], Y ] (15.20)
and not zero! So the brackets are not associative.
Proof. The proof is not too hard, it just requires writing everything out. We
know:
[[X, Y ], Z] = [(XY −Y X), Z] = (XY −Y X)Z−Z(XY −Y X) = XY Z−Y XZ−ZXY +ZY X
(15.21)
Now we can permute X → Y , Y → Z, Z → X to get the other two terms.
[[Y, Z], X] = Y ZX − ZY X − XY Z + XZY (15.22)
268 CHAPTER 15. LIE BRACKETS AND VECTORFIELDS
and again:
[[Z, X], Y ] = ZXY − XZY − Y ZX + Y XZ (15.23)
XY Z−Y XZ−ZXY +ZY X+Y ZX−ZY X−XY Z+XZY +ZXY −XZY −Y ZX+Y XZ = 0
(15.24)
as you should verify.
So you can pull out the functions, but you get error terms.
Alterantivley you can define it so that if u1 |U = u2 |U then L(u1 )|U = L(u2 )|U .
You can repeat this definition not for functions, but generally for everything, from
vectors to operations etc.
Both (X, u) → X · u and (X, Y ) → [X, Y ] are local operators (exercise: prove
this).
15.5. LIE ALGEBRAS 269
2
For example, E(u) = ∂x ∂u
+ sin( ∂x2 ) is a differential operator.
∂ u
Directional
derivatives u → X · u are of course differential operators. It is also true that every
differential operator is a local operator, simply because of how partials are local.
• Jacobi Identity
is called a Lie Algebra.
Flows
271
272 CHAPTER 16. FLOWS
Chapter 17
Lie Derivative
273